Am I weird in feeling like the code in this file is really really... normal? Like, it's verbose in certain ways due to being written in Go, as well as due to not relying on any deep abstractions (and I don't mind this - abstractions are a double-edged sword), but in general, as code, it seems typical - and if the header text didn't exist I wouldn't think twice about the style it's written in.
Maybe the disconnect here is that most of my experience is in enterprise software rather than systems software. Perhaps many of the comments in this file seem unnecessary to regular contributors within the k8s project? Whereas if I were writing this same code in an enterprise (and thus expect it to be read by people far in the future lacking context on all the technical details) I would have put -more- comments in this file, given the sheer complexity of all it's doing.
I wish code like this still felt normal to me, but over the past ~10 years it seems that many people have come to value brevity over explicitness.
I strongly prefer the explicitness, at least for important code like this. More than once in my career I've encountered situations where I couldn't figure out if the current behavior of a piece of code was intentional or accidental because it involved logic that did things like consolidating different conditions and omitting comments explaining their business context and meaning.
That's a somewhat dangerous practice IMO because it creates code that's resistant to change. Or at least resistant to being changed by anyone who isn't its author, though for most practical purposes that's a distinction without a difference. Unnecessarily creating Chesterton's fences is anti-maintainability.
I have a rule for my teams: "Don't write clever code".
I try to constantly reinforce that we don't write code for ourselves, we write it for the next person.
We should be doing everything in our power to decrease their cognitive load.
I try to envision the person that comes after me (who may be me in months or years!) and imagine that they are having a Bad Day and they have to make changes to my code.
Good code is clear, and tells a story. A story that's easy to follow, and easy to drill into.
Not to knock elixir unfairly, but I think that's the basis of my mental block with that language. It seems to be designed from the ground up to violate that rule. Everything is elixir is clever. Very clever. Too clever for me.
>I try to constantly reinforce that we don't write code for ourselves, we write it for the next person.
Heck, it does't even need to be another person. Even me 10 months from now who may have forgotten some context around some code will appreciate boring code.
That is a good rule. If you want to write clever code, do some code golf. For everything else I heavily prefer explicitness. Smart software architecture hides the verbosity if you do not have business with a specific and most likely specialized piece of code.
Even with ugly code, you should think about refactoring if this particular piece did run successfully for several years and there are no security related issues.
There are several languages the violate this principle. Brainfuck is probably one of the most prominent. It is of course not to be taken seriously. Overall alleged "elegance" of some code parts is rather annoying if you really need to understand and adapt it.
I think it's super subjective, and I'm sure it's just my mental block from 30 years of C style languages.
I have trouble with the equals sign being "pattern matching". There are other syntax things like that, where it seems like too much is being done in odd (to me) ways that are hard to grok.
I know a lot of people love it, and I really did try, but for whatever reason the syntax just doesn't work for me.
You are kind of backtracking here because I don't see anything about implicitness. I mean okay you can't get used to the syntax and you don't want to -- not something I'd deem a serious reason to drop a language but it's fair enough and it's obviously your right so cool.
But Elixir is anything but implicit. If anything, people periodically raise a stink about wanting some magic in there, and we the wider community just reject them.
Agreed. Explicitness and comments are very useful in understanding the intended functionality and logic, whether or not the code actually implements that intent correctly (an in providing that intent, they can help identify bugs earlier than they would be identified otherwise).
But comments go out of date, and the compiler doesn’t check them against the implementation. To document + enforce the intended functionality, use tests.
Outdated context is miles better than no context in my experience. As long as the comment isn't intentionally misleading, it always helps a ton in piecing together what happened, even if the comment is factually wrong when it was written, because I can tell how the original author was mistaken and why it led to the current code.
Outdated comments are great, because it means you probably have a bug right there. If the comment didn't get updated, the code change probably didn't look at all the context and missed things.
Does it matter? If the comment doesn't match the code, there's a bug (in the comment or the code). Either way you need to spend time to understand the context and figure out the correct thing, not trusting either.
Looking at the commit history is a great start. Especially if your team actually empowers people to reject code reviews when the commit messages are unclear or insufficiently detailed.
Generalizations like that are theoretical, and don't always align with reality. There's nothing wrong with comments summarizing the "what", and in fact doing so is a good thing because it can describe anything from the intention of the code to the business logic. "This function merges array X with array Y, converting all the keys to lowercase because they will be used later by foo()."
The "why" can go out of date, e.g. "do X before Y because [specific thing in Y] is dependent on [specific thing in X]". If you rewrite Y to no longer be dependent on X, the comment is now out of date.
The reality is that any comment can go out of date at any time if the code it describes changes enough. But that's really no excuse for failure to maintain comments. Sure, in reality code is messy and inconsistently written, not even counting comments. Comments are an essential parts of your codebase, and while they are used exclusively by humans, that doesn't mean they are any less worthy of being updated and cultivated.
I dunno, the "why" for me is "why are we doing this, and doing it this way?". If that changes, but somehow the comment isn't changed, that would feel really strange. It's not just tweaking a few lines, it's rewriting the whole routine. If all the code changed but not the comment, that would have to be deliberate, and definitely picked up in code review.
Though, obviously, accidents happen, etc. But then that also happens with tests and everything else. I have definitely seen out-of-date tests in code bases, where the test is no longer relevant but still maintained.
So I actually find this helpful because if the why doesn't match the what (code), I know to look back at the history of changes and see why there is a mismatch. This is honestly a great signal that something might have gone sideways in the past while I'm trying to triage a bug or whatever. So even if the comments are out of date, they're still helpful, because I know to go look at why they're out of sync.
Comments go out of date because of bad developers.
The same people who do the bare minimum for tests not to explode. But won’t add a new test case for the new branches they just introduced.
The same people who will mangle the code base introducing bizarre dependencies or copy paste the same piece of code rather than refactor.
People who fail to handle errors correctly. My favorite: by wrapping code in a if statement without an else. (else? Get a weird error without logs! Miles away from the call site!)
People who don’t validate inputs.
People who don’t give a monkey about adding context to errors making the thing impossible to debug in prod when they explode.
People who are too lazy or in incompetent to do their job properly and will always jump at the opportunity to save 5 minutes now but waste 5 hours of everybody else’s time later. Because of course these people can’t fix their own bugs!
And of course these are the people who make comments go out of date. I’ve seen them implement a thing literally the line below a TODO or FIXME comment and not delete the line.
Comments going out of date is a shit excuse for not writing comments as far as I’m concerned.
The fact that some people are incompetent should not drive engineering decisions. You should always assume a minimal level of competency.
> Comments going out of date is a shit excuse for not writing comments as far as I’m concerned.
I agree.
> Comments go out of date because of bad developers
I disagree.
Comments can also go out of date because
- developer is having a really shit time atm and their head is not in the game (bad looking after people management)
- developer is on a one day a week contract and doesn’t have the time in their contract to write big header comments explaining nuances (bad strategy)
- developer thought it looked obvious to them but it’s not obvious at review time (developer is being classic human)
- developer is getting pushed to deliver the thing now now now (bad workload management)
Most of those are the result of some decision made by someone who was not the developer (they’re all real cases). And they are the “non-code blockers” that good managers solve for us, so we can focus on getting good stuff done.
I’ve been where it seems like you are at. Blaming others for being bad didn’t help me. I had to lower my expectations of others, keeping my expectations for myself. Then get on about teaching willing people how they could be better. Might be making a few assumptions/projecting a bit there, but that’s my experience with “bad developers”.
Being any type of “leader” is lonely. Whether that’s an official role assigned to you or not. Or if it’s just a skill level thing. No one can quite match up to the expectations or understand what we see and why. But explaining it goes a long way with the right ones.
> Just like updating the tests when code is changed, update the comment when the code is changed.
Well, yeah. But the point is that tests can be run in a pipeline that can fail if the tests fail. Comments going out of date has to get caught by a human, and humans make mistakes.
Yeah but there's a fundamental difference between something like tests that can be checked automatically and comments, that have to be checked manually. Because of this, it can be assumed that comments will eventually go out of date.
Good PR review from a skilled and more senior developer catches these things, most of the time.
Just like how tests catch functionality issues , most of the time — bugs still exist in tested software, because people make incorrect assumptions about how/what to test, or implement the test wrong.
> it can be assumed that comments will eventually go out of date.
Don’t make assumptions. That’s just a lazy excuse for not trying.
The same thing could be said for tests
> it can be assumed that tests will eventually go out of date
So why should we bother updating tests? They’re just going to go out of date?!!
Because it makes the codebase easier to work with for someone brand new.
Same as comments.
Pay down the debt for the next person. The next person could even be you in a year’s time after working in a completely different project for 9 months.
Tests only test functionality, they don't test business context. Comments explain business context.
For example, "we have this conditional here because Business needs this requirement (...) satisfied for this customer"
Your comment can test the logic works correctly. But someone coming in, without the comment, will say "why are we doing this? Is this a bug or intentional? Is the test bugged, too?"
Now, they'll see it's intentional and understand what constraints the code was written under. Your test can't send a slack message to a business analyst and ask them if your understanding is correct. The original dev does that, and then leaves a comment explaining the "why".
> To document + enforce the intended functionality, use tests.
Tests go out of date
Tests increase the maintainince burden
The compiler does not ensure code is tested
Tests get duplicated
Mēh! Tests matter, and testing is very important. Good judgment is required
Just like comments.
Writing code requires professional care at every step. The compiler helps of course see, but being professional is more than writing code that compiles
It involves documents too. And tests. Not too many (tests or documents) but not too few
Undocumented code is an enormous burden to maintain (I am eyebrows deep in such a project now). It is not enough to just write code and tests, documents including inline comments, are crucial
Note in Rust if you include comments with code that will run as tests but be inline in your main code instead of having to find the relevant test function to confirm functionality.
Go has something similar: functions marked as examples that are both run as tests and shown and run as examples in the documentation (https://go.dev/blog/examples)
Interesting. I don't know when that was implemented in Rust but clearly Go has had it for a long time, since that post is dated 2015.
While things like syntax are important, languages adding tooling like this (along with stuff like package managers) is so important to the continued evolution of the craft of software development.
The first developer I ever worked with was very explicit, and I learned some important lessons from him. He had created a system in PHP that controlled printers and it was very explicit. He didn't know what a function was, his code had no functions. It was a 5000 line script that would run on a Windows timer, top to bottom. In some places the control structures were nested 17 deep, I counted; an if-statement inside an if-statement inside a while-loop inside a for-loop inside an if-statement inside a for-loop inside an if-statement, etc, 17 deep. The cyclomatic complexity was in the thousands.
I never could understand that code. I know it wasn't brief, does that mean it was explicit?
I think the truth is brevity and explicitness are orthogonal. Let me ask you this: can code be both brief and explicit? What would that look like?
I’m not sure brevity and explicitness are totally orthogonal, explicit sort of implies spelling things out in a longer way. That doesn’t mean that something that is long is thorough, however.
I like the tradeoff the Swiftlang project talks much about: brevity vs clarity (because explicit doesn’t necessarily mean clear, either, as your example shows). I think those are more orthogonal concerns, both important to think about, for different reasons that may often compete.
Every time some code reviewer comes into my PR and says something along the lines of "you know you can just write it this way" where "this way" means obfuscating the code because "clever" and "shorter," I die a little on the inside. This is from experienced devs who should know better. At one point I wrote a comment write above a section I knew would be targeted by this kind of thinking explaining it must be written this way for clarity. Sometimes that works.
I only recommend that if the more explicit code is less idiomatic. For example, if someone appends to a list in a Python loop where a list comprehension would express the same thing plainly, I’ll suggest it. That’s it.
Otherwise, please optimize for writing stuff I can understand at 2AM when things have broken.
In one of my first positions out of undergrad we had a few devs on the team that got overly caught up in stuff like this. They were no doubt smart people, but their egos got in the way of things way too often. I'm not kidding - we'd get caught up on an if statement curly bracket for a ticket and there would be an argument for 30 minutes to an hour over whether the curly bracket should be there or not. These arguments would go into full blown dissertations, evolving into tangents of BSD coding style or Doom code. Keep in mind this was on a very well known automotive software platform with 20k bugs and counting open.
There is definitely a time and a place. Those were extremely frustrating times and a great learning experience for a young dev.
I’m not a dev, but I manage them. On one team they were spending many hours on code golf and nothing was being built. I pushed the devs to passing testing=PR accepted.
In your opinion, what problems might come from removing opinionated code reviews? Why do some reviewers gravitate toward “Here’s how I would have written it?”
Often, different approaches can be used to solve a given requirement. So debate is needed.
But it could be that your team was just divided on approach and style.
They were struggling because they were trying to work out their differences through PR comments. That will be frustrating for everyone. Somebody went and got the PR working "the other way," and now the reviewer is trying to get the author to change the PR to "their way". If it goes on long enough, your devs will head for "the highway"...
If you just mandate "test pass = pr accepted," it will unblock your team short term, but in the long term, the large system will gravitate into many tiny, fiercely defended fiefs, each with different styles. Maintenance will be slow. Debugging will be complex. Wide-reaching refactors will be prone to blockage.
Fix the coding by having the team spend time "not coding." Establish a proposal and design process where an author needs to get team buy-in before they can implement. Establish an accepted standard on coding style. Do wide reviews on critical features and highlight issues where flow needs to go from one end of the system to the other and there are weird boundaries. Do the RCA (root cause analysis) and ask the "five whys". Look for patterns of issues and address them soon rather than pushing them to the back of the backlog. Prefer many small incremental changes over few large changes.
I think the biggest problem that can come of it is lack of standards.
Each dev has their own way of writing things, their own little language. To them it is perfect. And it all works, passes tests. But, if you let them do this then your code becomes sloppy. Styles go in and out. Like reading a book where every other paragraph is written by a different person, and none of them talked to each other. Sometimes you get PascalCase, sometimes camel_case, sometimes snakeCase. Sometimes booleans are real bools, sometimes they're "Y" "N" (yes, real) or sometimes they're ints. Sometimes functions take in a lot of arguments, sometimes they take in a struct with options. Sometimes missing data is nullable, other times it's an empty string, othertimes its "N/A". And on and on.
You can enforce a lot of this through automatic linters, but not all. You require a set standard and the PR procedure can enforce the standard.
I used to have opinions but I don't care anymore. After a spelunking into all manner of code bases I just match the style and move on. Verbose or terse, comments at the top or inline or nor none at all, 100 or 5 line functions, etc.
I agree to a point, but I would separate explicit code from excessive commenting. Explicit code is good because it lets you explain to the reader what you're actually trying to do. Excessive comments (or even comments in general) is less so because compiler cannot check them for correctness, if someone simply forgets to update a comment or writes it incorrectly then the only thing to potentially catch it is a code review.
I haven't looked close enough at it to really know for sure. I'm not saying it's _always_ bad, comments are helpful, but the problem is that unlike code they are not required to actually match reality.
In this case, I see several `if`s with no corresponding `else` even when the `if` section does not throw/return at the end, and that's largely my point. If the "space shuttle code" requirement is not actually rigorously followed, then why go on at length about it? And if it really is that important, then the comment about it is not good enough.
Rather than a comment about it that can be ignored, they should set up a static analyzer to enforce it at build time. That way you're forced to follow the convention and not relying on code reviewers that probably don't even see that comment during their review.
Most likely, this comment was added in response to a botched attempt to simplify code, to serve as a warning for future maintainers to think twice before making a similar attempt.
The commit that added the warning was "Add note about space-shuttle code style"[1], and the one before that was "Revert controller/volume: simplify sync logic in syncUnboundClaim"[2]
FWIW, I don't think the code style in [2] is less simple (slightly more readable, to use `if (X) {} else {}` rather than `if (!X) {} else {}`, for example).
So to me, this reads as the author of [1] is just overcorrecting by adding process, when some test cases or code review would've been more helpful in preventing whatever incident [2] caused.
I found myself thinking the same thing until I got to the hugely nested if statements. I would definitely have created some early-return branches to that thing.
It does feel like the first "make it work" step (from "make it work, make it fast, make it pretty"), and then they just didn't do the "make it pretty" step. I have written code this "ugly" before, with this many comments, when working out a thorny state interaction. But I usually clean it up a bit before submitting for review. Maybe instead I should just put a huge "do not attempt to simplify this code" banner at the top hehe :)
You may be weird, but you're not alone. I too think this looks perfectly normal. I have written similar-looking code and comments (at least from a high-level point of view) for any components that I feel are critical to the reliability of the system. I've never subscribed to the "no comments" fad since my own comments have all too often been invaluable to future me when returning to the code after months or years of attention elsewhere. I can't imagine trying to piece together all of the embedded logic in a component of this complexity without solid comments.
In particular, the every if has a matching else comment doesn't seem reliably true. Many of the unmatched ifs are just simple if (err != nil) { checks, but or other early returns, but outside of those, there do seem to be unmatched ifs.
That said, my experience in enterprise software isn't that extra comments are necessarily present (there was a plague of "// end if" comments in the codebase, but actual descriptive comments were rare).
Yeah I should clarify that I was speaking personally about new code. I definitely encounter plenty of legacy code that is completely inscrutable.
But, I can say in 2024 at least, that most teams I've been on value explaining complex logic (especially logic that can break in subtle ways if not properly maintained) with comments, in new code we write.
In most code I look at, to try and get clarity on something I’m trying to do, I have no idea what’s going on. Variables seem arbitrary, everything seems cryptic, and I can’t be bothered to try and track down what’s happening.
My own code doesn’t look like that at all. I have long descriptive variables, what I think are easy to read and follow functions, etc. Sometimes I think if I want to be “good” I need my stuff to look like what I find, but at the end of the day, I want it to be easy for me (and hopefully others) to maintain. If no one else can read my code, I don’t view it as job security, I view it has handcuffs.
Related article on Space Shuttle Software Quality [0]
Excerpt: "But how much work the software does is not what makes it remarkable. What makes it remarkable is how well the software works. This software never crashes. It never needs to be re-booted. This software is bug-free. It is perfect, as perfect as human beings have achieved. Consider these stats : the last three versions of the program — each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors."
> Consider these stats : the last three versions of the program — each 420,000 lines long-had just one error each.
What exactly do they mean by this? If each of the 3 versions had exactly one bug, isn't this just a weird way of saying the first 2 fixes either didn't work or introduced a new bug?
The SRR (software readiness review) process happened after development but prior to certification for launch. Most of the bugs were found here and were found to have existed in the code since the beginning of the program.
These were overwhelmingly low severity discrepancy reports.
If I recall correctly, there was a time when they were finding lots of bugs through SRR, so the main development team started their own "continuous review" designed to catch bugs before going to SRR. This made the SRR people angry because they were finding fewer bugs and felt the development team was focusing on competition over bug numbers rather than the code itself.
> This made the SRR people angry because they were finding fewer bugs and felt the development team was focusing on competition over bug numbers rather than the code itself.
This reminds me of the Quality culture, at my last job, which was a famous Japanese optical corporation.
It was deliberately set up, so there was an adversarial relationship between QA, and Development, with QA holding the aces.
As a Development manager, it used to drive me nuts (Ed. Well, it wasn’t much of a “drive.” More like a short putt). It did result in very high-Quality software, but at the cost of agility.
It reflected their hardware development methodology, which regularly resulted in stunningly high-Quality kit, but I think caused a lot of issues with the usability and effectiveness of their software.
Japans cultivate this through their entire culture - starting from young children. We, the Westerns, are already at least 2 decades behind, sometimes even 4-5 decades ...
I liked an NPR article about how Mayans let their young do chores whilst young, when in contrast we tell them to go away and hand them an iPad. I read it before having a kid, and now that I do, if my daughter can help with a chore in any way, I let her, and encourage her for helping. She is so overjoyed for helping out.
The claim that a codebase of 420k lines contains "only one error" is of course absurd, and the members of this forum would laugh anyone out of the room who made such a claim about any other project, pointing out how they cannot possibly know, actual logical contradictions in the claims as described by GP, or just plain ridiculing it without further elaboration.
But since the code in question cannot meaningfully be tested by the public, and people have been indoctrinated to believe the myth that aerospace engineers are an entirely different species that doesn't make mistakes in the sense that the rest of the workforce does, the hubris is accepted until disproven, which it probably won't be for various practical reasons.
Nevermind that the Space Shuttle was a death trap that killed two crews, and that the associated investigations (especially for Challenger) revealed numerous serious issues in quality and safety management. People will continue to just nod their head when they hear nonsense like this, because that's what they have seen others do, and so Schrödinger's Hubris can live on.
That would be the purpose of formal proofs, wouldn’t it?
Formal proofs may not be silver bullets, and we’re never safe from a faulty implementation of the proven algorithms, but this quanta article on a DARPA project showed impressive results [0].
Formal proofs can only prove that the system matches a specification.
Many (most?) non-trivial bugs are actually flaws in the specification, misunderstandings about what exactly you wanted and what real-world consequences arise from what you specified.
> Formal proofs may not be silver bullets, and we’re never safe from a faulty implementation of the proven algorithms
You also aren't safe from misunderstanding what it is that you've proven about the program.
Which is actually the same problem as other software bugs; you have a specification of something, but you don't know what it is that you've specified, and you wish it were something other than what it is.
Too true. When we translate complex concepts to code & math, either may inadvertently not be exactly what we wanted.
Interesting to think about a formal code verification system that maintained a connection between all four prime artifacts: natural language description of problem as understood & intended solution, vs. code and the properties actually verified.
That is wrong in general. With enough tests you absolutely can show the absence of bugs for certain programs. It is for example easy to test „hello world“ exhaustively.
Let's say you've tested this thing 1 million times. Each time the output was automatically checked by five different and independently-developed test suites. You're ready to swear there's no possible way for it to fail.
And then someone tries it with a ulimit of 16kB.
Does it run?
Do you _know_?
Do you even know what is correct behavior in this situation?
The system it runs on is part of the specification. A program is correct if fulfills all specified requirements. You're saying a car is defective because it breaks when you put sugar in the tank.
This is a no true Scotsman fallacy. Any time it runs incorrectly, it was invoked with the wrong operating conditions — which seem to be defined as any conditions that cause it to run incorrectly.
Sugar in the tank is an agreeable example because of how obvious it is, but what about something more subtle? An odd condition that leads to the wrong resonant frequency. An unknown software bug that makes the brakes lock up on Tuesdays at 12:00 AM on one specific backroad in a remote part of Virginia. The combinatorial possibilities of operating conditions are too numerous to exhaustively test.
I guess you could say that every function has every quality that it happens to have, so that functions need only exist in order to be “correct.”
You typically write down the specification before claiming that your code is done so you can’t use it to claim that any undesired behavior is not a bug. I naturally agree that in general tests are insufficient to show correctness because the state space is of impractical size, but, as I said above, for certain programs they totally can be.
Or maybe between one version and the next they only found one bug (there may have been bugs in the first version which weren't fixed until the third or later) - this seems more plausible to me since it's... rather difficult to count bugs until after you know about them.
Of course now the greatness of the feat depends on how much testing there was between versions, but given that it was the shuttle there was probably a lot.
And then we find a new category of bug, consider how we ran millions of different programs for many billions of CPU hours on all of our x86 CPUs before we learned about Spectre and meltdown.
Not just testing - line-by-line code review of the entire system by a panel of experts. Outside of aerospace/defence/nuclear this style of review is not very common.
I don't know about now, but software verification would also be used in consumer electronics.
Fixing a bug in 10,000 washing machine control boards is very expensive when it entails sending a technician to every house to replace the circuit board.
Yeah, I guess for devices that can't receive OTA updates that makes sense. Though I fear that segment of the market is rapidly shrinking - televisions have ubiquitous software update capabilities now, and even washing machines are increasingly internet connected.
We didn't apply anywhere near that kind of quality control to smartphones or VR headsets. Once users are trained to install OTA updates to fix issues, most of the impetus for extreme quality control outside of the bootloader->OTA update path is gone
It would be interesting to see the NASA approach compared to how SpaceX does things. Considering that they have done manned missions they seem to have very similar requirements.
It'd depend on what software is under consideration. IIRC the UI in Crew Dragon is using more contemporary stuff, Node.js I think. This is fine because they have redundancy, there's minimal crew control anyway, and there are manual overrides behind a panel below the screens.
They have 3 relatively modern CPUs setup to run the same code and error check each other, such that if one has an issue, there's still redundancy when it's rebooting.
The software controlling critical systems is probably closer to NASA-esque practices.
Electron was "bashed" for the resource overhead on consumer PCs. That's not really relevant to an aerospace firm who can spec their hardware to match the exact resource requirements of their tech stack.
No, I don't think so, 295x is a crazy high factor. The article says 260 people are involved, and let's generously say it took 20 years to write the software (the first mission was 10 years after the program started and it was around for a total of 40 years).
Dividing by 295 means a commercial team of the same size could have done it in less than a month. Or with a 10x smaller team, about 8 months. I don't think either of those are plausible.
We're well in mythical-man-month territory as you try and accelerate the timeline, though. Typical software companies can't coordinate a 260 person team quickly enough to even start a project of that size in a month.
As I skimmed that, the text wrapped at the hyphen in man-hours, and my brain autocompleted it to "295x fewer managers" - and it pretty much rings true...
> a system that is no longer operational due to its poor safety record
the safety problems with the shuttle were, broadly speaking, hardware problems and not software problems.
from "Appendix F - Personal Observations on Reliability of Shuttle" [0], which was Richard Feynman's appendix to the report on the 1986 Challenger disaster:
> To summarize then, the computer software checking system and attitude is of the highest quality. There appears to be no process of gradually fooling oneself while degrading standards so characteristic of the Solid Rocket Booster or Space Shuttle Main Engine safety systems.
he specifically highlighted the quality of the avionics software as an example of how engineering on a project like the Shuttle could be done well, and wasn't doomed to be low-quality and unsafe simply by virtue of being a large complicated government project.
100 success and 2 failures. About a 1.6% failure rate from memory.
That’s not a great record. Sure it’s a complex field, and it’s not as dangerous as say being US President, but a failure rate of >1% is not something to write home about.
Its predecessor launch system had 10 successes and 2 failures for the crewed flights. One, they got the crew home safely, but it was close. So that's a 17% failure rate and a 8% rate of failures killing the crew.
Not saying the Shuttle's success rate was awesome; I'm glad we demand more nowadays. But it still represented a pretty decent crew safety improvement for the USA's human spaceflight program.
Challenger blew up on launch because of a booster failure due to a faulty O-ring seal.
Columbia burned up on re-entry because a piece of insulating foam broke off from the external tank during launch, damaging the heat tiles on its left wing.
As far as we know, software never caused any dangerous incidents for the shuttle. You can't say that about Arianespace (Ariane 4 #1) or SpaceX (a couple of crashes while trying to land - low stakes though) or Airbus ("just" some bad UX in critical situations) or Boeing (software basically killed a few hundred people).
Sure, but I imagine at least some components only really execute a small number of times per flight, or possibly never in the case of certain error handling code. Stretching the metaphor more than is probably appropriate, I'd treat launching the shuttle and having it come back as a big integration test. A system that passes it's integration test 100 times isn't necessarily particularly impressive in terms of reliability.
We run our integration test tens of times a day, and it fails once or twice a month. Our system is kinda flaky :(
A 2% failure rate isn't impressive, but I'm fine not crediting any of the shuttle issues to software. My only point is that 100 instance of use for purpose isn't enough, to my mind, to argue that a piece of software is exceptionally reliable.
Wasn't a poor safety record whut killed the shuttle. It was the cost, and anticipation of degraded safety, in the future.
Even though more astronauts died, because of the two shuttle accidents, than any other NASA disaster, the safety record was absolutely amazing, when we consider everything that was going on.
> The space shuttle became obsolete technology after all those years. Would've needed a redesign.
Are people aware of how old the technology is that's currently putting objects and people into space? No, the space shuttle was not obsolete. It was expensive... very expensive. To this day, we still don't have a replacement for it's capabilities though.
Meanwhile the X37 is just silently doing everything the Air Force wanted the shuttle to do and probably way better since it doesn't need to meet anyone's long term goals of space exploration.
What so bad about old technology? Siemens is still selling point mechanisms based on designs that are almost as old as electric motors themselves. They work just fine.
the challenger disaster is noteworthy as a tragic incident because many individuals tried to stop the launch knowing this (ahem, hardware) issue was present. to many people, it was not a surprise when it happened.
But in neither case was it due to a code failure that put the shuttle into an unrecoverable state, but rather one the falls into materials and/or mechanical engineering.
> Even though more astronauts died, because of the two shuttle accidents, than any other NASA disaster, the safety record was absolutely amazing, when we consider everything that was going on.
That's "the operation was successful but the patient died" logic. Killing over 1% of your riders is not a good safety record! No ifs, no buts.
The situation with the Space Shuttle is more complex than simply poor safety. In terms of missions, it has a better record than many other launch vehicles - 2 fatal missions out of 135 for the shuttle, 2 out of 66 for the Soviet-era Soyuz, and a frighteningly poor 1 fatal mission out of only 12 spaceflights for SpaceShipTwo.
However, the Space Shuttle had a much larger crew capacity than most missions probably needed (up to eight astronauts compared to Apollo or Soyuz's three), especially considering that the majority of Soviet/Roscosmos, ESA and CNSA missions were autonomous and completely unmanned - no crew to endanger!
Perhaps that makes the metaphor even better for Kubernetes: an highly engineered, capable and multi-purpose system requiring considerable attention, and probably used a little more than it should be.
> had a much larger crew capacity than most missions probably needed
We rarely flew the maximum number of passengers. On non-EVA missions we typically only sent up 5 astronauts. For EVA missions we usually sent up 7 with the two extra crew typically being dedicated to the EVA.
EVAs are a real chore. The shuttle is at 14.7 psi with regular atmosphere, but the EVA suits are 4 psi with pure oxygen atmosphere, so you have to spend a lot of time pre-breathing just to put on the suit. It also drains the hell out of you because it's microgravity, not zero gravity, and moving around and positioning in the suit using just your hands all day wears you out fast.
The extra capacity was also useful for bringing other nations personnel onto a mission with us. They didn't strictly have a large mission role, but it is good diplomacy, and helps other nations build up their own space program. Plus.. a few times.. we sent up 6 but brought back 7, which is a nice feature.
Anyways, when not sending up extra crew, we used the additional space for equipment and large experiment packages, some of them as large as a single person.
> and probably used a little more than it should be.
NASA did a great job of making space flight look normal and routine. Which is a bummer, because if you dig into any of their voluminous documentation on the shuttle program as a whole, or into individual missions, they are all anything but.
Space is just an absolutely insane environment to work in, let alone to discover and study from, and the shuttle was an excellent platform for that work. Somewhere along the way space commercialization became about building stations and just trucking people back and forth to them. And for that inglorious mission the shuttle is definitely not appropriate.
Anyways.. one of my favorite things about the orbiter.. the front thermal windows need to be tinted; obviously, but what I found out recently is they used the same green dye they use in US banknotes to do that.
Per passenger mile traveled, the most common measure, it's one of the safest vehicles ever created and flown.
> will people even remember the Space Shuttle in a good light?
This is an honest question, since my childhood was squarely in the 1980s, but how can you possibly not? Are you so young that your only perspective of this program and all it's missions and accomplishments are purely in retrospect and colored by the zeitgeist of our current civilian space contractors?
Not even close to safest vehicle unless mean space vehicle. Space Shuttle was in orbit for 21k orbits and traveled 542 million miles. Which gives 28 deaths per billion miles.
Airliners are running 0.01 deaths per billion miles. Driving is 15 per billion miles. So it was worse than driving. Airliners beat it by 3 orders of magnitude. Which isn't surprising when US airliners travel distance of Space Shuttle in less than month.
You are off by a factor of at least 5, because it's _passenger_ miles, not _vehicle_ miles. This is also why airlines are "so safe," because we put 300 people on them at a whack, if you were wondering where the "three orders of magnitude" actually comes from. It's still a man made machine being operated by humans.
So it's 5.1. Three times safer than driving. You might apocryphally conclude they were at greater risk taking the astro van to the pad.
>DO-178B. It’s a quality standard for safety-critical aviation products... Your tests have to cause each branch operation in the resulting binary code to be taken and to fall through at least once... took a year of 60 hour weeks... It made a huge, huge difference. We just didn’t really have any bugs for the next eight or nine years.
This controller is intentionally written in a very verbose style. You will notice:
1. Every 'if' statement has a matching 'else' (exception: simple error checks for a client API call)
2. Things that may seem obvious are commented explicitly
We call this style 'space shuttle style'. Space shuttle style is meant to ensure that every branch and condition is considered and accounted for - the same way code is written at NASA for applications like the space shuttle.
----------------------------
^^^^
This bit reminds me of exhaustive checks in typescript code. I try to use them all the time.
Speaking specifically about cases where any not-completely-trivial `if` is matched with an explicit `else`: I wonder to what extent this code could be simplified if the authors of k8s had chosen to design around using structural pattern matching rather than `if`/`else` blocks?
Lots of mainstream languages with support for structural pattern matching have compile-time tooling to check whether a match was exhaustive, which alone could serve as an idiomatic solution while increasing information density in the code.
I saw it in the kube reddit, got a chuckle, and figured I would share here. Guess I got a chuckle 6 years ago too! Can't remember how I found it last time.
I've obviously only skimmed the code, but honestly it doesn't look that bad to me. Sure there are things I would do differently, but I've seen much, much worse. At least the code follows a single convention, and has the appearance that everything was thought through and that there is method behind the madness, as it were. I'd take this any day over the typical mishmash of styles, lazy coding, illogical code structure etc. that I've encountered so many times.
> // 1. Every 'if' statement has a matching 'else' (exception: simple error
> // checks for a client API call)
> // 2. Things that may seem obvious are commented explicitly
Honest question: Why invent "safety" practices and ignore every documented software engineering best practice? 2,000 line long modules and 200-line methods with 3-4 if-levels are considered harmful. Comments that say what the code does instead of specifying why are similarly not useful and likely to go out of date with the actual code. Gratuitous use of `nil`. These are just surface-level observations without getting into coupling, SRP, etc.
"The flight control code for the Armadillo rockets is only a few thousand lines of code, so I took the main tic function and started inlining all the subroutines. While I can't say that I found a hidden bug that could have caused a crash (literally...), I did find several variables that were set multiple times, a couple control flow things that looked a bit dodgy, and the final code got smaller and cleaner."
If Carmack finds value in the approach, perhaps we shouldn't dismiss it out of hand.
Also worth noting his follow-up comment:
"In the years since I wrote this, I have gotten much more bullish about pure functional programming, even in C/C++ where reasonable... When it gets to be too much to take, figure out how to factor blocks out into pure functions"
Thanks. This is the first instance of a respected software engineer arguing in favor of this style that I have read (contrast with Dave Thomas, Kent Beck, Bob Martin, etc.)!
Arbitrary line limits tend to unnecessary fragmentation. Add includes, licenses, glue code and comment; and you have an unapproachable spaghetti.
Try to keep methods to 200 lines in high performance code, and see your performance crash and burn like Icarus' flight.
When you read the comments in the code, you can see that they simplified the code to a single module, and embedded enormous amount of know-how to keep the code approachable and more importantly, sustainable.
For someone who doesn't know the language or the logic in a piece of code, the set of comments which outline what the code does is very helpful. In six months, your code will be foreign to you, so it's useful for you, too.
Comments are part of the code and the codebase. If you're not updating them as you update the code around them, you're introducing documentation bugs into your code. Just because the compiler doesn't act on them doesn't mean they are not functional parts of your code. In essence they're your knowledge, and lab notebook embedded in your code, and it's way more valuable in maintaining the code you wrote. They are more valuable than the code which is executed by the computer.
Best practices are guidelines, not laws or strict rules. You apply them as they fit to your codebase. Do not obey them blindly and create problematic codebases.
Sometimes you have to bend the rules and make your own, and it's totally acceptable when you know what you're doing.
> Try to keep methods to 200 lines in high performance code, and see your performance crash and burn like Icarus' flight.
Are these loops in Kubernetes so hot that extra microseconds for some program stack manipulation will affect performance? I never took Kubernetes as a hyper-real time application.
>Do not obey them blindly and create problematic code bases.
I don't know the code so won't question it specifically, but wouldn't this also apply to "space shuttle programming"? I feel Space shuttle programming's job in many ways is in fact to try and remove ambiguity from code. But not by explaining the language, but the variables and their units. I sure wouldn't mind spamming "units in cm" everywhere or explaining every branch logic if it's mission critical. Not so much this inconsistent doxygen/javadoc style documentation on every variable/class. If you're going to go full entrprise programming, commit to it.
Above everything else, the big thing going through my mind reading these are "a proper linter configuraion would have really helped enforce these rules".
> Are these loops in Kubernetes so hot that extra microseconds for some program stack manipulation will affect performance?
Actually, looking at the code itself, pv_controller doesn't look overly hot, but extremely high value. In this case the long methods are intended to keep the logic confined, so one can read end to end and understand what is going on.
The code even doesn't use automatic type inference in Go (the := syntax), in some cases to make code more readable.
From what I understand, this code needs to be "kernel level robust", so they kept the overly verbose formatting and collected all three files to a single, overly verbose file.
I don't think this is a bad thing. This is an important piece of a scale-out system which needs to work without fault (debate of this is another comment's subject), and more importantly it's developed by a horde of people. So this style makes sense to put every person touching the code on the same page quick.
> I feel Space shuttle programming's job in many ways is in fact to try and remove ambiguity from code. But not by explaining the language, but the variables and their units. I sure wouldn't mind spamming "units in cm" everywhere or explaining every branch logic if it's mission critical.
A code comment needs to explain both the logic, and how the programming language implement this logic the best way possible. In some cases, an optimized statement doesn't look like what it's doing in the comment above it (e.g. the infamous WTF? comment from id Games which does fast_sqrt with a magic number). In these cases I open a "Magic Alert" comment block to explain what I'm doing and how it translates to the code.
This becomes more evident in hardware programming and while working around quirks of the hardware you interface with ("why this weird wait?", or "why are you pushing these bytes which has no meaning?"), but it also happens with scientific software which you do some calculation which looks like something else (e.g.: Numerical integration, esp. in near-singular cases).
> Not so much this inconsistent doxygen/javadoc style documentation on every variable/class. If you're going to go full entrprise programming, commit to it.
This is not inconsistent. It's just stream-of-consciousness commenting. If you read the code from top to bottom, you can say that "aha, they thought this first, then remembered that they have to check this too, etc." which is also I do on my codebases [0]. Plus inline comments are shown as help blobs by gopls, so it's a win-win.
I personally prefer to do "full on compileable documentation" on bigger codebases because the entry point is not so visible in these.
> "a proper linter configuraion would have really helped enforce these rules".
gopls and gofmt do great job of formatting the codebase and enforcing good practices, but they don't touch documentation unfortunately.
I tried writing in this "safe" way for quite a while, but I found the number of bugs I wrote was much higher and took way longer than just using railroad-style error handling via early returns.
The problem with having an explicit else for every if block is that the complexity of trying to remember the current context just explodes. I think a reasonable reframe of this rule would be "Every if-conditional block either returns early or it has a matching else block". The pattern of "if (cond) { do special handling }" is definitely way more dangerous than early return and makes it much harder to reason about.
There is no single canonical suite of best practices.
There is also nothing harmful or unharmful about the length of a function or the lines of code in a file. Different languages have their opinions on how you should organise your code but none of them can claim to be ‘best practice’.
Go as a language doesn’t favour code split across many small files.
It's all opinions and "best practice" isn't some objective single rule to uphold.
But generally, best practices are "best" for a reason, some emperical. The machine usually won't care but the humans do. e.g. VS or Jetbrains will simply reject autocompletion if you make a file too big, and if you override the configuration it will slow down your entire IDE. So there is a "hard" soft-limit on how many lines you put in a file.
Same with Line width. Sure, word wrap exists but you do sacrifice ease and speed of readability if you have overly long stretches of code on one line, adding a 2nd dimension to scroll.
Assuming there is a compelling reason for a large file to begin with: with all due respect to VS Code and JetBrains, if the tools chokes because the file is too big, use a better tool.
As for long lines, there is sometimes value in consistently formatting things, even if it makes it somewhat harder to read because the lines run long. For example, it can make similar things all appear in the same column, so it's easy to visually scan down the column to see if something is amiss.
In any case, since soft wrapping has been available for ages, why do you feel the need to reformat the code at all in order to see long lines?
There's nothing inherently wrong with a 200-line-long method. If the code inside is linear and keeps the same level of abstraction - it can be the best option.
The alternative (let's say 40 5-line-long methods) can be worse (because you have to jump from place to place to understand everything, and you can mess up the order in which they should be called - there's 40! permutations to choose from).
> Why invent "safety" practices and ignore every documented software engineering best practice?
That seems unnecessarily brutal (and untrue).
> 2,000 line long modules and 200-line methods with 3-4 if-levels are considered harmful
Sometimes, not always. Limiting file size arbitrarily is not "best practice". There are times where keeping the context in one place lowers the cognitive complexity in understanding the logic. If these functions are logically tightly related splitting them out into multiple files will likely make things worse. 2000 lines (a lot of white space and comments) isn't crazy at all for a complicated piece of business logic.
> Comments that say what the code does instead of specifying why are similarly not useful and likely to go out of date with the actual code.
I don't think this is a clear cut best practice either. A comment that explains that you set var a to parameter b is useless, but it can have utility if the "what" adds more context, which seems to be the case in this file from skimming it. There's code and there's business logic and comments can act as translation between the two without necessarily being the why.
> Gratuitous use of `nil`
Welcome to golang. `nil` for error values is standard.
Splitting a 200 line method into 20, 10-line methods rarely improves readability, it just tricks you into thinking those 200 lines are simpler than they actually are.
Furthermore, how to split 200 lines into methods is context dependent. Looking through the lens of memory, optimality, simplicity, different flows of concern, and you'll want to split those 200 lines up differently.
The problem space is complex, hiding that fact doesn't get rid of that fact.
I would like to hear about what makes them bullshitters. I've had and seen really good results in terms of high productivity and low bug count on teams that followed the SOLID principles described in Robert Martin's Clean Architecture as well as Kent Beck's "make it work, make it right, make it fast." I've also universally observed the opposite results on teams that didn't.
Many stupid things came due to their work like "comments are bad" or ridiculous things like refactor of reasonably sized functions into very small functions - just a few LoC e.g 3.
Ive seen Uncle Bobs refactor where he modifies thread safe code and introduces static properties to make code look elegant, but actually changes it behavior in multi thread environment, so basically didnt perform a refactor, but just introduced bugs
But code looks better, so great thing to put into the book, right?
>Kent Beck's "make it work, make it right, make it fast."
How such a trivial thing can be even attributed to someone?
This sort of code strikes me as an ideal candidate for translation into a declarative, rule-based, table-driven system. Such a thing is more comprehensible and more verifiable than ad-hoc imperative if-clause-rich code.
Messy code of this sort is usually a sign of a missing abstraction.
The Go ideology is basically to just write down all the code, in a more or less straightforward translation of what you would have written in C, and not to try to abstract anything.
I once had to go through some code that was objectively terrible. The guy who wrote was is a mechanical engineer, close to retirement at the time, and self-learned in programming. He had absolutely none of the background you can expect a professional programmer to have, and in particular abstraction seemed like a foreign concept to him. Have 50 buttons, each doing essentially the same thing, and you will see 50 copy-pasted blocks of code. Particularly ironic that he was using Java, a language known for its culture of abuse of design patterns and abstraction.
But despite that huge mess that code was, it was surprisingly readable. You could look anywhere in the code and understand what it is doing. No calling though interfaces, when you see foo.bar(), you can just follow the symbol in your IDE and that's the instruction that will be run, and many times, there is not even a function, just code, thousands of lines of it, different cases are just dealt with ifs.
Maybe it was the most pleasant "bad code" I had to work with. Code using the wrong or too much abstraction is much worse, because it is just as buggy and ugly, but you don't even know what to expect when all you see is an interface call where the actual code is in a completely different part of the software, and what connect the two is in yet another place. With more "factories", "managers", "dispatchers", etc... than actual logic.
If you saw the 50 copy-pasted blocks of code I'll bet you mentally abstracted it and assumed they were all doing the same thing. The problem comes when one of them isn't quite doing the same thing. That isn't possible with a real abstraction.
Perhaps as a mechanical engineer the guy understood the more important things, though, like separation of concerns and a layered model, ie. architecture. Truly bad code mixes up all concerns into one ball of mud. Feel like forking a new process right from the GUI layer (the only layer) based on some business logic for that one button press? No problem!
Being able to look at a single piece of code in isolation and understand what it's doing isn't the challenge. Anyone can do that for any code. The challenge is knowing how it runs in context of the larger program. What are the downstream implications of the way it's done? How many times is this run and why? Who is this code responsible to and why would it change (e.g. is it business logic, or just a UI thing)? This is the kind of thing an architecture gives you. You don't need to go all in with abstracting everything, but you do need some architecture.
I (well really, a coworker of mine) just today discovered a JetBrains action “Copy GitHub URL” [sic] that, if you have lines of code selected in the IDE, includes those lines in the copied URL fragment. So so so much better than my old workflow of stopping what I’m doing, going to the file in GitHub, and selecting the lines there to share links to bits of code.
// CSINameTranslator can get the CSI Driver name based on the in-tree plugin name
type CSINameTranslator interface {
GetCSINameFromInTreeName(pluginName string) (string, error)
}
Do people actually find comments like the above useful?
On a big open-source project like Kubernetes, they're probably happy with the tradeoff between "some exported names have inane and obvious comments" and "our linter requires open-source contributors to document their exported names in a consistent way."
I consider comments to live at "conceptual" level while code lives at "physical" level.
With this way,when you are debugging, your mind can read code at conceptual level and easily disregard irrelevant blocks of code. without comments, i will need to mentally drop down at physical level and implement a compiler in my brain for the same results
In this example it's faster for me to understand what the code does by reading the comment than to parse the code itself. If you're just scanning over a lot of code and looking for what you care about it could be helpful.
Yes, if it's your first day on the job and you don't know that CSI is a driver. Yes, if GetCSINameFromInTreeName ever gets renamed to something less obvious.
Joined a new company recently. They kept telling me their codebase was a mess. Yet I've been finding it to be remarkably refreshing. There are extensive comments everywhere. Lots of white space (remember to let your code breathe). Existing linting policies are extensive and thorough. The code is well structured and remarkably easy to follow through where that does what. I think it helps that it's a very small team with maybe 5-6 people max over time. It's such a pleasure to explore the project while learning the system and starting to fix some small long term issues. Really, really nice.
I wrote a toy kubernetes CSI driver recently and found it quite pleasing to do. Bare minimum means implementing just 3 grpc api calls - An informational one and publish/unpublish. Amazon's EFS or EBS CSI drivers are good examples because they're a pretty small codebase. I don't know exactly how this code interacts with the CSI driver itself, but it appears that it is the logic that ultimately results in the volume manipulation calls the controller makes against the CSI driver. It's nice that the complexity is all here, I was actually pretty surprised how simple the drivers themselves are.
Main reason it is, as it is, and is usually not seen in any other languages is that go does not has operator functions for variable fallbacks and logical error handling.
But overall this helps in improving performance of application(on a large scale).
Personally I love this level of verbosity in code. There are still way too many levels of nested control flow for my taste—I find that makes it exceptionally hard to retain context—but at least there are early returns.
Nice. I wrote about a similar idea back in 2013 (https://jmmv.dev/2013/07/readability-explicitly-state.html) after I found that being explicit about all branches made my code easier to reason about and easier for reviewers to validate. Glad to find that this is “space shuttle style”!
I’ve always disliked how the Go style insists on removing certain branches at the end of functions, for example.
It's a lot easier to glance through a single 2k line file than it would be to go through 200 separate 100 line files (+ extra overhead for imports in each file)
I love this style of coding and it's something I've been doing increasingly more over the last few years. Code verbosity is so underrated in so many dev teams. I have never had a co-worker complain that my code was too well documented and too easy to understand, quite the contrary actually.
When someone advocates for more concise code by saying that it's "easier and quicker to read" I always counter that if you use a lot of language feature magic to make code more concise then the mental work to read the code remains the same, the only difference is that you ask your co-workers to leave the editor and google lots of things in order to understand your brief code versus keeping them in their editor and just being able to read the simple verbose code without interruption in one place.
One part of what NASA did was have every line of code reviewed by many people, including someone whose sole job was to make sure the comments matched the code and vice versa.
If you do not take such care, excessively for both comments are inevitably going to drift from the code and lead to incredible confusion.
People write code _not like this_ professionally?? This is 100% of our codebases.
I think the only difference is we state that if feel the urge to write comment, write an [equivalent] log statement instead, so then we can use it in production for failure tracing.
Is there an automated checker for this? I also have ad-box conventions for some code I write, but as long as there’s nothing but me between code and convention it’ll get broken right away.
Dunno. Maybe a table with functions and conditions?
The point isn't the specific form that support would take. The point is that it has some form that implies the intent of the programmer.
It has N states and they are enumerated. It has code that runs on transition and state. It has these variables which make up conditions. The state doesn't become a dead end and all states are reachable. etc.
The point is to enable the compiler to know "Hey, this is a state machine. You can check <invariant> because it is a state machine.
Is this code unit tested? If it's so critical that every branch has to be accounted for, I would assume that time would be well-spent combinatorically testing its inputs, right?
This kind of thing should be enforced with linting rules and an automated action that rejects any pull request which violates the rules, not with a comment.
I assume no and ultimately that is the point of code like this. As you say, if you can somehow correctly identify every possible branch in your tests then you could write the actual code any way you like. But then the tests would have to look like this, otherwise you'd have abstraction in your tests and you couldn't be sure if it covers every branch. Who tests the tests?
You test the test by mutating the code. Once you have 100% condition/decision branch coverage you can automatically sweep the code with changes and the tests will fail.
I have a little harness I use for this steps through each non-comment line of code changes signs, comparison directions, offsets values, swaps variables, adds negations, replaces computations with constants, etc. basically changes that are more or less guaranteed to compile.
Then it runs the compiler to produce a stripped optimized output, if the compiler is successful, it checks that the resulting md5 is different from all the prior results, runs the tests. If the tests pass, it saves the passing code (which, to be clear is a meta-test failure), and then later I sweep through and either determine that it managed to produce functionally equivalent code or I improve the tests (and fix the resulting bugs they expose).
The identical compiled binary test eliminates a lot of false positives.
But that kind of approach doesn't work unless you get to ~100% condition/decision branch coverage since obviously any condition that isn't tested will be free to mutate.
Hm. Maybe if I updated it I'll have some attempt at sampling LLM rewrites of functions. :P
Interesting approach. I hadn't considered actually implementing such "brute force" methods. I guess it's similar to fuzzing.
I think the problem is you are then moving your "real" condition/decision branch documentation into your tests. The tests are then basically a guard rail for just in case someone modifies some abstract bit of code and it changes the behaviour of some otherwise opaque decision branch. The approach in OP seems to be to just move the "real" logic/documentation into the code itself and do away with the abstractions (and perhaps the tests too).
Yes, I agree though the issue there is that changes which aren't believed to change the behavior might, as there isn't a way to tell except by being a very careful programmer and reviewer.
This is how important code should be written. No abstractions, with comments saying why things are implemented this way (instead of just saying what the code does).
// ==================================================================
// PLEASE DO NOT ATTEMPT TO SIMPLIFY THIS CODE.
// KEEP THE SPACE SHUTTLE FLYING.
// ==================================================================
//
// This controller is intentionally written in a very verbose style. You will
// notice:
//
// 1. Every 'if' statement has a matching 'else' (exception: simple error
// checks for a client API call)
// 2. Things that may seem obvious are commented explicitly
//
// We call this style 'space shuttle style'. Space shuttle style is meant to
// ensure that every branch and condition is considered and accounted for -
// the same way code is written at NASA for applications like the space
// shuttle.
When I looked into Go I found it a bit surprising that someone had created a non-expression-based language as late as ~2009.
I have not familiarized myself with the arguments against expression-based design but as a naive individual contributor/end-user-of-languages, expressions seem like one of the few software engineering decisions that doesn't actually "depend," but rather, designing languages around expressions seems to be unequivocally superior.
I used to be skeptical about introducing complex expressions to C-syntax languages for a long time until I saw how well Kotlin handled `when`.
Now every time I use typescript or go I have trouble trying to express what I want to say because `when` and similar expressions are just such a convenient way to think about a problem.
In go that means I usually end up extracting that code into a separate function with a single large `switch` statement with every case containing a `return` statement.
Yeah expression based languages are a pleasure to work with. A lot of what was good about CoffeeScript was absorbed into ES6, but the expression oriented nature of it was never fully replicated in JS/TS.
More to the point, the comment in the code mentions the “combinatorial” explosion of conditions that have to be carefully maintained by fallible meat brains.
In most modern languages something like this could be implemented using composition with interfaces or traits. Especially in Rust it’s possible to write very robust code that has identical performance to the if-else spaghetti, but is proven correct by the compiler.
I’m on mobile, so it’s hard to read through the code, but I noticed one section that tries to find an existing volume to use, and if it can’t, then it will provision one instead.
This could be three classes that implement the same interface:
class ExistingVolume : IVolumeAllocator
class CreateVolume : IVolumeAllocator
class SeqVolumeAllocator : IVolumeAllocator
The last class takes a list of IVolumeAllocator abstract types as its input during construction and will try them in sequence. It could find and then allocate, or find in many different places before giving up and allocating, or allocating from different pools trying them in order.
Far more flexible and robust than carefully commented if-else statements!
Similarly, there's a number of "feature gate" if-else statements adding to the complexity. Let's say the CreateVolume class has two variants, the original 'v1' and an experimental 'v2' version. Then you could construct a SeqVolumeAllocator thus:
allocator = new SeqVolumeAllocator(
featureFlag ? new CreateVolumeV2() : new CreateVolumeV1(),
new ExistingVolume() );
And then you never have to worry about the feature flag breaking control flow or error handling somewhere in a bizarre way.
See the legendary Andrei Alexandrescu demonstrating about a similar design in his CppCon talk “std::allocator is to allocation what std::vector is to vexation”: https://youtu.be/LIb3L4vKZ7U
In the statement/expression-oriented axis of languages, Go is a statement oriented language (like C, Pascal, Ada, lots of others). This is in contrast to expression oriented languages like the Lisp family, most, if not all, functional languages, Ruby, Smalltalk and some others.
Expressions produce a value, statements do not. That's the key distinction. In C, if statements do not produce a value. In Lisp, if expressions do. This changes where the expression/statement is able to be used and, consequently, how you might construct programs.
A simple example for anyone who might not appreciate why this can be so nice.
In languages where if is a statement (aka returns no value), you'd write code like
int value;
if(condition)
{ value = 5; }
else
{ value = 10; }
Instead of just
int value = if(condition) {5} else {10}
Some languages leave ifs as statements but add trinary as a way to get the same effect which is an acceptable workaround, but at least for me there are times I appreciate an if statement because it stands out more making it obvious what I'm doing.
Not necessarily. In many Lisps you can bind the result of a condition like so.
(let [thing (cond
pred-1 form-1
...
pred-n form-n)]
(do something with thing))
This makes laying out React components in ClojureScript feel "natural" compared to JSX/TSX, where instead one nests ternaries or performs a handful of early returns. Both of these options negatively impact readability of code.
cond is neither an operator nor a statement, it's an expression. This is a demonstration of a conditional expression handling multiple conditions, which GP wanted.
More importantly, pattern matching is not necessary here.
You misunderstood. They were talking specifically about languages that only have ternary operators as a way to do if-as-expression, and why they prefer languages with either real if-else if-else expressions or full switch/pattern matching as expression.
As someone who writes a fair bit of c# making switch and if's into expressions and adding Discriminated Unions (which they are actually working on) are my biggest "please give me this."
Plus side I dabble in f# which is so much more expressive.
Same for me in the Scala vs. Java world, it's hard once you get used to how awesome expressions over statements and algebraic data types/case enums/"discriminated unions" are. But I haven't done much C# (yet) myself, could you clarify for me: does C# have discriminated unions? I didn't think the language supported that (only F# has them)?
The c# team is working on a version of them they are calling Typed Unions, not guaranteed yet but there is an official proposal that I believe is 2 weeks old.
I'm assuming you mean "non-exception". Apologies if I assume incorrectly.
In case I'm correct, this is from Andrew Gerrand, one of the creators of Go:
The reason we didn't include exceptions in Go is not because of expense. It's because exceptions thread an invisible second control flow through your programs making them less readable and harder to reason about.
In Go the code does what it says. The error is handled or it is not. You may find Go's error handling verbose, but a lot of programmers find this a great relief.
In short, we didn't include exceptions because we don't need them. Why add all that complexity for such contentious gains?
You appear to have misread "expression" as "exception"; this is completely unrelated. An expression-based language is one that lets you do `let blah = if foo then bar else baz`, for example.
I don't think he misread, because I also was puzzled. I had never heard of the term "expression" used in this way, and I imagine I'm not alone. I do greatly appreciate the clarification from you and jtsummers though. I knew of the distinction, but I didn't know of a term for it until today.
i honestly struggle with this because its a "i know when i see it" thing, ex. here, const boo = foo ? bar : baz suffices which brings in ~every language I know.
My poor attempt at a definition, covers it in practice in languages I'm familiar, but not in theory, I assume: a language where switch statements return a value
Go doesn’t have a ternary operator, you are supposed to write something like
boo := bar
if foo {
boo = baz
}
One of the many cases where Go’s designers decided they would ban something they disliked about C (in this case, complicated ternary operator chains), but thought Google programmers were too stupid to understand any idea from a more modern language than C, so didn’t add any replacement.
(I’m not exaggerating or being flippant: Google programmers being too stupid to understand modern programming languages has literally been cited as one of the main design goals of Go).
The second you add a tenary operator people are gonna nest them, but the same is true for if/switch/match expressions unfortunately. I don't think they meant stupid literally, it's more like KISS philosophy applied to language design for maintainablity/readability/code quality reasons. Google employs some of the smartest programmers in the world.
Nesting them is not so bad if the syntax makes it obvious what the precedence is, which isn't true of C, but is of Rust for example.
Anyway, complicated code should be avoided whenever possible, true, but banning the ternary operator (and similar constructs like match/switch statements as expressions) does nothing to make code simpler. It just forces you to transform
let x = (some complicated nested expression);
into
var x;
// (some complicated nested tree of statements where `x` is set conditionally in several different places)
A language in which matching on a structure is not a statement but instead returns a value; the special case of matching on a boolean (this is often spelled `if`); one which doesn't have a `throw` statement (but instead models it as a generic function `Exception -> 'a`, for example); etc.
The `if` statement is just less ergonomic than the ternary operator, because statements don't compose as well as expressions do. A language which has a lazy ternary operator, and which lets you use an expression of type `unit` as a statement, does not require an `if` statement at all, because `if a then (b : unit) else (c : unit)` is identically `a ? b : c`. The converse is not true: you can't use `if` statements to mimic the ternary operator without explicitly setting up some state to mutate.
> I assume: a language where switch statements return a value
A language where everything is a value. Yes, a switch statement could be considered a value. More specifically - these are expressions that can be (but don't necessarily have to be) evaluated into a value. The most practical and introductory example of this is probably Ruby (called case: http://ruby-doc.com/docs/ProgrammingRuby/html/tut_expression...).
Python, JS, Ruby all have facilities to do this to varying extents. For a "true" expression-based language you will want to look at something like Clojure.
Yeah you nailed the limitation. Switch type expression that returns a value is a pretty universal feature in expression based languages, often in the form of a pattern matching based expression.
Check out the ‘case’ statement in elixir for an example.
In languages that support it, it usually becomes an incredibly commonly used expression because it’s just so applicable and practical.
I went right into the code and looked for 'if' statements without 'else' statements. There are plenty. I don't see how you can have any exceptions to this rule if you are truly committed to capturing all branches.
If the 'if' condition matching always results in a thrown exception, a return, or likewise, then you don't really need an 'else' unless you're using a language which supports conditions and resumption (conformant Common Lisp implementations, and not really anything else I know of). The 'else', implicitly, is that the flow of control leaves the scope of the 'if' block at all.
(I haven't read far enough into the code to know that this is what they're doing, but the head matter I did read suggests as much. It's a common enough pattern, especially around eg argument validation and other sanity checks a function might perform to ensure it can do meaningful work at all.)
(I do wish HN supported an inline monospace markup, the <code> to a four-space indent's <pre>...)
No, "guard" in Swift is special. Once you're in the "else" block, you MUST return or call a non-returning function (e.g. abort).
It's specifically designed to prevent bugs where flow control accidentally resumes from an error handler.
It's the same idea as the "every 'if' must have an 'else'" guideline in the code being discussed, except with "guard" the compiler will detect violations. It's a good thing.
I did not say exceptions could be resumed (correctly 'restarted'); I said conditions could. Most languages with which I'm familiar do not have the latter.
One typically available restart is to ignore the condition and resume execution immediately after, as you describe. Another is to re-evaluate the form in which the condition was signaled. In that case, the conditional may well be itself re-evaluated with a different result, executing a different branch.
An if-like conditional construct will not, by itself, re-evaluate the condition when an exception occurs, unless it is a special exception-aware construct (a weird macro someone made). The conditional construct would itself have to have an internal restart point around the conditional expression, which intercepts the exception.
"Condition" is just a silly name for "exception". It does not mean "restartable exception". It's a terminology that Common Lisp copied from PL/I. At the time Common Lisp was being standardized, it was not a common programming language feature, so the naming didn't matter. In the decades since, the world went to "exception".
The word "condition" already has a clear meaning in computing, referring to a logical state ("condition control register", "conditional branch", ...). The "condition variable" synchronization primitive (which has no condition-like state!) is bad enough; we don't need to heap more meanings on those words.
Note that processor instruction sets have exceptions, precisely restartable, down to the instruction, without requiring a cooperating restart point. E.g. any code can hit a page fault: the exception handling will fix it up, making a page present at the faulting address, and then restart the instruction that faulted.
Yeah, I immediately scrolled down to see how silly I thought it looked, and.. it's not, there are plenty of ifs without elses. If you're going to have exceptions to that rule where it's 'simple' and not necessary, then congratulations you're using if (and else or not) just like everyone else?
hahaha well Kubernetes is the opposite of a special shuttle that keeps on flying. It crashes all the time, version updates etc. If you want stability go to apache or nginx.
yeah maybe it does, but when you have a hammer everything looks like a nail. Kubernetes is used all the time nowadays on simple projects just because it's the most intellectual pleasing solution. This is creating high maintainance costs by locking in companies with extremely complicated software where it's just not needed.
That's a pretty good track record, and indicates a level of fail-safe design (everything continued working, even though a critical components kept crashinglooping).
You obviously have not operated kubernetes long enough. With large enough scale you'll find the cluster controllers crashing for various reasons, getting out of sync with each other, etcd crashing or getting locked, and a whole bag of bugs.
If you think nginx is the right tool for solving problems like container deployment, service discovery, cluster scaling, and secret management, then I suppose it's not surprising that you think Kubernetes "crashes all the time" and that a 1 year rolling support window for software releases is an insurmountable obstacle. Kubernetes has a lot of genuine issues and rough edges, but you're kind of showing your ass when you make comments like this.
There are just very few applications that actually need all of this. maybe 1 tot 0.1%, for instance vercel might need it. But 95-99% can just run on several "simple" servers & keep deployment times within minutes and no complicated stuff needed. Yet Kubernetes get's pushed all the time.
there is nothing easy or robust about kubernetes. Hence all the tooling around it, having things break down if you don't update. Dependencies not being compatible all the time. Server management should cost as little time, and be stress free.
There are many ways to solve the up/down issue and depends on which language you are running.
I agree if you are running kubernetes yourself you are absolutely right. I was thinking more about managed clusters in the cloud. Every provider offers managed kubernetes, then you aren't even vendor locked.
Why was Space Shuttle code so good and the engineering so bad? The thing was expensive and shit and had a 1.5% catastrophic failure rate for passenger transport. Soyuz was two orders of magnitude better.
Russia/USSR have reputation for McGyvering things and US has reputation for gold-plating but US ship is killing people every 65 flights and Russian ship has over 1500 launches without death.
Maybe engineers should learn from either Soviet/Russian engineers or from NASA software engineers. Both are making more reliable things.
> I need to give you the issue from the NASA point of view so you can understand the pressures that they were under. In a developmental program, any developmental program, the program manager essentially has four areas to trade. The first one is money. Obviously, he can go get more money if he falls behind schedule. If he runs into technical difficulties or something goes wrong, he can go ask for more money. The second one is quantity. The third one is performance margin. If you are in trouble with your program, and it isn’t working, you shave the performance. You shave the safety margin. You shave the margins. The fourth one is time. If you are out of money, and you’re running into technical problems, or you need more time to solve a margin problem, you spread the program out, take more time. These are the four things that a program manager has. If you are a program manager for the shuttle, the option of quantity is eliminated. There are only four shuttles. You’re not going to buy any more. What you got is what you got. If money is being held constant, which it is—they’re on a fixed budget, and I’ll get into that later—then if you run into some kind of problem with your program, you can only trade time and margin. If somebody is making you stick to a rigid time schedule, then you’ve only got one thing left, and that’s margin. By margin, I mean either redundancy—making something 1.5 times stronger than it needs to be instead of 1.7 times stronger than it needs to be—or testing it twice instead of five times. That’s what I mean by margin.
> It has always been amazing to me how many members of Congress, officials in the Department of Defense, and program managers in our services forget this little rubric. Any one of them will enforce for one reason or another rigid standard against one or two of those parameters. They’ll either give somebody a fixed budget, or they’ll give somebody a fixed time, and they forget that when they do that, it’s like pushing on a balloon. You push in one place, and it pushes out the other place, and it’s amazing how many smart people forget that.
Multiple books have been written on this very topic, but the TL;DR is that the problem was not the engineering, but the absurd, often mutually contradictory design decisions forced on it for political reasons.
Where are you getting 1500 Soyuz flights? As far as I can find the number is more like 150. It also experienced failure and loss of crew on two missions (Soyuz 1, Soyuz 11).
It's correct. I have confused Soyuz launcher with Soyuz spacecraft like newbie. It's true. Soyuz spacecraft has early failures (till 1971, none after). Shuttle has late failures (2003, then project decommission). I suppose design improvement in Soyuz, engineering quality decrease in Shuttle. Incompetent early Soyuz design team. Incompetent late Shuttle engineering team. Lack of ethics to make such broken devices.
Maybe the disconnect here is that most of my experience is in enterprise software rather than systems software. Perhaps many of the comments in this file seem unnecessary to regular contributors within the k8s project? Whereas if I were writing this same code in an enterprise (and thus expect it to be read by people far in the future lacking context on all the technical details) I would have put -more- comments in this file, given the sheer complexity of all it's doing.