“Everything should be built top-down, except the first time.” – Alan Perlis
Thought it might be fun to bring up one of the most heated debates in programming. I wonder if I’m in a particularly contentious mood today….
Originally, when the internet was envisioned by Alan Kay in the 60s, he knew before anybody else that humans and computers would be connected and interact in networks. There have been a few software greats that saw Twitter coming before smartphones or the internet existed. But in Alan Kay’s particular case, he has a great track record for predicting the future throughout his entire career. Of course, he isn’t some sort of prophetic computer genie. You could say he cheated as he simply invented the future instead of predicting it. Which if you ask me, is much more consistent than speculating. Many people know him as the inventor of the GUI, Computer Mouse, Personal Computer, Object Oriented Programming, and then laser printer. All together Alan Kay’s inventions have produced over a trillion dollars of value. His lab, Xerox Parc, was the harvesting ground for the technology that Microsoft, Apple, and of course, Xerox all used to create the computer revolution. Not only did he invent all of these things, but he also had a knack for predicting technology trends by playing what he called the Wayne Gretsky game.

A big reason Gretsky was the GOAT was that he didn’t focus on where the puck was. On the contrary, he focused on where the puck was going. He knew the game and what his opponents were going to do next and he played a few steps ahead of them, often playing on
predictions of the future rather than present knowledge. He analyzed, predicted, and took risks to great success. They just knew the fundamentals of computing and it was fairly clear that there were great throughput problems solved by an information superhighway. At PARC, they took a play out of his book and decided to focus on where the puck was going in 30 years. They just did it with computing. That’s why they were building iPads in the 70s. They rightfully could have focused on more pressing things, but then we wouldn’t have gotten the computer revolution. Or at the very least, it wouldn’t have come so fast. When asked “Is Software Engineering Still an Oxymoron”, Alan Kay briefly mentioned Facebook’s system of social media (It was still Facebook at the time) and commented on the lack of hard engineering work in critical infrastructure. They had quite a few companies relying on them for authentication services and other critical infrastructure. So many in fact that the website DownDetector reported over 10 million outages. They broke a major section of the internet. That outage caused their stock price to drop 5% and Mark Zuckerburg lost 6 billion dollars of net worth in one day.
By contrast, the internet itself never had such an outage in the history of its existence. Certain systems never break, and others can break in critical ways if not designed rigorously. And it goes to show there are wide gaps in engineering efforts between technologies. On the other hand, there was little thought put into modern social media in terms of what makes it a good system, what’s the model of governance, and what outcomes could come from this? It was simply invented and set off to the races. There have been some consequential negative ramifications of that such as mental health problems, grifters, scantily clad clout chasing, and increasing rates of suicide. We’ve known for a long time that anything large that deals with humans needs to have a modern equivalent of Asmov’s Laws of Robotics. For the uninformed, this is the governance approach to the invention of robots. Designed to protect us from being swallowed alive by the AI singularity by some sort of omniscient superintelligence (or something like that), The laws are as stated:
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Sometimes I think to myself we’re feeding our children to the metaverse to increase profits for shareholders. The fallout of doing that in an unprincipled way will only exacerbate the situation. The internet as a system, by contrast, had a tremendous amount of engineering work put into it. It drives value and doesn’t have blackouts. We are starting to see them more and more as we switch to cloud technologies. When AWS had a 5-hour outage large sections of the internet fell apart. Engineering companies grinded to a standstill as several major infrastructure services made it impossible to deploy software or even sometimes access their code. Software engineering is a very new field that still has yet to be fully developed, but we’re getting more accustomed to being able to cope with the lack of engineering. We can look to some systems such as the internet and draw conclusions about why they are so fault-tolerant. What in my opinion is the most beautiful example of fault tolerance is The AXD 301, a 1998 ATM switch from Ericsson. It was programmed in Erlang by Joe Armstrong.
The AXD301 has achieved a NINE nines reliability (yes, you read that right, 99.9999999%). Let’s put this in context: 5 nines is reckoned to be good (5.2 minutes of downtime/year). 7 nines almost unachievable … but we did 9.
Why is this? No shared state, plus a sophisticated error recovery model.
– Joe Armstrong
In manufacturing and other engineering-heavy industries, the more commonly used term for fault tolerance benchmark is called 6 sigma. That’s about 99.99966% uptime. Coined by Motorola, and adopted by HoneyWell and General Electric, 6 sigma has become a materials engineering standard. In 2011, it even got its own published ISO standard definition. Know of any production software systems you’ve interacted with that are 6 sigma? Absolutely not. Erlang programming is one of the most important things an engineer can pick up. If you’re not familiar with its actor model for concurrency and error recovery model, you’re missing out big time. Even if you never write a single line of production Erlang, it’s critical to know how this happened.
Perhaps the biggest reason we don’t have “real” engineering in software is the lack of powerful enough tooling. We’re still building software development tools. Other disciplines such as building the Empire State building have the luxury of being a thousand-year-old practice humans have been honing in on how to do for millennia. We have power tools and an idea of building construction in this field is much too new to have at this point. We’re much too young to pull off a feat similar to the Empire State Building in this field as we’re still programming with relatively new tooling. All fields had to evolve over centuries to get to where they are today. For instance, with cooking, it was not well understood how recipes work, but they were collected into a cookbook and they evolved. We didn’t know why things tasted good or bad before food science advanced. Mathematics evolved similarly but is the luxury of thousands of years. Computer Science and Software Engineering are practically toddlers compared to Mathematics and Cooking. Science and technology could be considered the most extraordinary art form of the 21st century. It takes a cycle of tinkering, esthetics, engineering, research, experimentation, philosophy, and much more. It’s hard work, but an artful process.
Computing is much where architecture was before they invented the arch. Kay compares it to the ancient Egyptions “who made large structures with hundreds of thousands of slaves toiling for decades to pile stone upon stone: they used weak ideas and weak tools, pretty much like most software development today”. Our industry only recently widely adopted distributed systems programming, functional programming, fault tolerance, and other tools such as container orchestration only recently started coming mainstream. We have to take a step back and realize we’re still new to this and not get discouraged by our lack of sophistication in our tooling. Instead, I think it’s best to be mindful that we live at a beautiful point in human evolution where we just recently started our relationship with the field of Software Engineering and we’re actively figuring out the first steps. We may very well look back on these times and miss the good old days before we have the tools to build an Empire State Building. It’s humbling, but exciting to be in the first few stages of learning about these marvelous and powerful machines. It’s best to approach the craft with childlike wonder as we’re still in our infantile development stages in this field.
In the short term, since we don’t yet have Software Engineering, we’re forced to build flexibility into our systems. I find this to be a very rewarding limitation. Things like laziness and late binding. Dynamic and extensible systems, API versioning, and metaprogramming are all so much fun because the field requires dynamism to keep these software systems afloat. We have to build things modularly and separate concepts and I find this does wonderful things for your thinking and brain function. As a software engineer, I need to use heavy abstract reasoning skills to tease apart complex concepts into a series of smaller concepts. The difficult art of decomposing problems into smaller problems is a joy because you can take that skill and apply it to the rest of your life as well. I’m forced to exercise my simplification muscles so I find myself naturally tending to simplify my life. I know about temporal coupling in software systems just as well as I know about temporal coupling in organizing events with friends, running my business, and my relationships with others. Simplicity and flexibility aren’t just needed by computers, humans need that too. It’s a blessing to practice that on a daily basis.