New bridge, old book: the shape of software progress

New Bay Bridge east spanThe Bay Bridge’s new eastern span is about to open. When they started building it over a decade ago, I was beginning work on my book Dreaming in Code. As I began digging into the history of software development projects and their myriad delays and disasters, I kept encountering the same line: Managers and executives and team leaders and programmers all kept asking, “Why can’t we build software the way we build bridges?”

The notion, of course, was that somehow, we’d licked building bridges. We knew how to plan their design, how to organize their construction, how to bring them in safely and on time and within budget. Software, by contrast, was a mess. Its creators regularly resorted to phrases like “train wreck” and “death march.”

As I began my research, I could hear, rattling the windows of my Berkeley home, the deep clank of giant pylons being driven into the bed of San Francisco Bay — the first steps down the road that ends today with the opening of this gleaming new span. I wrote the tale of the new bridge into my text as an intriguing potential contrast to the abstract issues that beset programmers.

As it turned out, of course, this mammoth project proved an ironic case in any argument involving bridge-building and software. The bridge took way longer than planned; cost orders of magnitude more than expected; got hung up in bureaucratic delays, political infighting, and disputes among engineers and inspectors; and finally encountered an alarming last-minute “bug” in the form of snapped earthquake bolts.

So much for having bridges down. All that the Bay Bridge project had to teach software developers, really, was some simple lessons: Be humble. Ask questions. Plan for failure as well as success.

Discouraging as that example may be, I’m far more optimistic these days about the software world than I would ever have expected to become while working on Dreaming. Most software gets developed faster and in closer touch with users than ever before. We’ve turned “waterfall development” into a term of disparagement. No one wants to work that way: devising elaborate blueprints after exhaustive “requirements discovery” phases, then cranking out code according to schedules of unmeetable precision — all in isolation from actual users and their changing needs. In the best shops today, working code gets deployed regularly and efficiently, and there’s often a tight feedback loop for fixing errors and improving features.

My own recent experiences working closely with small teams of great developers, both with MediaBugs and now at Grist, have left me feeling more confident about our ability to wrestle code into useful forms while preserving our sanity. Software disasters are still going to happen, but I think collectively the industry has grown better at avoiding them or limiting their damage.

While I was chronicling the quixotic travails of OSAF’s Chandler team for my book, Microsoft was leading legions of programmers down a dead-end path named Longhorn — the ambitious, cursed souffle of an operating system upgrade that collapsed into the mess known as Windows Vista. At the time, this saga served to remind me that the kinds of delays and dilemmas the open-source coders at OSAF confronted were just as likely in the big corporate software world. Evidently, the pain still lingers: When Steve Ballmer announced his retirement recently, he cited “the loopedy-loo that we did that was sort of Longhorn to Vista” as his biggest regret.

But Longhorn might well have been the last of the old-school “death marches.” Partly that’s because we’ve learned from past mistakes; but partly, too, it’s because our computing environments continue to evolve.

Our digital lives now rest on a combination of small devices and vast platforms. The tech world is in the middle of one of the long pendulum swings between client and server, and right now the burden of software complexity is borne most heavily on the server side. The teeming hive-like cloud systems operated by Google, Facebook, Amazon and their ilk, housed in energy-sucking server farms and protected by redundancy and resilient thinking, are among the wonders of our world. Their software is run from datacenters, patched at will and constantly evolving. Such systems are beginning to feel almost biological in their characteristics and complexities.

Meanwhile, the data these services accumulate and the speed with which they can extract useful information from it leave us awe-struck. When we contemplate this kind of system, we can’t help beginning to think of it as a kind of crowdsourced artificial intelligence.

Things are different over on the device side. There, programmers are still coping with limited resources, struggling with issues like load speed and processor limits, and arguing over hoary arcana like memory management and garbage collection.

The developers at the cloud platform vendors are, for the most part, too smart and too independent-minded to sign up for death marches. Also, their companies’ successes have shielded them so far from the kind of desperate business pressures that can fuel reckless over-commitment and crazy gambles.

But the tech universe moves in cycles, not arcs. The client/server pendulum will swing back. Platform vendors will turn the screws on users to extract more investor return and comply with increasingly intrusive government orders. Meanwhile, the power and speed of those little handheld computers we have embraced will keep expanding. And the sort of programmers whose work I celebrated in Dreaming in Code will keep inventing new ways to unlock those devices’ power. It’s already beginning to happen. Personal clouds, anyone?

Just as the mainframe priesthood had to give way to the personal-computing rebels, and the walled-garden networks fell to the open Internet, the centralized, controlled platforms of today will be challenged by a new generation of innovators who prefer a more distributed, self-directed approach.

I don’t know exactly how it will play out, but I can’t wait to see!

Related

Get Scott’s weekly Wordyard email

Comments

  1. Tim

    Always an optimist, Scott. In spite of everything you know. It is one of the things that makes you fun to read.

    Waterfall development and detailed requirements gathering are alive and well in large enterprises, and in people trying to solve hard technical problems (like some of those cloud vendors). Agile development in its various forms has been a real improvement in software development, but the kinds of problems that can be tackled with those kinds of lightweight processes is limited to those operating in known domains. Managing and adding features to a website and Content Management System is that kind of domain.

    Trying to build failure tolerant loosely coupled systems without single points of failure isn’t.

    As I recall, one of the many reasons that Chandler didn’t ever hit escape velocity was the initial decision to use Python instead of a lower level language — so the developers could develop more rapidly and be more agile, and be multiplatform from the outset. But that initial architectural decision to be more agile attached a ball and chain performance penalty to the product that it always had a hard time overcoming.

    And:
    >The developers at the cloud platform vendors are, for the most part, too smart and too
    > independent-minded to sign up for death marches.

    Have you chatted with some of the AWS developers up in Seattle? From what I hear, death marches are alive and well. And when the cloud products that we are becoming increasingly reliant on go down, they can go down very hard, taking lots of systems that with unseen connections down with them.

    Lastly, I do remember the bridge building metaphor in Dreaming in Code, which I still think was a brilliant book, and which I gave to my mother so she could understand what I do for a living, and I do think the irony has been exquisite.

    Miss seeing you around Berkeley, hope you are having fun in Seattle.

    Thanks,
    Tim

  2. Scott Rosenberg

    Thanks, Tim. I know waterfalls and death marches haven’t left the building. But they feel decreasingly connected with the most important work. On the other hand, no, I haven’t hung out with AWS developers. Sounds like I should.

    I continue to stand against fatalism — largely because I think it’s my natural bent and needs to be counterbalanced. Optimism of the intellect, pessimism of the will.

Post a comment