The $6 Billion Software Bug
In August 2003, the largest power blackout in North American history hit the northeastern United States and southeastern Canada. More than 50 million people lost power, some for up to two days. At least 11 people died as a result of the blackout. Total cost: $6 billion. Investigators concluded that the initial problem started when a power line in Ohio sagged and hit an overgrown tree. Normally, this is an isolated incident. When such an incident occurs, alarm systems alert power systems operators of the problem so they can take appropriate steps to make sure that the problem remains isolated. Unfortunately, in this case, a software bug caused the alarm system to malfunction. This started a cascade of errors that led to more than 50 million people losing electrical power. The lack of power also led to water supply problems and the shutdown of major transportation systems, including railroads and airlines. Gas stations could not dispense fuel. Cell phone service was interrupted (although wired phones continued working).
While lack of proper maintenance was the root cause of the blackout, without the software bug, the impact would have been small and limited to relatively few people. With the software error, tens of millions of people were affected, and there were billions in economic costs.
There are many other examples of the serious consequences of faulty software:
In 2013, the Patient Protection and Affordable Care Act (Obamacare) went live, and as millions of Americans accessed the newly created exchange websites, the websites crashed and became nonresponsive. As a result, millions of dollars had to be spent to get the websites working properly.
In 2005, Toyota recalled more than 150,000 Prius hybrids to fix software that caused the gasoline engine to shut down unexpectedly.
In 2004, a software error disrupted communication at Los Angeles International Airports air traffic control system; 800 flights were disrupted. Fortunately, there were no midair collisions, although there were several near misses.
From 1985 to 1987, software errors in the Therac-25 radiation therapy device allowed the device to deliver lethal doses of radiation to patients. At least five patients died.
In 1990, an error in the control systems of AT&Ts long distance switches left 60,000 people without long-distance service.
Have you ever experienced software that did not operate correctly? What consequences did you experience?
What can be done to limit software errors?
Why is it important to catch software errors early in the development process?