IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

When 911 Fails: What Safeguards Are In Place?

An investigation into the situation that left 2.3 million residents in northern Virginia without telephone access to 911 for up to four days.

EM_storm_road_thumb
People know what to do when they get into a car accident: Call 911. They know what to do when the traffic signals go out: Call 911. But what do you do when 911 goes out?

The question was more than merely academic for some 2.3 million residents in northern Virginia who lost telephone access to emergency services for up to four days in June in the wake of a quick and violent thunderstorm known as a derecho.

Congress moved as swiftly as the derecho itself to express its displeasure with the 911 outage. In a letter to FCC Chairman Julius Genachowski, U.S. Reps. Jim Moran, Gerry Connolly and Frank Wolf wrote: “In the event of an emergency situation, whether it be a natural disaster or man-made threat, the public needs confidence that they can get through to 911 operators. This storm exposed a weakness in our response system, and now that we know it exists, we must fix it.”

To fix it, you first have to understand it. With multiple investigations under way in Washington, Emergency Management asked Fairfax County Director of Public Safety Communications Steve Souder to describe the events surrounding the 911 blackout.

It was the evening of June 29 ...



Initial surge


While there had been bad weather looming on the radar, no one saw anything like the derecho coming. “The severity, the speed and the wind velocity that struck was quite unpredicted,” Souder said.

911 centers throughout the region were immediately swamped with calls of downed wires and trees, felled poles, roof damage — all the typical carnage in the aftermath of intense wind and rain, but coming at an extraordinary pace.

During the three hours of the storm, 10:30 p.m. to 1:30 a.m., emergency call traffic in Fairfax County reached 415 percent above a normally busy Friday evening. Fire crews went out on rescue calls. Police took over manual traffic control at intersections where power outages had knocked out the signals.

There were early, though by no means catastrophic, blips in 911. When the Arlington emergency system lost power briefly, systems including 911 rolled to the uninterruptible power supply and then onto generators, as designed. Operationally, the power never went out. The generators ran for 10 hours until power was restored.

Emergency managers braced for the morning, when homeowners would wake up and realize the full extent of the damage to their properties. By 7 a.m., there would surely be a fresh wave of 911 calls.

Except there wasn’t.

The new shift came on at 7 a.m. “And virtually at the same time we noticed that we were not getting any 911 calls — a most unusual thing — and we had not been notified by Verizon that there was any problem,” Souder said. The emergency lines were out; the nonemergency lines were gone. The administrative and business lines: all dead.

“I got an email in the morning saying there were major problems, no 911 service at all, contact the department of public safety communications ASAP,” Souder said. “I tried to contact them by my personal cellphone, but I found that even that was out.”

He headed into work, where things were just as bad as he’d imagined. There were no telecommunications, period. “Obviously we have an alternative center that we would normally go to, but in this case it also was without any 911 service. This thing was as total as anyone knows of in the 44 years that 911 has been around.”

911 went out at 6:30 a.m., and Fairfax managers didn’t hear from Verizon until three hours later. Why the lag? It’s far from clear. “Verizon has said their delay in notifying Fairfax County and other jurisdictions was because of technical problems and internal and external communications issues,” Souder said.

In fact, the communications rift was even greater, Souder said. Verizon knew it had major problems on its hands well before 911 systems went down, but didn’t reach out to local officials for nine hours. “I have described that as akin to the captain of the Titanic not telling his passengers the ship had struck an iceberg until the bow of the boat was about to hit the bottom of the ocean,” Souder said.

“Obviously this was a situation that we had never encountered before. The only thing we could do was post a message on Twitter that, in fact, 911 service was out,” he said. “We told people to go to the nearest police or fire station, to flag down the nearest police or fire unit and tell them of your emergency. It was not something we would ever want to do, but it was the only thing we could do.”

The outage lasted in some places until July 3.


Picking it apart


Investigations are ongoing: Verizon is looking into its own systems; the FCC is sifting through the ashes; the Virginia State Corporation Commission (the state regulatory body) is conducting its inquiry; and the governor’s office and Metropolitan Washington Council of Governments are examining the outage.

Everyone wants to know why and how it happened, but at least the “what” seems clear.

A Verizon report from August put it succinctly: “External power failures affected more than 100 Verizon locations. At each of these locations, batteries and nearly all the back-up generators worked as designed, allowing us to continue service. However, at two of these locations, generators failed to start, disabling hundreds of network transport systems, and causing Verizon to lose much of its visibility into its network in the impacted area.”

Two generators failed to fire up, and the entire 911 network in northern Virginia went down. The few emergency calls that did find a way through arrived without location information. The failed generators triggered a cascade effect that ultimately knocked out four public safety answering points. Compounding the problem, system failures included the downing of monitoring capabilities, making it impossible for Verizon to see into its northern Virginia network facilities, thus hindering initial efforts to assess and repair damages, Verizon reported.

A critical question remains: Why didn’t the generators start?

Verizon conducted testing using third-party experts and found that in the Arlington facility, air had entered the fuel system, resulting in a lack of fuel in the lines. The fuel lines for both Arlington generators have been replaced.

Verizon found that the Fairfax generator did not power up because the auto-start mechanisms failed. Those mechanisms should start the generator once commercial power is lost. “But they did not operate correctly and have since been replaced,” Verizon reported.

However, all this leaves Souder far from satisfied. The 911 system is used 240,000 times a day, 87 million times a year. “It is the gateway to public safety,” he said. “It has to be flawless. It has to be robust. It has to stand up when it is most needed.”

He’s not the only one frustrated.


Proposed changes


Discussing the derecho in a July 19 hearing, the FCC’s Genachowski declared, “911 outages are unacceptable, and we must and will work with all stakeholders to address this serious issue.”

The chairman sought to put a human face on the crisis. “In Prince William County, Va., someone called 911 to report a man suffering cardiac arrest and got a busy signal. He finally got help, fortunately, but only after the caller tracked down authorities on a nonemergency line. The derecho made clear the absolutely vital role of our communications networks, particularly during emergencies.”

While various agencies investigate the outage, officials in northern Virginia already are making recommendations.

“Our feeling was that, while we still didn’t know a lot, we did know some things that would make it a whole lot better, and we didn’t want to wait six or eight months until the official reports came in,” said Souder.

The 911 directors of Alexandria, and the counties of Arlington, Fairfax, Loudoun, Prince William and Stafford have recommended that Verizon adopt five steps that are primarily focused on communications in response to the storm. Verizon has agreed to all.


  • Officials recommended that Verizon sign onto the National Incident Management System (NIMS) model to address future incidents. Verizon said it employs an “all-hazards approach” to its business continuity, disaster recovery, facility preparedness and emergency management programs. That process utilizes NIMS principles, and thresholds for invoking that process have been strengthened to more readily bring those procedures to bear in similar situations.
     
  • It is recommended that Verizon obtain and use a reverse notification phone call system to give notice about an interruption of 911 service. Verizon notes that since March 2011, it has employed a broadcast email process to provide specific ticket information to individual public safety answering points. Verizon said it will expand that process to include texting and will work with 911 directors to establish the correct contact lists and process details.
     
  • Emergency managers recommended that the company develop a semiannual drill or exercise with each jurisdiction on actions to be taken in the event of a 911 outage. Verizon said it will engage the assistance of its Business Continuity Emergency Management team to work with the company’s 911 Customer Care Center organization to develop and exercise procedures for such drills.
     
  • Verizon should provide, during the first week of each month, a current contact list for the account manager assigned to a given jurisdiction, along with contact information for four immediately escalating Verizon personnel up to a vice president level. The company agreed to do this.
     
  • Verizon will have a representative present at the jurisdictions’ EOCs to provide accurate information concerning 911 service and outages. Verizon said it will work with the 911 directors to explore ways to accommodate this request, perhaps through virtual participation in an EOC via an instant messaging-like application.

These initiatives won’t guarantee that the events that occurred during the derecho will not repeat, but they should give emergency planners a solid start against future crises. “Not every one of these will apply in every location, but we know what we are looking for. So while the solution in each situation may be different, the overall plan will be comprehensive,” said Maureen Davis, Verizon vice president of network technology for the Mid-Atlantic region.

Beyond these specific remedies, Verizon walks away with a greater appreciation for the overall vulnerabilities of its systems.

“For us, one of the major lessons has to do with ways to deal with significant and multiple catastrophic events,” Davis said. “We have had commercial power failures and generator failures at central offices before, but I can’t think of a single instance where we were working more than one at a single moment. We all just need to be continually vigilant about backup strategy, redundancy, diversity, so we understand how much of it we have and how much of it we need.”

Emergency managers meanwhile found in these events a none-too-subtle reminder that their means of public outreach is still in need of upgrades.

“We need to be more engaged with how social media can be utilized to notify the public,” Souder said. “We didn’t do a bad job, but we did it on the fly. It isn’t something we normally do. We know the social media phenomenon has a role here, but how to use it, how to engage it — these are weighty things.”

The 911 center doesn’t have its own Twitter account. The communication after the derecho ran through the county government’s account. Clearly, Souder said, something more formal is needed.

At the same time, local planners are assembling an intrajurisdictional 800 MHz talk group for 911 events, a network that would reach across the region’s trunk radio systems without interfering with the police or fire mutual aid radio systems. “It will allow us one more communications path,” said Souder.


Lessons learned


Was there an element of human error here? Not in the immediate sense: The breakdown was fundamentally mechanical. But there clearly were systemic difficulties, a failure to communicate down the line at crucial moments, and a shortfall in emergency procedures that might apply in the most extreme circumstances.

Could the situation happen again? Whatever flaws may have lurked inside those generators, it took an extraordinary meteorological event to bring them to light. History has shown that extreme weather can drive extreme consequences. It might not happen in exactly this way, but surely a catastrophic 911 failure can never be ruled out. What’s more relevant is the response.

Emergency managers have articulated a new set of safeguards that should help to drive response in future events. By the time investigators have finished combing through this crisis, they likely will have learned enough to stand up more robust defenses to aid during future disasters.

 

Adam Stone is a contributing writer for Government Technology magazine.