As a result, residents lost access to some major state functions, including the DMV Express, which is Vermont's service for online licenses renewals and registrations. The outage struck around 3 p.m. on Thursday, April 20, with the sites and services going live again by 10:30 a.m. Friday, he said.
The disruption “was significant, in that it was our primary public-facing presence and all of our websites, but services were able to still be offered through in-person options,” Nailor said.
Public safety systems were unaffected and internal processes remained intact, enabling residents to be helped onsite and letting the state continue processing tax returns.
What happened? The route of the problem goes to Washington, D.C., where a cable-cutting set off a domino effect, ultimately toppling Vermont’s web presence.
Nailor said state websites went out after the company that handles much of Vermont’s web hosting and content management lost Internet at its D.C.-based data center. The company in question, Tyler Technologies, told GovTech that this outage hit when its network vendor’s fiber connections were severed, “impacting us as well as other organizations.” That network vendor is AT&T, Nailor said; AT&T did not respond to a request for comment by press deadline.
This kind of incident was unprecedented for Vermont.
“Tyler and its predecessor, NIC, have been providing these services [to Vermont] for around 17 years, and this is the first event of this magnitude in that time,” Nailor said. “They continue to provide us with like four nines of uptime — 99.9 percent uptime. This is an unheard of event.”
During the outage, Vermont agencies turned to their social media teams to provide residents with status updates and information on alternate ways for getting services, Nailor said.
The bulk of the outage occurred during the evening and overnight, which may have helped reduce disruptions. As of yesterday, the state was still gathering data about the extent to which the incident may have interrupted residents getting needs met during that time or caused any backlogs. But Nailor said anecdotal accounts suggest that call centers “weren’t getting overwhelmed” in response to the event.
The state, network vendor and Tyler Technologies worked together to restore services. Vermont was able to get sites and services back up and running Friday, while Tyler Technologies’ D.C. data center was still down. Doing so required switching over to a secondary data center in Texas, where, for resiliency, Tyler had also stored the state’s content, Nailor said.
To make the switchover, Vermont first needed to make domain name system (DNS) changes so website addresses would direct to content at the Texas data center, something the state did with Tyler’s help, Nailor said.
Meanwhile, the network provider worked to repair connectivity in D.C.
“AT&T restored an aerial WAN into it [the data center] over the weekend, but the permanent repair was not completed until late in the day [Mon. April 24],” Nailor said. Vermont planned to undo the DNS changes Tuesday and switch back to that primary D.C.-area data center.
The state intends to use the incident as a learning opportunity, and Nailor said it is in discussions with Tyler about disaster recovery testing and practices.
“What you'll see come from this is a regimented testing protocol that will allow us to be assured that the DR contingencies that are in place will work when they're needed at the time,” Nailor said.
That’ll include coordinating with Tyler Technologies to conduct simulated Internet outages at the primary data center and practice rolling over connectivity to the secondary data center, he said. That switchover should occur either with no service disruption or with services restored “within seconds,” without humans needing to intervene.
Tyler Technologies said that, as another step, it is also “working with the state to evaluate a migration to the AWS cloud.”