Most technology leaders know that sinking feeling. The phone rings, and the voice at the other end says, “The mainframe just crashed.” Or, “We lost power at the data center and some of the uninterruptible power supply units (or the generator) didn’t work properly.” Just as scary: “Our vendor’s network is down. The incident is impacting thousands of customers.”

Computer and network outages — and the corresponding ramifications — come with the IT territory. Even when services are outsourced, the ultimate responsibility still rests with the public CIO. Despite mind-numbing thoughts of “what if,” our teams must implement recovery efforts just as a fire department responds to fires. And yes, seconds matter.

While the need to activate a full-scale disaster recovery plan may be rare, operations personnel deal with varying types of critical incidents regularly. But how effective is your team in these situations? What’s your recovery time objective when things go wrong? Simply stated: Are you ready for the next significant outage? 

Key Considerations

So what are some of the keys to a successful outage remediation?

1) Understand the outage scope, your options and timelines. Just as the military wants intelligence regarding enemy movements in a war, operations leaders must quickly grasp the extent of an operational emergency. Good monitoring tools, end-to-end system management capabilities and qualified operations staff are essential for achieving timely restoration of service. 

Tip: Beyond asking what happened, ask if anything changed. Can you roll back to the previous configuration? Utilize request for changes and change control boards to track activity. In Michigan, we activate our Emergency Contact Center during major incidents to ensure that the right priority is placed on the situation. All key resources gather (virtually or in person) to coordinate recovery options. 

2) Develop clear roles and responsibilities. Early decisions are often the key. Who’s in charge and what resources are available? Should we keep fixing the problem or activate the disaster recovery plan? What resources or vendor relationships can help?

Seasoned pros who have been through outages know that conflicting information and competing interests often emerge. Sometimes the technical staff will underestimate the issue or overestimate their ability to remediate what happened, making matters worse.

Tip: Developing “run books,” compilations of the procedures and operations that the system administrator or operator carry out, can help navigate outages. A good run book includes procedures for every anticipated scenario and generally uses step-by-step decision trees to determine the effective course of action.

3) Promote excellent communication. When critical systems are down, everyone counts the minutes. Perception is reality, and while some loss-of-service situations will make the local news and others won’t, public perception can impact your actions. Remember that communication continues after systems are restored. A good root-cause analysis listing lessons learned — including people, process and technology activities — should be provided to clients after appropriate review. 

Tip: Develop an emergency communication plan for dealing with internal and external stakeholders. Don’t let this become shelfware — practice different scenarios during tabletop exercises. Meeting customer expectations and building confidence in your statements is as important as restoring service. Don’t make promises you can’t keep.

In May, Michigan had two outages that made the news. Fortunately our experienced public information officer handled all media inquiries with expert precision. He knew what questions would be asked, who to contact internally to get the facts and what to say about restoration times.

In conclusion: Despite our best efforts, technology outages are inevitable. Cloud computing and more smartphones in the enterprise will further complicate end-to-end service restoration and escalate the need to partner with vendors. Prepare now for the unexpected.

Dan Lohrmann is Michigan’s CTO and previously served as the state’s first chief information security officer. He has 25 years of worldwide security experience, and has won numerous awards for his leadership in the information security field.

Dan Lohrmann Dan Lohrmann  |  Chief Security Officer & Chief Strategist at Security Mentor, Inc.

Daniel J. Lohrmann is an internationally recognized cybersecurity leader, technologist and author. During his distinguished career, Dan has served global organizations in the public and private sectors in a variety of executive leadership capacities, including enterprise-wide Chief Security Officer (CSO), Chief Technology Officer (CTO) and Chief Information Security Officer (CISO) roles in Michigan.

Dan Lohrmann joined Security Mentor, Inc. (www.securitymentor.com) in August, 2014, and he currently serves as the Chief Security Officer (CSO) and Chief Strategist for this award-winning training company. Lohrmann is leading the development and implementation of Security Mentor’s industry-leading cyber training, consulting and workshops for end users, managers and executives in the public and private sectors. 

Daniel J. Lohrmann was Michigan's first Chief Security Officer (CSO) and Deputy Director for Cybersecurity and Infrastructure Protection from October 2011 to August 2014. Lohrmann led Michigan's development and implementation of a comprehensive security strategy for all of the state’s resources and infrastructure. His organization provided Michigan with a single entity charged with the oversight of risk management and security issues associated with Michigan assets, property, systems and networks.

Under Lohrmann’s leadership, Michigan was recognized as a global leader in cyberdefense for government - winning numerous professional awards for outstanding accomplishments. The Michigan Cyber Initiative, Michigan Cyber Range, Michigan Cyber Disruption Response Strategy, Michigan Cyber Civilian Corps, new 7x24 Security Operations Center (SOC), reinvention of end user cyber awareness training, new cybersecurity portal and Cyber Summit Conference Series were just a few of the initiatives achieved in under three years. 

Over the past decade, Lohrmann has advised the U.S. Department of Homeland Security (DHS), the White House, Federal Bureau of Investigation (FBI), numerous federal agencies, law enforcement, state and local governments, non-profits, foreign governments, local businesses, universities, churches and home users on issues ranging from personal Internet safety to defending government and business-owned technology and critical infrastructures from online attacks. 

Lohrmann is also a globally recognized author and blogger on technology and security topics. His keynote speeches have been heard at worldwide events, such as GovTech in South Africa, IDC Security Roadshow in Moscow, SecureWorld Expo events nationwide and the RSA Conference in San Francisco. 

He has been honored with numerous cybersecurity and technology leadership awards, including “CSO of the Year” by SC Magazine, “Public Official of the Year” by Governing magazine and “Premier 100 IT Leader” by Computerworld Magazine.

For more than a decade, Lohrmann served as a trusted advisor for the National Association of State Chief Information Officers (NASCIO), the Multi-State Information Sharing & Analysis Center (MS-ISAC). He also served as an adviser on TechAmerica's Cloud Commission, and a co-chair on several National Governor’s Association (NGA) committees to enhance cybersecurity. Lohrmann was also the chairman of the board for 2008-2009 and past president (2006-2007) of the Michigan InfraGard Member's Alliance. He currently serves on the Michigan InfraGard Executive Board.

Dan represented NASCIO on the U.S. Department of Homeland Security’s IT Government Coordinating Council from 2006-2014. In this capacity, he assisted in the writing and editing of the National Infrastructure Protection Plans (NIPPs), sector specific plans, Cybersecurity Framework and other federal cyber documents. 

From January 2009 until October 2011, Lohrmann served as Michigan's Chief Technology Officer and Director of Infrastructure Services Administration. He led more than 750 technology staff and contractors in administering functions, such as technical architecture, project management, data center operations, systems integration, customer service (call) center support, PC and server administration, office automation and field services support. 

Under Lohrmann’s leadership, Michigan established the award-winning Mi-Cloud data storage and hosting service, and his infrastructure team was recognized by NASCIO for best practices and for leading state and local governments in effective technology service delivery in datacenter consolidation, WiFi and mobile deployments. 

Earlier in his career, Lohrmann served as Michigan’s first Chief Information Security Officer (CISO), and the first enterprise-wide government CISO in the USA, from May 2002 until January 2009. He directed Michigan's award-winning Office of Enterprise Security for almost seven years. 

Lohrmann's first book, Virtual Integrity: Faithfully Navigating the Brave New Web, was published in November 2008 by Brazos Press, Baker Publishing Group. His second book, BYOD for You: The Guide to Bring Your Own Device to Work, was published in Kindle format in April 2013. He also wrote chapter 8 on "CIO as Protector: Our Cybersecurity Imperative," for the 2011 Public Technology Institute book, CIO Leadership for State Governments: Emerging Trends and Practices.

Prior to becoming Michigan's CISO, Lohrmann served as the Senior Technology Executive for e-Michigan, where he published an award-winning academic paper titled: The Michigan.gov Story — Reinventing State Government Online. He also served as director of IT and CIO for the Michigan Department of Management and Budget in the late 1990s.

Lohrmann has more than 28 years of experience in the computer industry, beginning his career with the National Security Agency. He worked for three years in England as a senior network engineer for Lockheed Martin (formerly Loral Aerospace) and for four years as a technical director for ManTech International in a US / UK military facility. 

Lohrmann is on the advisory board for four university information assurance (IA) programs, including Norwich University, University of Detroit Mercy (UDM), Valparaiso University and Walsh College. 

He has been featured in numerous daily newspapers, radio programs, TV news, CSPAN and global media from as far away as Australia. Lohrmann writes a regular column for Public CIO magazine on cybersecurity. He's published articles on security, technology management, cross-boundary integration, building e-government applications, cloud computing, virtualization, securing portals and The Internet of Things.

He holds a master’s degree in computer science from Johns Hopkins University in Baltimore and a bachelor’s degree in computer science from Valparaiso University in Indiana.

NOTE: The postings on this blog are Dan Lohrmann's own views. The opinions expressed do not necessarily represent Security Mentor’s official positions. 

Sample of Lohrmann Individual and Team Awards: 

  • Outstanding Information Technology Achievement in Cybersecurity – National Association of State CIOs (NASCIO) – recognized as top cybersecurity project in the nation. Michigan Cyber Training 3.0 – October 2013 
  • Executive Government Technology Award – GTRA’s GOVTek 2012 
  • Technology Leadership Award: InfoWorld  2011
  • Premier 100 IT Leader: Computerworld Magazine 2010
  • Top Doers, Dreamers and Drivers: Government Technology magazine 2009
  • Public Official of the Year: Governing magazine — November 2008
  • CSO of the Year: SC Magazine — May 2008
  • Top 25 in Security Industry: Security magazine — December 2007
  • Compass Award: CSO Magazine — March 2007
  • Information Security Executive of the Year, Central Award: Tech Exec Network - 2006