April 13, 2008 By Dan Lohrmann
Most public CIOs are asked this simple question: Is everything backed up? The answers range from an overconfident "yes," to a dejected "can we please change the subject?" No matter your answer, I challenge you to make disaster recovery (DR) a process and not a destination.
In 1996, Government Technology published an article called Disaster Recovery Planning Gets No Respect. Here's an excerpt: "Montana, like many other state and local governments, has found that its disaster recovery plans and budget have not kept pace with the rapid growth in computing." The article argues strongly for more DR resources with facts from a 1987 University of Texas study:
Two decades later, dependence on IT is greater than ever - and its impact on government even larger should systems be unavailable. Most technical operations now have 24/7/365 expectations due to the growth of Internet-enabled applications.
And yet, even after Y2K, 9/11, the Northeast Blackout of 2003 and Hurricane Katrina, CIOs still struggle with the same DR funding problems. Despite numerous studies that demonstrate the importance of planning for emergencies, many governments still give a low priority to actual spending.
Here are four recommendations:
Know what's critical. Start by identifying your critical systems and sensitive data. Assemble business and technology experts who can answer simple questions, such as: What can't our government live without? What legislative mandates apply to DR? If certain databases were lost, what would we do?
While many organizations have a hard time ranking their priorities, most can group systems into critical categories. Create an ongoing process to update this list every year.
Determine current capabilities. This "as is" analysis is harder than it sounds, and you need to know what your capabilities are before you can truly build your case for more redundancy, much less fix anything. Pick the top three or four most likely disaster scenarios and have your team figure out their potential impact to critical systems.
Don't assume data is backed up just because your tapes or other media go offsite. Has your team tried before to restore those tapes? What if the hardware is destroyed? Could you read the media with other hardware? Don't forget to examine all critical system components such as alternate power sources and networks (including the resiliency of items such as DNS and DHCP).
Get information to key decision-makers. Once you have the data from the first two steps, provide options to key business leaders who own those functions. They may be shocked by your list.
Start the dialog and agree on whatever the plan is. The bottom line is business units must be aware and willing to accept the risk for any missing pieces. On the other hand, IT must ensure expectations are met or exceeded in providing the DR services the business is paying for and counting on.
The National Association of State Chief Information Officers has created some great materials to help you make your business case for DR.
Test your plan and measure effectiveness. Testing is important, and not only for well funded projects that have great DR plans. You need to test your plans at least once a year. Some organizations will neglect to test their backup tapes - and they're surprised later when recovery efforts fail.
I encourage CIOs to have their teams work with emergency management coordinators so technology recovery is built into broader emergency response efforts. Participating in important exercises on potential disasters, such as pandemic flu, will create opportunities to highlight and correct weaknesses in systems that need repair.
You can improve DR by ensuring your customers know your current status and by keeping business continuity as one of your overall IT priorities.
You may use or reference this story with attribution and a link to