Fixing Online Health Insurance Exchanges Will Require Long-term Care (Contributed)

Bugs and outages can be mitigated with more careful monitoring.

by Chris LaPoint, SolarWinds / December 11, 2013

Everyone knows the rollout of, the online face of the Affordable Care Act, has seen its share of bumps and bruises. State exchanges also came out of the gates incapacitated, with networks in Maryland, Oregon and others experiencing major issues.

Now that federal and state governments are making concerted efforts to fix what ails their sites, it’s time to ask: What must be done to ensure their long-term health and viability?

Suffice to say, this is not going to involve outpatient care, but ongoing treatment. Otherwise, the specters of portal outages, missing applications and login credentials, and other stumbling blocks that have marred the sites’ launches will continue to rear their ugly heads.

To ensure this does not happen, IT managers and developers working on and state-run health insurance exchanges must act as caregivers who take a holistic approach. They must pay close attention to potential health risks by honing in on two fronts:

Integrated, end-to-end monitoring of the entire infrastructure –
The problem with, in particular, is that it was created like Frankenstein’s monster – cobbled together with different pieces from separate organizations. Given this, the fact that it did not work is not terribly surprising, nor was the finger pointing that took place after it was launched. But too much time has been spent trying to blame different components of the exchanges and finding root causes where there may not be any. Let’s not say that it was a network problem, or an app server issue, or a database or storage matter – that’s not constructive now, and will not be in the months ahead.

In order to avoid this problem in the future, teams working on the sites must have a way to easily monitor the entire system from end to end. This means having a single view of all of the different components, including applications, databases, servers, storage and more, along with the ability to receive automatic alerts if something should fail in any of them. This will provide administrators the ability to identify issues specifically and immediately, and to act upon them quickly. An integrated, end-to-end approach such as this should help avoid a repeat of the issues that have plagued the sites since October.

End user experience and application performance monitoring – The question that has been debated seemingly endlessly is, “Why didn’t those working on the exchanges recognize the issues before the launch?” It’s likely that they did not have a reliable way of monitoring the end user experience and the performance of vital applications.

While sites like Amazon may have the end user experience down pat, it’s a whole new operation for state and local governments. With, the federal government in effect threw its hat into the online merchant business, without apparently thoroughly testing the entire user experience before going live. States have experienced similar issues; Oregon’s rollout has gone so terribly awry that organizers have resorted to paper applications until the state’s online exchange can be fixed, while Maryland’s site has been interminably slow and glitch-prone. In every case, the user experience has been listed as critical.

Clearly this needs to change, beginning with the implementation of sustained and comprehensive end user monitoring. Administrators need to be able to closely guard every step of the transaction process, from initial sign-up, to the selection of insurance plans, and through to purchase. They must have alerts in place that will notify them of any type of hiccup, however small. This will give them a better idea of where individual pain points may lie.

Likewise, application performance monitoring (APM) will also play an important role in keeping the sites healthy. Traditionally APM has been difficult even with a basic application – and and its sister sites are far from basic – but there is now the ability to insert trace amounts of code directly into applications themselves, allowing administrators to collect detailed timing information on transactions and understand what’s happening at every step. This code-based APM approach is called Real User Monitoring or RUM. Like a physician who is able to use the latest diagnostic equipment to identify a potentially threatening disease, RUM can provide administrators with unprecedented insight into problems that can adversely impact the performance of applications. Perhaps there is a problem with the application itself that is causing users to have trouble accessing a list of health-care providers, or getting information on the plans most suitable for them. Detailed APM can help pinpoint these issues, allowing administrators to patch them up accordingly. Not only will this make for a smoother user experience, it will also help prevent application downtime – something the online exchanges have already experienced quite enough of.

The good news is that these sites are not terminal patients, and can be completely viable over time. It’s just a matter of administrators deploying the correct procedures and gaining a better view of the entire system and experience. Thankfully, they already have the tools needed to do so, but they need to implement them, stat.

Chris LaPoint is the vice president of Product Management at SolarWinds, an IT management software provider based in Austin, Texas.