Government agencies have eagerly embraced open data and transparency. This newfound abundance of data has spawned new questions and challenges about how to organize the information so that citizens and policymakers can use it.
Sunshiny days have come to government, as transparency Web sites, iPhone apps, online dashboards and data feeds have proliferated across the Internet. The movement is changing the playing field for how citizens obtain government information. Since President Barack Obama made transparency and open data early priorities of his administration, public agencies have put thousands (if not millions) of data sources online, creating a rich pool of material that once was hidden in dusty file cabinets and available only through freedom-of-information requests.
Although most agree that openness is a welcome aim, this newfound abundance of data has spawned new questions and challenges: Can government agencies link together these far-flung data sources coherently so that citizens can find what they're looking for? And can open data be packaged and presented in a manner that decision-makers and the public can use to affect policymaking?
Those are big-ticket tasks, but they are issues that thought leaders in the nation's IT community are beginning to confront, according to San Francisco CIO Chris Vein. "There are wonderful plans, at least in our heads, to connect all of this stuff, and I think we could," he said. The topic has been broached, Vein said, in preliminary discussions with Beth Noveck, Obama's deputy CTO for open government, and Federal CIO Vivek Kundra.
"What we've brainstormed -- nothing more than that -- is kind of a 'government vertical' and trying to see if we couldn't build something across all the various levels of government to see if that would be particularly useful," Vein said. "Obviously there's interest -- it's just a matter of time and priority."
Like any other project that requires cooperation and a shared vision by the feds, states and locals, linking together open data wouldn't be simple -- and that's probably an understatement.
One person who knows firsthand how challenging such an effort can be is Richard Seline, an entrepreneur and former Bush 41 White House official who dreamed up the National Dashboard, a politically neutral consortia that tracks government spending by combining 50 million records of federal grants with thousands more from cities, counties and states. Seline has assembled a coalition of seven government associations, including the National Governors Association and the National Association of Counties (NACo), with the ultimate goal of creating a "comprehensive analytics and modeling tool" that uses Recovery Act reporting data, federal grants and contracts, and ongoing spending at all government levels. For now, it remains an unfinished dream.
"The technology of getting information sources to all link up -- that is, frankly, a somewhat simple solution now," Seline said. "The difficulty isn't connecting the knowledge pipes. The issue is reading the information and making sense of all the different pieces of information that are coming together."
In other words, governments need to figure out what they want to know from the data, and then coordinate discussions across government levels so that data is aligned by subject matter - like Vein's suggestion of a vertical. For example, Seline said it's within reach that data for a broad topic like transportation could be integrated and connected within an "online information tree" that includes the U.S. Department of Transportation and flows down to include the National Highway Traffic Safety Administration, state departments of transportation, the counties that receive much of the road money, and local mass transit departments. This kind of "taxonomy" has become an organizing principle of the conceptual Nationaldashboard.org.
"The idea of having hundreds of these Web sites on transparency is a check mark for open government, but are we actually causing something new to happen with this new information and knowledge? If we're not, then we should step back and ask much more purposeful questions of ourselves," Seline said. Simply put, the data needs to be organized in such a way that it tells policymakers and the public a story, instead of just releasing data for data's sake.
For that kind of organization to become a reality, then somebody or some agency must step up and take the lead, said David Tucker, the CIO of Vermont. There are policy and technical considerations that would have to be accounted for. "I would think there would have to be some sort of standard data format and an agreement on the structure of the portal," he said. "From the citizens' perspective, it certainly makes sense for somebody to try to pull all those resources together so that people don't have to hunt around."
What an all-encompassing national dashboard would look like is unclear, although some think it could be modeled after or integrated into Data.gov, the online data portal Kundra launched in spring 2009, shortly after he joined the White House. The online data clearinghouse makes a wide range of data sets from various federal agencies available to the public in different formats. Some states and cities emulated the concept by opening data portals of their own, which offers a further opportunity for creating a connected matrix of such Web sites. Vein added that it's possible that governments that have released transparency apps and data tools on the iPhone -- San Francisco, Washington, D.C., and the U.S. Office of Management and Budget, to name a few - would someday offer them at a one-stop online storefront. It could very well be built upon the existing Apps.gov, another of Kundra's innovations.
But it's presumptuous to assume that only government will drive the creation of a national dashboard. Seline's effort is among several that are surfacing in the private sector. Another possibility is that this nationwide collaboration could become a public-private partnership. When the federal government launched Recovery.gov, business-to-government solutions provider Onvia launched a complementary Web site called Recovery.org, which used Recovery Act spending data to visualize on a national map those stimulus projects all the way down to the county level. Onvia also is a technology partner of the National Dashboard -- and a good example of the commingling of public and private entities that are involved in the transparency movement.
One more unknown is whether a nationwide dashboard would be purely a data repository, or if it would attempt to visualize data so that it tells an interpretive story. As governments already have proved, releasing data online can be done -- and has been done.
Figuring out and coordinating what the locals, states and federal government want to visualize -- what story those stakeholders want to tell with the data -- is probably the biggest challenge of creating a national dashboard, said David Keen, the chief financial officer for NACo. Should they put all the data on an interactive map, should it be organized thematically or topically, or tailored to a user's geolocation? The scale of this challenge could require some kind of government summit, he said.
"What you have [now] are all these disparate data sets that were captured for individual purposes, so the first thing you have to do is get the data sets aligned. You have to have discussions to find out what you're trying to accomplish," Keen said. "If it's economic development, that's one set of data metrics. If it's housing, that's another. But if you're just trying to capture data to have data, then you're going to be lucky if it's meaningful when you bring it all together."
The difficulty of that type of data coordination was a hard lesson learned during the beginning months of the American Recovery and Reinvestment Act, when the federal government mandated in 2009 a reporting process that was more rigorous than any grant-making effort that had come before.
Local governments were asked to report metrics they had never calculated before -- the number of jobs created, in particular -- and forced them to drill down on their spending data in more detail. Officials quickly discovered there were gaps in the data and the reporting process, and that at the time even the federal government's data weren't reported uniformly across all agencies, Keen said.
"And so it ends up running into issues, and that was apparent in the stimulus. The [federal government] said, 'OK, we want to funnel this money down to the local level using the current channels with existing projects,' but, 'Oh, by the way, we're going to change the metrics so now you guys need to report jobs and these have to be shovel-ready projects,' which weren't designed to create jobs, but were designed to fill operational needs."
After months of discussion, last summer the federal government published the necessary data formulas for calculating the number of jobs created and a common data format for the information. But even before states and locals began submitting their data, some lawmakers criticized Recovery.gov's design, complaining that it didn't do a good job of explaining how the stimulus was benefiting everyday Americans. The Web site's board realized the site needed to be more robust, so it re-bid the portal in a redesign that cost $18 million.
So if Recovery.gov was a test case, then one can only imagine a national dashboard's potential for spurring disagreements and competing agendas. The problem with organizing information, in Keen's mind, is the same problem that's plagued public-sector initiatives for years: The three levels of government -- the federal, the state and the local -- aren't talking to each other. They're talking at each other.
"The feds are able to show data on the money they've pumped out, but it goes to the state level -- where it doesn't really get reported. It gets filtered -- sent to the local level, and in most cases they report back to the feds. But that doesn't capture all the other data for the activity that's generated at the local level," Keen said.
If there's reason for optimism, it's that the idea of a national dashboard and the increasing numbers of government transparency and open data Web sites are good starting points because they're forcing participants to look at the issue of "data capture" holistically. It has caused people to start asking questions. "I think in the end, at each individual level it'll get there, but until all three are brought to the table -- and maybe that's through [the White House Office of] Intergovernmental Affairs -- you're never going to have seamless data," Keen said.
He said a national dashboard initiative would have to take into account the dire financial conditions of states and locals. Since cities and counties are cutting their budgets severely, they would want to know they could use the analytics they contribute to the dashboard in ways that help them save money, like reducing the number of information requests that a department handles.
The need for such an initiative -- especially as the Obama White House continues its long march forward with transparency and expects states and locals to follow -- will only intensify because there's a danger that amount of available data will become overwhelming, some observers say. There's a danger people won't know where to look for what they want to find, perhaps creating an environment that is in some ways too transparent.
"The difficulty is, how many citizens are going to go looking site by site, project by project, unless you are one of the 10 percent of people who have a beef with something that didn't get funded or are directly involved in something, or just out of pure curiosity?" Seline asked. "We should step back and try to figure out the information, how much information, and more importantly, what's the point of the information."
What that organizing effort ultimately becomes is -- and perhaps this is appropriate -- an open question.