Government thinks open data is an add-on that boosts transparency, but it’s more than that.
Most open data portals don’t look like labors of love. They look like abandoned last-minute science fair projects, pie charts sagging because someone didn’t use enough glue stick. The current open data movement is more than a decade old, but some are still asking why they should even bother.
“Right now, it is irrational for almost anybody who works in government to open data. It makes no sense,” Waldo Jaquith said. “Most people, it’s not in their job description to open data — they’re just the CIO. So if he fails to open data, worst case, nothing bad happens. But if he does open some data and it has PII [personally identifying information], then his worst case is that he’s hauled before a legislative subcommittee, grilled, humiliated and fired.”
Though perhaps it’s not immediately apparent, Jaquith is the director of U.S. Open Data and one of the movement’s most active advocates. But he’s also a realist. Open data is struggling to gain financial and spiritual backing. Open data may fizzle out within the next two years, said Jaquith, and a glance at government’s attitude toward the entire “open” concept supports that timeline.
The people who are really into open data — like Jaquith — aren’t the fad-following type. Open data’s disciples believe in it because they’ve seen that just a little prodding in the right spots can make a big difference. In 2014, Jaquith bought a temporary license for Virginia’s business registration data for $450 and published the records online. That data wasn’t just news to the public — it had been kept from Virginia’s municipal governments too. Before that, the state’s municipal governments had no way of knowing which businesses existed within their boundaries and, therefore, they had no way of knowing which businesses weren’t paying license fees and property taxes. Jaquith estimated (“wildly,” he admits) that this single data set is worth $100 million to Virginia’s municipal governments collectively.
The disconnect between the massive operational potential that open data holds and government’s slow movement toward harnessing it can be explained simply. Government thinks open data is an add-on that boosts transparency, but it’s more than that. Open data isn’t a $2 side of guacamole that adds flavor to the burrito. It’s the restaurant’s mission statement.
Here are six ideas that can help government more fully realize open data’s transformative power.
Open data isn’t just about transparency and economic development. If it were, those things would have happened by now. People still largely don’t know what their governments are doing and no one’s frequenting their city’s open data portal to find out — they read the news. Open data portals haven’t stopped corruption; the unscrupulous simply reroute their activities around the spotlight. And if anyone’s using open data to build groundbreaking apps that improve the world and generate industry, they’re doing a great job keeping it a secret. For government, open data is about working smarter.
“I’m tired of the argument of ‘Oh, it will unlock value to the private sector,’” Jaquith said. “That’s nice. I hope people make billions of dollars off of that. But nobody in any government is going to spend any real amount of time on all the work that goes into opening all the data sets on a sustainable, complete basis because some stranger somewhere might get rich.”
Open data’s most basic advantage is that it makes life easier for government workers. Information that’s requested regularly can be put online, freeing workers to do other tasks. At its best, open data uncovers interjurisdictional insights that save money and improve operations. And no matter how tenuous, peripheral bonuses like transparency and economic development are still there too. Governments aren’t gaining the benefits of open data today because there’s not been a rigorous effort to integrate the concept of openness into public-sector work.
One unnamed city that ranks respectably in the U.S. City Open Data Census has more than 1,000 records on its open data portal. But only 132 of those records are data sets and 86 of those data sets are pieces of a single budget that have been split apart. This is a common practice across the public sector and one that reveals intent. For the most part, governments aren’t publishing their data because they know it’s a useful resource that ought to be easily accessible, well curated, neat and current so that it can be used by all. It’s because 1,000 sounds better than 50 when an official is giving a speech or addressing stakeholders, and they’re not the ones who have to use it.
Governments use data. Open data portals are designed for displaying and sharing information in an organized way. Therefore, governments should use a tool designed for the thing they’re trying to do. Even putting aside the “open” concept, public-sector offices around the nation would benefit hugely from having a common, shared pool of data they can draw upon when they need reliable information. Putting the data online is the most practical way to do that — and it also happens to meet the political dictates of transparency — but government should be doing this for its own sake.
“The most common mistake I see governments make with open data is thinking that publication is the end of the activity, rather than beginning of the activity,” said Dan O’Neil, executive director of the Smart Chicago Collaborative. “Because publishing data can be, if we live in a perfect world, simply a prefatory step to allowing residents to talk about how data affects their lives and helps them live better. But usually, what happens is they publish data and they run as fast as they can in the other direction.”
Open data has outgrown the novelty phase, and that means it needs organizational and policy support to survive. It needs comprehensive planning and believers who will act. People wouldn’t be giving up much if they abandoned open data today, O’Neil said, because open data hasn’t done much. The tragedy of giving up now, he said, would purely be a loss of prospect, because open data could change the world if the focus were shifted away from technology and toward the needs of the people.
An organization called City Bureau is attempting to encourage young non-white people to become reporters in an attempt to restore balance to journalistic coverage on the south and west sides of Chicago. Another journalistic endeavor on Chicago’s South Side called Invisible Institute serves as a watchdog organization that uses investigative reporting, litigation and public discussion to further its civil rights goals. O’Neil’s world is one of civic tech and social justice, but regardless of whether a person supports these particular groups ideologically, everyone can learn from their approach.
“That’s where it’s at,” O’Neil said. “Getting data that isn’t open and making it open and then having an actual community strategy around analyzing not just the data, but the social justice issues around the general milieu.”
Government needs to do the same if open data is to find meaning. Just putting data online and hoping for the best isn’t wrong, but it doesn’t do much. Open data needs a clear plan, and it needs to come from a wide patronage within government.
“The most common mistake is focusing on the project over the practice,” said Will Saunders, Washington state’s open data guy (his actual title). “It’s always attractive to have an executive sponsor, and a lot of times open data projects get started as a transparency commitment, as ‘a hallmark of my administration’ kind of thing. [Sometimes] you wind up having a diligent, small group of folks who facilitate the publication of data and then if there’s a leadership change in three or four years, then a lot of the sustainability just isn’t there.”
Washington could be publishing three to four times more data than it is today, Saunders said, but the state doesn’t because longevity through automation ensures the efforts will stick.
“Program managers know that they can and should publish, and when they do, they tend to link it to their own programmatic goals as opposed to a specific political commitment,” he said. “What I typically do is work with agencies to see if there’s a way I can encourage them to make publication part of their program design, and if I can’t, then I wait for another day.”
This approach is slower, but like proper diet and exercise, experts recommend it because it works.
Open data’s relevance will grow only if efforts mature. In Washington and elsewhere, data sets are often used for purposes different from what was originally intended. Opportunities to repurpose data will appear more frequently as the information becomes better organized, shared and understood. One severe obstacle to that prospect is that today there exist few standard schemas for publishing data. Roads, for instance, cross every boundary the nation has, and yet road data takes a new format in each jurisdiction. Today, without standards, a large project that uses open road data sounds like more trouble than it’s worth.
Government has a hard time following publishing standards today because not many exist. The President’s Task Force on 21st Century Policing is developing some standards for police data, Data.gov is working toward a standard that will let companies like Uber publish their ride data meaningfully, and programs like Bloomberg’s What Works Cities initiative are positioned to develop standards across city lines. Comprehensive and accessible publishing rules would reduce the work required of freeing data sets, and it would solve many of today’s data sharing and comprehension snags.
The public isn’t qualified to tell the government how it should be using its data, because the public doesn’t understand government. Most people think “the government” means the president or Congress. No one understands the challenges of government better than those who run it and those are the people who should guide the use of public-sector data.
Utah is growing its open data automation daily under the guidance of experts. The technology office monitors which data sets its offices need and educates stakeholders on how to use that information. The state auditor, the health-care system and external data requesters are among those learning, said Dave Fletcher, Utah’s CTO.
“Increasingly we’re working on an initiative that we’re calling data-driven government to make better decisions based on data,” Fletcher said, adding that they share statewide data with counties so information like graduation rates, unemployment rates, taxes and air quality measures are easily accessed by commissioners.
Drew Mingl, Utah’s open data coordinator, said people are grateful to have a definitive centralized source of state information that can yield new insights. Data now being drawn from the state’s Medicare system, for example, showed a $25,000 deviation in the cost of hip replacement surgery in two neighboring counties.
“People are now making better, more informed decisions because we’ve put all this state data in one place where they can get access to it,” Mingl said.
Los Angeles runs one of the best open data portals in the nation. It ranks first on the U.S. City Open Data Census, with nearly 100 percent of the city’s data open to the public. It’s not perfect, but what it has, it gained through the knowledge of the city’s experienced workers.
Ted Ross, general manager of L.A.’s Information Technology Agency, said the city wanted three things from its portal: a way for average citizens to view data casually, capabilities for data scientists who wanted to do more with the data, like download it or use APIs, and the ability to integrate federated data sets from across systems. Contracting a vendor was the easiest way to reach those goals, Ross said, so rather than develop the portal in-house, that’s what Los Angeles did.
The city listens to the people who use data most to guide its efforts: journalists, researchers, officials and technology staff, Ross said. This feedback ensures the city’s doing more than fulfilling a political mandate, he said.
L.A. has done more with its data than leave it dangling. Vision Zero, a multinational road safety program, promotes roadway design to reduce pedestrian injury and death, and it’s powered by the city’s open data.
“We worked with USC, who volunteered about 25 graduate-level data science students and three professors, and we basically analyzed for causation and commonality, and trends relating to those, and they can help identify some of the high-value networks,” Ross said. “That’s a prime example of taking open data and ... using it as a platform to interact with a local university and actually identify information and insight that’s being leveraged to save lives.”
Open data doesn’t need to save lives — and it usually won’t. Its value is in supporting the core functions of government, which are basic things like keeping parks and water clean and trash cans empty, said Josh Baron, applications delivery manager for Ann Arbor, Mich., and that should be the goal of everyone who works in government.
“Our No. 1 job,” Baron said, “is to support the lines of business who are out there making the city a wonderful place to live.”
Looking for the latest gov tech news as it happens? Subscribe to GT newsletters.