Editor's note: Read part I of How Government Can Unlock Economic Benefits from Open Data here.
Every local government would love to have double-digit increases in bus ridership, lower service costs, improved wellness in the community and fast-growing firms like Zillow in their backyard. So far, a handful of mostly large cities have made an investment of resources to open broad sets of data for public use and are setting some best practices, if not outright economic improvement. The short list includes Baltimore, Boston, Chicago, Los Angeles, New York, Philadelphia, San Francisco and Seattle. On the county side, Alameda County, Calif., and Montgomery County, Md., stand out as open data innovators. (For more details on local government open data, visit the city and county section of Data.gov and the U.S. City Open Data Census.)
As the open data movement spreads to smaller jurisdictions, the opportunities and challenges to extracting economic value become more pronounced. Riverside, Calif., with a population of 316,619, is on the cusp between a mid-sized and large city. It has been active in opening data sets and engaging potential users. Lea Deesing, the city’s chief innovation officer who admits to being an open data evangelist, has done all the right things to push open data. Internally she meets regularly with agency heads to discuss what data sets have value and should be opened to the public. Meanwhile, the city has hosted a range of events, from hackathons, coding forums and start-up weekends to get out the word about Riverside’s open data. “We have a high-tech ecosystem in Riverside and it seems to be growing,” she said.
Are Hackathons Really Necessary?
They have become the most demonstrable way of showing that a city or county has opened its data. Hackathons have done more to create a buzz and excitement about open data than just about anything else out there. The media covers the events regularly, highlighting how coders come into a room and by the end of the day there are apps (or prototypes) available that use government data in a new and innovative way for the benefit of the community. The best apps win prizes and government has a new service tool at minimal cost.
But when it comes to extracting long-term, economic and social value from open data, hackathons might not be the best way to go. Stefaan Verhulst, co-founder of GovLab, sees three problems with hackathons. “First, there’s a lot of duplication going on with hackathons, with lots of them solving the same problem over and over. Second, their implementation process focuses on showcasing possible ways of using data without the follow-through needed. Third, they go after problems at the margins. Cities and counties have some very big problems, but most hackathons don’t focus on them,” he said.
Waldo Jaquith, director of the U.S. Open Data Institute, said hackathons can be useful in building up community, connecting people and testing how they can use data in a lab-like setting. “But beyond that, they are almost always a waste of time,” he said. His particular peeve is that the vast majority of people who show up at hackathons are coders and app developers, not experts in a particular program from which the data has come from. “Those are the people you want involved in deciding the smartest way to use open data, not coders.”
What’s not so clear is the economic impact of Riverside’s open data. “It’s not easy to calculate the formula for economic development for open data at this point. It’s really difficult,” said Deesing.
Two high-value data sets include permitting and geospatial data. Deesing said both have economic value that could come in the form of lower costs (through shared permitting information between contractors and subcontractors) and new business potential (GIS data for developers). Although Riverside doesn’t have a formal policy on open data, it will become part of the city’s next IT strategic plan and it has the support of city leaders. At the operational level, published data sets are kept fresh and actively updated.
But the concept is still new and Deesing said she keeps up a constant drumbeat of talks and presentations to help sustain the momentum. “We need more executive understanding in government and in industry about this,” she said.
Even big cities, with plenty of resources and experience with open data, are finding that reaching the level of “trend setter” isn’t always easy. In 2012, New York City passed what was considered a ground-breaking open data law designed to give the public free access to data that was once locked up. But by 2014, things were not going well. Open data advocates and city officials complained the data sets were “proving to be messy, incomplete and in some cases useless in the format in which it is presented,” reported The New York Times.
Data from the New York Police Department has been the most popular, but also the most complicated to use when it first came out. Much of the statistics on crime and crashes was in PDF and Excel formats; technically in compliance of the new law but not in a computer-readable format, complained some critics.
Today, however, the situation haschanged for the better, said Gale Brewer, Manhattan Borough president and a longtime advocate for open data in the city. With 1,300 data sets now accessible, Mayor Bill de Blasio’s administration, including the police department, has made major strides in opening up city information, according to Brewer.
So far, most of the progress in the city has been focused on improving civic life. Brewer cited a recent hackathon that led to the creation of an app that helps city residents accurately measure their heat and water usage. She is also setting up a training program so the volunteers who run Manhattan’s more than 600 community boards can learn how to use the open data sets to make better decisions that affect a range of issues, from the disabled and children to health care and schools.
Nevertheless, the economic impact of open data on New York City remains anecdotal, as it does in other cities and counties. While McKinsey and Capgemini have tried to measure the value at the national level, little is known at the local level. “There’s a growing awareness, but lots needs to be done in terms of measuring how much progress has been made,” said GovLab’s Stefaan Verhulst.
Part of the problem is that what constitutes economic value is so diffuse. There are the firms that use the data directly, such as Zillow, and create new lines of business, new revenue and new jobs. But there’s also the value that is created indirectly. A person who uses a transit app that’s driven by a city’s open data and switches from driving a car to riding a bus, could end up saving time and money. How do you capture that value and put a price tag on it? While it can be done, it’s not easy and government has other competing priorities for its limited resources.
But finding a way to show where the value lies in open data is critical to its success. If the value isn’t identified and measured, government officials who decide how to spend tax dollars will be less willing to make a long-term investment toward sustaining open data. As Verhulst explained, evidence is needed to show that open data is worth the effort. “If that’s not done, it is going to get harder to keep the movement accelerating,” he said.
Already, there’s been some discussion of an open data bubble, with too many sets published without the use and participation that would warrant long-term investment. “I don’t think open data is in danger of disappearing, but I have seen data portals set up and then certain functions being turned off,” said Wendy Carrara, a senior policy adviser with Capgemini.
Want to Open Your Data? There’s a Vendor for That
Cities and counties have lots of data they would like to open up. The problem is that most don’t have the resources, including manpower and know-how, to do it themselves. So it comes as no surprise that a host of companies, some legacy IT vendors, some Internet startups, have stepped up to offer products and services to make the job of opening data a bit easier.
Companies most directly connected with the open data movement are ones that offer data portal platforms. Junar, based in Dallas and Silicon Valley, is a cloud-based data platform company, which simplifies the data publishing process, according to its website. The firm has worked with a number of local governments, including Palo Alto, Pasadena and Sacramento, all in California.
CKAN is an open source solution offered by the Open Knowledge Foundation, a nonprofit group that advocates for free and open data around the world. CKAN (which stands for Comprehensive Knowledge Archive Network) is aimed at helping governments to manage, publish and share data. Others include DKAN, the Open Government Platform and Socrata, all of which embed certain services in their platforms, including data management, content management, data publishing, data discovery and workflow.
Socrata is perhaps the best known among the cloud-based data platform providers. Started in 2007 by founder Kevin Merritt, Socrata today has more than 70 government customers around the globe, including New York City, Seattle and San Francisco, as well as numerous counties. Governments pay Socrata a monthly subscription fee to use its platform, which can provide a range of services, including a Web site for interacting with government data, data hosting, application programming interfaces for third-party developers and tools for custom information products.
Safoun Rabah, Socrata’s vice president of product, likes to speak metaphorically when explaining the importance of open data to governments and society. “Data is the fuel for innovation and new apps,” he said. It also provides transparency, helps with business compliance, makes government work better and improves the citizen experience. “But government can’t do it all by itself. Socrata can help manage the flow of data, provide the portfolio of apps that make data consumable, as well as the know-how governments need,” he said.
Her colleague, Dinand Tinholt, spoke of a tipping point: “There may be a shakeout with some data that isn’t useful,” he said. “Open data could lose its buzz.”
To avoid losing the buzz, local governments have to get smarter about open data. “Cities need to focus on the best data sets. Less is more when it comes to putting value into open data,” said Tinholt. That should make it a bit easier to find the evidence that is needed to not only show open data initiatives can work, but which data sets have value and what that value is worth. When government learns how to prioritize what data to release, it can maximize the potential. “That’s important because prioritized data sets can get some traction going economically,” he added.
In addition to keeping the data fresh, machine-readable and available to a wide group of potential users, there are several other, more elusive goals to making open data economically viable. One is to set standards to ensure data is interoperable. Right now, there are thousands of standards out there, making open data interoperability problematic. Most experts agree that the number of standards needs to be reduced so that it becomes easier to connect different types of data around a common criteria, such as geolocation, for example. “Data becomes more valuable when you can link it with other data sets,” said Verhulst. “It can result in insights you wouldn’t have had before.”
Local governments also need to be active users of their own data. Not just to gain the benefits that come from an expansive view of how data can improve services and operations, but to understand how data works and to use that knowledge with companies and organizations that could benefit from it. Verhulst calls that “data literacy” and said local governments need to be more data literate so they can make the data more user-friendly. It’s the process that the NYPD went through — instead of posting data in PDF format, it learned how to make police information more accessible. Today, NYPD data is among the most popular data sets in New York.
Finally, local governments need to start thinking about data in a more collaborative, rather than competitive way. Data sharing shouldn’t stop at jurisdictional borders, said Daniel Castro. He points to transit and housing information as good examples of data sets that have more value when used in a collaborative fashion. The payback is in more business opportunities for the Zillows out there. “Local governments will have to accept the challenge of working through the fact that some of this stuff will be outside their control,” he said.