Special Report: Can Governments Keep Up Their Open Data Initiatives?

Open data may be cool, but is it useful, sustainable and cost effective?

by / May 30, 2012
Image from Wikipedia.org Image from Wikipedia.org

About This Report

This report is based on the activities of the Digital Communities program, a network of public- and private-sector IT professionals who are working to improve local governments’ delivery of public service through the use of digital technology. The program — a partnership between Government Technology and e.Republic’s Center for Digital Government — consists of task forces that meet online and in person to exchange information on important issues local government IT professionals face.

More than 1,000 government and industry members participate in Digital Communities task forces focused on digital infrastructure, law enforcement and big city/county leadership. The Digital Communities program also conducts the annual Digital Cities and Digital Counties surveys, which track technology trends and identify and promote best practices in local government.

Digital Communities quarterly reports appear in Government Technology magazine in March, June, September and December.



Open data may be cool, as some tech publications put it, but is it useful, sustainable, cost effective? After all, with an Internet full of data and information, how many people really are interested in machine-readable data sets of city or county expenditures, or the locations of public toilets, boat ramps or building footprints?

Most Americans first heard of open data a few years ago when President Barack Obama put stimulus finances online to avoid some of the problems that arise when tax money starts gushing into government programs opposed by political opponents — and thus subject to intense scrutiny, spin and the making of political hay.

About that time, Vivek Kundra — then-CTO of Washington, D.C. — decided to democratize district data, saying — in a Government Technology article — that he did it for several reasons: “No. 1 was to drive transparency; No. 2 was to engage citizens; No. 3 was to ensure that we were lowering the cost of government operations.” Kundra put hundreds of data sets onto the city’s website so the general public could easily access real-time feeds — in XML and other formats — of everything governmental. The district also sponsored the Apps for Democracy contest, which gave prizes to contestants who integrated the data feeds into useful apps. New York City and other cities followed suit, and open data seemed like a big deal.

Kundra took open data with him when he moved from the district in 2009 to become the federal government’s first CIO, and Aneesh Chopra — Virginia’s secretary of technology and another enthusiastic government transparency advocate — was named federal CTO.

But developments begin to cast doubt on open data’s sustainability. While Kundra launched the federal government open data site Data.gov, his successor in the District of Columbia announced that Apps for Democracy may be “more cool than useful” to citizens of the district, and so the program would be allowed to drop off the twig.

And even though the federal government’s Data.gov expanded to include more data sets, the lights dimmed in 2011, when funding for the program was slashed by 75 percent. So it evidently wasn’t high enough on the priority list to weather the nation’s budget crisis. Of course, teachers, police officers and many others across the country suffered a similar fate, so the jury is still out on what really matters. And while they left a legacy of transparency and IT innovation, both Chopra and Kundra moved on to other things.

According to Data.gov — which may be a bit outdated now after its funding was cut — 31 states and 15 cities maintain open data sites. Are they still “engaging citizens, driving transparency and ensuring we are lowering the cost of government operations” or are they doomed to fade into the sunset like so many other cool ideas?

We pick up the story there, as the U.S. emerges from the recession, and the baton passes from Kundra and Chopra to local governments infused with an enthusiasm for transparency, open government and open data.


Trust and Transparency

In a cynical age, in a highly polarized political climate, under the duress of a recession, trust in government has fallen to historic lows. Recently, according to The New York Times, trust in the federal government dropped to single digits. Public trust in state and local government is a bit better, according to a Gallup poll, but in such a climate, reports of government secrecy and abuse of power strike an ugly chord that can resonate broadly.

In 2010, for example, it was discovered that the city of Bell, Calif. — a Los Angeles suburb with about 40,000 residents — paid a $1.5 million annual compensation package to its city manager and enormous salaries for several other officials. Angry residents besieged City Hall. Staff members were recalled and indicted, and California now requires all city and county governments to report employee salaries, which are then posted on the Internet.

Contracts — in which taxpayer funds are used to purchase goods and services — have been another contentious issue that begs for transparency. According to an OMB Watch report last year, states are leading in that regard, while an audit of 10 federal agencies showed that none disclosed the terms of contracts. On the other hand, local jurisdictions like Frederick, Md.; Cook County, Ill.; and Louisville, Ky., have gone so far as to upload even their checkbooks to public view.

But building trust is only the first level of engagement for transparency. There is utility in data once it’s made available, and nothing beats utility like open data that can be captured, manipulated, compared, graphed, mapped and so on. Just as used items can be recycled into new products, data captured for one government purpose can be extracted, evaluated and compared, resulting in new ideas that reveal opportunities and problems heretofore overlooked. And there is even a higher calling for open data: democracy itself.

San Diego Gets Results from 24-Hour Hackathon

Pet lovers, allergy sufferers and fitness enthusiasts are just some of the groups that stand to benefit from a 24-hour hackathon held in March in San Diego, aimed at encouraging software developers to come up with the next great city app. The hackathon was one component of the broader AT&T San Diego Apps Challenge, launched by Mayor Jerry Sanders. Developers competed for a share of $50,000 in prize money donated by challenge sponsors, including San Diego Gas & Electric, CONNECT and CleanTech San Diego. “ [The Apps Challenge] fits right in with San Diego’s entrepreneurial, innovative spirit, and we know our tech-savvy population will make it a huge success,” said Sanders when he announced the contest in January. — Noelle Knell

“The people of America need to start thinking of themselves as citizens again,” said Jennifer Pahlka, executive director of Code for America. “And if you’re a citizen, it’s not just about the benefits you get, but also about the responsibilities.”

For instance, Pahlka cited an open source software app that helps citizens locate and dig out fire hydrants when they are covered with snow. “Hawaii modified the app to use for testing tsunami sirens and Seattle is going to use it to clean leaves out of storm drains. Other cities are going to roll it out for a variety of things.”

Pahlka points out that one Code for America volunteer said: “I’m a citizen by choice here. We’re immigrants, and this country has been good to me and I believe the skills that I have as a Web developer are what I should be giving back to my country.”

From Electronic to Engaged

“E-government” has a long history in government, from the first websites modeled after the telephone book, to today’s sites with archived videos of city council meetings, searchable legislation, online payment, email contact information, even the jurisdiction’s checkbook register listing specific expenditures and salaries. The data is viewable with any Web browser, and can be copied, pasted or printed.

Open data is a more technical approach to government data with some big advantages. A programmer, for example, can dip into the content of an open data site, and instead of just copying and pasting data from it, can use an application programming interface (API) to link to live data sets on the site. These open data sets — because of the API — are machine-readable, meaning they can be manipulated by computer, merged with other data sets, mapped, etc., revealing new information and insight — like how many minutes it will take for the next bus to arrive, or even such complex subjects as cross-linking political contributions with successful lobbying and legislation efforts.

For instance, Seattle’s open data site contains information on Fire Department emergencies, which has been turned into useful applications by citizen developers. “One application takes the fire feed and shows on a smartphone where the fires are,” said former CIO Bill Schrier, who now directs e.Republic’s Digital Communities program. We had a couple of one- or two-day hackathons where developers used the data and built those applications.” Schrier suggested looking at http://dev.socrata.com/gallery — a gallery of applications — to see several examples built using data feeds from data.seattle.gov, hosted by Socrata, a Seattle-area company.

Many law enforcement sites carry photos of sex offenders and their approximate addresses. Through open data applications, this information can be linked to Google Maps to display offenders’ proximity to parks and schools. Google Maps was built with an API and is thus very useful for open data applications. The federal government’s Data.gov site, along with Data.seattle.gov and other state and local open data sites also are built with APIs. It’s an approach that makes it easier for agencies to share their data.

“Government has enterprise data management systems optimized for collecting and capturing and processing data in a course of conducting business,” said Socrata founder and CEO Kevin Merritt. “Open data is the opposite side of the same coin, which is basically making that data useful and acceptable to people who can extract value from it.”

Code-a-Thon Tackles Health-Care

IT Developers descended on Blacksburg, Va., in April to compete for cash prizes at the Hokie Health Code-a-thon. Home to Virginia Tech, this region in southwest Virginia also boasts many prominent health-care organizations. The two-day event challenged teams of student and professional software developers to come up with the next great innovation to improve health care. The Code-a-thon was part of a series from Health 2.0, which hosts similar events throughout the United States, Europe and Asia. At the event, former U.S. CTO Aneesh Chopra headlined a panel of experts delivering presentations relating to the role of technology in modern health care.

Those people include programmers who prefer a “programmatic interface” to information, said Merritt, as well as researchers, journalists and others who have systems to deal with masses of information.

Individuals who want to look something up or who want to make sense of government data, but are not technically sophisticated, are the beneficiaries of open data applications that deliver information in usable form. For example, users may be homebuyers who want to evaluate neighborhood schools and crime rates, or people who want to track voting records of their elected officials or watch government spending.     

Another advantage of open data is that it can provide real-time information, such as when the next bus will arrive.

“If you think about the evolution of software development, over the last five years, most modern Web developers will tell you that the majority of the work they do is in mashup development,” Merritt said. “They are cobbling and stitching together applications by weaving APIs from disparate sources and disparate applications. Google maps is the most common mashup out there and the reason it is the most common is that it led with an API strategy long ago. And developers are recognizing now that, ‘Hey if you wrap an API around this data, I’ll start to build some interesting things with it.’’ Open data done right, he added, can tell a story about what that data means.

John Wonderlich, policy director for the Sunlight Foundation, said open data is helping to reshape e-government. “There have been a lot of efforts in e-government in the past decade and a half. Efforts to index information or get it to show up on Google search, and there was a lot of ‘let’s get more things online, because we want it to be visible to the Web.’ So it’s not brand new,” he said. “I think the newer thing is the way in which this information can be reused, analyzed and republished, and not just by specialists, but also by amateur or nonprofessional developers who are making tools or websites to use public information. So the new and disruptive thing is that there is suddenly a lot more use and appreciation for the kinds of reuse that are possible by putting things online.”

San Francisco CIO Jon Walton agreed. “Back when we started focusing on e-government in the late ’80s and early ’90s, it was about automation,” he said. “Taking manual systems or maybe old mainframe systems, and trying to come up with a website or an automated way to do it. That was e-government … open data for me is not just about automating manual processes, but is an example of how you engage with the citizen. ‘E’ before was ‘electronic.’ E now is about ‘engagement.’ That’s the contribution I think that technology now provides to government.”

Photo: Jon Walton, CIO, San Francisco. Photo by David Kidd.

Open Data in the Big Apple

In March, New York City Mayor Michael Bloomberg signed open data legislation, saying that the NYC BigApps 3.0 open data competition had created dozens of useful programs to help New Yorkers do everything from pick a restaurant to find a parking space. “At the contest’s core is a simple premise,” said Bloomberg. “This data belongs to the public, and if we make it accessible to everyone, the possibilities are limitless.”

The city’s open data site — which now gets 2,000-3,000 visits per day in addition to those accessing data through the APIs — grew from about 100 data sets to nearly 1,000 in the last 2.5 years, and now includes a data set of nearly 20 million requests to the city’s 311 nonemergency service system from 2004 to 2011, showing the date, time, originating agency, location, status and contextually relevant details about each 311 request.

While detailed government data may give notoriously cantankerous New Yorkers more ammunition, former CIO Carole Post said transparency and openness of government are essential. “There’s now legislation in place, and I don’t see this going anywhere but continuing to advance and to evolve.”

Take Aim at Open Data, Social Media

This November, voters in 37 states will have an easier time finding election information via social media and mobile devices, thanks to the Voting Information Project (VIP), an initiative of the Pew Center on the States, Microsoft, AT&T, Foursquare, Google, state elections offices, media partners and others. VIP takes state election information, translates it into an open programming format and organizes it into application programming interfaces (APIs), so that developers can create a user-friendly apps from the data. “There’s polling place information, there’s what’s on the ballot, and before the November elections there will be location-specific ‘rules of the road,’” said Pew Senior Associate Matthew P. Morse. “Voters retrieve information specific to them through their addresses.” — Wayne Hanson

New York City modeled its data and apps contest after Washington, D.C.’s Apps for Democracy. But city officials say New York deserves “pioneer status” given the city’s size, scale and complexity. And though the city could have been satisfied with posting static PDFs, the decision was to use APIs to make the data live and breathe.

The city may be a pioneer in another respect as well. According to Nicholas T. Sbordone, spokesman for New York’s Department of Information Technology and Telecommunications (DoITT), the first deliverable under Bloomberg’s open data legislation is a data set policy and technical standards document. “In the spirit of open data,” said Sbordone, “what we’re doing is opening up the process of developing those standards via wiki, available at www.nyc.gov/datastandards.” The city also organized three related events during Internet Week in May.

San Francisco Opens Up

Walton said that San Francisco’s open data program was launched to address two problems. First, the public was frustrated at a perceived lack of openness and transparency. And second, the city’s data was simply hard to find.

“A lot of data was already available but it was available on paper, or in a specific format or a CD which has limited use, and other times, it was just hard to find — we’re a very large organization and every department had its own website, who to call and how to drill down on the website. So even if the data was available, it was challenging.

“So initially, open data was just to organize and open our data to the public,” Walton said. “We just created a portal, we surveyed all the departments and said ‘What data are you already providing and can you provide that in digital format rather than printing it out?’ If so, we’re going to post it on this simple portal in machine-ready format.” Those basic actions began the process of expanding public engagement, he said.

“We did things like creating a form where [citizens] could ask us questions about the data if they didn’t understand it. A lot of the public knows their own piece of the data, and they could give us feedback on when they thought the data was incorrect. Then they might say, ‘I looked at all the data sets and there’s this one piece of data that’s valuable, can you make that one available as well?’ So that allowed us to become the facilitator to the public. It opened up the dialog and made it much more collaborative.”

Walton said he thought open data would become more sustainable over time but was surprised at the immediate interest. “I knew people would use the data. I didn’t realize community groups and small startup firms would take that data and use it as a building block of part of an application. I was amazed. We put up 200 data sets, and almost immediately applications were being generated.”

Ordinarily, if the public demanded a certain application, the city would have done an RFP, asked for money and hired a consultant, Walton said. “And two years later, maybe we would have had an application.” Now, the city is providing the tools to the public to meet its own needs.

Safety Data Community

As this magazine went to press, the Obama administration was preparing to launch safety.data.gov, a “safety data community,” designed to “develop and deploy a range of digital tools and mobile applications to enhance public and product safety.” An introductory blog said the site would be a “one-stop shop for government safety data,” creating a “cross-sector, collaborative safety portal that better fulfills the needs of the public, private, and civil sectors.”

This increased coordination and transparency is intended to make it easier to compile safety information from multiple federal sources, including Justice, Labor and Occupational Safety and Health Administration, Health, the Consumer Product Safety Commission, and others, focusing on incident data, enforcement actions, product safety and exposure data. The site will operate on principles of open data and will have social functionality like blogs, forums, wikis and rating systems.

The open data site also will create “a one-stop shop for analytical software and decision support tools that agencies make publically available,” according to the blog.

“For example, the site links to the Pedestrian and Bicycle Crash Analysis Tool from walkinginfo.org. This tool is designed to assist state and local pedestrian/bicycle coordinators, planners and engineers with improving walking and bicycling safety through a comprehensive database of crashes between motor vehicles and pedestrians or bicycles.

“Another tool linked through Safety.Data.Gov is the FRA Web Accident Prediction System,” continued the blog, “which provides access to railroad safety information, including accidents and incidents and highway-rail crossing data. From this site users can run dynamic queries and view current statistical information on railroad safety.”

According to the administration, potential uses for the site and the data include:

  • Finding ways to visually inspect satellite imagery for roadway characteristics at fatal crash sites.
  • Calculating economic costs of transportation safety issues, especially in terms of system safety.
  • Enhancing bike safety reporting with a crowdsourcing tool to self-report bike crashes or dangerous conditions.
Transit data is online, for example, so a programmer can link public transit timetables and routes with GPS data and create an app. A quick glance at the iPhone app store shows many San Francisco transit applications including “Routesy” in both free and pay versions that track routes and schedules with station arrival times.

“That’s an amazing thing, an empowering thing,” Walton said. “And users are self-selecting what’s important. You’re not having to predetermine that or guess. So now we have links to some 60 applications that have been developed. I think there are a couple of challenges going forward, to complete the journey, but even if we stopped here, I would call it a success. And I think it is sustainable as a model. As with anything, the more data you make available, the more useful the open data initiative is, the more applications you can generate.”

One challenge that worries Walton is if a company that developed a popular app stopped supporting it, and the app becomes unavailable or stops working and 100,000 users get angry. “What’s my responsibility as the CIO? I feel that if there is an application that is shown to be a benefit to the citizens, I need to try to make that sustainable over the long run. Do you subsidize it, find a way to partner with them, do you buy the code from them?”

One solution for improving sustainability of open data apps may lie in multigovernment partnerships. An informal group of big-city CIOs — known as the G7 (for group of seven) — is launching a website that will house standardized data from member cities, making it easier for them to share applications. These multicity applications could be attractive to investors, said Walton, eliminating the need for government support.

“We’ve started trying to formalize our data between cities ... to agree on data schemas, data models, so that when someone writes an application for San Francisco that same application will work in all the other six cities,” he said. “We think that’s a step in the right direction, as that will create a much broader enthusiasm base.”

The G7 — Chicago, Los Angeles, New York City, San Francisco, Boston, Seattle and Philadelphia — also intends to standardize the domain as city.data.gov. “The cities continue to do what they do best, innovate locally and push their data, but at the same time, have an eye to how this benefits cities and counties all over the country,” Walton said.

The group intends to prototype one or two applications that will work across all the cities. Then the plan is to get developers together to start building applications.

Open Data — Value Delivered?

Open data is seen as a partnership between the jurisdiction, the public and developers. Not surprisingly, civic engagement and collaboration are nearly always cited as the primary value. But are there tangible benefits as well? And what are the costs?

Peter Threlkel from the Oregon Secretary of State’s Office was looking for a way to make the state’s trademark information available online. Staff investigated a proprietary database solution, but instead decided to leverage the state’s existing open data site.

“It’s a small program with about 5,000 active trademarks registered and about 30 new filings a month,” Threlkel said, “so we could not justify spending half a million dollars on the Oracle solution. We used to make the trademark information and images of the filings available as a public records request on CD-ROM for $100 a month to a handful of customers. Now the Socrata solution at data.oregon.gov allows us to make the trademark information available and searchable online for free, so our customers and the public can access the data and images at their convenience.”

Early e-government efforts had to cope with analog records that needed to be run through optical character recognition equipment and then cleaned up for use. Since most government records are now born digital, that’s no longer the case said Socrata’s Merritt. “Now, the mechanism to get that data from the business system out to a public-facing open-data portal is straightforward and cost effective.”

New York City’s experience seems to align with that assessment. DoITT’s Director of Research and Development, Andrew Nicklin, and Open Data Coordinator Albert Webber were shifted into the open data program. Nicklin, Webber, a small team of city employees and a few support staff did the whole thing in the back office. The site is externally hosted and involved minimal costs.

The return on investment is mostly intangible, although city officials said there is an economic development side of it. The key, however, is citizen engagement, accountability and transparency.

Neither San Francisco nor New York City had hard evidence of a decrease in Freedom of Information Act (FOIA) or discovery requests as a result of open data initiatives, but anecdotal evidence was obtained through discussions with public-sector and nonprofit advocates to the effect that discovery filings were often not necessary since the data was easily available online.




Apps Help Patients, Doctors Battle Cancer

Two health IT applications designed to help doctors and patients stay a step ahead of cancer each took home a $20,000 prize in a January challenge to develop cancer prevention and treatment apps using public data. Ask Dory, an app which helps users find information about clinical trials for cancer based on data from ClinicalTrials.gov, was one of the winners. The other finalist, My Cancer Genome, allows doctors to see a list of therapeutic options for cancer, based on a patient’s tumor gene mutations. In an interview with Government Technology, Wil Yu, special assistant for innovations with the Office of the National Coordinator for Health Information Technology, said the winning apps stood apart from other submissions based on their ability to manipulate cancer data and provide a level of analysis that was both accurate and significant to users. — Brian Heaton

Seeing the Future

What might the future hold for open data initiatives? “I see a lot of struggle around influence and security as contentious issues that will take a lot of work to sort out what should be public and what should not,” said Wonderlich, of the Sunlight Foundation. “Those lines are not drawn appropriately as it stands now. Beyond that struggle, I think is better, more rigorous decision-making around information management. Looking at all the information an agency has and making good decisions about what should be public and what shouldn’t, rather than just a response to a FOIA request. I’m hoping we end up in a place where information is managed, just like HR is managed, or anything else. We make good decisions about what is public and what isn’t, not because of the interest of the department but because there’s a strong and clear public interest behind good decisions about what to make public.”

 “We’re on the eve of a period of time in which every government will have an open data portal just like they have a website,” said Merritt. “This is a trend initiated by the District of Columbia and their citywide data catalog, and by the federal government in Data.gov. We now see open data portals in many cities and counties like Baltimore, Austin, New Orleans, Seattle, San Francisco and Chicago; states like Illinois, Oregon, Oklahoma, Washington, Maryland and Hawaii are all sharing their data.”

In a move that some think could signal a revival of Data.gov, and others dismiss as a campaign-year stunt, the Obama administration on March 8 launched ethics.data.gov, which, according to a White House release, “fulfills a campaign promise to centralize ethics and lobbying information for voters.” Jessica McGilvray, the American Library Association’s director for the Office of Government Relations — while mourning the demise of the Statistical Abstract of the United States, whose funding disappeared in the 2012 Census Bureau budget — saw the launch of data.ethics.gov as a sign of progress.


Wayne Hanson

Wayne E. Hanson served as a writer and editor with e.Republic from 1989 to 2013, having worked for several business units including Government Technology magazine, the Center for Digital Government, Governing, and Digital Communities. Hanson was a juror from 1999 to 2004 with the Stockholm Challenge and Global Junior Challenge competitions in information technology and education.

Platforms & Programs