Open data may be cool, but is it useful, sustainable and cost effective?
This report is based on the activities of the Digital Communities program, a network of public- and private-sector IT professionals who are working to improve local governments’ delivery of public service through the use of digital technology. The program — a partnership between Government Technology and e.Republic’s Center for Digital Government — consists of task forces that meet online and in person to exchange information on important issues local government IT professionals face.
More than 1,000 government and industry members participate in Digital Communities task forces focused on digital infrastructure, law enforcement and big city/county leadership. The Digital Communities program also conducts the annual Digital Cities and Digital Counties surveys, which track technology trends and identify and promote best practices in local government.
Digital Communities quarterly reports appear in Government Technology magazine in March, June, September and December.
Open data may be cool, as some tech publications put it, but is it useful, sustainable, cost effective? After all, with an Internet full of data and information, how many people really are interested in machine-readable data sets of city or county expenditures, or the locations of public toilets, boat ramps or building footprints?
Most Americans first heard of open data a few years ago when President Barack Obama put stimulus finances online to avoid some of the problems that arise when tax money starts gushing into government programs opposed by political opponents — and thus subject to intense scrutiny, spin and the making of political hay.
About that time, Vivek Kundra — then-CTO of Washington, D.C. — decided to democratize district data, saying — in a Government Technology article — that he did it for several reasons: “No. 1 was to drive transparency; No. 2 was to engage citizens; No. 3 was to ensure that we were lowering the cost of government operations.” Kundra put hundreds of data sets onto the city’s website so the general public could easily access real-time feeds — in XML and other formats — of everything governmental. The district also sponsored the Apps for Democracy contest, which gave prizes to contestants who integrated the data feeds into useful apps. New York City and other cities followed suit, and open data seemed like a big deal.
Kundra took open data with him when he moved from the district in 2009 to become the federal government’s first CIO, and Aneesh Chopra — Virginia’s secretary of technology and another enthusiastic government transparency advocate — was named federal CTO.
But developments begin to cast doubt on open data’s sustainability. While Kundra launched the federal government open data site Data.gov, his successor in the District of Columbia announced that Apps for Democracy may be “more cool than useful” to citizens of the district, and so the program would be allowed to drop off the twig.
And even though the federal government’s Data.gov expanded to include more data sets, the lights dimmed in 2011, when funding for the program was slashed by 75 percent. So it evidently wasn’t high enough on the priority list to weather the nation’s budget crisis. Of course, teachers, police officers and many others across the country suffered a similar fate, so the jury is still out on what really matters. And while they left a legacy of transparency and IT innovation, both Chopra and Kundra moved on to other things.
According to Data.gov — which may be a bit outdated now after its funding was cut — 31 states and 15 cities maintain open data sites. Are they still “engaging citizens, driving transparency and ensuring we are lowering the cost of government operations” or are they doomed to fade into the sunset like so many other cool ideas?
We pick up the story there, as the U.S. emerges from the recession, and the baton passes from Kundra and Chopra to local governments infused with an enthusiasm for transparency, open government and open data.
In a cynical age, in a highly polarized political climate, under the duress of a recession, trust in government has fallen to historic lows. Recently, according to The New York Times, trust in the federal government dropped to single digits. Public trust in state and local government is a bit better, according to a Gallup poll, but in such a climate, reports of government secrecy and abuse of power strike an ugly chord that can resonate broadly.
In 2010, for example, it was discovered that the city of Bell, Calif. — a Los Angeles suburb with about 40,000 residents — paid a $1.5 million annual compensation package to its city manager and enormous salaries for several other officials. Angry residents besieged City Hall. Staff members were recalled and indicted, and California now requires all city and county governments to report employee salaries, which are then posted on the Internet.
Contracts — in which taxpayer funds are used to purchase goods and services — have been another contentious issue that begs for transparency. According to an OMB Watch report last year, states are leading in that regard, while an audit of 10 federal agencies showed that none disclosed the terms of contracts. On the other hand, local jurisdictions like Frederick, Md.; Cook County, Ill.; and Louisville, Ky., have gone so far as to upload even their checkbooks to public view.
But building trust is only the first level of engagement for transparency. There is utility in data once it’s made available, and nothing beats utility like open data that can be captured, manipulated, compared, graphed, mapped and so on. Just as used items can be recycled into new products, data captured for one government purpose can be extracted, evaluated and compared, resulting in new ideas that reveal opportunities and problems heretofore overlooked. And there is even a higher calling for open data: democracy itself.
San Diego Gets Results from 24-Hour Hackathon
Pet lovers, allergy sufferers and fitness enthusiasts are just some of the groups that stand to benefit from a 24-hour hackathon held in March in San Diego, aimed at encouraging software developers to come up with the next great city app. The hackathon was one component of the broader AT&T San Diego Apps Challenge, launched by Mayor Jerry Sanders. Developers competed for a share of $50,000 in prize money donated by challenge sponsors, including San Diego Gas & Electric, CONNECT and CleanTech San Diego. “ [The Apps Challenge] fits right in with San Diego’s entrepreneurial, innovative spirit, and we know our tech-savvy population will make it a huge success,” said Sanders when he announced the contest in January. — Noelle Knell
“E-government” has a long history in government, from the first websites modeled after the telephone book, to today’s sites with archived videos of city council meetings, searchable legislation, online payment, email contact information, even the jurisdiction’s checkbook register listing specific expenditures and salaries. The data is viewable with any Web browser, and can be copied, pasted or printed.
Open data is a more technical approach to government data with some big advantages. A programmer, for example, can dip into the content of an open data site, and instead of just copying and pasting data from it, can use an application programming interface (API) to link to live data sets on the site. These open data sets — because of the API — are machine-readable, meaning they can be manipulated by computer, merged with other data sets, mapped, etc., revealing new information and insight — like how many minutes it will take for the next bus to arrive, or even such complex subjects as cross-linking political contributions with successful lobbying and legislation efforts.
For instance, Seattle’s open data site contains information on Fire Department emergencies, which has been turned into useful applications by citizen developers. “One application takes the fire feed and shows on a smartphone where the fires are,” said former CIO Bill Schrier, who now directs e.Republic’s Digital Communities program. We had a couple of one- or two-day hackathons where developers used the data and built those applications.” Schrier suggested looking at http://dev.socrata.com/gallery — a gallery of applications — to see several examples built using data feeds from data.seattle.gov, hosted by Socrata, a Seattle-area company.
Many law enforcement sites carry photos of sex offenders and their approximate addresses. Through open data applications, this information can be linked to Google Maps to display offenders’ proximity to parks and schools. Google Maps was built with an API and is thus very useful for open data applications. The federal government’s Data.gov site, along with Data.seattle.gov and other state and local open data sites also are built with APIs. It’s an approach that makes it easier for agencies to share their data.
“Government has enterprise data management systems optimized for collecting and capturing and processing data in a course of conducting business,” said Socrata founder and CEO Kevin Merritt. “Open data is the opposite side of the same coin, which is basically making that data useful and acceptable to people who can extract value from it.”
Code-a-Thon Tackles Health-Care
IT Developers descended on Blacksburg, Va., in April to compete for cash prizes at the Hokie Health Code-a-thon. Home to Virginia Tech, this region in southwest Virginia also boasts many prominent health-care organizations. The two-day event challenged teams of student and professional software developers to come up with the next great innovation to improve health care. The Code-a-thon was part of a series from Health 2.0, which hosts similar events throughout the United States, Europe and Asia. At the event, former U.S. CTO Aneesh Chopra headlined a panel of experts delivering presentations relating to the role of technology in modern health care.
Photo: Jon Walton, CIO, San Francisco. Photo by David Kidd.
In March, New York City Mayor Michael Bloomberg signed open data legislation, saying that the NYC BigApps 3.0 open data competition had created dozens of useful programs to help New Yorkers do everything from pick a restaurant to find a parking space. “At the contest’s core is a simple premise,” said Bloomberg. “This data belongs to the public, and if we make it accessible to everyone, the possibilities are limitless.”
The city’s open data site — which now gets 2,000-3,000 visits per day in addition to those accessing data through the APIs — grew from about 100 data sets to nearly 1,000 in the last 2.5 years, and now includes a data set of nearly 20 million requests to the city’s 311 nonemergency service system from 2004 to 2011, showing the date, time, originating agency, location, status and contextually relevant details about each 311 request.
While detailed government data may give notoriously cantankerous New Yorkers more ammunition, former CIO Carole Post said transparency and openness of government are essential. “There’s now legislation in place, and I don’t see this going anywhere but continuing to advance and to evolve.”
Take Aim at Open Data, Social Media
This November, voters in 37 states will have an easier time finding election information via social media and mobile devices, thanks to the Voting Information Project (VIP), an initiative of the Pew Center on the States, Microsoft, AT&T, Foursquare, Google, state elections offices, media partners and others. VIP takes state election information, translates it into an open programming format and organizes it into application programming interfaces (APIs), so that developers can create a user-friendly apps from the data. “There’s polling place information, there’s what’s on the ballot, and before the November elections there will be location-specific ‘rules of the road,’” said Pew Senior Associate Matthew P. Morse. “Voters retrieve information specific to them through their addresses.” — Wayne Hanson
Walton said that San Francisco’s open data program was launched to address two problems. First, the public was frustrated at a perceived lack of openness and transparency. And second, the city’s data was simply hard to find.
“A lot of data was already available but it was available on paper, or in a specific format or a CD which has limited use, and other times, it was just hard to find — we’re a very large organization and every department had its own website, who to call and how to drill down on the website. So even if the data was available, it was challenging.
“So initially, open data was just to organize and open our data to the public,” Walton said. “We just created a portal, we surveyed all the departments and said ‘What data are you already providing and can you provide that in digital format rather than printing it out?’ If so, we’re going to post it on this simple portal in machine-ready format.” Those basic actions began the process of expanding public engagement, he said.
“We did things like creating a form where [citizens] could ask us questions about the data if they didn’t understand it. A lot of the public knows their own piece of the data, and they could give us feedback on when they thought the data was incorrect. Then they might say, ‘I looked at all the data sets and there’s this one piece of data that’s valuable, can you make that one available as well?’ So that allowed us to become the facilitator to the public. It opened up the dialog and made it much more collaborative.”
Walton said he thought open data would become more sustainable over time but was surprised at the immediate interest. “I knew people would use the data. I didn’t realize community groups and small startup firms would take that data and use it as a building block of part of an application. I was amazed. We put up 200 data sets, and almost immediately applications were being generated.”
Ordinarily, if the public demanded a certain application, the city would have done an RFP, asked for money and hired a consultant, Walton said. “And two years later, maybe we would have had an application.” Now, the city is providing the tools to the public to meet its own needs.
Safety Data Community
As this magazine went to press, the Obama administration was preparing to launch safety.data.gov, a “safety data community,” designed to “develop and deploy a range of digital tools and mobile applications to enhance public and product safety.” An introductory blog said the site would be a “one-stop shop for government safety data,” creating a “cross-sector, collaborative safety portal that better fulfills the needs of the public, private, and civil sectors.”
This increased coordination and transparency is intended to make it easier to compile safety information from multiple federal sources, including Justice, Labor and Occupational Safety and Health Administration, Health, the Consumer Product Safety Commission, and others, focusing on incident data, enforcement actions, product safety and exposure data. The site will operate on principles of open data and will have social functionality like blogs, forums, wikis and rating systems.
The open data site also will create “a one-stop shop for analytical software and decision support tools that agencies make publically available,” according to the blog.
“For example, the site links to the Pedestrian and Bicycle Crash Analysis Tool from walkinginfo.org. This tool is designed to assist state and local pedestrian/bicycle coordinators, planners and engineers with improving walking and bicycling safety through a comprehensive database of crashes between motor vehicles and pedestrians or bicycles.
“Another tool linked through Safety.Data.Gov is the FRA Web Accident Prediction System,” continued the blog, “which provides access to railroad safety information, including accidents and incidents and highway-rail crossing data. From this site users can run dynamic queries and view current statistical information on railroad safety.”
According to the administration, potential uses for the site and the data include:
Open data is seen as a partnership between the jurisdiction, the public and developers. Not surprisingly, civic engagement and collaboration are nearly always cited as the primary value. But are there tangible benefits as well? And what are the costs?
Peter Threlkel from the Oregon Secretary of State’s Office was looking for a way to make the state’s trademark information available online. Staff investigated a proprietary database solution, but instead decided to leverage the state’s existing open data site.
“It’s a small program with about 5,000 active trademarks registered and about 30 new filings a month,” Threlkel said, “so we could not justify spending half a million dollars on the Oracle solution. We used to make the trademark information and images of the filings available as a public records request on CD-ROM for $100 a month to a handful of customers. Now the Socrata solution at data.oregon.gov allows us to make the trademark information available and searchable online for free, so our customers and the public can access the data and images at their convenience.”
Early e-government efforts had to cope with analog records that needed to be run through optical character recognition equipment and then cleaned up for use. Since most government records are now born digital, that’s no longer the case said Socrata’s Merritt. “Now, the mechanism to get that data from the business system out to a public-facing open-data portal is straightforward and cost effective.”
New York City’s experience seems to align with that assessment. DoITT’s Director of Research and Development, Andrew Nicklin, and Open Data Coordinator Albert Webber were shifted into the open data program. Nicklin, Webber, a small team of city employees and a few support staff did the whole thing in the back office. The site is externally hosted and involved minimal costs.
The return on investment is mostly intangible, although city officials said there is an economic development side of it. The key, however, is citizen engagement, accountability and transparency.
Neither San Francisco nor New York City had hard evidence of a decrease in Freedom of Information Act (FOIA) or discovery requests as a result of open data initiatives, but anecdotal evidence was obtained through discussions with public-sector and nonprofit advocates to the effect that discovery filings were often not necessary since the data was easily available online.
Apps Help Patients, Doctors Battle Cancer
Two health IT applications designed to help doctors and patients stay a step ahead of cancer each took home a $20,000 prize in a January challenge to develop cancer prevention and treatment apps using public data. Ask Dory, an app which helps users find information about clinical trials for cancer based on data from ClinicalTrials.gov, was one of the winners. The other finalist, My Cancer Genome, allows doctors to see a list of therapeutic options for cancer, based on a patient’s tumor gene mutations. In an interview with Government Technology, Wil Yu, special assistant for innovations with the Office of the National Coordinator for Health Information Technology, said the winning apps stood apart from other submissions based on their ability to manipulate cancer data and provide a level of analysis that was both accurate and significant to users. — Brian Heaton
What might the future hold for open data initiatives? “I see a lot of struggle around influence and security as contentious issues that will take a lot of work to sort out what should be public and what should not,” said Wonderlich, of the Sunlight Foundation. “Those lines are not drawn appropriately as it stands now. Beyond that struggle, I think is better, more rigorous decision-making around information management. Looking at all the information an agency has and making good decisions about what should be public and what shouldn’t, rather than just a response to a FOIA request. I’m hoping we end up in a place where information is managed, just like HR is managed, or anything else. We make good decisions about what is public and what isn’t, not because of the interest of the department but because there’s a strong and clear public interest behind good decisions about what to make public.”
“We’re on the eve of a period of time in which every government will have an open data portal just like they have a website,” said Merritt. “This is a trend initiated by the District of Columbia and their citywide data catalog, and by the federal government in Data.gov. We now see open data portals in many cities and counties like Baltimore, Austin, New Orleans, Seattle, San Francisco and Chicago; states like Illinois, Oregon, Oklahoma, Washington, Maryland and Hawaii are all sharing their data.”
In a move that some think could signal a revival of Data.gov, and others dismiss as a campaign-year stunt, the Obama administration on March 8 launched ethics.data.gov, which, according to a White House release, “fulfills a campaign promise to centralize ethics and lobbying information for voters.” Jessica McGilvray, the American Library Association’s director for the Office of Government Relations — while mourning the demise of the Statistical Abstract of the United States, whose funding disappeared in the 2012 Census Bureau budget — saw the launch of data.ethics.gov as a sign of progress.