The Louisville Metro Government in Kentucky started its open data efforts in 2011 with a homegrown Web portal, and is now automating the publishing processes and using the data for performance improvement. As it does so, Louisville is working with a handful of vendors specializing in open data catalog publishing. “We are at a crossroads,” said Jason Ballard, director of the Department of Information Technology. The department has partnered with a company called NuCivic, which is developing an open source platform called DKAN, and is working with Socrata, the open data publishing vendor, on performance management.
“We may end up with a hybrid,” Ballard said, pointing to the expense of more mainstream platforms. “With some other products, such as DKAN, we may be able to achieve similar results for a better cost and sustainability.” The goal is to provide access to day-to-day work done by government employees in as close to real time as possible. “That’s where we are going,” he said.
This is a good time to be a chief data officer or CIO charged with creating an open data program thanks to a widening range of data publishing solution options, including open source, that did not exist just a few years ago. That’s good news for cost-conscious state and local governments.
The budget for open data publishing platforms is often a big constraint and decisions depend on internal capacity, according to Timothy Herzog, an open data specialist at the World Bank. “It is very common for cities and counties to have a bright young person familiar enough with open source and facile enough with the tools to stand up something fairly quickly on their own for a few thousand dollars plus staff time,” he said. For jurisdictions that have bigger budgets, a number of commercial choices are available.
The Regional Approach
Regional collaboration could be the next phase of open data publishing. The city of Pittsburgh has partnered with Allegheny County and the University of Pittsburgh on a regional data portal that doesn’t belong to any one entity but is collectively managed and co-mingles data from multiple governments. “What they are doing is exciting,” said Accela’s Mark Headd. “If you are a resident of Pittsburgh, you are getting services not just from the city but the county as well, and there is much more potential to provide transparency if you co-mingle that data and make it easier to use. We are going to break out of the approach of the last few years of a city having its own data portal, and there is a bright line around the boundary of its portal.”
Ultimately, which solution works depends on what government wants to do. At its most basic level, an open data portal can just be a directory like a phone book. If a citizen wants a certain piece of data, then they can follow this link and it is hosted here. “If that is what the government wants, the technology hurdle is significantly lower,” said Headd. “If a government wants a robust, managed API because they feel that developers or researchers or journalists want to leverage that, then they have to ask if they have the appropriate people in house to support that or is it something they should outsource.”
And if the aim of government is to nurture and grow the community of civic data activists and startups, then they have to look at their open data portal as a mechanism for engagement, not just a place where you go get data.
Whichever route they take to publish open data, public agencies should strive to make the data as easily accessible as possible, not only to people but also to automated systems, said Herzog. Open data platforms that are really doing it right allow users to download the data, but also include an application programming interface (API) to make data available in a consistent machine-readable format. With an API, it is possible for a visualization platform or other consumer technology to ingest that data or create middleware to make that translation automatic. “That is one of the value-adds that a good open data platform will give you,” said Herzog.
A manager of a midsize city might have to spend $150,000 to bring in a vendor to design a turnkey solution and put a platform online, Herzog said. But some out-of-the-box software-as-a-service (SaaS) solutions are available for about $150 to $200 per month. “On the other hand, if you are CIO of the U.S. Census Bureau, that model is not going to make sense,” he added. “It doesn’t fit how you do procurement, for one thing.”
Louisville spent about $150,000 in 2014 to launch its new website, and as of December, Ballard noted that they spent about $85,000 in 2015 with DKAN and NuCivic. “We’ll see where that takes us and continue to invest where we need to in order to evolve to the end state where we get to complete transparency.”
Another city that has invested in open data publishing is Seattle, which started working with Socrata in 2010. The city wanted developer-friendly tools and easy-to-use interfaces, including APIs that allow nontechnical people to access information about city government, according to Michael Mattmiller, Seattle’s chief technology officer. “We also are thinking about how we can help employees use these tools to glean information.”
Although he didn’t provide figures on the city’s open data budget, Mattmiller said Seattle hasn’t looked closely at open source options. “When there is a product that meets our needs, where we have seen other municipalities be successful with it, and our target users are familiar with it, it is hard to make the case we should build something else and devote technology resources to it.”
Government Technology surveyed some of the leading vendors offering solutions in the open data platform market to find out how the market is evolving and whether customers’ needs are changing.
There seems to be consensus that Socrata is out in front of the open data publishing market. Socrata CEO Kevin Merritt said he started the company in 2007 to build Web-based databases for small and midsize business. “One of those businesses that started using our cloud-based database was the White House,” he said. “We started looking at this opportunity and became passionate about governments putting their data online for a number of external stakeholders.”
It became evident that governments should be making data that taxpayers fund available to stakeholders downstream, Merritt recalled. In early 2009, Socrata pivoted and decided to go all-in and help governments make data available. “We now lead that market with more than 300 customers using our platform to make their data available,” said Merritt.
Socrata has worked with several jurisdictions on the cutting edge of open data, including Seattle, Chicago, New York and Illinois.
Governments go through a maturity curve during the open data adoption process, according to Merritt. Their needs evolve and what they want to accomplish as a byproduct of making data available is different as they go along. The foundation is to put up a catalog and make files searchable, discoverable and downloadable.
Every government has to start there, but that is no longer sufficient. “You have to get to the point where your data becomes an important element in your own data-driven decision making and becomes part of the economic development you are promoting in your jurisdiction, and that is where I think we set ourselves apart,” Merritt said. “We have customers at every step of the way in terms of maturity.”
Junar (which means to view and to know) was founded by Diego May and Javier Pajaro about five years ago. With offices in Silicon Valley and Latin America, Junar provides a cloud-based open data platform. May, the company’s CEO, noted that while governments can build these tools from scratch themselves, “we put a lot of brain power into thinking about what citizens are looking for when they go to open data portals and how cities need to open up data. We offer something that solves the problem right away.”
Customers include Palo Alto and Sacramento County, Calif. He said customers like that Junar offers a solution that’s modular. “We offer a standard open data platform package that is priced based on the population of the city.” How does he distinguish Junar’s offerings? “Socrata is a great company,” May said, “but I would say we are simpler to deploy and easier to use. That allows us to be more cost-effective. The total cost of ownership is lower.”
As CIO of the New York State Senate from 2009 to 2011, Andrew Hoppin led the deployment of a major website for the state, NYSenate.gov, using the Drupal content management system. Based on that experience, Hoppin today is the co-founder and president of NuCivic, which leads the development of an open source platform called DKAN, which was rolled out at the end of 2013.
New York-based NuCivic was purchased by GovDelivery in 2014. “We worked with a small number of customers first to figure out how to build a great open source product, and then we were ready for prime time,” he said. The plan is to take advantage of GovDelivery’s reach. It has about 1,000 government customers in the United States and Europe. Most governments want to do something with open data to get information out or even internally to drive efficiency, said Hoppin, adding that open source software that’s also available as a service — or “OpenSaaS” — offers increased agility and affordability.
A company that offers government software that streamlines land, permitting, asset, licensing, right-of-way, legislative management, and resource and recreation management, Accela is in somewhat of a unique position in terms of open data, said the company’s developer evangelist Mark Headd, the former chief data officer of Philadelphia. “If you look at other open data platforms, particularly the commercial platforms, you need to take data from another system and put it in their platform,” he noted. “Accela is the system of record for a lot of this data that gets put in open data portals.”
Governments use Accela’s system to conduct business, such as issuing licenses or permits. “All the data accumulates in our system, so we can help them bring that data out,” Headd explained. Accela has an open data platform built on the open source platform CKAN that is free for its customers to use. “My job as developer evangelist is to help our customers understand how they can leverage that data inside their Accela system,” he said. “We have over 100 customers currently on our civic data platform.”
In terms of how governments think about open data publishing, Headd said budget resources are one consideration, along with the amount of data they currently have available to publish. “It doesn’t make much sense for a government to invest a lot of resources in an open data portal if they are not at a point where they are ready to publish a lot of data,” he explained. “Conversely, if you have a lot of data and have a lot of usage, then there are commercial options available.” You have to decide how invested you want to become in any one particular vendor, especially since this is sort of a new area, he added.
Cities, counties and state agencies have been using Esri’s ArcGIS for years to manage spatial data and share it on the public Web. But in 2014 Esri launched a new application, ArcGIS Open Data, to give customers a free and quick way to set up public-facing websites where people can find and download data in a variety of open formats. “We added capabilities to ArcGIS that are expected or required of open data catalogs such as exporting in several common formats and speaking the new DCAT [Data Catalogue vocabulary] specification,” said Andrew Turner, chief technology officer of the Esri Research and Development Center in Washington, D.C.
One advantage is that the new application is part of the same data management and dissemination platform ArcGIS has had for a long time. “What we have seen agencies saying is, ‘Here is the subset of all my government data that I really want the public to discover first.’ This is the curated front page to their data catalogs and applications,” Turner said. Many ArcGIS customers are using this platform as their only open data catalog. “Once they find out about it, they don’t see why they would go pay for a separate open data solution. This is what the police chief and mayor already look at. They think, ‘Why don’t I use this same platform for open data and leverage my existing investment for new potential innovations?’”
Another viable option is to go the open source route, and CKAN (Comprehensive Knowledge Archive Network) is the leading open-source data portal platform. The Open Knowledge Foundation maintains CKAN, which is written in the Python programming language and can provide full support and hosting. (The federal Data.gov portal has used both CKAN and Socrata.)
NuCivic’s Hoppin describes why open source looked attractive to him when he was building a solution as CIO of the New York State Senate. “I want to be in the driver’s seat with my own technology,” he said. “I don’t want to be locked in with a vendor, even one that is a fantastic vendor.” In the Senate, requirements changed all the time, he added, sometimes for political or budget reasons, not necessarily technology reasons. “Open source gives you the ability to control your own destiny. I want the ability to fire my vendor and more important, I want the ability to innovate,” Hoppin said. “If I want to do something that is a novel idea, it would be nice to take care of that myself directly with the recent college graduate I just hired who has the tech skills to do it.”