IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Do Universities, Research Institutions Hold the Key to Open Data’s Next Chapter?

Where government has raw data, professors and researchers have expertise and analytics programs.

Government produces a lot of data — reams of it, roomfuls of it, rivers of it. It comes in from citizen-submitted forms, fleet vehicles, roadway sensors and traffic lights. It comes from utilities, body cameras and smartphones. It fills up servers and spills into the cloud. It’s everywhere.

And often, all that data sits there not doing much. A governing entity might have robust data collection and it might have an open data policy, but that doesn’t mean it has the computing power, expertise or human capital to turn those efforts into value.

The amount of data available to government and the computing public promises to continue to multiply — the growing smart cities trend, for example, installs networks of sensors on everything from utility poles to garbage bins.

As all this happens, a movement — a new spin on an old concept — has begun to take root: partnerships between government and research institutes. Usually housed within universities and laboratories, these partnerships aim to match strength with strength. Where government has raw data, professors and researchers have expertise and analytics programs.

Several leaders in such partnerships, spanning some of the most tech-savvy cities in the country, see increasing momentum toward the concept. For instance, the John D. and Catherine T. MacArthur Foundation in September helped launch the MetroLab Network, an organization of more than 20 cities that have partnered with local universities and research institutes for smart-city-oriented projects.

The focus of those projects was disparate: In Houston, one of Rice University’s ideas was to help the city collect data to better determine where it should place bicycle-sharing racks. The University of Washington sought to help Seattle install weather sensors to track hyperlocal precipitation in an effort to predict when residents will strain the power grid with higher electricity demand.

The network included some partnerships that had already existed for at least a few years. One of them was the Urban Center for Computation and Data (CCD), a collaborative involving faculty from the University of Chicago and researchers from the Argonne National Laboratory, which partnered with Chicago officials to put data to work enhancing municipal knowledge and solving problems.


Charlie Catlett, director of the Urban Center for Computation and Data, is working with Chicago to place sensors on city-owned infrastructure to demonstrate the Internet of Things’ potential. Photo by Alex Garcia

The partnership has focused largely on big data and breaking down silos among city agencies. In 2013, the Urban CCD helped the city win $1 million from the Bloomberg Mayors Foundation Challenge to expand situational awareness data. During major events — Chicago plays host to more than a few — Urban CCD Director Charlie Catlett wants to ensure the city’s platform can scale up to include more situational awareness data than ever before, as well as support more users and run predictive analytics programs.

It’s a project that might have forced the city to inflate its technology budget if it weren’t for the partnership, Catlett said.

“That was to look at the analytics the city does and see if we could accelerate new innovations in that area without having to hire an army of additional data scientists,” he said.

The $1 million award is an example of another benefit that people with backgrounds in academia can help cities with, he said: grant applications. As part of its smart cities initiative, the White House and several federal agencies offered $160 million that city-research institute partnerships can reach for.

Some of that money has already gone to the Urban CCD for a project called the Array of Things. The idea, Catlett said, is to construct a real-life metropolitan demonstration of what the Internet of Things can be. Chicago is gearing up to place an array of sensors on city-owned infrastructure — the number of sensors grew from 30 to 500 after the project won $3.1 million from the National Science Foundation — that will be capable of taking environmental measurements as well as recognize objects.

The potential applications for the project, from a municipal and research perspective, stretch as far as the technological capabilities of the sensors because the collaborative plans to crowdsource ideas for how to use the array. Some early applications include air pollution maps, congestion tracking and flood damage prediction.

And the Array of Things is just one project. Catlett has biweekly meetings with Chicago’s tech officials. That’s the thing about partnerships: They establish ties that allow the projects to keep on coming.


Carlo Ratti believes the world is letting a vast and valuable data resource slip away unmined. That resource is sewage — sewage carries a sampling of the collective bacteria that act as markers of public health. Working with the city of Cambridge, Mass., Ratti and the SENSEable City Lab hope to demonstrate the potential of collecting and analyzing bacteria in sewage and trying to identify where it comes from. The first application is to predict disease outbreaks, which could help contain them and reduce health-care costs, but the lab sees more possibilities for the project in the future. For instance, sewage-based data could be isolated to individual neighborhoods. Eventually the data could act as a new kind of human census. Photo via Shutterstock.

At the core of the concept of city-research institution partnerships is the idea that both parties bring something to the table that the other is lacking. The cities bring data that researchers otherwise would not have, while the researchers bring expertise and ideas.

“In general, we believe in feedback loops from city officials and citizens while developing a project,” wrote Carlo Ratti, director of the SENSEable City Lab hosted within the Massachusetts Institute of Technology, in an email to Government Technology. “By working in the same city, these flows of information speed up, with great benefits for all the parties involved.”

The cities often get very tangible benefits from the partnerships: They outsource labor to the research institutions involved, they get ideas for how to use data, and they can uncover trends that help make operations more efficient or perhaps cut down on wasteful spending. For instance, if Chicago were to send out work crews to replace old light bulbs with LEDs, Catlett suggested that the city might save some money by asking those same crews to perform other maintenance while they’re out.


The SENSEable City Lab’s HubCab project creates an interactive visualization of the millions of taxi rides in New York City each year, providing insights into mobility and efficiency.

“Big data is impacting many dimensions of our society. In cities, it can help us better understand the world around us and plan its transformation,” wrote Ratti. “Over a century ago, Élisée Reclus stated that good ‘surveying,’ i.e., data collection, is the first fundamental step in city planning. It is not different today — if not for the fact that we know our cities much better and can plan their transformation accordingly, [then in] opening the data to citizens. Universities have a lot to offer here — both in terms of helping with the data management platforms and providing ideas for its usage.”

The benefits cities offer to universities can be a little less tangible. They can offer access to data, which allows researchers to peer into things they otherwise might not be able to. They also present a unique opportunity for students to learn and gain practical experience.

“The students know that there are job opportunities in data analytics, so being able to get their hands on city data is a wonderful opportunity,” said Steven Koonin, director of New York University’s Center for Urban Science and Progress.

Catlett said government officials also bring a perspective into the conversation that researchers might not have — a perspective that can help shape the applications researchers pursue.


Working from downtown Brooklyn, researchers from CUSP headed up to the roof of a building in 2014 and set up an 8-megapixel camera pointed at the Manhattan skyline. The camera, which takes pictures every 10 seconds, builds an aggregate view over time of the city’s infrastructure. The idea is to combine the information contained in the visible spectrum of light with radar, infrared and more to identify problems, patterns and trends — for example, plumes of gas released from buildings. CUSP has partnered with an array of public agencies — city, transit, the port authority and others — to get access to data that can supplement the information gleaned from the observatory and broaden its application.
Photo via Shutterstock.

“Researchers are not familiar with the challenges that a city has, from the point of view of an official making [a] decision,” he said. “They may be users of cities as individuals, but what cities offer in terms of a partnership is a glimpse into the complexities of the challenges they’re facing.”


Two recurring themes in projects that universities and research organizations take on in cooperation with government are project evaluation and impact analysis. That’s at least partially driven by the very nature of the open data movement: One reason to open data is to get a better idea of how well the government is operating.

“A lot of the open data laws were motivated by a desire for transparency, which is a great thing, but putting data out there so that the government is more transparent is a much different exercise than putting the data out there so that it’s usable — so the public sector can benefit from it, the private sector can benefit from it,” said Mike Holland, CUSP’s executive director.

That means that sometimes inefficient or broken systems will be uncovered. When that happens, Koonin said, it helps push public servants toward creating better systems.

“If services end up not working, that can be embarrassing,” he said. “But it’s a tool for creating better government.”

A large-scale example happened in September when the University of Southern California released a study examining data points from traffic monitors across Los Angeles in an attempt to discover whether a light rail expansion had truly reduced congestion like its proponents thought it would when pitching the project to the city. The study, which its authors believe to be the most extensive and granular of its kind ever conducted in the U.S., showed that the light rail expansion didn’t actually cut down highway congestion. It did, however, increase overall transportation along that corridor of the city — a benefit unto itself, according to the researchers.

The takeaway, study co-author Sandip Chakrabarti told Government Technology in November, was that good data analysis enables city officials to have more informed conversations about major projects like light rail expansions.

But there are other considerations that can go into a major transportation investment, which is part of the reasoning behind the Urban CCD’s “Plenario” project. Plenario pulls together open data sets covering a wide range of agencies and allows users to compare and map them. The result, said Catlett, is an ability to peer into various possible outcomes for city operations.

“Let’s say it’s rapid transit or a new park or a new road in a part of a city,” he said. “You’d like to know what the impact of that investment is. Let’s say it’s $50 million. You want to know: Did it create more jobs? Did it help the neighborhood be safer?”


Open data may have been part of the impetus for city-university partnerships, in that the availability of more data lured researchers wanting to work with it and extract value. But those partnerships have, in turn, led to government officials opening more data than ever before for useful applications.

Sort of.

“I think what you’re seeing is not just open data, but kind of shades of open — the desire to make the data open to university researchers, but not necessarily the broader public,” said Beth Noveck, co-founder of New York University’s GovLab.


Much of what GovLab does is about opening up access to data, and that is the whole point of Docker for Data. The project aims to simplify and quicken the process of extracting and loading large data sets so they will respond to Structured Query Language commands by moving the computing power of that process to the cloud. The docker can be installed with a single line of code, and its website plays host to already-extracted data sets. Since its inception, the website has grown to include more than 100 gigabytes of data from more than 8,000 data sets. From Baltimore, for example, one can easily find information on public health, water sampling, arrests, senior centers and more. Photo via Shutterstock.

That’s partially because researchers are a controlled group who can be forced to sign memorandums of understanding and trained to protect privacy and prevent security breaches when government hands over sensitive data. That’s a top concern of agencies that manage data, and it shows in the GovLab’s work.

It was something Noveck found to be very clear when she started working on a project she simply calls “Arnold” because of project support from the Laura and John Arnold Foundation. The project involves building a better understanding of how different criminal justice jurisdictions collect, store and share data. The motivation is to help bridge the gaps between people who manage the data and people who should have easy access to it. When Noveck’s center conducted a survey among criminal justice record-keepers, the researchers found big differences between participants.

“There’s an incredible disparity of practices that range from some jurisdictions that have a very well established, formalized [memorandum of understanding] process for getting access to data, to just — you send an email to a guy and you hope that he responds, and there’s no organized way to gain access to data, not just between [researchers] and government entities, but between government entities,” she said.


The infusion of federal money and the launch of the MetroLab Network are both big boosts to the concept of government-research institution partnerships. But the people involved in those partnerships see more reasons to be optimistic.

One of them is the self-sustaining nature of those partnerships. When students at universities are offered the chance to do the duties of data scientists, it puts them in good position to pursue jobs handling data, IT or other related jobs within the public sector.

The rising number of chief information officers, innovation officers, data officers and data scientists springing up across the country also lends itself to supporting such partnerships, according to several sources. People in those positions, with resources at their disposal and easy access to government databases, are more likely to seek out researchers who can help them put raw data to good use.

“When you have a data scientist in-house, that’s what leads to the recognition that you need to do more data-driven projects … and you need to turn toward partners,” Noveck said.

Ultimately what that means is that there are more people working to help government get smarter — even though not all of them work directly for the government.

“Open data by itself doesn’t necessarily translate into smart decisions unless smart people are willing to work on them,” she said. 

Ben Miller is the associate editor of data and business for Government Technology. His reporting experience includes breaking news, business, community features and technical subjects. He holds a Bachelor’s degree in journalism from the Reynolds School of Journalism at the University of Nevada, Reno, and lives in Sacramento, Calif.