Tom Schenk is Chicago’s second chief data officer, leading the city’s analytics and open data initiatives since September 2014. He previously served as the city’s director of analytics and performance management and as a senior research analyst for Northwestern University’s Feinberg School of Medicine. Schenk now focuses on Chicago’s citywide data repository WindyGrid and related predictive analytics pilots.
Last year Chicago deployed a predictive analytics project forecasting rodent infestations. What’s next?
Since then we’ve been working on research around predicting and optimizing where to inspect for food poisoning, namely restaurant inspections. So of 15,000 establishments in Chicago, the goal is to identify how we can better recognize which restaurants are the ones we need to visit first. We finished off that pilot. It went very well. We were able to find critical violations earlier than we could in a normal process, so that data-driven approach is driving more efficiency. In the upcoming months we’re going to have some more research projects that we’re just piloting right now.
WindyGrid collects 7 million rows of data each day, much of it in real time, across departments. How is Chicago working to offer this technology to other cities?
The commitment to open source, on the software side, is our biggest work right now. WindyGrid has Mongo as its back-end database, which is an open source solution. But we want to make the entire platform an open source solution so other cities or entities can use its interoperability for geospatial data. So we’re rewriting a lot of the code to achieve more user functionality that will work on mobile devices — which is a big objective of ours — and also to ensure our code base can be completely open source and literally put on GitHub.
Do you think WindyGrid will be used only by large cities or will it be easy enough for smaller cities to deploy?
Certainly large cities would be able to deploy it. For smaller cities, we hope this will be an easily deployable platform or framework, but the true measure of that is only going to come after we get through development. It’s certainly going to be useful for them, but it’s going to depend on other things such as the availability of data in smaller cities. Many cities do have 311 systems, and that is one important source of data for us and other large cities, but deploying the system depends a lot on the data sources available — for example, the degree to which information is digitized.
How does open source software foster city partnerships?
A great part of using open source technology is that it can be somewhat agnostic toward vendors yet able to leverage partnerships with other companies; companies need to use such software for research. For instance, with the restaurant inspections we partnered with Allstate Insurance, which has a team of data scientists. And like many companies, Allstate workers do volunteer projects pro bono for the community. In our case, Allstate’s team of data scientists worked with us to develop a predictive analytics model to improve quality of life for Chicago residents. So by leveraging open source technologies — and not requiring software licenses to do research — it gives us the nimbleness of partnering with companies and knowing they don’t need software licenses to contribute.