IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

How Chicago’s Data Dictionary is Enhancing Open Government

The Chicago Data Dictionary is a searchable archive of “data describing data” containing information about the variety of data in Chicago’s numerous databases.

Right now, the City of Chicago is working on documenting all of its data. That’s right: all of Chicago’s public data, across all databases, in all its departments and sister agencies.

The project, called the Chicago Data Dictionary, is a massive, public metadata repository—a searchable archive of “data describing data”—that gives users information about the variety of data in the City of Chicago’s numerous databases. As the next phase of Chicago’s government transparency initiative, the Data Dictionary complements the City’s open data portal by providing background information on where such data comes from.

While it may not be the city’s chicest tech initiative, the Data Dictionary is nonetheless an ambitious and colossal project that is enhancing the city’s data landscape.

Why Does a City Need a Data Dictionary?

A database is only as good as the data it contains. Its validity, however, can suffer if its data is not defined clearly. Thus, data dictionaries, or metadata repositories, are important because they provide database users with key “ground rules” for understanding complex, often jargon-riddled, databases. Data dictionaries also allow users to find data quickly with a simple query.

In the case of a public data dictionary, “users” can include just about anyone who accesses municipal data. Public data dictionaries can benefit academic researchers and software developers who want to know what kinds of data a City holds, and how they can access it for research or application development. They can also assist city staff who manage city databases and work to improve their efficiency.

Since government open data initiatives are still new, public data dictionaries are uncommon.  In Cambridge, Massachusetts, the Cambridge Information Technology Department (ITD) is creating a Data Dictionary for its Geographical Information Systems (GIS) division. Cambridge’s dictionary provides information about the city’s geographical data use, coding, history, and other attributes.    

Like Cambridge’s program, most metadata repositories cover only a single department, project or database. The Chicago Data Dictionary is a radical step: it takes the standard metadata repository model and amplifies it across an entire city.

Building a Metadata Repository in Chicago

The Chicago Data Dictionary is part of Mayor Rahm Emanuel’s vision to use technology to make government more efficient and transparent. The initiative also expands upon Chicago’s goal to be the nation’s leader in open data. 

In March 2012, the Mayor sponsored an ordinance for the Data Dictionary, and it quickly passed through City Council. With the assistance of a $300,000 grant from the John D. and Catherine T. MacArthur Foundation, the Chapin Hall at the University of Chicago research center led the initiative along with Chicago’s Department of Innovations and Technology (DoIT).

Nine months later, an Executive Order issued by the Mayor mandated that city agencies regularly publish and update their public data on the City’s data portal. The Order specifically mentioned the Data Dictionary as a tool that would “improve City operations, services and analytical decision-making.” 

Now one year into the project, Chapin Hall and DoIT are continuing work on the first of a three-phase plan to develop the Data Dictionary. In the past year, Chapin Hall has completed the inclusion of more than 12 city databases into the Dictionary; currently, they are identifying, processing, and cataloguing over 100 additional municipal databases.

However, Chicago government contains far more than 100 databases. By including every City and sister agency database in the new repository, how can Chicago ever complete such a herculean task?

This is the wrong question to ask. As the project’s scope implies, compiling the Dictionary is no quick job, nor is it ever a “done” job. Because new municipal databases may be added or changed, the Data Dictionary requires continued maintenance to ensure that its users receive useful and up-to-date information.

This brings us to the right question: how can the Data Dictionary improve the way Chicago’s citizens and government understand and use their City’s data?   

One way to do so is to make the Data Dictionary available online, even as its development continues. Chapin Hall designed its homepage simply and efficiently, helping convey its purpose as a querying tool for users:

A second way to do so is to share the design of the Dictionary itself, so that outside cities and organizations may benefit by adopting it. As with many of Chicago’s other open-source projects, DoIT will make the source code for the Chicago Data Dictionary available on Github for anyone who wishes to build a metadata repository of their own. 

Moreover, while some of Chicago’s open-source initiatives, such as the SmartData predictive analytics platform, are intended to be replicated by other cities, Chicago’s Data Dictionary model can serve a purpose for any type of organization. A ready-made API could be a gift to database administrators in nonprofits and private companies alike who use databases regularly.  

A New Tool for the Public

When thinking of new and innovative ways data can improve cities, most people generally don’t think of metadata repositories. But without better understanding the “data about the data,” many of these new benefits may not develop in the first place.

Chicago’s Data Dictionary, a bibliographic giant growing bigger by the day, is providing the City with just that resource. The next time someone in Chicago has a question about their city’s data, they know where to look first.

This story originally appeared on Data-Smart City Solutions.

Government Technology editor Noelle Knell has more than 15 years of writing and editing experience, covering public projects, transportation, business and technology. A California native, she has worked in both state and local government, and is a graduate of the University of California, Davis, with majors in political science and American history. She can be reached via email and on Twitter. Follow @GovTechNoelle
Special Projects
Sponsored Articles
  • How the State of Washington teamed with Deloitte to move to a Red Hat footprint within 100 days.
  • The State of Michigan’s Department of Technology, Management, and Budget (DTMB) reduced its application delivery times to get digital services to citizens faster.

  • Sponsored
    Like many governments worldwide, the City and County of Denver, Colorado, had to act quickly to respond to the COVID-19 pandemic. To support more than 15,000 employees working from home, the government sought to adapt its new collaboration tool, Microsoft Teams. By automating provisioning and scaling tasks with Red Hat Ansible Automation Platform, an agentless, human-readable automation tool, Denver supported 514% growth in Teams use and quickly launched a virtual emergency operations center (EOC) for government leaders to respond to the pandemic.
  • Sponsored
    Microsoft Teams quickly became the business application of choice as state and local governments raced to equip remote teams and maintain business continuity during the COVID-19 lockdown. But in the rush to deploy Teams, many organizations overlook, ignore or fail to anticipate some of the administrative hurdles to successful adoption. As more organizations have matured their use of Teams, a set of lessons learned has emerged to help agencies ensure a successful Teams rollout – or correct course on existing implementations.