Open data offers great promise, but also some risk.
At the beginning of this year, President Trump signed into law the Open, Public, Electronic and Necessary Government Data Act, requiring that nonsensitive government data be made available in machine-readable, open formats by default.
As researchers who study data governance and cyber law, we are excited by the possibilities of the new act. But much effort is needed to fill in missing details – especially since these data can be used in unpredictable or unintended ways.
The federal government would benefit from considering lessons learned from open government activities in other countries and at state and local levels.
Open government is the governing doctrine which holds that citizens have the right to access the documents and proceedings of the government to allow for effective public oversight. The doctrine has drawn increased attention in recent years, as a growing list of nations agree to participate in a global voluntary commitment towards democratic reforms, via the Open Government Partnership initiative.
America was one of the Open Government Partnership’s eight founding countries in 2011, and the Open Government Partnership was an outgrowth of domestic open government initiatives launched in the first months of the Obama presidency.
In December 2009, Obama issued a directive requiring federal agencies to proactively publish government information online in open formats and to take other steps toward building a culture of openness around data.
This initiative launched the Data.gov website that publishes government databases, as well as WeThePeople.gov, for petitioning the government; Challenge.gov, for competing to help the government solve problems; and USASpending.gov, disclosing and tracking the federal budget.
Open government data have already produced direct impacts on Americans’ daily lives. For example, detailed city profiles offer information such as demographics, crime rates, weather patterns and home values. These data, in turn, allow developers to build more robust applications for individuals, such as a health inspection score app or AccuWeather, which provides minute-by-minute precipitation forecasts.
Because the Obama administration’s efforts toward openness were driven by executive orders and not legislation, they faced possible rollback by later administrations. The Trump administration’s early removal of climate science-related data on agency websites, for example, raised concerns among researchers and others about its commitment to transparency and accountability.
The Trump administration has, however, recognized the value of government data for driving innovation and economic growth, holding federal grantees accountable and improving the effectiveness of public services.
The OPEN Government Data Act, signed into law on Jan. 14, enjoyed broad bipartisan support in Congress and built directly on the Obama era agenda for openness.
Taking effect in January 2020, the act requires government agencies to make their data freely and publicly available in open formats and machine-readable, unless other considerations – such as intellectual property, privacy or national security concerns – indicate otherwise.
Agencies must also develop strategic plans for managing their data; develop a comprehensive and metadata-enriched inventory of their data, minus some national security-related data; and appoint a chief data officer to manage agency data and maximize its value to the government and the public.
In February, the White House issued America’s fourth National Action Plan. This plan echoes the OPEN Government Data Act and emphasizes the need to make federally funded science publicly available in the interest of economic growth, innovation and public health.
As Democratic Rep. Derek Kilmer of Washington said in an interview to Federal Times, “Passing the OPEN Government Data Act was a big step, but it wasn’t the last step.”
While the law requires that all agencies designate a nonpolitical chief data officer, only a few agencies currently have filled this role. Even the existence of a data officer does not guarantee success; the key will be whether the data officers can build a robust agency culture of data sharing and openness.
“There’s, I think, naturally, a tendency within government or any other large institution to favor risk aversion and opacity,” Christian Troncoso, policy director at Business Software Alliance, commented to Fedscoop.com. “People take a sort of siloed view of what they’re working on and don’t necessarily appreciate the fact that the data they may be generating in the course of a project could also be helpful to their colleagues within the agency, certainly, but then to their colleagues across government as well.”
Several features of the act are designed to promote data sharing and best practices in data management. For example, the Office of Management and Budget must create guidance for agencies and a council of agency chief data officers, and collaborate with others to build an online repository of open data tools and standards.
But, in practice, the act may still leave significant room for agency discretion in judging data to be restricted or too costly or not worth making open. This could lead to serious gaps in open data. While open government data is a lofty goal, without coordinated implementation, it may suffer from unrealized potential.
The OPEN Government Data Act requires agencies to walk a fine line between making data as open as possible, but as closed as necessary due to, for example, cybersecurity concerns over sensitive information.
Federal law already limits some disclosures. For example, under the Confidential Information Protection and Statistical Efficiency Act, statistical agencies face strict rules for protecting personally identifiable information. Employees at agencies such as the Census Bureau face fines and potential jail time for improper disclosure of data.
Existing laws have contributed to a culture of withholding data when sensitive data are present within the data set. But the OPEN Act could lead to new tensions between openness and privacy, because expanding the universe of open data increases the risk that data that appear anonymized will become personally identifiable.
For example, in 2014, a London researcher was able to trace an individual’s movements from Transport for London’s open data. With just a little more information, the researcher claims he could have easily identified the individual.
In 2014, another group of researchers successfully deanonymized New York City Taxi and Limousine Commission data. The researchers were able to track specific taxi medallion numbers and, in some cases, specific passenger trips.
While these instances were part of a small number of reported concerns, we are concerned how other data releases may lead to unintended consequences, such as open data being used to track the movements of individuals. Appropriately, the OPEN Government Data Act calls on OMB and agencies to consider the risks of reidentification from data pooling as they carry out their open data activities.
But, it seems to us that truly deidentifying data is an increasingly elusive goal. The act also requires agencies to collect and analyze information on how their data are being used. While this makes sense in the broader context of maximizing the usefulness of government data, it raises its own privacy issues. Implementation choices will be key – and the role of data officers will be critical.
Anjanette Raymond, Associate Professor of Business Law and Ethics; Director, Program on Data Management and Information Governance, Ostrom Workshop, Indiana University; Beth Cate, Clinical Associate Professor of Public and Environmental Affairs, Indiana University, and Scott Shackelford, Associate Professor of Business Law and Ethics; Director, Ostrom Workshop Program on Cybersecurity and Internet Governance; Cybersecurity Program Chair, IU-Bloomington, Indiana University