OpenFDA: Making Federal Public Health Data Sets Accessible

Taha Kass-Hout, chief health informatics officer for the U.S. Food and Drug Administration, talks about the creation of openFDA.

by / January 13, 2015

Even skeptics about the responsiveness of an organization as large as the FDA have had to admit that the agency’s first chief health informatics officer, Taha Kass-Hout, has shaken things up with the creation of the Office of Informatics and Technology Innovation (OITI).

Kass-Hout came to the FDA in March 2013 from the Centers for Disease Control and Prevention, where he had helped with the adoption of cloud computing. At the FDA, Kass-Hout’s first endeavor was the creation of openFDA, an initiative launched in June 2014 to make it easier for Web developers, researchers and the public to access public health data sets collected by the agency. In a recent interview, Government Technology asked Kass-Hout about the creation of openFDA.

What was the impetus behind openFDA?

Previously it was almost impossible for developers and researchers to easily access the data. Also, the Freedom of Information Act Office was getting lots of requests, many of them asking for the same things. If you wanted to look at acetaminophen over time, for instance, you had to download all these separate files and stitch the data together and de-duplicate it. We talked to some developers who said it had taken them almost two years just to construct the data. So the data was public but not easy to access. We wanted to make it easier and more transparent, both for the industry to submit information and for the consumers of the information to access it.

What were the next steps and some decisions you had to make?

When we thought about openFDA, we saw it as sort of the sandbox for how to deal with all the other problems we have to deal with. We have a wide variety of data types, from genomic to regulatory to clinical research. I engaged my team, primarily FDA employees, but we brought in a small company from Silicon Valley [Iodine]. We chose to use a search-based application program interface (API) that gives developers the ability to search through text within the data. The open source code and documentation are shared on GitHub. We hope this will encourage the industry to move to this API and big data approach. At the same time, we wanted to stop at the API and not force one set of applications or another on people. This method allows them to build their own applications on top of openFDA, giving them flexibility to determine what types of data they would like to search and how they would like to present that data to end users. This enables a wide variety of applications to be built on one common platform.

What types of data sets have proved most interesting to developers so far?

In public meetings with people interested in getting data from FDA, the first choice was adverse-event reporting of drugs. So that was the first data set we made available — 3.8 million adverse event reports received between 2004 and 2013. You can search by a generic name, brand name, active ingredient, inactive ingredient, etc. The second was recall data; the third was device-adverse events, and the fourth was labeling for more than 65,000 products. So it is almost like this trinity: adverse events, recalls and labeling. Through engagement with the community, we are thinking about adding other data sets.

Do you think the creation of the OITI can spark new ways to think about things in the FDA?

My office, the OITI, is focused on where innovation should take us, looking at data standards and knowledge management issues. But I also work closely with the CIO who runs the Office of Information Management. So the innovation operation has ties with FDA centers and offices, as well as with industry and the development community, to allow us to deliver high-impact solutions that can help us achieve our mission. 

David Raths contributing writer