In 2012 the Harvard Business Review declared chief data scientist “The Sexiest Job of the 21st Century.” Sexy perhaps, but hardly universal. In January, Indeed.com had about 90 listings for the title, Monster.com had 19 and the federal USAJobs.gov had none. Still, the title has gained some traction in the public sphere. The White House has appointed a national chief data scientist. The U.S. Department of Commerce and the Environmental Protection Agency have their own, as do some state agencies.
“The rise of the chief data scientist reflects the growing acknowledgment that government must make data-driven decisions to be effective,” said Jennifer Bachner, director of the master’s program in government analytics at Johns Hopkins University.
With the recent arrival of the chief data officer, some might ask why another C-suite data executive is needed. Bachner said the two play very different roles. “The term ‘scientist’ implies that performing and overseeing analyses are core responsibilities. In contrast, the term ‘officer’ might lead people to think that the position primarily involves data management — collection, storage, distribution,” she said. “Today’s chief data scientist needs to be able to use data to develop actionable recommendations, not just gather, store and distribute it.”
Given the increasingly common notions of open data, big data and data analytics, the role of the data scientist is on the rise. Many in government are eager to see just how these tools might apply to their efforts to build models, ponder what-ifs and convert raw information into policy guidance.
Here’s a look at how public-sector data science chiefs are defining this emerging role.
In February 2015 the United States got its first chief data scientist, giving a substantial boost to a job title that is still in its formative stages in many levels of government.
Housed within the White House’s Office of Science and Technology Policy, under the CTO, the new data leader has been called to “harness the power of technology and innovation to help government better serve the American people,” the White House said in announcing the appointment of DJ Patil as chief data scientist and deputy CTO for data policy.
It’s a mouthful of a title, but the task is fairly straightforward. Patil said his job is not so much to crunch the numbers as it is to determine how the data can be used to inform policy.
He points to new data insights in health care, such as the ability to use genomic data and bioinformatics to drive breakthroughs. The federal Precision Medicine Initiative drives these innovations, and that policy is guided in part by Patil and his three-person team. “We are there to help ensure data will be used responsibly, that the rules are there to unleash the power of data and finally to make sure that everyone will benefit,” he said.
Patil comes to the job with a daunting tech resume. He served most recently as the vice president of product at RelateIQ, which was acquired by Salesforce, and previously held positions at LinkedIn, Greylock Partners, Skype, PayPal and eBay.
Today Patil receives his assignments directly from the president, reports to U.S. CTO Megan Smith and collaborates with a wide range of multidisciplinary partners. “You need people who have deep experience in policy, in law, in criminal justice, and then we also have to have the technology people at the table,” he said. “The more we can bring the best minds together, the more effective we are.” Patil has supported a range of presidential priorities, from mental health and suicide prevention to the management of data from public safety body cameras.
As for that sprawling title and the rise of the chief data scientist, Patil tends to play down their importance. “I don’t personally see a big distinction between a data officer and a data scientist,” he said. “It always comes down to the role. A CIO in one shop is oftentimes different from a CIO in another shop. It is very specific to the set of problems that you work on.”
How did all this lead to his present job helping to manage the state’s Medicaid program as chief data scientist? He calls it a natural progression. “The more you start interacting at the C-suite level, the more you talk about the financial impact on organizations — data always plays a role in that,” he said. “I was always creating another database, trying to understand why something was happening that we couldn’t see from the surface.”
Kane served with the agency for five years before taking on the role of top data scientist. During that time, everyone’s understanding of data was changing. “Our leadership recognized that as the Medicaid program in general was growing, there were also new types of data coming in all the time,” he said.
The potential impact of all that data is enormous, with some 3.5 million claims coming in every week. “That might not be so large for Google, but with health-care data we are just scratching the surface of what can be done,” Kane said. For the data chief and his five-person team, “one of our missions is to help get a handle on this, to look for new and innovative ways of looking at the data.”
Structurally, Kane is part of a larger data management effort. He reports to the chief of the Bureau of Medicaid Data Analytics, which oversees the data program. Kane crunches the data, making it easily accessible and sensible. A separate business unit looks for ways to put that data to work in terms of practical policy.
Kane also advises the CIO of the Agency for Health Care Administration. He may put up global notions, like the implementation of a visualization suite, which the CIO eventually adopted agencywide. That was a big change. “Now we can do exploratory analyses or targeted analyses in such a short time, versus what would have taken you days in the past,” said Kane. “Now the answer will just show up right in front of you.”
Overall, Kane sees the chief data scientist as being essentially a builder of bridges, someone who not only manages data, but also helps to integrate that information into the larger vision of the organization.
“It bridges the gap between the technical person and the C-level person, between the clinician and the analyst,” he said. “If you have a complement of skills, if you can do the dirty work and still present it to the board at the end of the day, then there are fewer disconnects. And there are not that many people who can do that.”
The way Curt Savoie has it figured, the data chief’s role in the public arena is to make life better for the citizenry. As Boston’s principal data scientist, a position he held until February 2016, he looked to the city’s top official to set the agenda for how that is to happen.
When Mayor Martin Walsh took office, he promised to tackle a clunky permitting process that seemed to slow investing. “So that filtered down to our saying, ‘How can we help?’” Savoie said. He and his team of about a dozen analysts dug into permit applications, inspection records and other existing data. “We want to send it back into the organizations so that management can make better decisions.”
This basic premise — good data drives good policy — forms the core of Savoie’s work. He said that while he loves analytics, “when I think of what will possibly add value or sustain this work in the years past when I am here, it’s the policy stuff.”
Two years ago, for instance, he crafted an executive order calling for the wide-scale opening of city data. “This is important. Before that, we had a lot of holdouts in different departments, people who weren’t particularly interested, who saw open data as a political risk,” he said. “For the mayor to come out and say this is important, it gives those middle managers a little bit of political coverage.”
Much of Savoie’s work has to do with data literacy, walking city managers through the basics of what data is and why it matters. “If I throw a bunch of numbers at a manager, they might not know how to interpret those,” he said. “Sometimes you have to guide people along the path. And they need to see that it is not just a ‘think’ piece, that this is something actionable that they can move on.”
Rather than asking managers to rise up to the data, it’s sometimes more helpful to get down in the weeds, “to start with something they know,” Savoie said. “They may not have the numbers, but they know their business, they know what they are talking about. If you can speak to what they know, that’s when you really start to provide a service.”
He’s not just providing a service to city departments, helping them to improve their services. Savoie also is providing a service to the public at large. For the top data scientist these days, that can engender a tricky balance.
“You have the obligation to protect the constituents,” he said. It’s not enough just to stay within what the law allows. “There is also just being considerate, having that view of who you work for. It’s about being responsible to the citizens while still using all this information for their benefit. That duality can be tough to battle.”