Editor’s note: The following is the third installment of the Digital Communities special section in the September issue of Government Technology magazine.
How to Start
If analytics sounds like something your city or county could use, how does one begin? According to Maryland Chief Innovation Officer Michael Powell, it’s simple: All you need is some data and a lively sense of curiosity. And he should know, he’s been in the analytics game a long time. Powell began working on Baltimore’s groundbreaking CitiStat program back in 2001 and now works with Maryland’s StateStat program under Gov. Martin O’Malley.
Most cities and counties have what Powell calls “single-purpose” data residing in databases and spreadsheets. The trick is to see what kinds of questions can be asked about it that will squeeze out additional value. “The first thing is to just take the data that you have, and sit down and explore it,” said Powell. The process often leads to taking different sets of single-purpose data and comparing them.
Baltimore, for example, cracked down on unregistered rental properties by comparing two sets of single-purpose data. “In Maryland, if you have a rental property, you are required to register it,” Powell said. He compared a rental registration billing database with a real property database that indicates if a property is owner-occupied or a rental. “I found that a large number of rental properties were not paying their rental registration,” he said. As a result, the Housing Department started using the real property database to identify potential rental properties, and revenue increased more than half a million dollars per year.
Powell suspects that Baltimore is not alone in having untapped data. “The reality was most homeowners who were renting out their properties didn’t know that they were required to register them. So we were able to increase revenues just by getting people into compliance. Nobody had thought to look at that before.”
People with the technical skills needed to staff what Powell typifies as a “small, lean group of smart young analysts” aren’t hard to find. Analytical thinkers can be found in economics departments or may have a degree in statistics.
“In my long experience doing CitiStat and then StateStat, we have a variety of people who were teachers and policy folks,” Powell said. “I was a GIS person. For geographic analysis, the mapping part is not hard. It’s making sure you have good data quality, and that you have curious people who have an understanding of the business.”
By knowing what it’s like to be a health or restaurant inspector, those tasked with data analysis can think of smart questions to ask when working with the data.
Now that many jurisdictions have 311 systems, said Powell, that data can be repurposed to stimulate questions and find solutions.
One concept that helps translate analytics findings to staff and residents is a “dashboard” — a list of goals, for example, and some quick way of communicating how the jurisdiction is doing with regard to those goals. Maryland’s StateStat has O’Malley’s 16 policy goals, with red, yellow or green indicators that signify progress toward each goal. Clicking on a goal brings up more detailed information about it and the analytics underlying it.
While the ease of interpreting a green, yellow or red marking is important from a transparency perspective, Powell said it’s really only the tip of the iceberg. The real value of analytics is to drive deep conversations around specific issues. To illustrate, he outlined a state-level conversation focused on unemployment insurance.
“You sit the unemployment insurance people down and say, ‘Show us what this stuff looks like.’ And then you start asking questions like, ‘Can we look at the difference between people who find a job within six months of receiving unemployment benefits and people whose benefits expire and they age out of the program? Can we look at the jobs people find and understand what industries they are moving into? Can we integrate that unemployment data set with a data set on education, and understand the education background of those who end up unemployed?’ People don’t think to ask those questions unless they explore the data that they have. This is the opportunity.”
So how are such efforts funded? Try an innovation fund, where general fund money is set aside for innovative projects. When those projects succeed, a portion of the savings goes back into the innovation fund and the fund grows, fueling more projects. Powell, who maintains that the return on investment on such projects can be “staggering,” said that’s a very workable model.
Where does the rubber meet the road? “Curiosity,” Powell said. “Take some data and put it on a map, and you might be surprised at what you find.”
Beth Blauer, who wrote the GovStat Program How-To Guide, also cut her teeth in the Maryland StateStat program and is now director of GovStat for Seattle-based Socrata, an open data platform provider.
According to Blauer, the most difficult part of running such a program is “finding the will to simply get started.” In other words, if the impulse to start an analytics program starts at the grass-roots level, it requires top-level leadership — the mayor, county executive or governor, for example — to put the data analysis into real action to solve problems.
To do that, said Blauer, tell a story. “I always bring an arsenal of stories.” For example, she tells about what happened when foster home locations were mapped in Maryland and then compared to a map of where sexual offenders lived. They found “overlap.”
“Or you take data around children who live in families that are eligible for supplemental food but are not receiving free or reduced-cost lunch,” she said. “Those are data sets that are being generated for many different purposes and are not always coordinated in a way that will help solve a real-life problem.” Doing so, said Blauer, gets action. “Political leaders say, ‘OK, I get it! These are problems I’m trying to solve.’”
Open data seems essential to a robust analytics program, but Blauer said that “open to the public” is a default definition of “open data” that misses the point. Open data, she explained, is “freeing data so that it can be made actionable.” That means data governance, data curation and comprehensive metadata that enables internal use. Then it can also be made available to the public. “Places like Chicago and San Francisco have used the data internally and then pushed it out to the public,” she said. “The applications developed on that data are much stronger, much more reliable.”
Future of Analytics
Big retail stores like Target, Wal-Mart and Amazon reuse the sales data generated by brick-and-mortar cash registers and online sales. Buying a certain brand of laundry detergent may trigger a discount coupon for a specific brand of fabric softener, because analytics has shown that purchasers of one most often favor the other. Online searches for coffee makers may trigger ads for coffee makers when browsing other sites, purchasers of one book title “most often also purchased this book,” and so on. Even patterns of heavy and light sales days help stores predict staffing needs.
“That’s where analytics in government is going,” Powell said. “We have data we’ve collected for specific purposes, like processing claims or enumerating crimes, and we’re starting to see there’s a bunch more value in that, which can answer questions that we don’t even know we have. The first step is exploring it.”
Predicting things with analytics is likely to become more useful over time. Powell, for example, said it’s possible to predict the risk that a parolee or probationer will reoffend. “It doesn’t make sense to supervise every person the same,” he said. “So we have built a statistical algorithm that says how likely they are to commit a crime while they are on parole or probation. We can classify different tiers of people that we supervise, and we supervise them differently based on what tier they are in.”
While some risk is intuitive, analytics is starting to get much smarter, and subtle indicators that may be missed, even by experienced staff, are starting to be recognized by the algorithm. And that means better safety for the public and better allocation of scarce resources.
Some barriers to sharing and comparing different kinds of data are disappearing, said Powell. In the last 10 years, for example, the amount of digital data that’s available to be queried has grown exponentially and likely will continue to climb. And compatibility barriers also are falling. Years ago, even a document in WordPerfect was unreadable without the program, and customized systems, relational databases, etc., required expensive data translation and extraction. But technology is solving those issues, and 10 years from now Powell expects those barriers to be gone, or at least of minor concern.
But one barrier to large-scale collection and analysis of data isn’t likely to disappear soon: privacy, especially following revelations of the NSA’s PRISM spying program. The safety and security of personally identifiable information such as medical records and phone calls is a major concern. “We’re not trying to hone in on information on one person,” said Powell of the state’s big data and analytics programs. “We’re interested in a broad look at health, unemployment or crime in the state of Maryland.”
So in this time and in this place, analytics might be our best hope of making sense of masses of data too large to comprehend. It may provide insight into trends too large or too small, too slow or too fast for us to perceive, and help us manage our most complex concerns, or largest counties and smallest towns. But it also has the potential to be turned against us to create a surveillance state. The analytics predicting the outcome of that conflict have yet to be developed.