Records management has never been the subject of a best-selling book. The corps that dedicates its working life to the discipline just isn’t that large. The subject matter is dense and difficult to explain. It is even harder to do. Expensive too. And it has been long relegated to the category of problem rather than seen as the source of potential solutions for the vexing issues of governing.
Against that background is the hopeful development that is bringing together the disciplines of old-school records managers and the energy of young hipster data scientists, and they have more than thick-rimmed eyeglasses in common. Although they may differ in approach — one acutely aware of the constraints of the current environment, the other leaning into the art of the possible — they both see the challenges of a universe of public records that is rapidly expanding in both volume and complexity.
Their collective mettle is being tested by the rise of big data — or at least its much-ballyhooed imminent advent — commonly characterized by high velocity, high volume and high variety. The scalable, extensible horsepower of cloud computing brings the velocity. While government adds modestly to the volume, its real contribution is that what it does bring is no garden-variety source of data.
Open data, particularly the kind that can be freely used from government sources, is not just part of that variety — it also holds the promise of being the ballast of big data, anchoring the amalgam of extremely large data sets of disparate origins that are being combined and interrogated to reveal patterns, trends and associations. The unique attribute of public records cannot be overstated: Government is the holder of the singular, authoritative record to which all others refer.
The public value of records grows as governments’ “sense making” capacity increases, surfacing insights hiding in plain sight. The private value of records also increases as they are put to new uses.
Predictive analytics — the subject of regular columns in these pages by Stephen Goldsmith of the Data-Smart City Solutions initiative at Harvard’s Ash Center — and still-nascent exponential technologies are providing ever more powerful platforms to see correlations hiding in plain sight and, increasingly, act on them.
For all the progress of showcase initiatives, including Chicago’s SmartData project running on its WindyGrid platform, we remain much closer to the beginning of this journey than the end.
Technological and societal changes are radically expanding the universe of public records. Unstructured data, audio, video and social media are generally acknowledged to fall within the definition of public record. Each brings with it nontrivial questions about how to properly manage them in huge volumes in a way that meets statutory requirements.
The introduction of dash cameras, and now the push for body cameras, in law enforcement is overwhelming agencies’ ability to capture, store, secure, index, search and retrieve huge volumes of video in ways that meet legal tests for evidence.
Social media brings with it myriad records issues for law enforcement and all public agencies. Questions of appropriate use, the creation of public forums, and deciding what and how to archive are being worked out agency by agency, city by city, state by state. The state archivist in Illinois is in the midst of comprehensive rule-making for social media archiving. An early practice of capturing screen images of a social mention will no longer do.
A tweet maxes out at 140 characters. Behind it lie more than 2,000 characters of metadata containing details such as user identity, time stamps and other contextual information. Under the emerging rules in Illinois and elsewhere, the metadata too would be included in the definition of a public record.
If the shock of the new isn’t enough, long-standing debates around privacy and security have new urgency as the Internet shows us over and over again what happens when the public record is actually public.