Data scientist Felienne Hermans explains the fickle problems of spreadsheets and how to make them user friendly in the long term.
For office-goers from the top of the ladder to the bottom, in the public sector or private, the question isn’t if they hate their spreadsheets — it's how much.
The horror stories are a testament to that hatred. After all, it was a spreadsheet malfunction that caused the town of Framingham, Mass., to believe it had $1.5 million more in its 2011 budget than it did. It was also a spreadsheet that made organizers of the 2012 London Summer Olympics overbook four swimming events by 10,000 people. And again, it was another spreadsheet that nearly cost California’s Kern County $12 million in tax revenues when a gaffe hid $1.26 billion worth of taxable real-estate.
These tales of spreadsheet woe are only some of the scores heard by Felienne Hermans, an assistant professor at Delft University of Technology based in the Netherlands. Hermans has a Ph.D. centered in spreadsheet data extraction and is the founder of a startup called Infotron, a company that employs algorithms for spreadsheet analysis. Her typical day is always a hopscotch among lines, columns and rows.
In January, Hermans visited California to speak at the O’Reilly Strata Big Data Conference held in Santa Clara, and recently spoke with Government Technology to offer followup advice on how everyone from organizations to average users can improve and untangle their spreadsheets.
Spreadsheets are quick to create and stay a while. Herman’s Ph.D. research, that surveyed and studied a number of enterprise-level businesses, showed that each spreadsheet lives five years — roughly as long as a pet mouse or hamster.
“They live for a long time,” Hermans said, and as such, she recommended a good filing and folder system — one that is intend to endure not just for months, but years.
Most spreadsheet makers often think their work will only be seen by a small few. However, within any large organization, Hermans said on average 50 percent of business decisions are based upon spreadsheets, and about 12 people will view or work with a single spreadsheet file. Surprisingly, Hermans said that despite the many hands a spreadsheet passes through, only 33 percent of them have a page that includes a manual explaining how to read and use them. Illustrating the sticky point by way of example, Hermans said this can be problematic for employers going through a hiring process, as new staff — even if endowed with expert skills — could likely have no idea how to interpret the data.
“They evolve into something that multiple people are using,” Hermans said. “And I would say you have to give some sort of description of what you want the spreadsheet to do.”
Many spreadsheets “smell.” It’s a fact rarely mentioned, mainly because most people don’t realize spreadsheets are code, Hermans said. Spreadsheets, she reasoned, answer many problems and have similar qualities and challenges of programming code; because of this, they also have “smells.” In this sense; the word “smell” draws its definition from programmers who describe it to mean code that is confusing, sloppy or just simply poor quality.
“Excel formulas are just as expressive as any other type of code languages,” Hermans said.
With spreadsheets, the formulas can start to smell and become difficult or complicated to work with afterward. Examples of this include extra-long formula operations, cell values that need to be traced back to cells in different pages for context, and in cases of duplicated formulas, where a cell’s formula contains other cell formulas.
Consciously thinking about clarity and simplicity is the overall premise Hermans encouraged here. Make sure cell values are well labeled, formulas are as short as possible and a cell's formula on one page draws on a value of another page. And first, consider placing the value — with a label — somewhere on the new page so it can be easily referenced.
Hermans has additionally been working with other researchers to craft a free software tool called Bumblebee that can be found on her blog. Named after the Autobot character in Transformers, the program allows users to simplify spreadsheet formulas by transforming them into shorter and clearer versions — a tool similar to tools that improve code. It does this by way of a simple and easy-to-use interface that highlights potential cell formula transformations inside each spreadsheet.