A City University of New York faculty member leads a team that made Wikipedia printable and quantified the massive amount of data on the crowdsourced website.
Wikipedia has grown into a huge collection of crowdsourced knowledge. But it's hard to visualize just how massive it is — unless it's printed.
Michael Mandiberg, a member of the doctoral faculty at The Graduate Center - City University of New York (CUNY), has been working on the Print Wikipedia project for the last six years with a team of designers and programmers, including Denis Lunev, Jonathan Kiritharan, Kenny Lozowski, Patrick Davison and Colin Elliot. A consistent editor of Wikipedia and an artist, Mandiberg wondered what value a book in the Digital Era holds and wanted to make something different out of the online collection.
"I was really interested in what a book meant in a digital era, and thinking about sort of the end of print and what value a book still held — and what to do with them as they were increasingly being discarded," Mandiberg said.
He had already used a laser cutter to cut words like "data" and "base" into open books as part of his art experiments. So why not create a software program that makes Wikipedia printable in bound volumes?
The team worked closely with the Wikimedia Foundation and the on-demand printing site Lulu.com as they worked to make Wikipedia available for printing. The first software program they created uses Wikipedia's monthly database backup to run a code that generates PDFs of the text. For that many physical volumes, it takes about three to five weeks to turn them into PDFs. But because they had to work through typical bugs and revise their program, they spent April through mid-June running and rerunning these processes.
Once the first software program finished the PDFs, a second program actually uploaded them to Lulu.com and simulated the human actions of clicking on buttons to customize an order. While the website previously had an application programming interface (API), they scrapped it because it wasn't being used. So the Print Wikipedia team took the creative maker/hacker approach with the second software program that Mandiberg created.
Because of his experience on this project, Mandiberg has been able to share real-life examples of how to troubleshoot problems, work with collaborators and embrace failure as an iterative process in his classes at The Graduate Center - CUNY. He teaches students from a new media art practitioner's perspective about how to start with a small experimental project and make progress toward a final product instead of starting with a big research project and ultimately narrowing its the focus.
The sheer volume of data in projects like these is too big for humans to understand because gigabytes and terabytes don't actually portray its size.
"If this is a post-digital era, it is also an era of big data, and big data controls us," Mandiberg said. "We are constantly creating data as we move through the world, but we don't really have a sense of the scale or the size of this."
And by scale, we're talking thousands of books. If someone wanted to print the Wikipedia collection, it would be an order of 7,473 physical volumes of dictionaries that run about 700 pages each and cost half a million dollars. But don't worry, no one's actually printed all of them.
Through Print Wikipedia, Mandiberg has learned that people care about books and that books are a useful measure of knowledge. He sold 50 copies of Wikipedia volumes at an art exhibition about the project in New York City in June, and hundreds of people are buying select volumes from the website.
"Print is dead, but people really are emotionally attached to books," Mandiberg said. "They hold a very kind of important affective place in our collective cultural experience."