Stanford and Cornell are leading library efforts to help machines understand bibliographic information so it will be searchable on the Web.
Academic libraries are making progress on their quest to break bibliographic information out of closed catalog systems and share it on the Web for more people to find.
This spring, Cornell, Harvard and Stanford completed a two-year planning grant project and started work on two new implementation grants from the Andrew W. Mellon Foundation that totaled nearly $3 million. These related grants tackle several key problems for libraries today.
The MAchine-Readable Cataloging (MARC) data format emerged nearly 40 years ago, and allows computers to share and use bibliographic information in library catalogs so patrons can find out more about the items libraries have in their collection. But outside of the library world, few organizations use the MARC data format, and external machines can't read it, making it dificult for people to find information about library holdings on the Web, said Philip E. Schreur, assistant university librarian for technical and access services at Stanford University's Green Library.
Enter the Linked Data for Libraries projects. The two-year grants awarded this spring support library initiatives that develop and advance the use of linked open data, which uses hyperdata links instead of hypertext links on the semantic Web to help machines and people discover content that wasn't available before. This move to linked data requires a change in library processes, tools and staff skills that a number of organizations — including Cornell, Harvard, Stanford, Columbia, Princeton and Stanford universities, along with the Library of Congress and the University of Iowa — are working on in different ways.
"The whole point is to shift to linked data because it has a structure called 'resource description framework' that allows a machine to actually understand what the data is," Schreur said, adding that while this shift could take eight to 10 years and several grant renewals, he hopes it will be done in four to five years before new technology comes out.
The grant project that Stanford's leading focuses on practical things libraries can do to transition to linked data, while Cornell is focused on developing tools to enable that process. The two teams will work together to figure out their needs and how the tools can help them, specifically with items including rare books, music, art objects, maps and globes, and rare materials.
Cornell has been working on linked data since 2003 and developed an open source software tool used to discover scholarly work called VIVO. Within VIVO, the university also designed an integrated ontology editor and semantic Web app called Vitro that allows library staff to create metadata. Cornell is working to expand Vitro's use to include broader kinds of metadata.
"There is a lot in what we do that's colored by the way we did things in the past," said Chew Chiat Naun, director of cataloging and metadata services at Cornell, "and this is kind of a chance to re-examine some of that."