2014 LITA Forum Preconferences

Wednesday, November 5, 1:00 – 5:00 pm and Thursday, November 6, 8am – noon

Linked Data for Libraries: How libraries can make use of Linked Open Data to share information about library resources and to improve discovery, access, and understanding for library users

Speaker: Dean B. Krafft and Jon Corson-Rikert, Cornell University Library                                      

Linked Open Data (LOD) provides an expressive and extensible mechanism for sharing information (metadata) about all the materials research libraries make available. This workshop introduces the principles and practices of creating and consuming Linked Open Data via a series of examples from sources relevant to libraries. The workshop will provide an introduction to the technologies, tools, and types of data typically involved in creating and working with Linked Open Data and the semantic web. It will also address the challenges of data quality, interoperability, authoritativeness, privacy, and other issues accompanying the adoption of new technologies as these apply to making use of Linked Open Data.

Following orientation and the presentation of some introductory examples, the bulk of the workshop will explore Linked Open Data in two contexts: research networking and library resource discovery. Research networking encompasses two principal activities: showcasing information about research and researchers, and making that information interoperable across a network of institutions. The evolution of the VIVO (http://vivoweb.org) software and ontology as an open, community-based platform provides a series of short case studies illustrating many key features of linked data including integrating data from many sources, developing ontologies to represent information across disciplines, and working through issues of scaling from one institution to regional, national, and worldwide networks. Examples will illustrate creating linked data; disambiguating data sources; harvesting and querying within and across linked data stores; building services to support re-use of linked data in other applications; and visualizations and other forms of reporting and analysis.

The second context reflects work underway in the two-year Mellon-funded Linked Data for Libraries (LD4L) project (https://wiki.duraspace.org/display/ld4l/), where three major research libraries at Cornell, Harvard, and Stanford are collaborating to develop ways to describe and exchange library-relevant metadata as Linked Open Data. This metadata goes well beyond standard MARC to include information on how materials are used, organized, interrelated, and embedded in the process of research and scholarship. Background work at Cornell, Harvard, and Stanford involving linked data and usage data will be reviewed to set the stage for the current project. The workshop will then present current LD4L ontology work and interactions with sample data to illustrate open research issues, including migration from MARC to BIBFRAME; shifting from reliance on strings to identifiers and independent entities or authorities; integration of new types of data to enrich discovery and analysis; and tools for leveraging linked data in other library applications.

Dr. Dean B. Krafft is the Chief Technology Strategist and Director of Information Technology at the Cornell University Library, where he serves as part of the Library’s senior management team. His academic interests include research data management and curation, digital archiving and preservation, and the use of semantic web technologies to support the discovery of and access to research, researchers, and scholarly information resources. Prior to joining the Library in 2008, Dr. Krafft served as the principal investigator on the NSF-funded National Science Digital Library technical network services project, leading the team that built the library infrastructure and operated the library services. Before that, he served for many years as Director of IT for Computing and Information Science at Cornell. Dr. Krafft received his Ph.D. in Computer Science from Cornell University in 1981. Since the 1990s, he has been continuously involved in digital library research, serving as the principal investigator or co-principal investigator on a series of federally-funded research projects.

Jonathan Corson-Rikert is Head of Information Technology Services at Albert R. Mann Library, the agriculture and life sciences library at Cornell University.  He has worked since 2003 on the practical application of Semantic Web technologies through VIVO (http://vivoweb.org), a project to make information about research and researchers more interconnected and discoverable on the Web.  He served from 2009-2012 as the Development Lead for the National Institutes of Health "VIVO: Enabling National Networking of Scientists" project (vivoweb.org) and continues in a leadership role in the VIVO international open source community, a project with DuraSpace (duraspace.org).

Learn Python by Playing with Library Data

Speaker: Francis Kayiwa, Kayiwa Consulting

What can be more fun than learning Python? Learning Python by hacking on library data! In this workshop, you'll learn Python basics by reading files, looking at MARC (yes MARC), building data structures, and analyzing library data (those logs aren't going to appreciate themselves). By the end, you will have set up your Python environment, installed some useful packages, and learned how to write simple programs that you can use to impress your colleagues back at work.