Session 2: Automating Legacy Data Cleanup Projects

Date & Time: 

This session was presented live on 06/08/16. Access the recording and webinar materials now:
This 90-minute session will demonstrate two case studies for automating clean-up processes for legacy data. Presenters will demonstrate workflows and use of tools such as MARCEdit, and Open Refine for batch editing large retrospective projects.   

Learning Objectives:

  • Ability to automate the cleanup of legacy metadata
  • Understand tools for metadata refinement and conversion 

Presentation Titles & Presenters:

  • Editing Legacy Metadata for ETDs: Description of a Best Practice Using the MARCEdit Plug-In Tool
Presented by Marielle Veve, Metadata Librarian, University of North Florida

Abstract: As the standards for describing & encoding metadata for electronic theses and dissertations (ETD) constantly change, so is the need to go back and retrospectively edit records that are not in accordance with the new rules. The process of retrospectively editing individual ETD records to meet new standards can be grueling and time-consuming, but with the creation of new tools and developments it does not have to be that way anymore.
Marielle Veve is a Metadata Librarian at the University of North Florida Libraries since 2013, where she creates, transforms, and consults on metadata for a number of digitized collections, including electronic theses and dissertations. She holds a Masters of Library & Information Science from Louisiana State University 2002 and a Master of Science in Instructional Technology from the University of Tennessee 2012. Prior to her position at the University of North Florida, Marielle served as Cataloging & Metadata Librarian at the University of Tennessee from 2006 to 2013 and Catalog Librarian for Latin American materials at Tulane University from 2003 to 2006.
  • Looking Back, Moving Forward: A Large-Scale Metadata Remediation Effort
Presented by Maggie Dickson, Metadata Architect, Duke University
Abstract: In 2016, Duke University Libraries is migrating all of its digital collections to our new Fedora/Hydra/Blacklight-based platform, Tripod 3. In preparation for this migration, we are undertaking a large-scale remediation of all descriptive metadata, which consists of more than 1.6 million statements about approximately 112,000 items, created over the course of twenty years, by many different people, and using many different schemas and standards (or not). We have formed a task group to make decisions, identify and engage stakeholders, and guide the workflow. This involves reviewing existing properties and values and evaluating the adoption of standards and vocabularies, with an eye toward linked open data and sharing our resources with the DPLA and beyond. The remediation itself (which at the time of this proposal is ongoing) is being completed using OpenRefine, scripting, and many good old spreadsheets. This presentation will describe the process, its challenges and successes , and future directions.
Maggie Dickson is the Metadata Architect for Duke University Libraries, where she provides leadership and support for the creation and maintenance of metadata for digital projects, the Duke Digital Repository, and beyond. Prior to this position she has worked as a digital projects librarian at the North Carolina Digital Heritage Center, the University of North Carolina at Chapel Hill, and North Carolina State University. 

For more information, including how to register, visit the 2016 ALCTS Virtual Preconference web page

This virtual preconference is generously sponsored by Springer Nature.

Spirnger Nature Logo

Additional ALCTS Events at Annual Conference

For a listing of other conference events, visit the ALCTS conference web page.