“The Future
Is Now: Global Authority Control”
ACIG Program
and Business Meeting—25th anniversary celebration
ALA Annual
Meeting, Chicago, Illinois, July 13, 2009, 1:30—5:30
PROGRAM
Mary
Mastraccio, ACIG chair, welcomed the audience.
In the interest of time, she did not make extensive introductions of
speakers.
Tim Spalding
(founder, LibraryThing)
Authority Control 2.0?
Spalding
briefly described LibraryThing, software that allows individuals to catalog
their personal libraries, but with features that encourage social networking
and cooperative cataloging, as well as idiosyncratic projects, such as
cataloging the libraries of dead people (e.g. Thomas Jefferson, Marilyn
Monroe). A major feature of LibraryThing
is the use of “tags,” or user-formulated- and supplied subject terms. These are unstructured, non-hierarchical
terms, and Spalding sees that as a strength.
53 million have been added; LibraryThing software aggregates them. LibraryThing is not without authority
control; the two major means are disambiguation and Common Knowledge. Users participate in disambiguation through
“combine” and “split” functions. This
can be done with authors, with works (Spalding seemed to regard a textual work
with different titles in the same language and a textual work translated into
different languages as equally eligible for combination), or with tags. “Authorized forms” are established by
usage. One rule is that combined tags
have to be alike in both meaning and usage.
All the changes made are logged, and can be reversed. Common Knowledge
is a set of data stored in a fielded wiki; users can and do use it to record
things that neither publishers nor catalogers would routinely provide
(character names in a novel, related places and events). 1.5 million edits occur in Common Knowledge
each year. Authority control follows the
Wikipedia model; though not enforced by software, rules are applied by “heads”
(knowledgeable individuals in a field).
Spalding acknowledged shortcomings—data in LibraryThing is not always systematic
or perfect; it is not immune to the “curse of either/or thinking;” and there is
no good way to merge authority files. He
finished by admonishing librarians to see LibraryThing users as valuable
allies—they care about the same things that we do.
Jeanne Spala (Civica/CMI—ILS vendor)
Global Authorities in the Local Catalog—(presentation
on their ILS product)
Spala
described authority control components in her company’s ILS, Spydus. The catalog affords extensive use of external
links to enriched content and Web 2.0 features (data from Google Books,
Facebook, LibraryThing, etc.). There is
robust global-change software that allows for very granular changes. It is designed for search and display of data
in any language (the current list of supported languages is short). The database supports multiple versions of
MARC bibliographic records, though authority records are maintained in MARC21.
Michael
Kreyche (Kent State U. systems librarian)
Spanish Equivalents for LCSH
Kreyche has
developed a bilingual database of Spanish equivalents for LCSH (available at http://lcsh-es.org ). While a login is required for searching, use
of the database is free. His purpose was
to bring together as many of the disparate lists—some electronic, some not—that
have been independently developed. With
the increase in Spanish-speaking library users in the U.S., the need is
increasing; and Kreyche hopes to develop a global collaborative for further development
and maintenance. He began in 2005 with
headings in bibliographic records from the San Francisco Public Library OPAC,
then added terms from the Queens (N.Y.) Public Library, which then incorporated
the SFPL headings in its own catalog.
Other lists added include CSIC (a Spanish list on CD-ROM), MARC
authority records from the National Library of Spain, Bilindex from 1984 (the
English-Spanish index in the back), and Simon Spero’s LCSH (otherwise known as
the Fred 2.0 Project. In 2007 Kreyche
procured NEH funding, which he used to convert his data to MARC, and to hire
student programmers, and launch the Connexion Web service which inserts
Spanish-language headings into bib records.
He finished with another plea for collaborators.
Karen
Smith-Yoshimura (OCLC)
Cooperative Identities Hub
Names as
identifiers are ambiguous. They depend
on context and can change over time.
Current authority practice aims to create unique headings, usually by
adding dates—often insufficient and not always helpful. Ambiguity of names—LCNAF uses dates (insufficient
for most)—names depend on context, they change over time. We have lots of sources for disambiguation,
but they are widely dispersed. Within
the RLG Division of OCLC, Networking Names Advisory Groups was formed to answer
the question “In an ideal world, what information would we want for tracking
names?” The group created user scenarios
for librarians, students, archivists, institutional repositiories, and
others. The group developed a prototype
Cooperative “Identities Hub”—a framework in which to concatenate/merge
authoritative information on creators, using social networking, to provide a
gateway to all forms of a creator’s name without preferring any single
form. The objectives were 1) to bring
together information about creators from disparate sources and expose a wider
community (not just librarians) to the data; 2) to increase efficiency of
metadata creation; 3) make it easier to identify creators and their works
regardless of language or discipline; 4) determine a preferred form within a given
context; and 5) generally expand knowledge of people and corporate bodies
gained beyond the library catalog. The
hub data elements include at least one form of name; life events (e.g. origin,
places of output, affiliations); associated entities; some works (though not
necessarily all); a short biography; and unique identifiers. At OCLC, the most visible outcomes of this
initiative have been WorldCat Identities and OCLC’s participation in the
Virtual International Authority File (VIAF).
Among pending developments in the former is a Merge Identities function,
open to anyone. While the function will
operate in the WorldCat.org realm, corresponding changes to underlying
Connexion data will be made as well.
Thom Hickey
(OCLC)
The Virtual International Authority File
The Virtual
International Authority File in its current form began in 2003 as a joint
venture of the Library of Congress and the Deutsche Nationalbibliothek. The Bibliothèque nationale de France joined
about a year later; a large number of libraries began contributing files in
2008. A salient principle of the VIAF
is that rather than attempting to identify a single authorized form of a name,
national/regional variations are allowed to co-exist. Its current scope is limited to personal names
and geographic headings, but eventually all sorts of named entities will be
included (but no subject headings). The
software mines data from bibliographic records associated with the entity and
merges that information with existing authority records. The database contains 10.4 million name
records that form 8.7 million clusters, to each of which is assigned an
identifier. “Match points” include
titles of works, various sorts of dates, joint authors, LCCN’s, either fully or
in part. 99.5% accuracy in matching is
the goal. Next steps are to continue
adding participants, expand the scope from current personal/geographic limits,
and get data and files from more sources (e.g. rights agencies, ISNI
(International Standard Name Identifier), regional and specialized files). Benefits for OCLC include improved FRBR
matching of English- to non-English records, aid in authority control, and
provide better regional tailoring of WorldCat.org.
Janis Young
(LC)
Authorities and Vocabularies, LC’s New
SKOS-Based Service
Authorities
and Vocabularies came from a desire and a demand that LCSH and allied data be
accessible on the Internet for free. The
vehicle of choice was SKOS (Simple Knowledge Organization System), based on the
Resource Description Framework. The current
address of the database is http://id.loc.gov. The database provides human and programmatic
access to the LCSH vocabularies, including subject, genre/form, children’s,
subdivision, and validation records).
There are also links from LCSH to RAMEAU, the authority file of the
Bibliothèque nationale de France. A
primary benefit is the capacity to download the entire file of a controlled
vocabulary for free, though a search interface can locate individual
records. Some of the capabilities and
limitations can be identified by reading information at the Technical Center
link. Coming soon is the Thesaurus of
Graphic Materials (TGM) and the MARC code lists for languages, geography, and
relator terms.
Diane
Hillman (Information Institute of Syracuse)
Registering the RDA Vocabularies
Diane asked
“Why register vocabularies?” Her answers
were that users benefited from 1) enhanced visibility (especially outside the
library community); 2) learning of changes; and 3) having broader
participation. Vocabulary owners receive
1) help in managing “versioning” and changes; 2) notifying users and
maintainers; and 3) gaining support for vocabulary extension and “community
formation.” To make the RDA data
elements and vocabulary lists more useful to the broader audience to which they
are aimed, registration was desirable.
As a result of meetings in May 2007 between members of the Joint
Steering Committee for the Development of RDA (JSC), the Dublin Core Metadata
Initiative (DCMI), and other Semantic Web groups, the DCMI/RDA Task Group was
formed. It was charged to build a formal
representation of RDA elements and vocabularies and to set up an Application
Profile, with the end of including the data elements and vocabularies in the
National Digital Science Library Metadata Registry (NDSL Registry).
Among the
challenges the Task Group faced were 1) timing, with the text of RDA still in
flux’; 2) technology, and 3) standards and conventions, which were not settled
for RDA vocabularies. The Task Group
discovered that most data elements and vocabulary concepts had changed at least once,
sometimes drastically, and that the Group was at the end of the line to see
revised texts. Registration of the FRBR
entities is intended, but has been delayed by problems with the IFLA
(International Federation of Library Associations) Web site.
Other
challenges: 1) including relationships to FRBR entities as part of the element
definitions creates limitations for use of RDA outside library community; 2) the
number of techniques for relating RDA elements and FRBR entities add to the
complication of the schema; 3) decisions made about how to preserve traditional
library data linkages limit usefulness for other communities.
This summary
does not do justice to the detail and breadth of Diane’s presentation. More information can be found at the NDSL
Registry site (http://metadataregistry.org/
), and visitors can “play” in the “Sandbox” to see other schemas and perhaps
create their own. While the process has
been difficult, doing the work in parallel with RDA development makes
implementation possible more quickly.
This work will also serve as a basis for migration from MARC to
broader-use platforms.
ACIG 25th anniversary
Attendees
were invited to join in eating cake in celebration of ACIG’s 25th
anniversary, and in honor of the group’s “founding mother,” Barbara Tillett
(Library of Congress). Barbara
reminisced briefly about the group’s genesis and its accomplishments. Early meetings showed an overwhelming
interest in issues related to authority control, and this interest
persists. The downside is that the
issues themselves have persisted; Barbara expressed hope that today’s talks
were emblematic of change to come.
Barbara
Tillett recognition and her comments
|