Tom Zillner, Editor
CORC: New Tools and Possibilities for Cooperative Electronic Resource Description
Karen Calhoun, John J. Riemer, eds. NewYork: Haworth, 2001. 184p. (ISBN 0-7890-1304-5).
It is noteworthy that the week I submitted this review for publication was the same week that the OCLC's Cooperative Online Resource Catalog (CORC) service came to an end--not because of its failure, but because of its success. After more than three years of ongoing, extensive upgrades, CORC has evolved into OCLC's new Web-based Connexion interface. This could not have come about without the commitment, time, energy, and expertise of the early pioneers, a group well represented by the authors of this volume, who worked in CORC from the time of its experimental infancy.Ê CORC began as an experimental cooperative project in early 1999 and became a full OCLC production service in July 2000. CORC was both a database of metadata records for online resources and an experimental interface for creating those records, along with sharable pathfinders. But, as this volume makes clear, CORC was also much more than that for those who participated in it.
This book consists of fourteen separate articles written by individuals or groups of coauthors, published simultaneously in this book and in the Journal of Internet Cataloging, volume 4, number 1/2, 2001. The book also includes a preface by Jay Jordan, president of OCLC, and an introduction by the two coeditors, Karen Calhoun of Cornell University and John J. Riemer of the University of Georgia. All fourteen articles center around the writers' early experiences with the OCLC CORC service, but contain widely different perspectives. The subtitle of the volume conveys the theme that ties these diverse articles together: "new tools and possibilities for cooperative electronic resource description." The new tools explored here are those that developed within the CORC service, opening up new possibilities for cooperation in creating metadata for electronic resources, especially the burgeoning number of resources available on the Web. The types of cooperation covered include cooperation between OCLC and CORC participants, cooperation among participating institutions, and cooperation within institutions among technical services, public services, and collection development staff.
The articles in this volume were written when CORC was still an experimental project and its records had not yet been integrated into WorldCat. Given the rapid changes in the world of metadata and in the CORC service itself, some of the information in these articles is already out of date. But on the whole the book remains remarkably relevant for recent users of CORC, and now of Connexion, with regard to cooperative description of online resources. Especially useful, perhaps, are the articles that describe the development of local workflows and the application of Dublin Core (DC), MARC, and CORC functionality to their institutional processes. The volume still has much to offer institutions that are already engaged in creating metadata for online resources, but is especially helpful for those who have yet to begin. Readers will find a wealth of theoretical and practical information in this book, and they will discover blueprints and models to consider when planning their own institutions' work flows.
The first two groups of articles in the volume look at CORC from a theoretical perspective, the first three offering a big-picture view of CORC. First off, Thomas B. Hickey of OCLC writes about collaboration in CORC, providing a good overview of the CORC project. He emphasizes the ways in which the project continues and expands the long tradition of cooperation and collaboration developed between libraries and OCLC, especially in the areas of systems and standards, and manifest in both the CORC Resource Catalog and CORC Pathfinders. Hickey stresses standards as the mainstay of library cooperation.
In their article on CORC and the future of libraries, authors Charlene Hurt and William Gray Potter share the perspectives of two library administrators. They suggest that CORC may offer the hope of being "the universal bibliography," containing both local and universal records, and controlling collections previously not well controlled by libraries, such as special collections, manuscripts, archives, media, along with Web sites, e-journals, and other new e-resources. This vision rests, I assume, on the potential of CORC to eventually include records in multiple metadata formats in addition to AACR/MARC and Dublin Core. Concluding the first group of articles, John J. Riemer writes about the rationale and possibilities for a relationship between CORC and the Program for Cooperative Cataloging (PCC). He includes some interesting comments about the dynamic tension that exists between the DC and MARC standards and how PCC members can monitor and help establish core-level record equivalents in other metadata schemes.
The second grouping of articles represents a research and development view of CORC. The four articles in this section discuss ways in which OCLC is investigating methods for letting computers automate and simplify the resource-description process, allowing human beings to concentrate on the more intellectually-demanding tasks that computers cannot do (at least not yet, if ever). One such area of exploration concerns ways to simplify the use of subject terminology in metadata records. Lois Mai Chan, Eric Childress, Rebecca Dean, Edward T. O'Neill, and Diane Vizine-Goetz write about the development of Faceted Application of Subject Terminology (FAST). FAST provides a new approach to subject vocabulary for Dublin Core based on Library of Congress Subject Headings (LCSH), but applied with a simpler syntax. The core idea of FAST is the separation of LC subject terminology into four complementary facets: topical, geographic, form, and time period, and the use of computerized metadata systems for post-coordination instead of precoordination of subject strings based on users'search criteria. FAST is not intended to replace fully developed LCSH strings in AACR content records, but to allow for a simpler form of resource description by those not well-versed in the complexities of LCSH. This would still allow for subject searches of far greater richness than simple uncontrolled keyword searching.
Authors Carol Jean Godby and Ray Reighart offer a summary of the WordSmith research project's contribution to CORC. The primary goal of this project is to automatically identify significant subject terminology in machine-readable text so as to harvest and automatically generate subject terms in records for electronic resources. These terms may be accepted by resource describers in whole or in part as uncontrolled subject keywords, or they may serve as a starting point for development of controlled vocabulary terms within a metadata record.
In her article "Dewey in CORC: Classification in Metadata and Path finders," Vizine-Goetz discusses OCLC's efforts to map the vocabulary of LCSH to Dewey Decimal Classification (DDC) and to experiment with automated DDC classification for e-resources in CORC at the time of record creation. This type of automated classification remains an area of active research at OCLC today and into the future.
Childress, one of the leaders within the CORC project at OCLC, writes about "Crosswalking Metadata in the OCLC CORC Service." From its first release, CORC offered support for both the DC Metadata Element Set (DCMES) and OCLC-MARC as available views of all resource records. The key to supporting this capability has been a behind-the-scenes "crosswalk," based on a metadata-to-metadata conversion specification. Childress points out that OCLC plans to develop additional crosswalking capabilities for other major metadata standards in the future.
This crosswalking between DC and MARC is one of the great achievements of the CORC project, but it also exhibits, in the opinion of some CORC users not represented in this volume, the potential for significant confusion among library catalogers and other resource describers, and can result in compromising the integrity of both AACR/MARC and DC metadata created according to their own unique standards. But this is not to diminish the value of CORC's role in developing an actual working crosswalk, extremely important for general metadata interoperability, nor the fact that this capability within CORC proved central to several libraries' cross-divisional workflows for online resource description, as recounted in later articles in this book.
The third and fourth groupings of articles are written by implementers of local metadata projects using CORC. The third group deals with CORC implementation within cataloging departments and within cross-functional institutional teams. Jeff Edmunds and Roger Brisson (Pennsylvania State University) stress that, at the time of their writing, CORC was very much a work in progress. They note that CORC included both important innovations but also significant drawbacks, as well as many automated capabilities still in their infancy or yet to be realized as useful in practice, such as metadata harvesting and automated DDC generation. In the section of their article titled "CORC As a Testbed for Innovation," the authors point out that many previous technologies and many years of work of the OCLC Office of Research went into CORC, and that during the early time of beta testing there was a great deal of ongoing, continuous revision of the CORC interface, much of it in direct response to feedback from the early users of CORC based on their hands-on experience. The authors see the CORC project as enabling a melding of two cultures: those of cataloging specialists and of computing and Internet specialists.
Norm Medeiros, Robert F. McDonald, and Paul Wrynn write about utilizing CORC to develop and maintain access to biomedical Web sites at New York University School of Medicine. As part of their move from subject-specific biomedical Web pages to using CORC pathfinders, they developed an excellent list of resource types to be considered for selection and those to be eliminated from consideration. Like many of the other authors in this work, they selected DC as a standard that allowed a broader group of staff to create metadata in CORC. Rather than using the library OPAC, their Web team used a subject-based approach within the library's Web site as the locus for patrons to access these resources. "In this approach, users select their subject interest, and have the option to view e-journals, e-texts (when available), Web sites, online catalog (MEDCat) resources, and recent Medline citations on that particular subject" (120).
The next two articles discuss CORC as the basis for collaboration across internal library divisions. Ann Caldwell, Dominque Coulombe, Ronald Fark, and Michael Jackson write about collaboration between catalogers and reference librarians in the OCLC CORC project at Brown University. They provide a good deal of practical information, including a list of expected outcomes, implementation strategy and training schedule, local guidelines for selection of no-fee Web resources for cataloging, and a list of results of the project.
Calhoun describes how the CORC at Cornell project allowed principal functional groups which typically worked independently--technical services, collection development, and public services--to work together collaboratively. The project initiated an experimental workflow for Internet resources, differing in three ways from their existing workflow: (a) selectors prepared the preliminary records in CORC, using the DC standard; (b) reference librarians as well as selectors identified, chose, and created preliminary records for Internet resources; and (c) later, catalogers used CORC to finish the records in MARC format, exported the metadata to the local catalog database, and to the Library Gateway. Using systems analysis theory to model and discuss the processes of describing Internet resources, Calhoun finds that "distributed description is both feasible and beneficial" (138). She states that being able to flip back and forth between DC and MARC views of a record was the keystone of their workflow.
The fourth and final grouping of articles deals with the use of CORC and DC for special categories of material, namely, serials, digital art, and digital images of maps. Wayne Jones concludes that the DC Element Set can work well for describing serials, except for two areas: dates and volume/date designation. He recommends defining the DC Coverage element for the volume/date designation and prescribing a four-digit date in all cases for a single year, with unknown digits indicated by a "u" or some other method. Also necessary would be the use of a qualifier with the DC Date element, such as Date.Issued.
Ann Hanlon and Ann Copeland recount the results of an investigation of using DC and CORC for describing collections of digital art, that is, works created originally in digital form, not reproductions of tangible works, for the @art gallery, an online gallery of digital art, exclusively for the Web environment. They found that DC could work relatively well if, and only if, multiple local qualifiers were added. For example, they needed to be able to identify the specific role of multiple creators in a digital art project and to connect specific names with specific titles within a collection-level record. After experimenting with using DC in CORC, they chose not to use CORC for their resource description, largely because of the limited use of qualifiers as well as the DC-to-MARC and MARC-
to-DC mapping problems, which, among other things, were not able to maintain the connection between multiple artists and titles.
Finally, David Yehling Allen shares his experiences with using the DC in CORC to catalog digital images of maps. Although he feels that creating bibliographic records for digital maps in the CORC database seems to be the best available solution to the problem of bibliographic control of the thousands of digitized maps available on the Web, he recounts the many problems that he encountered in doing this. Among many other interesting comments, he states:
In theory, it should be much easier for novice catalogers to work with the DC rather than MARC, but in practice I encountered many difficulties. Most of the problems arise from the newness of both the DC and CORC, and from the lack of established standards and documentation for working with them. . . . Ironically, one source of confusion has been the very simplicity of DC cataloging (165).
These comments will resonate with many who take DC seriously and have attemp ted to use it. Allen stresses the importance of developing best-practice guides for the application of DC to describing cartographic materials as the best solution to these problems.
Looking at the volume as a whole, the majority of authors are quite positive about the CORC project, whether it be CORC's theoretical possibilities, OCLC's ongoing research and development efforts, or what CORC allowed the authors to accomplish practically within their institutions and for their end-users. For example, the ability to use DC and map it into MARC allowed reference librarians and other non-catalogers to create simple records without MARC tagging and for catalogers to then upgrade those records to full AACR-MARC standards when chosen for inclusion in their library OPAC. The ever-developing technical capabilities of the CORC system, now evolved into the Connexion interface, such as dynamic, hyperlinked authority control and setting local "holdings," are also regarded as major positives of the system.
But several of the authors noted problems centering on the quantity and quality of the records included in CORC, the application of standards for resource description, and the results of mapping between DC and MARC. For example, Edmunds and Brisson remark on the fact that the majority of records in CORC were those seeded from the experimental InterCat and NetFirst projects, the latter especially containing low-quality, sub-standard records, and many describing resources of popular interest or trivial Web sites of low value to most libraries. The authors comment that the problems with the quantity and quality of records in the CORC database are (at the time of their writing) the result of a lack of rigid selection and cataloging standards. They raise additional significant issues entailed in the co-existence of CORC records created according to both DC and MARC cataloging standards. Calhoun in her article also writes: "While we agreed that even a skeletal DC record enables better indexing and retrieval, we found the present heterogeneous mixture of DC practices encoded in CORC records less than optimal" (141). And Hanlon and Copeland recounted in their article on using DC for digital art that "We were disappointed . . . with the restrictions placed on us by the implementation of DC in CORC and, to some extent, with the way the DC/MARC crosswalk maps data between the two views" (155). Knowing that these comments were written before CORC records were merged into WorldCat, it would be very interesting to know what these authors think today about the implications of this merger for record-sharing in WorldCat, predicated as it has been on careful adherence to commonly-held national standards, upon which OCLC members have relied as the mainstay of cooperative cataloging.
The above problems not withstanding, this book reminds us of what marvelous work OCLC has done in encouraging and enabling the description of Internet resources by catalogers and other librarians through the initial CORC project. The book demonstrates how OCLC and librarians have greatly benefited each other by working collaboratively on the CORC project and how the project also enabled librarians to work collaboratively with each other in innovative ways, across internal institutional divisions, bringing noncatalogers into the resource description process. This book reflects the fact that the development of the current Web-based cataloging interface, with hyperlinked authority control and many other technical innovations, has been a phenomenal advance for the library and cataloging communities, all rooted in the once experimental CORC project. These authors are among the pioneers who made possible these features, now embodied in CORC's successor, and their words give us not only a window on the recent past but also remain on the whole remarkably relevant for the cooperative description of electronic resources today. --Steven Jack Miller, University of Wisconsin/Milwaukee Libraries.