Archiving of Electronic Business Reference Sources
1998 Publishers Open Forum
January 12, 1998
Summary of Publishers' Presentations
Representatives of Disclosure, Dun & Bradstreet, and UMI made presentations in response to the BRASS Business Reference Sources Committee's Discussion Paper.
Presenter: John Viglotti (Vice President - Product Management)
Disclosure was founded 30 years ago and was the first company to release SEC products on CD-ROM. They have complete archives of all SEC filings for the last 30 years. The various approaches and forms that Disclosure has used to archive documents have been driven by the needs of the SEC, and for many years the only acceptable storage medium was paper or microfiche. There are approximately 500 SEC form types, but not all are required to be filed electronically. Since 1996, Disclosure has provided the SEC with limited microfiche archives migrating towards CD-ROM and online storage. Throughout its history, Disclosure has used various archival formats, which vary depending upon the age of the document. Disclosure records are now being stored as TIFF images and in ASCII formats. The process is to migrate from microfiche to TIFF, and from CD-ROM to online. The forms currently available are:
Prior to 1988: Microfiche and paper were the only mediums used. At a later date, this historical microfiche will be converted to TIFF format, rendering it readable and available via the Web.
1988-present: Document images are on CD-ROMs distributed in various packages such as LaserD and Disclosure Select. Plans are to continue to do so into the foreseeable future. There have been no technology problems in reading the document CD-ROMs from inception to date. However, pre-June 1990 Compact Disclosure discs cannot be read on current CD-ROM drives.
1994-Present: In addition to the CD-ROM format, SEC filing forms are available via Global Access, a Web-based product. By August 1998 all 500 SEC form types will be accessible via Global Access.
Amidst an environment of changing technology and its attendant challenges for data migration, Disclosure remains committed to maintaining archival records of their holdings.
DUN & BRADSTREET
Presented by: Doug Doremus (Director - Reference Services Marketing)
In the past several years people have expressed interest in how archival information will be available from D & B, prompting him to closely examine this issue. The questions that he asked himself were how will D & B deliver this information, and will there be a marketplace for it? At present, it appears that only academic institutions are interested in historical information, and, for obvious reasons, if there is not a sufficient marketplace, the information cannot be produced. Consistency of data collection is another important issue. If the methods have varied over the years, will this be a point of contention for academics? Mr. Doremus is currently researching these questions, and is interested in our input. Electronic archival products and delivery options will be determined by the potential audiences/markets, and to this end, focus groups will be convened to reveal just what is wanted, and how much users would be willing to pay for this information.
As it stands now, archival data is available as follows:
Industry Norms: The current product contains three years of data, and D & B has ten years of this data archived electronically. From this point on, all data will be electronically archived.
Marketing Information: 15 years of this type of data is archived and readily available at D & B. They have every credit report they've ever published in a vault, but getting at this data is admittedly very difficult. Again, if there is an adequate market, a product or system could be developed with this information, either on CD-ROM or the Web.
Directories: All the historical directories are housed in Bethlehem, PA, and there is currently a pay service providing photocopies of desired information. To render accessing historical information easier in the future, D & B will allow institutions with standing orders to keep the directories and archive them locally. In the past, subscribers were obligated to return old directories to D & B. In their own archives, D & B originally archived the directories on floppy disks, many of which became corrupted. To remedy this, directories are now archived on CD-ROM.Return to top of page
Presented by: Dan Arbour (Vice President of Marketing - Library Division)
UMI feels that there is a big marketplace for archival information and that microform as an archival medium is not obsolete. In fact, 1997 has been their best year ever for microform sales. Sales figures aside, they are using electronic mediums and delivery options. They are facing migration issues in selectively digitizing their microform "vault." Another big question that they are facing is how to accurately archive Internet resources. Do you archive all the changes, or just a set "edition?"
UMI is building a digital archive. To determine what features users want in the digital archive, UMI held two focus groups: one for academic libraries and one for public libraries, and came up with the following preferences for the groups:
- Web-based delivery system
- Full-text and full image viewing
- Full-text searching
- Full-issue browsing
- Comprehensive indexing
- Local printing
- Multiple source linking, e.g., in topic searching.
UMI admits that there are migration issues to work out, and they will probably not be able to deliver all the preferences.
UMI has been involved in OCR and ASCII conversion of early text. In conversion of pre-1930 text some inaccuracy has resulted from both of these methods.
UMI is an aggregator. They have massive amounts of information from many sources going back many years, and face differing licensing problems for each. The following are some of their major products:
ABI Inform: Available back to 1971 on CD-ROM and online.
Dissertations: They have all dissertations with citations as far back as 1861; and, of the 1.4 million titles they have, over a million are available full text, which can be ordered from UMI in print or on microform. Starting in 1997, UMI will scan new dissertations from cover to cover, and make them available on the Web for a fee.
Wall Street Journal and the New York Times: UMI has the full text of the WSJ and NYT back to their inceptions on microfilm. In addition, they've been scanning images since 1988. The WSJ will be loaded back to 1984 on Proquest online.
Lastly, the advent of electronic formats has impacted the areas of copyright and licensing. Issues that have arisen include:
- resolving the question of ownership of data versus access to it;
- difficulty of obtaining perpetual access guarantees from publishers;
- changing of a source's publisher may produce a change in policies;
- increasing financial rewards for authors;
- influence of consortia on licensing data;
- individual publisher concerns for an archival model that is compatible with their own.
Mary Jean Pavelsek
International Business and Economics Librarian
Elmer Holmes Bobst Library
New York University
Archiving of Electronic Business Reference Sources
Publishers Open Forum, ALA Midwinter Conference, January 12, 1998
Disclaimer: This publication has been placed on the web for the convenience of BRASS members. Information and links will not be updated. Posted 13 February 1998.