DIRECT: A Decentralized Image Retrieval System for the National STEM Digital Library

Jinshan Tang, Sridhar R. Avula, and Scott T. Acton


This paper describes the Decentralized Image Retrieval for Education (DIRECT) service for the National Science, Technology, Engineering, and Mathematics (STEM) Digital Library (NSDL). DIRECT augments NSDL by providing content-based image-retrieval (CBIR) functionality. CBIR allows the user to designate a query image so that the service can search the library for images of similar content. DIRECT matches images not by text metadata but by the color or texture of the image objects; the matching process does not depend on a match between the cataloger description and the user description. DIRECT is a peer-to-peer service built for decentralized digital libraries. The content-based image-retrieval service is available to all collections in NSDL without imposing new standards or protocols. With DIRECT, NSDL can support images that have yet to be cataloged or have incomplete metadata without adding overhead (expenditure of additional resources) to the collections.

In order to support education in science, technology, engineering, and mathematics (STEM), the National Science Foundation (NSF) has established the National Science, Technology, Engineering, and Mathematics Education Digital Library (NSDL) program. The aim of the program is to create a digital library that ultimately meets the “needs of students and teachers at all levels—kindergarten to undergraduate, graduate, and lifelong learning—in both individual and collaborative settings, as well as formal and informal modes.” 1 The resultant program is an integrated and distributed information environment for accessing quality-assured digital resources from many sources. NSDL tasks include the following: (1) supply educators with easy access to high-quality, organized educational resources and the tools, interfaces, and other services to help them effectively use these resources; (2) support students with straightforward access to scientific data that encourages learning by direct experience; and (3) support a new integrated community by addressing the full spectrum of science, mathematics, engineering, and technology education. 2

One challenging research problem that needs to be addressed in the NSDL research community is mining images in the digital library. The most common method of searching for images involves matching a text query to text metadata that have been entered manually. Text-based searches often prove ineffectual for the following reasons: (1) the image does not have associated metadata; (2) the metadata are incomplete; (3) the metadata include technical terms unknown to the user; and (4) the subjectivity of the metadata leads to a mismatch in terms used by the cataloger and user. To assist users in finding desired images from the expected tens of millions of images in NSDL, innovative services must be pursued to increase the usefulness of image retrieval for both teachers and students. Content-based image retrieval (CBIR) techniques can be designed to meet this aim.

The advantage of CBIR over exhaustive manual text annotation is obvious. As thousands of images are added to digital libraries each day (in such application areas as earth science and telemedicine), the ability to effectively catalog and manually describe each image is precluded. Since images constitute a vital vehicle for learning (from preschool to higher education), the effective maintenance of rapidly expanding image libraries is a critical goal. CBIR allows users to find images that have not been cataloged or have been partially cataloged. Of course, CBIR will not replace text-based searches entirely where text annotation exists, but CBIR will augment the searching capability with image-related information. Adding CBIR to digital libraries provides users with a new mechanism for accessing data that does not require knowledge of technical terms and jargon. Metadata often use technical terms to describe the content of the image; a text-based search would require the user to imitate the metadata vocabulary. CBIR also provides a mechanism for accessing images without requiring strong English skills and is particularly usable by young children and nonnative English speakers.

CBIR techniques have obtained considerable attention during the past decade. Examples include MIT’s Photobook system, Columbia’s VisualSEEK system, UCSD’s Virage, and IBM’s QBIC system. 3 In each case, the search and retrieval of images is based on a central database. These systems place the responsibility of extracting and indexing features on the collections and digital librarians. Centralized image-retrieval systems cannot meet the needs of NSDL as it is a heterogeneous collection of distributed resources.

In this paper, the authors describe the development of a decentralized image-retrieval system for education (DIRECT). With DIRECT, CBIR is a service that runs autonomously; the collections do not need to be modified to accommodate DIRECT. DIRECT applies a consistent CBIR mechanism to the entire system and gives CBIR functionality to collections without effort by the collection providers. Collections that do not have their own CBIR system or are not able to invest in a CBIR system (such as small collections or collections that do not contain a large proportion of images) gain greater distribution of their holdings via DIRECT’s CBIR functionality. The DIRECT service finds collections in the library and requests the images using the same method that the portal uses to search and request images. DIRECT then performs feature extraction on the images and stores the features in its own database. A portal or client application can query DIRECT, and it uses the features of the desired image to search the library for images with the same features. DIRECT automatically monitors collections in order to keep the DIRECT index up to date.

Another advantage of DIRECT is that DIRECT supports not only global feature-based image retrieval, but local feature image retrieval. Local feature image retrieval is a segment-based image retrieval approach. In this approach, the critical process is image segmentation—the task of locating the constituent homogenous regions of the image. In past CBIR systems, segmentation has been performed manually or by computer-assisted methods. We are describing a fully automated method based on k-means classification. In our approach, the given image is segmented into dominant components and the properties of such components as color and texture are used to retrieve similar images containing similar segments.

Background and Related Work

NSDL

Launched in December 2002, NSDL is a comprehensive source and public network that provides educational materials and a lifelong learning environment for science, technology, engineering, and mathematics education at all levels. NSDL is attaining immediate utility at the precollege, undergraduate, and graduate levels, as well as providing a mechanism for continuing education.

The NSDL vision is to create a “digital learning place.” 4 Realizing this vision entails an environment that is more than just a repository of digital resources, but rather an environment that “encourages and supports users in their efforts to create, discover, explore, and interact with digital resources.” 5 NSDL is “a center of innovation in digital libraries as applied to education, and a community center for groups focused on digital library-enabled science education.” 6

This paper focuses on the DIRECT system for CBIR. Currently, DIRECT has been implemented as a stand-alone prototype using a subset of the available NSDL collections. DIRECT is expected to be fully integrated into NSDL by late 2004.

Peer-to-Peer Network Systems

A peer-to-peer network system provides a new way of utilizing distributed resources. In a peer-to-peer system, all participants in the system, such as image collections, education collections, searching services, and image retrieval services, are regarded as peers. Control of the system is distributed throughout the peers. Participants are peers in a peer-to-peer system. One of the benefits of such a peer-to-peer system is that it utilizes the resources of all the peers involved. Unlike an Internet application that must invest more resources to support more and more users, a peer-to-peer application gains resources and capabilities with each participant, making the system scalable and inexpensive for thousands to hundreds of thousands of participants.

Although peer-to-peer systems were used in the earliest incarnations of the Internet, these systems have been overshadowed by the centralized Internet applications of the Web. Lately, personal computers have reached a level of ubiquity that promises significant potential for peer-to-peer systems.

Several existing systems use peer-to-peer services, such as Napster, Gnutella, and Freenet. 7 Napster transfers files in a peer-to-peer manner while maintaining a centralized index. When using the system, a client will query the central system and receive a list of peers that have the desired files. The client then will contact the peer to download the file. When peers join, they notify the central system, and the central system updates the index. 8 This type of system is very fast and efficacious and avoids the costs of a central library but incurs the cost of a centralized index and imposes updating the index on the peers.

Gnutella and Freenet are completely decentralized. 9 A client can literally query each peer in the system when searching for a file. This query is accomplished by connecting to several neighboring peers and sending the query to each. These peers will, in turn, relay the query to each of their neighboring peers, and the query will extend in a like manner until a timeout is encountered. If the file is found, the response is passed back through the network of neighbors, and the file is transferred from peer to peer. These services are completely decentralized and distribute the cost of the system among the peers, but the search mechanism is extremely slow and usually will not search the entire system before the user-invoked time limit has been reached.

Gnutella decreases the response time and increases scalability. This system introduces peers that query their neighbors and automatically create indexes. This indexing peer is a reflector node and contains the index for ten or twenty regular nodes. These reflector nodes, when used as the backbone of the Gnutella network, can reduce the number of nodes that need to be searched and improve performance. 10 However, the Gnutella performance is inefficient in comparison to centralized systems. 11

DIRECT

Image Retrieval Service in NSDL

As pointed out in the beginning of this article, centralized image-retrieval systems, which place the responsibility of extracting and indexing features on the collections and digital librarians, cannot meet the needs of NSDL as it is a heterogeneous collection of distributed resources. The query of distributed, heterogeneous collections of images requires a CBIR service. DIRECT is such a system being developed that satisfies this requirement.

In DIRECT, CBIR will be a service that will run autonomously; the collections will not need to be modified to accommodate DIRECT. DIRECT will apply a consistent CBIR mechanism to the entire system and gives CBIR functionality to collections without effort by the collection providers. Collections that do not have their own CBIR system or are not able to invest in a CBIR system (such as small collections or collections that do not contain a large proportion of images) will gain greater distribution of their holdings via DIRECT’s CBIR functionality.

The DIRECT service will find collections in the library and request the images using the same method that the portal uses to search and request images. DIRECT will perform feature extraction on the images and store the features in its own database. A portal or client application can query DIRECT, and it will use the features of the desired image to search its database for images with the same features. DIRECT will automatically monitor collections in order to keep its database up-to-date. The structure of DIRECT service is shown in figure 1.

Image Features Used in DIRECT

DIRECT supports both global feature-based image retrieval and local feature-based image retrieval. For global feature-based image retrieval, global color histograms and texture measurements are used to conduct an image query within the digital libraries. For local feature-based image retrieval, there is a segment-specific approach to feature extraction and content description for digital images. 12 The image query and retrieval engine is based on object features extracted by segmentation, and image retrieval occurs by matching the constituent image segments.

In global feature-based image retrieval, global color histograms and global texture measures are used to perform image retrieval. For each image in the database, DIRECT records a quantized (sixteen-element) histogram for each color band. The histogram is a function that records the frequency of occurrence of each color. Although perceptually motivated color spaces exist, such as l*a*b*, DIRECT utilizes the RGB color space to achieve enhanced processing versatility. If DIRECT required an l*a*b* representation for color, the majority of images in the distributed database could not meet the specification, due to lack of information about the true color content. The color histograms used in this CBIR system are quantized using a preset table that exploits the color discrimination of the human visual system. 13 In order to expedite searching, DIRECT constructs a histogram feature library in the centralized database. As each band of each image is represented by only a sixteen-element histogram, for each image, only forty-eight integers are needed to be stored in the centralized library. After construction of the histogram feature database, a distance measure is used to compare the similarity between the histogram of the query image and the image in the database. In DIRECT, histogram intersection is used due to its simplicity and robustness. 14 Figures 2 (a) and (a*) show retrieved images when only color is employed.

In terms of texture extraction, DIRECT captures the granularity (periodicity) and the directionality of two-dimensional patterns in the digital library images so that users can retrieve images containing similar patterns (textures). The texture retrieval method in DIRECT is based on the wavelet. 15 Figures 2 (b) and (b*) show retrieved images using only global texture features in DIRECT.

Besides allowing only one aspect of the image (color or texture) to be utilized in image retrieval, DIRECT also provides image retrieval combining texture and color. In order to combine two features to perform image retrieval, a matching functional is required. The matching functional determines the relevance of match between the query image specified by the student or teacher and an image in the distributed digital library. The first step in constructing the matching functional is normalization of the color and texture features for each image. Given a query image Q and a database image D, the matching functional is computed using the following:

D (Q, D) = w 1D H(Q, D) + w 2D T(Q, D)

D H(Q, D) represents the dissimilarity in color histograms. D T(Q, D) represents the normalized dissimilarity of texture.

The parameters w 1, w 2 determine the relative weighting for the image features. In the initial DIRECT system, we allow only discrete weights of 0, 1, 2, and 3. A weight of zero would exclude a particular feature (for example, texture) from the query. A weighting using 1, 2, and 3 would assign a prioritization to features in the search process. When all weights are equal (=1), the characteristics of color and texture are considered equally in the image search. Figures 2 (c) and (c*) are the retrieved images when color and texture are combined.

For local feature-based image retrieval, the image is segmented into different regions and the image retrieval is based on the segments rather than global features. Segmentation is the process of partitioning the image into the constituent homogeneous regions, and a segment is one of the homogeneous regions. In DIRECT, a k-means image segmentation is used. 16 Figure 3 shows an example of DIRECT’s image segmentation algorithm. After the segmentation, DIRECT retains the five largest segments (called the dominant segments) and uses them to perform image retrieval based on the individual color and texture features of the segments.

Local feature image retrieval using color and texture features is similar to image retrieval based on global color histograms and global texture measures, except that features are extracted from segments, not globally from the images. Color and texture databases are generated using the local color and texture features. The color data are extracted as follows: (1) for each image, DIRECT obtains five dominant segments; (2) for a given dominant segment, the color content is extracted in the form of quantized RGB histograms, as with global image feature extraction; (3) the construction of the local texture feature database is also produced in a manner similar to the construction of the global texture feature database: to obtain the texture feature, a square is inscribed in the segment, and an image block is extracted; and (4) the texture feature for the image block obtained by a wavelet method is used as the texture feature of the segment. 17

When performing retrieval, the query image first is segmented and the dominant components found are highlighted so that the user can select one component. Once a dominant component is chosen, it is treated as an image and the retrieval mechanism based on global features can be used to perform the retrieval. The major difference is that the feature match is evaluated between two segments, not two images.

Figure 4 shows an example of image retrieval based on segments. Here, in the query image, the component representing the body of water was used to retrieve the images from the database. Figure 5 shows the retrieval results using the global features. In both cases, the same weights (=1) were used for texture and color.

Implementation of Decentralized Peer-to-Peer File Retrieval in DIRECT

A major component of the DIRECT service is supporting decentralized peer-to-peer file retrieval. DIRECT will work as a service in the peer-to-peer environment of the NSDL collections to retrieve images from image collections. All participants in the system such as image collections and the DIRECT searching service will be regarded as peers, and control of the system will be distributed throughout these peers.

DIRECT creates a network of services by creating a brokered peer-to-peer system. A broker is an entity that connects service providers with service users or clients in a way that prevents either party from needing to know about the other a priori. 18 The service provider informs the broker that it provides a certain type of service. The client queries the broker for a certain type of service and, if there is a match, the broker connects the client with the service provider. In the DIRECT approach, the portals are clients; the collections are service providers, and such entities as the index and the content-based image-retrieval portion of DIRECT are both clients and service providers.

Currently, DIRECT adopts Java’s Jini to implement a broker in a peer-to-peer system. 19 Jini provides an application known as a lookup server that acts as a broker for the system. The lookup server comes with a predefined application program interface (API) for finding and accessing it. When a service provider registers a service, it gives the Jini lookup server a set of attributes describing the service and a service object, which is usually a stub to a remote object. A stub is a special object that represents an object located in another process or on another machine. The stub has the same methods as the remote object it represents and when the program calls a method on the stub, the stub communicates across the wire to the remote object and calls that method on the remote object. A stub is a mechanism for allowing an object to appear local, when the object actually resides remotely. 20

When the client contacts the Jini lookup server, the client can search for the right service by the attributes given when the services were registered. When the client finds the appropriate service, the stub is sent, and then the client can call methods on the remote object through the stub.

The power of Jini resides in the “write once, run anywhere” idea of Java. 21 The stub object is written in Java, and the code for it can literally be sent across the wire to a remote machine and executed there, even if the machine has never seen the code previously. Therefore, a portal does not need to know how a collection implements a search protocol to query it. For instance, University of California–Berkeley’s Simple Digital Library Interface Protocol defines a Java API but also defines several ways the stub can communicate with the remote object (for example, Hypertext Transfer Protocol [HTTP] or Common Object Request Broker Architecture [CORBA]). 22 This ability to allow the service provider to choose the form of communication is an important aspect of such a decentralized service as DIRECT.

The Jini lookup servers enable a decentralized system, but the broker itself is not only centralized but a single point of failure. To solve this dilemma, DIRECT executes multiple lookup servers in different geographic areas, and services are registered with each of the lookup servers. DIRECT allows a lookup server to fail and not debilitate the entire library, and it also prevents the broker mechanism from being centralized with one system.

Another feature of Jini is the event system. The Jini lookup server will broadcast an event whenever a new service is added, or if an existing service is removed or modified. Clients can register to receive these notifications. The indexes use this mechanism to keep up-to-date. They receive notification whenever a collection joins, leaves, or announces its metadata has changed. The index then queries the collection to update its tables. The event mechanism allows a client to react to a service provider without having the service provider know about the client.

Jini provides the functionality to implement DIRECT as a decentralized peer-to-peer service. Jini supplies a broker so that DIRECT can find collections dynamically and allows mobile software that can communicate with each collection using the collection’s desired communication mechanism. Last, Jini provides an event mechanism so that DIRECT can be notified when it needs to update its index.

Conclusion

In this paper the authors have described DIRECT, a decentralized image-retrieval system for the NSDL. DIRECT uses distributed database technology to exchange digital images in a peer-to-peer environment. With a flexible, sustainable, decentralized approach, the existing NSDL collections and services do not have to be significantly modified to utilize DIRECT. Most importantly, DIRECT gives the ability to search NSDL images by content. Such a content-based image-retrieval tool will allow access to the constituent collections of the library by such underrepresented groups as nonnative English speakers. The service also will allow search and retrieval for images that have not been annotated manually. As the number of digital images in the NSDL rapidly increases, automatic segmentation and description of the digital image content, as proposed with DIRECT, becomes a critical need.

References

   1. L. Zia, National Science, Technology, Engineering, and Mathematics Education Digital Library (NSDL): Program Solicitation, NSF-02-054, 2002. Accessed Dec. 1, 2002, http://www.nsf.gov/pubs/2002/nsf02054/nsf02054.html#TOC.

   2. C. Manduca, F. McMatin, and D. Mogk, Pathways to Progress: Vision and Plans for Developing the NSDL, 2001. Accessed Jan., 21, 2004, http://doclib.comm.nsdlib.org/PathwaysToProgress.pdf.

   3. A. Pentland, R. W. Picard, and S. Sclaroff, “Photobook: Content-Based Manipulation of Image Databases,” International Journal of Computer Vision 18, no. 3 (1996): 233–54; J. R. Smith and S. F. Chang, “Querying by Color Regions Using the VisualSEEK Content-based Visual Query System,” Intelligent Multimedia Information Retrieval, ed. Mark T. Maybury (1996); J. R. Bach et al., “Virage Image Search Engine: An Open Framework for Image Management,” Symposium on Electronic Imaging: Science and TechnologyStorage and Retrieval for Image and Video Databases IV, 2670 (1996), 76–87; M. Flickner et al.,“Query by Image and Video Content: The QBIC system,” IEEE Computer 28, no. 9 (1995), 23–33.

   4. Manduca, McMatin, and Mogk, Pathways to Progress.

   5. Ibid.

   6. National Science Foundation, “NSDL: The National Science Digital Library” (initial version). Accessed Dec. 5, 2002, http://nsdl.org/render.userLayoutRootNode.uP.

   7. Napster, How Napster Works. Accessed Dec. 10, 2002, http://wiki.cs.uiuc.edu/cs427/Napster:+How+it+works; DSS Group, Gnutella: To the Bandwidth Barrier and Beyond, 2000. Accessed June 10, 2002, http://dss.clip2.com; Freenet, The Free Network Project, 2001. Accessed May 7, 2002, http://freenetproject.org.

   8. Napster, How Napster Works.

   9. DSS Group, Gnutella: To the Bandwidth Barrier and Beyond; Freenet, The Free Network Project.

   10. DSS Group, Gnutella: To the Bandwidth Barrier and Beyond.

   11. M. Ripeanu, “Peer-to-Peer Architecture Case Study: Gnutella Network,” Proceedings of the First (2001) International Conference on Peer-to-Peer Computing, eds. Ross Lee Graham and Nahid Shahmehri, Linkoping University, Sweden, Aug. 27–28, 2001 (Piscataway, N.J. : IEEE, 2002), 99–100.

   12. Sridhar Avula, Jinshan Tang, and Scott Acton, “Image Retrieval Using Segmentation,” Proceedings of the 2003 IEEE Systems and Information Engineering Design Symposium, eds. Matthew H, Jones, Barbara E. Tawney, K. Preston White Jr., University of Virginia, Charlottesville, Apr. 24–25, 2003 (Charlottesvile, Va.: Department of Systems and Information Engineering, Univ. of Virginia, 2003), 289–94.

   13. Y. Ma and B. S. Manjunath, “NETRA: A Toolbox for Navigating Large Image Databases,” Proceedings of the International Conference on Image Processing, Santa Barbara, California, Oct. 26–29, 1997, vol. 1 (New York: Institute of Electrical and Electronics Engineers, 1997), 26–29.

   14. M. Swain and D. Ballard, “Colour Indexing,” International Journal of Computer Vision 7, no. 1 (1991): 11–32.

   15. M. Do and M. Vetterli, “Wavelet-Based Texture Retrieval Using Generalized Gaussian Density and Kullback-Leibler Distance,” IEEE Transactions on Image Processing 11, no. 2 (2002): 146–58.

   16. J. MacQueen, “Some Methods for Classification and Analysis of Multivariate Observations,” Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1 (Berkeley: Univ. of California Pr., 1967): 281–97.

   17. Do and Vetterli, “Wavelet-Based Texture Retrieval,” 146; Avula, Tang, and Acton, “Image Retrieval Using Segmentation,” 289.

   18. B. Knighten, Peer-to-Peer Computing, 2002. Accessed Dec. 10, 2002, http://cnscenter.future.co.kr/resource/hot-topic/p2p/PtP_IDF_Rev1.11-web.pdf.

   19. S. Oaks and H. Wong, Jini in a Nutshell: A Desktop Quick Reference (Sebastopol, Calif.: O’Reilly, 2000).

   20. Ibid., 102.

   21. Ibid., 111.

   22. SDLIP: The Simple Digital Library Interoperability Protocol, 1999. Accessed Dec. 1, 2002, http://www-diglib.stanford.edu/~testbed/doc2/SDLIP/SDLIPProjDesc/sdlipProject.htm.


Jinshan Tang (jt6w@virginia.edu) is a Research Scientist, Sridhar R. Avula (sar4v@virginia.edu) is a Research Assistant, and Scott T. Acton (acton@virginia.edu) is the Walter N. Munster Associate Professor of Electrical and Computer Engineering, Department of Electrical and Computer Engineering, University of Virginia, Charlottesville.