HCI Bibliography Home | HCI Conferences | DL Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
DL Tables of Contents: 9697989900010203040506070809101112131415

JCDL'03: Proceedings of the 3rd ACM/IEEE-CS Joint Conference on Digital Libraries

Fullname:ACM/IEEE Joint Conference on Digital Libraries
Editors:Catherine C. Marshall
Location:Houston, Texas, USA
Dates:2003-May-27 to 2003-May-31
Publisher:ACM
Standard No:ISBN 0-7695-1939-3, IEEE Catalog No.: PR01939; ACM DL: Table of Contents hcibib: DL03
Papers:88
Pages:393
  1. Music and digital libraries: from users to algorithms
  2. Automatic metadata creation
  3. Managing resources and services
  4. Information retrieval and data mining
  5. Knowledge and representation
  6. User interaction
  7. OAI in action
  8. Multimedia issues in digital libraries
  9. Designing and accessing scientific digital libraries
  10. Digital libraries in the classroom
  11. Standards, mark-up, and metadata
  12. Tools for building digital libraries
  13. Correction and analysis
  14. Demonstrations
  15. Posters
  16. Workshops

Music and digital libraries: from users to algorithms

An ethnographic study of music information seeking: implications for the design of a music digital library BIBAFull-Text 5-16
  Sally Jo Cunningham; Nina Reeves; Matthew Britland
At present, music digital library systems are being developed based on anecdotal evidence of user needs, intuitive feelings for user information seeking behavior, and a priori assumptions of typical usage scenarios. Emphasis has been placed on basic research into music document representation, efficient searching, and audio-based searching, rather than on exploring the music information needs or information behavior of a target user group. This paper focuses on eliciting the native music information strategies employed by people searching for popular music (that is, music sought for recreational or enjoyment purposes rather than to support a serious or scientific exploration of some aspect of music). To this end, we conducted an ethnographic study of the searching/browsing techniques employed by people in the researchers local communities, as they use two common sources of music: the public library and music stores. We argue that the insights provided by this type of study can inform the development of searching/browsing support for music digital libraries.
Content-based indexing of musical scores BIBAFull-Text 18-26
  Richard A. Medina; Lloyd A. Smith; Deborah R. Wagner
This paper describes a method of automatically creating a content-based index of musical scores. The goal is to capture the themes, or motifs, that appear in the music. The method was tested by building an index of 25 orchestral movements from the classical music literature. For every movement, the system captured the primary theme, or a variation of the primary theme. In addition, it captured 13 of 28 secondary themes. The resulting index was 14% of the size of the database. A further reduction of 2% is possible; however, this discards secondary themes. A listening experiment using five orchestral movements showed that people can reliably recognize secondary themes after listening to a piece of music-therefore, it may be necessary to retain secondary themes in a score index.
Structural analysis of musical signals for indexing and thumbnailing BIBAFull-Text 27-34
  Wei Chai; Barry Vercoe
A musical piece typically has a repetitive structure. Analysis of this structure will be useful for music segmentation, indexing and thumbnailing. This paper presents an algorithm that can automatically analyze the repetitive structure of musical signals. First, the algorithm detects the repetitions of each segment of fixed length in a piece using dynamic programming. Second, the algorithm summarizes this repetition information and infers the structure based on heuristic rules. The performance of the approach is demonstrated visually using figures for qualitative evaluation, and by two structural similarity measures for quantitative evaluation. Based on the structural analysis result, this paper also proposes a method for music thumbnailing. The preliminary results obtained using a corpus of Beatles' songs show that automatic structural analysis and thumbnailing of music are possible.

Automatic metadata creation

Automatic document metadata extraction using support vector machines BIBAFull-Text 37-48
  Hui Han; C. Lee Giles; Eren Manavoglu; Hongyuan Zha; Zhenyue Zhang; Edward A. Fox
Automatic metadata generation provides scalability and usability for digital libraries and their collections. Machine learning methods offer robust and adaptable automatic metadata extraction. We describe a Support Vector Machine classification-based method for metadata extraction from header part of research papers and show that it outperforms other machine learning methods on the same task. The method first classifies each line of the header into one or more of 15 classes. An iterative convergence procedure is then used to improve the line classification by using the predicted class labels of its neighbor lines in the previous round. Further metadata extraction is done by seeking the best chunk boundaries of each line. We found that discovery and use of the structural patterns of the data and domain based word clustering can improve the metadata extraction performance. An appropriate feature normalization also greatly improves the classification performance. Our metadata extraction method was originally designed to improve the metadata extraction quality of the digital libraries Citeseer [17] and EbizSearch[24]. We believe it can be generalized to other digital libraries.
Bibliographic attribute extraction from erroneous references based on a statistical model BIBAFull-Text 49-60
  Atsuhiro Takasu
In this paper, we propose a method for extracting bibliographic attributes from reference strings captured using Optical Character Recognition (OCR) and an extended hidden Markov model. Bibliographic attribute extraction can be used in two ways. One is reference parsing in which attribute values are extracted from OCR-processed references for bibliographic matching. The other is reference alignment in which attribute values are aligned to the bibliographic record to enrich the vocabulary of the bibliographic database. In this paper, we first propose a statistical model for attribute extraction that represents both the syntactical structure of references and OCR error patterns. Then, we perform experiments using bibliographic references obtained from scanned images of papers in journals and transactions and show that useful attribute values are extracted from OCR-processed references. We also show that the proposed model has advantages in reducing the cost of preparing training data, a critical problem in rule-based systems.
Automated semantic annotation and retrieval based on sharable ontology and case-based learning techniques BIBAFull-Text 61-72
  Von-Wun Soo; Chen-Yu Lee; Chung-Cheng Li; Shu Lei Chen; Ching-chih Chen
Effective information retrieval (IR) using domain knowledge and semantics is one of the major challenges in IR. In this paper we propose a framework that can facilitate image retrieval based on a sharable domain ontology and thesaurus. In particular, case-based learning (CBL) using a natural language phrase parser is proposed to convert a natural language query into resource description framework (RDF) format, a semantic-web standard of metadata description that supports machine readable semantic representation. This same parser also is extended to perform semantic annotation on the descriptive metadata of images and convert metadata automatically into the same RDF representation. The retrieval of images then can be conducted by matching the semantic and structural descriptions of the user query with those of the annotated descriptive metadata of images. We tested in our problem domain by retrieving the historical and cultural images taken from Dr. Ching-chih Chen's "First Emperor of China" CD-ROM [25] as part of our productive international digital library collaboration. We have constructed and implemented the domain ontology, a Mandarin Chinese thesaurus, as well as the similarity match and retrieval algorithms in order to test our proposed framework. Our experiments have shown the feasibility and usability of these approaches.

Managing resources and services

Towards a cultural heritage digital library BIBAFull-Text 75-86
  Gregory Crane; Clifford Wulfman
This paper surveys research areas relevant to cultural heritage digital libraries. The emerging National Science Digital Library promises to establish the foundation on which those of us beyond the scientific and engineering community will likely build. This paper thus articulates the particular issues that we have encountered in developing cultural heritage collections. We provide a broad overview of audiences, collections, and services.
The DSpace institutional digital repository system: current functionality BIBAFull-Text 87-97
  Robert Tansley; Mick Bass; David Stuve; Margret Branschofsky; Daniel Chudnov; Greg McClellan; MacKenzie Smith
In this paper we describe DSpace, an open source system that acts as a repository for digital research and educational material produced by an organization or institution. DSpace was developed during two years' collaboration between the Hewlett-Packard Company and MIT Libraries. The development team worked closely with MIT Libraries staff and early adopter faculty members to produce a 'breadth-first' system, providing all of the basic features required by a digital repository service. As well as functioning as a live service, DSpace is intended as a base for extending repository functionality, particularly to address long-term preservation concerns. We describe the functionality of the current DSpace system, and briefly describe its technical architecture. We conclude with some remarks about the future development and operation of the DSpace system.
Metis: lightweight, flexible, and Web-based workflow services for digital libraries BIBAFull-Text 98-109
  Kenneth M. Anderson; Aaron Andersen; Neet Wadhwani; Laura M. Bartolo
The Metis project is developing workflow technology designed for use in digital libraries by avoiding the assumptions made by traditional workflow systems. In particular, digital libraries have highly distributed sets of stake-holders who nevertheless must work together to perform shared activities. Hence, traditional assumptions that all members of a workflow belong to the same organization, work in the same fashion, or have access to similar computing platforms are invalid. The Metis approach makes use of event-based workflows to support the distributed nature of digital library workflow and employs techniques to make the resulting technology lightweight, flexible, and integrated with the Web. This paper describes the conceptual framework behind the Metis approach as well as a prototype which implements the framework. The prototype represents a "proof-of-concept" of the Metis framework and approach as we show how it can both model and execute a peer review workflow drawn from a "real-world" digital library. After describing related work, the paper concludes with a discussion of future research opportunities in the area of digital library workflow and outlines how Metis is being deployed to a small set of digital libraries for additional evaluation.

Information retrieval and data mining

Protein association discovery in biomedical literature BIBAFull-Text 113-115
  Yueyu Fu; Javed Mostafa; Kazuhiro Seki
Protein association discovery can directly contribute toward developing protein pathways; hence it is a significant problem in bioinformatics. LUCAS (Library of User-Oriented Concepts for Access Services) was designed to automatically extract and determine associations among proteins from biomedical literature. Such a tool has notable potential to automate database construction in biomedicine, instead of relying on experts' analysis. This paper reports on the mechanisms for automatically generating clusters of proteins. A formal evaluation of the system, based on a subset of 2000 MEDLINE titles and abstracts, has been conducted against Swiss-Prot database in which the associations among concepts are entered by experts manually.
Genescene: biomedical text and data mining BIBAFull-Text 116-118
  Gondy Leroy; Hsinchun Chen; Jesse D. Martinez; Shauna Eggers; Ryan R. Falsey; Kerri L. Kislin; Zan Huang; Jiexun Li; Jie Xu; Daniel M. McDonald; Gavin Ng
To access the content of digital texts efficiently, it is necessary to provide more sophisticated access than keyword based searching. Genescene provides biomedical researchers with research findings and background relations automatically extracted from text and experimental data. These provide a more detailed overview of the information available. The extracted relations were evaluated by qualified researchers and are precise. A qualitative ongoing evaluation of the current online interface indicates that this method to search the literature is more useful and efficient than keyword based searching.
Taxonomies for automated question triage in digital reference BIBAFull-Text 119-121
  Jeffrey Pomerantz; R. David Lankes
This study identifies (1) several taxonomies of questions at different levels of linguistic analysis, according to which questions received by digital reference services are classified, and (2) a simple categorization of triage recipients. The utility of these taxonomies and categorizations of triage recipients is discussed as the basis for systems for automating triage and other steps in the digital reference process.
Topic detection and interest tracking in a dynamic online news source BIBAFull-Text 122-124
  Andrew J. Kurtz; Javed Mostafa
Digital libraries in the news domain may contain frequently updated data. Providing personalized access to such dynamic resources is an important goal. In this paper, we investigate the area of filtering online dynamic news sources based on personal profiles. We experimented with an intelligent news-sifting system that tracks topic development in a dynamic online news source. Vocabulary discovery and clustering are used to expose current news topics. User interest profiles, generated from explicit and implicit feedback are used to customize the news retrieval system's interface.
Methods for precise named entity matching in digital collections BIBAFull-Text 125-127
  Peter T. Davis; David K. Elson; Judith L. Klavans
In this paper, we describe an interactive system, built within the context of CLiMB project, which permits a user to locate the occurrences of named entities within a given text. The named entity tool was developed to identify references to a single art object (e.g. a particular building) with high precision in text related to images of that object in a digital collection. We start with an authoritative list of art objects, and seek to match variants of these named entities in related text. Our approach is to "decay" entities into progressively more general variants while retaining high precision. As variants become more general, and thus more ambiguous, we propose methods to disambiguate intermediate results. Our results will be used to select records into which automatically generated metadata will be loaded.
An application of multiple viewpoints to content-based image retrieval BIBAFull-Text 128-130
  James C. French; A. C. Chapin; Worthy N. Martin
Content-based image retrieval uses features that can be extracted from the images themselves. Using more than one representation of the images in a collection can improve the results presented to a user without changing the underlying feature extraction or search technologies. We present an example of this "multiple viewpoint" approach, multiple image channels, and discuss its advantages for an image-seeking user. This approach has also been shown to dramatically improve retrieval effectiveness in content-based image retrieval systems[3].

Knowledge and representation

Convergence of knowledge management and E-learning: the GetSmart experience BIBAFull-Text 135-146
  Byron Marshall; Yiwen Zhang; Hsinchun Chen; Ann Lally; Rao Shen; Edward Fox; Lillian N. Cassel
The National Science Digital Library (NSDL), launched in December 2002, is emerging as a center of innovation in digital libraries as applied to education. As a part of this extensive project, the GetSmart system was created to apply knowledge management techniques in a learning environment. The design of the system is based on an analysis of learning theory and the information search process. Its key notion is the integration of search tools and curriculum support with concept mapping. More than 100 students at the University of Arizona and Virginia Tech used the system in the fall of 2002. A database of more than one thousand student-prepared concept maps has been collected with more than forty thousand relationships expressed in semantic, graphical, node-link representations. Preliminary analysis of the collected data is revealing interesting knowledge representation patterns.
Acquisition, representation, query and analysis of spatial data: a demonstration 3D digital library BIBAFull-Text 147-158
  Jeremy Rowe; Anshuman Razdan; Arleyn Simon
The increasing power of techniques to model complex geometry and extract meaning from 3D information create complex data that must be described, stored, and displayed to be useful to researchers. Responding to the limitations of two-dimensional (2D) data representations perceived by discipline scientists, the Partnership for Research in Spatial Modeling (PRISM) project at Arizona State University (ASU) developed modeling and analytic tools that raise the level of abstraction and add semantic value to 3D data. The goals are to improve scientific communication, and to assist in generating new knowledge, particularly for natural objects whose asymmetry limit study using 2D representations. The tools simplify analysis of surface and volume using curvature and topology to help researchers understand and interact with 3D data. The tools produced automatically extract information about features and regions of interest to researchers, calculate quantifiable, replicable metric data, and generate metadata about the object being studied. To help researchers interact with the information, the project developed prototype interactive, sketch-based interfaces that permit researchers to remotely search, identify and interact with the detailed, highly accurate 3D models of the objects. The results support comparative analysis of contextual and spatial information, and extend research about asymmetric manmade and natural objects.
Leveraging a common representation for personalized search and summarization in a medical digital library BIBAFull-Text 159-170
  Kathleen R. McKeown; Noemie Elhadad; Vasileios Hatzivassiloglou
Despite the large amount of online medical literature, it can be difficult for clinicians to find relevant information at the point of patient care. In this paper, we present techniques to personalize the results of search, making use of the online patient record as a sophisticated, pre-existing user model. Our work in PERSIVAL, a medical digital library, includes methods for re-ranking the results of search to prioritize those that better match the patient record. It also generates summaries of the re-ranked results which highlight information that is relevant to the patient under the physician's care. We focus on the use of a common representation for the articles returned by search and the patient record which facilitates both the re-ranking and the summarization tasks. This common approach to both tasks has a strong positive effect on the ability to personalize information.

User interaction

Visualizing and exploring Picasso's world BIBAFull-Text 173-175
  Carlos Monroy; Richard Furuta; Enrique Mallen
We discuss the preliminary use of a visualization tool called Interactive Timeline Viewer (ItLv) in visualizing and exploring a collection of art works by Pablo Ruiz Picasso. Our data set is composed of a subset of the On-line Picasso Project, a significantly-sized on-line art repository of the renowned Spanish artist. We also include a brief discussion about how this visualization tool can help art scholars to study and analyze an artist's life and works.
Graded access to sensitive materials at the archive of the indigenous languages of Latin America BIBAFull-Text 176-178
  Heidi Johnson
The Archive of the Indigenous Languages of Latin America (AILLA) is a web-accessible repository of multi-media resources in and about the indigenous languages of Latin America. In this paper, I describe the Graded Access System developed at AILLA to protect sensitive materials by allowing resource producers -- academics and indigenous people -- finely-grained control over the resources they house in the archive.
Learning digital library technology across borders BIBAFull-Text 179-181
  Silvia Barcellos Southwick; Richard Southwick
This paper describes the background context and initial findings from an ongoing case study of an electronic theses and dissertations (ETD) digital library (DL) project in Brazil. The specific focus of the case study centers on the activities of a Brazilian government agency acting as a mediator between software developers -- primarily academic institutions in the United States -- and university clients in Brazil. The authors highlight the loosely integrated nature of the DL technology, and the uncertain relationship between developers and users in terms of support. These circumstances reinforce a view of technology transfer as a process of organizational learning. As a consequence, the mediating institution in the study is viewed as assuming multiple roles in advancing the project.
Personal spaces in the context of OAI BIBAFull-Text 182-183
  Natalia Reyes-Farfan; J. Alfredo Sanchez
We describe MiBiblio 2.0, a highly personalizable user interface for a federation of digital libraries under the OAI Protocol for Metadata Harvesting. (OAI-PMH). MiBiblio 2.0 allows users to personalize their personal space by choosing the resources and services they need, as well as to organize, classify and manage their workspaces including resources from any of the federated libraries. Results can be kept in personal spaces and organized into categories using a drag-and-drop interface.
PoPS: mobile access to digital library resources BIBAFull-Text 184-185
  Nohema Castellanos; J. Alfredo Sanchez
Mobile devices represent new opportunities for accessing digital libraries (DLs) but also pose a number of challenges given the diversity of their hardware and software features. We describe a framework aimed at facilitating the generation of interfaces for access to DL resources from a wide range of mobile devices.
How to turn the page BIBAFull-Text 186-188
  Yi-Chun Chu; Ian H. Witten; Richard Lobb; David Bainbridge
Can digital libraries provide a reading experience that more closely resembles a real book than a scrolled or paginated electronic display? This paper describes a prototype page-turning system that realistically animates full three-dimensional page-turns. The dynamic behavior is generated by a mass-spring model defined on a rectangular grid of particles. The prototype takes a PDF or E-book file, renders it into a sequence of PNG images representing individual pages, and animates the pageturns under user control. The simulation behaves fairly naturally, although more computer graphics work is required to perfect it.

OAI in action

Repository synchronization in the OAI framework BIBAFull-Text 191-198
  Xiaoming Liu; Kurt Maly; Mohammad Zubair; Michael L. Nelson
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) began as an alternative to distributed searching of scholarly eprint repositories. The model embraced by the OAI-PMH is that of metadata harvesting, where value-added services (by a "service provider") are constructed on cached copies of the metadata extracted from the repositories of the harvester's choosing. While this model dispenses with the well known problems of distributed searching, it introduces the problem of synchronization. Stated simply, this problem arises when the service provider's copy of the metadata does not match the metadata currently at the constituent repositories. We define some metrics for describing the synchronization problem in the OAI-PMH. Based on these metrics, we study the synchronization problem of the OAI-PMH framework and propose several approaches for harvesters to implement better synchronization. In particular, if a repository knows its update frequency, it can publish it in an OAI-PMH Identify response using an optional About container that borrows from RDF Site Syndication (RSS) Format.
eBizSearch: an OAI-compliant digital library for eBusiness BIBAFull-Text 199-209
  Yves Petinot; Pradeep B. Teregowda; Hui Han; C. Lee Giles; Steve Lawrence; Arvind Rangaswamy; Nirmal Pal
Niche Search Engines offer an efficient alternative to traditional search engines when the results returned by general-purpose search engines do not provide a sufficient degree of relevance and when nontraditional search features are required. Niche search engines can take advantage of their domain of concentration to achieve higher relevance and offer enhanced features. We discuss a new digital library niche search engine, eBizSearch, dedicated to e-business and e-business documents. The ground technology for eBizSearch is CiteSeer, a special-purpose automatic indexing document digital library and search engine developed at NEC Research Institute. We present here the integration of CiteSeer in the framework of eBizSearch and the process necessary to tune the whole system towards the specific area of e-business. We show how using machine learning algorithms we generate metadata to make eBizSearch Open Archives compliant. eBizSearch is a publicly available service and can be reached at [13].
The OAI-PMH static repository and static repository gateway BIBAFull-Text 210-217
  Patrick Hochstenbach; Henry Jerez; Herbert Van de Sompel
Although the OAI-PMH specification is focused on making it straightforward for data providers to expose metadata, practice shows that in certain significant situations deployment of OAI-PMH conformant repository software remains problematic. In this paper, we report on research aimed at devising solutions to further lower the barrier to make metadata collections harvestable. We provide an in depth description of an approach in which a data provider makes a metadata collection available as an XML file with a specific format -- an OAI Static Repository -- which is made OAI-PMH harvestable through the intermediation of software -- an OAI Static Repository Gateway -- operated by a third party. We describe the properties of both components, and provide insights in our experience with an experimental implementation of a Gateway.

Multimedia issues in digital libraries

How fast is too fast?: evaluating fast forward surrogates for digital video BIBAFull-Text 221-230
  Barbara M. Wildemuth; Gary Marchionini; Meng Yang; Gary Geisler; Todd Wilkens; Anthony Hughes; Richard Gruss
To support effective browsing, interfaces to digital video libraries should include video surrogates (i.e., smaller objects that can stand in for the videos in the collection, analogous to abstracts standing in for documents). The current study investigated four variations (i.e., speeds) of one form of video surrogate: a fast forward created by selecting every Nth frame from the full video. In addition, it tested the validity of six measures of user performance when interacting with video surrogates. Forty-five study participants interacted with all four versions of the fast forward surrogate, and completed all six performance tasks with each. Surrogate speed affected performance on four of the measures: object recognition (graphical), action recognition, linguistic gist comprehension (full text), and visual gist comprehension. Based on these results, we recommend a fast forward default speed of 1:64 of the original video keyframes. In addition, users should control the choice of fast forward speed to adjust for content characteristics and personal preferences.
Event-based retrieval from a digital library containing medical streams BIBAFull-Text 231-233
  Mohamed Kholief; Kurt Maly; Stewart Shen
We describe a digital library that contains streams and supports event-based retrieval. Streams used in the digital library are CT scan, medical text, and audio streams. Events, such as 'tumor appeared', were generated and represented in the user interface to enable doctors to retrieve and playback segments of the streams. This paper concentrates on describing the data organization and the user interface.
Music representation in a digital music library BIBAFull-Text 234-236
  Donald Byrd; Eric Isaacson
The Variations2 digital music library currently supports music in audio and score-image formats. In a future version, we plan to add music in a symbolic form. This paper describes our work defining a music representation suitable for the needs of our users.
A quantified fidelity criterion for parameter-embedded watermarking of audio archives BIBAFull-Text 237-239
  A. R. Gurijala; J. R. Deller
A novel algorithm for speech watermarking through parametric modeling is enhanced by inclusion of a quantified fidelity criterion. Watermarking is effected through solution of a set-membership filtering (SMF) problem, subject to an l8 fidelity criterion in the signal space. The SMF approach provides flexibility in obtaining watermark solutions that trade-off watermark robustness and stegosignal fidelity.
Fourth-phase digital libraries: pacing, linking, annotating and citing in multimedia collections BIBAFull-Text 240-244
  J. Alfredo Sanchez; J. Anibal Arias
We discuss the implications of the use of current multimedia collections and posit that it is possible to build what we term fourth-phase digital libraries (4PDLs). In 4PDLs users can take advantage of both the powerful audiovisual channels and the proven practices developed for media such as text. We demonstrate how various technologies can be integrated to produce a 4PDL.

Designing and accessing scientific digital libraries

On querying geospatial and georeferenced metadata resources in G-portal BIBAFull-Text 245-255
  Zehua Liu; Ee-Peng Lim; Wee-Keong Ng; Dion H. Goh
G-Portal is a web portal system providing a range of digital library services to access geospatial and georeferenced resources on the Web. Among them are the storage and query subsystems that provide a central repository of metadata resources organized under different projects. In GPortal, all metadata resources are represented in XML (Extensible Markup Language) and they are compliant to some resource schemas defined by their creators. The resource schemas are extended versions of a basic resource schema making it easy to accommodate all kinds of metadata resources while maintaining the portability of resource data. To support queries over the geospatial and georeferenced metadata resources, a XQuery-like query language known as RQL (Resource Query Language) has been designed. In this paper, we present the RQL language features and provide some experimental findings about the storage design and query evaluation strategies for RQL queries.
A scientific digital library in context: an Earth Radiation Budget Experiment collection in the atmospheric sciences data center digital library BIBAFull-Text 256-257
  Michelle Ferebee; Gregory Boeshaar; Kathryn Bush; Judy Hertz
At the NASA Langley Research Center, the Earth Radiation Budget Experiment (ERBE) Data Management Team and the Atmospheric Sciences Data Center are developing a digital collection for the ERBE project. The main goal is long-term preservation of a comprehensive information environment. The secondary goal is to provide a context for these data products by centralizing the 25-year research project's scattered information elements. The development approach incorporates elements of rapid prototyping and user-centered design in a standards-based implementation. A working prototype is in testing with a small number of users.
Designing a language for creating conceptual browsing interfaces for digital libraries BIBAFull-Text 258-260
  Tamara Sumner; Sonal Bhushan; Faisal Ahmad; Qianyi Gu
Conceptual browsing interfaces can help educators and learners to locate and use learning resources in educational digital libraries; in particular, resources that are aligned with nationally-recognized learning goals. Towards this end, we are developing a Strand Map Library Service, based on the maps published by the American Association for the Advancement of Science (AAAS). This service includes two public interfaces: (1) a graphical user interface for use by teachers and learners and (2) a programmatic interface that enables developers to construct conceptual browsing interfaces using dynamically generated components. Here, we describe our iterative, rapid prototyping design methodology, and the initial round of language type components that have been implemented and evaluated.
Content access characterization in digital libraries BIBAFull-Text 261-262
  Greg Janee; James Frew; David Valentine
To support non-trivial clients, such as data exploration and analysis environments, digital libraries must be able to describe the access modes that their contents support. We present a simple scheme that distinguishes four content accessibility classes: download (byte-stream retrieval), service (API), web interface (interactive), and offline. These access modes may recursively nest in alternative (semantically equivalent) or multipart (component) hierarchies. This scheme is simple enough to be easily supported by DL content providers, yet rich enough to allow programmatic clients to automatically identify appropriate access point(s).
SCENS: a system for the mediated sharing of sensitive data BIBAFull-Text 263-265
  Song Ye; Fillia Makedon; Tilmann Steinberg; Li Shen; James Ford; Yuhang Wang; Yan Zhao; Sarantos Kapidakis
This paper introduces SCENS, a Secure Content Exchange Negotiation System suitable for the exchange of private digital data that reside in distributed digital repositories. SCENS is an open negotiation system with flexibility, security and scalability. SCENS is currently being designed to support data sharing in scientific research, by providing incentives and goals specific to a research community. However, it can easily be extended to apply to other communities, such as government, commercial and other types of exchanges. It is a trusted third party software infrastructure enabling independent entities to interact and conduct multiple forms of negotiation.

Digital libraries in the classroom

Understanding educator perceptions of "quality" in digital libraries BIBAFull-Text 269-279
  Tamara Sumner; Michael Khoo; Mimi Recker; Mary Marlino
The purpose of the study was to identify educators' expectations and requirements for the design of educational digital collections for classroom use. A series of five focus groups was conducted with practicing teachers, pre-service teachers, and science librarians, drawn from different educational contexts (i.e., K-5, 6-12, College). Participants' expect that the added value of educational digital collections is the provision of: (1) 'high quality' teaching and learning resources, and (2) additional contextual information beyond that in the resource. Key factors that influence educators' perceptions of quality were identified: scientific accuracy, bias, advertising, design and usability, and the potential for student distraction. The data showed that participants judged these criteria along a continuum of tolerance, combining consideration of several factors in their final judgements. Implications for collections accessioning policies, peer review, and digital library service design are discussed.
Integrating digital libraries into learning environments: the LEBONED approach BIBAFull-Text 280-290
  Frank Oldenettel; Michael Malachinski; Dennis Reil
This paper presents the project LEBONED that focuses on the integration of digital libraries and their contents into web-based learning environments. We describe in general how the architecture of a standard learning management system has to be modified to enable the integration of digital libraries. An important part of this modification is the LEBONED Metadata Architecture which depicts the handling of metadata and documents imported from digital libraries. The main components of this architecture and their interrelation are presented in detail. Afterwards we show a practical application of the concepts described before: The integration of the digital library eVerlage into the learning management system Blackboard.
The interactive shared educational environment: user interface, system architecture and field study BIBAFull-Text 291-300
  Xiangming Mu; Gary Marchionini; Amy Pattee
The user interface and system architecture of a novel Interactive Shared Educational Environment (ISEE) are presented. Based on a lightweight infrastructure, ISEE enables relatively low bandwidth network users to share videos as well as text messages. Smartlink is a new concept introduced in this paper. Individual information presentation components, like the video player and text chat room, are "smartly" linked together through video timestamps and hyperlinks. A field study related to children book selections using ISEE was conducted. The results indicated that the combination of three information presentation components, including video player with storyboard, shared browser, and text chat room, provided an effective and more comfortable collaboration and learning environment for the given tasks than text reviews or text chat alone or in combination. The video player was the most preferred information component. Text comments in the chat room that did not synchronize with the video content distracted some participants due to limited cognitive capacity. Using smartlink to synchronize various information components or "channels" is our attempt to reduce the user's working memory load in information enriched distance learning environments made possible by digital libraries.

Standards, mark-up, and metadata

XML semantics and digital libraries BIBAFull-Text 303-305
  Allen Renear; David Dubin; C. M. Sperberg-McQueen; Claus Huitfeldt
The lack of a standard formalism for expressing the semantics of an XML vocabulary is a major obstacle to the development of high-function interoperable digital libraries. XML document type definitions (DTDs) provide a mechanism for specifying the syntax of an XML vocabulary, but there is no comparable mechanism for specifying the semantics of that vocabulary -- where semantics simply means the basic facts and relationships represented by the occurrence of XML constructs. A substantial loss of functionality and interoperability in digital libraries results from not having a common machine-readable formalism for expressing these relationships for the XML vocabularies currently being used to encode content. Recently a number of projects and standards have begun taking up related topics. We describe the problem and our own project.
Utility of an OAI service provider search portal BIBAFull-Text 306-308
  Sarah L. Shreeves; Christine Kirkham; Joanne Kaczmarek; Timothy W. Cole
The Open Archives Initiative (OAI) Protocol for Metadata Harvesting (PMH) facilitates efficient interoperability between digital collections, in particular by enabling service providers to construct, with relatively modest effort, search portals that present aggregated metadata to specific communities. This paper describes the experiences of the University of Illinois at Urbana-Champaign Library as an OAI service provider. We discuss the creation of a search portal to an aggregation of metadata describing cultural heritage resources. We examine several key challenges posed by the aggregated metadata and present preliminary findings of a pilot study of the utility of the portal for a specific community (student teachers). We also comment briefly on the potential for using text analysis tools to uncover themes and relationships within the aggregated metadata.
The Dienst-OAI gateway BIBAFull-Text 309-311
  Terry L. Harrison; Michael L. Nelson; Mohammad Zubair
Though the Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) is becoming the de facto standard for digital libraries, some of its predecessors are still in use. Although a limited number of Dienst repositories continue to be populated, others are precariously unsupported. The Dienst Open Archive Gateway (DOG) is a gateway between the OAI-PMH and the Dienst (version 4.1) protocol. DOG allows OAIPMH harvesters to extract metadata records (in RFC-1807 or Dublin Core) from Dienst servers.
The XML log standard for digital libraries: analysis, evolution, and deployment BIBAFull-Text 312-314
  Marcos Andre Goncalves; Ganesh Panchanathan; Unnikrishnan Ravindranathan; Aaron Krowne; Edward A. Fox; Filip Jagodzinski; Lillian Cassel
We describe current efforts and developments building on our proposal for an XML log standard format for digital library (DL) logging analysis and companion tools. Focus is given to the evolution of formats and tools, based on analysis of deployment in several DL systems and testbeds. Recent development of analysis tools also is discussed.
A quantitative analysis of unqualified dublin core metadata element set usage within data providers registered with the open archives initiative BIBAFull-Text 315-317
  Jewel Ward
This research describes an empirical study of how the unqualified Dublin Core Metadata Element Set (DC or DCMES) is used by 100 Data Providers (DPs) registered with the Open Archives Initiative (OAI). The research was conducted to determine whether or not the DCMES is used to its full capabilities. Eighty-two of 100 DPs have metadata records available for analysis. DCMES usage varies by type of DP. The average number of Dublin Core elements per record is eight, with an average of 91, 785 Dublin Core elements in each DP. Five of the 15 elements of the DCMES are used 71% of the time. The results show the unqualified DCMES is not used to its fullest extent within DPs registered with the OAI.
Extracting geometry from digital models in a cultural heritage digital library BIBAFull-Text 318-320
  Thomas L. Milbank
This paper describes research to enhance the integration between digital models and the services provided by the document management systems of digital libraries. Processing techniques designed for XML texts are applied to X3D models, allowing specific geometry to be automatically retrieved and displayed. The research demonstrates that models designed on object-oriented paradigms are most easily exploited by XML document management systems.

Tools for building digital libraries

Assembling and enriching digital library collections BIBAFull-Text 323-334
  David Bainbridge; John Thompson; Ian H. Witten
People who create digital libraries need to gather together the raw material, add metadata as necessary, and design and build new collections. This paper sets out the requirements for these tasks and describes a new tool that supports them interactively, making it easy for users to create their own collections from electronic files of all types. The process involves selecting documents for inclusion, coming up with a suitable metadata set, assigning metadata to each document or group of documents, designing the form of the collection in terms of document formats, searchable indexes, and browsing facilities, building the necessary indexes and data structures, and putting the collection in place for others to use. Moreover, different situations require different workflows, and the system must be flexible enough to cope with these demands. Although the tool is specific to the Greenstone digital library software, the underlying ideas should prove useful in more general contexts.
A system for building expandable digital libraries BIBAFull-Text 335-345
  Donatella Castelli; Pasquale Pagano
Expandability is one of the main requirements of future digital libraries. This paper introduces a digital library service system, OpenDLib, that has been designed to be highly expandable in terms of content, services and usage. The paper illustrates the mechanisms that enable expandability and discusses their impact on the development of the system architecture.
The Web-DL environment for building digital libraries from the Web BIBAFull-Text 346-357
  Pavel P. Calado; Marcos A. Goncalves; Edward A. Fox; Berthier Ribeiro-Neto; Alberto H. F. Laender; Altigran S. da Silva; Davi C. Reis; Pablo A. Roberto; Monique V. Vieira; Juliano P. Lage
The Web contains a huge volume of unstructured data, which is difficult to manage. In digital libraries, on the other hand, information is explicitly organized, described, and managed. Community-oriented services are built to attend specific information needs and tasks. In this paper, we describe an environment, Web-DL, that allows the construction of digital libraries from the Web. The Web-DL environment will allow us to collect data from the Web, standardize it, and publish it through a digital library system. It provides support to services and organizational structure normally available in digital libraries, but benefiting from the breadth of the Web contents. We experimented with applying the Web-DL environment to the Networked Digital Library of Theses and Dissertations (NDLTD), thus demonstrating that the rapid construction of DLs from the Web is possible. Also, Web-DL provides an alternative as a largescale solution for interoperability between independent digital libraries.

Correction and analysis

Distributed proofreading BIBAFull-Text 361-363
  Gregory B. Newby; Charles Franks
Distributed proofreading allows many people working individually across the Internet to contribute to the proofreading of a new electronic book. This paper describes Project Gutenberg's Distributed Proofreading project, along with our general procedures for creating an electronic book from a physical book. Distributed proofreading has promise for the future of Project Gutenberg, and is likely to be a useful strategy for other digital library projects.
Correcting broken characters in the recognition of historical printed documents BIBAFull-Text 364-366
  Michael Droettboom
This paper presents a new technique for dealing with broken characters, one of the major challenges in the optical character recognition (OCR) of degraded historical printed documents. A technique based on graph combinatorics is used to rejoin the appropriate connected components. It has been applied to real data with successful results.
Correcting common distortions in camera-imaged library materials BIBAFull-Text 367-368
  Michael S. Brown; Desmond Tsoi
We present a technique to correct image distortion that can occur when library materials are imaged by cameras. Our approach provides a general framework to undo a variety of common distortions, including binder curl, fold distortion, and combinations of the two. Our algorithm is described and demonstrated on several examples.
Link attachment (preferential and otherwise) in contributor-run digital libraries BIBAFull-Text 369-371
  Miles Efron; Donald Sizemore
Ibiblio is a digital library whose materials are submitted and maintained by volunteer contributors. This study analyzes the emergence of hyperlinke d structures within the ibiblio collection. In the context of ibiblio, we analyze the suitability of Barabasi's model of preferential attachment to describe the distribution of incoming links. We find that the degree of maintainer activity for a given site (as measured by the voluntary development of descriptive metadata) is a stronger link count predictor for ibiblio than is a site's age, as the standard model predicts. Thus we argue that the efforts of ibiblio's contributors positively affect the popularity of their materials.
Automatic disambiguation of Latin abbreviations in early modern texts for humanities digital libraries BIBAFull-Text 372-373
  Jeffrey A. Rydberg-Cox
Early modern books written in Latin contain many abbreviations of common words that are derived from earlier manuscript practice. While these abbreviations are usually easily deciphered by a reader well-versed in Latin, they pose technical problems for full text digitization: they are difficult to OCR or have typed and -- if they are not expanded correctly -- they limit the effectiveness of information retrieval and reading support tools in the digital library. In this paper, I will describe a method for the automatic expansion and disambiguation of these abbreviations.

Demonstrations

Educational tools in support of the Stanford MediaServer BIBAFull-Text 377
  Derek Stevenson; Chih-Chien Chao; Sakti Srivastava; Jeremy C. Durack; Amy Ladd; Kevin Montgomery; Jenn Stringer; Parvati Dev
Medical media resources exist in a variety of analog and digital formats. Collections are generally organized and stored by their owners, each of whom utilizes their own method of cataloging and retrieval. As faculty retire, move on, or pass away, institutions risk losing the expertise that enhances the value media. The Stanford MediaServer has previously been deployed to catalog, organize, and centralize management of such media collections via the World Wide Web. Educational tools have been developed on top of existing MediaServer infrastructure to address a range of pedagogical models, and to promote widespread adoption within the Stanford Medical School curriculum and departments. These tools include Slide Show, Export to PowerPoint, Teaching File, and e-Books. With the exception of e-Books, these tools use web-based wizards to lead the user through the steps for creating each component.
   Slide Shows consist of an ordered set of images and provide the underpinning data structures for PowerPoint and Teaching File creation. Slide Shows can be assembled from any accessible media in the MediaServer and shared with other users of the system.
   Export to PowerPoint is a utility function to address the widespread use of PowerPoint in medical education and multimedia presentation. It allows Slide Shows to be converted to PowerPoint and downloaded to the client system for offline use, easing the process of assembling media and creating a PowerPoint document. This function leverages XML Web Services and the SOAP protocol to achieve the desired outputs.
   Teaching Files are used to illustrate a particular educational topic, and consist of a multi-page interface. Each page contains media and annotations specific to the educational topic at hand. Annotations are stored with the Teaching File and not with the collated media. Individual pages are assembled by choosing existing Slide Shows and further annotating the media.
   E-Books are web-based books built on a particular design template provided by the MediaServer. Authors can integrate media from the MediaServer into these e-Books, which are assembled through the use of 3rd party tools such as Macromedia Dreamweaver.
   MediaServer resources were deployed in a gross anatomy course through the use of these tools and integration with third party applications, including a three-dimensional stereo viewing system. This pilot project was well received by the course participants and evaluation of usage data is ongoing.
   These educational media tools must be further evaluated for their teaching efficacy. These tools will be evaluated with volunteer faculty contributing media and creating Slide Shows, PowerPoint documents, Teaching Files, and e-Books. These educational modules will then be used for medical school classes. Feedback will be integrated into further development of new educational tools, providing new views into the large Stanford MediaServer dataset.
   Access rights management and security is paramount for the protection of digital media. The existing MediaServer security system will be enhanced to address privacy concerns, while providing faculty the flexibility to appropriately create and share educational units with their students and colleagues. Standard APIs will also be created to allow third-party developers to access the media in the MediaServer and deliver it through their own web-based applications.
   This work was partially funded by gifts from the Yamazaki-Yang Family Foundation, the Siminoff Family Foundation, Sun Microsystems, and Silicon Graphics.
Processing and formatting system for digital collections BIBAFull-Text 378
  Frances Webb
This system is being used to build structure data for the HEARTH digital collection and to manage the collection under the DLXS system. It allows student workers or unskilled employees to build structure metadata from scanned images for both monographs and serials, and manages the process of delivering the titles under DLXS once prepared. It allows supervisors to manage the work, simplifying tasks like re-assigning the in-progress work of graduated students.
CMedPort: a cross-regional Chinese medical portal BIBAFull-Text 379
  Yilu Zhou; Jialun Qin; Hsinchun Chen; Zan Huang; Yiwen Zhang; Wingyan Chung; Gang Wang
CMedPort is a cross-regional Chinese medical Web portal developed in the AI Lab at the University of Arizona. We will demonstrate the major system functionalities.
V2V: a second variation on query-by-humming BIBFull-Text 380
  William P. Birmingham; Kevin O'Malley; Jon W. Dunn; Ryan Scherle
A digital collections management system based on open source software BIBAFull-Text 381
  Allison Zhang; Don Gourley
Robust and flexible digital collections management and presentation software is essential for creating and delivering digital collections. But digital library technologies and contents are not static. Continual evolution and investment are required to maintain the digital library. Few commercial digital library products are comprehensive and extensible enough to support this evolution. Many of these systems are in early release and have not been used and tested widely. Some require an initial investment in license fees or staff time that we could not afford. None of the products covered the full range of functionality needed for our digital library.
Object-oriented modeling, import and query processing of digital documents BIBFull-Text 382
  Andre Zeitz; Ilvio Bruder
Stanford encyclopedia of philosophy: a dynamic reference work BIBFull-Text 383
  Colin Allen; Uri Nodelman; Edward N. Zalta
Digital library service integration BIBFull-Text 384
  Xin Chen; Dong-ho Kim; Nikechi Nnadi; Himanshu Shah; Prateek Shrivastava; Michael Bieber; Il Im; Yi-Fang Wu
5SGraph demo: a graphical modeling tool for digital libraries BIBAFull-Text 385
  Qinwei Zhu; Marcos Andre Concalves; Edward A. Fox
The current demand from non-experts who wish to build digital libraries is strong worldwide. However, since DLs are complex systems, it usually takes a huge amount of effort and time to create and tailor a digital library to satisfy specific needs and requirements of target communities/societies. What is desired is a simplified modeling process and rapid generation of digital libraries. To enable this, digital libraries should be modeled with descriptive domain-specific languages [1]. In a domain-specific modeling language, the models are made up of elements representing concepts, rules, and terminology that are part of the domain world, as opposed to the code world or generic modeling languages (e.g., UML [2]). A visual modeling tool would be helpful to non-experts so they may model a digital library without knowing the theoretical foundations and the syntactical details of the descriptive language.
   In this demonstration, we present a domain-specific visual modeling tool, 5SGraph, aimed at modeling digital libraries. 5SGraph is based on a metamodel that describes DLs using the 5S theory [3]. The output from 5SGraph is a digital library model that is an instance of the metamodel, expressed in the 5S description language (5SL) [4].5SGraph presents the metamodel in a structured toolbox, and provides a top-down visual building environment for designers (see Figure 1). The visual proximity of the metamodel and instance model facilitates requirements gathering and simplifies the modeling process. Furthermore, 5SGraph maintains semantic constraints specified by the 5S metamodel and enforces these constraints over the instance model to ensure semantic consistency and correctness. 5SGraph enables component reuse to reduce the time and efforts of designers. 5SGraph also is designed to be flexible and extensible, able to accommodate and integrate several other complementary tools (e.g., to model scenarios or complex digital objects), reflecting the interdisciplinary nature of digital libraries. The tool has been tested with real users and several modeling tasks in a usability experiment [5] and its usefulness and learnability have been demonstrated.
ICON (Innovation Curriculum Online Network): the national digital library for technological literacy BIBAFull-Text 386
  Quentin M. Briggs
The International Technology Education (ITEA), in partnership with the Eisenhower National Clearinghouse (ENC) and funded by the National Science Foundation has created a comprehensive digital library collection for K-12 technological literacy in an accessible virtual environment. ICON, or the Innovation Curriculum Online Network, is a central source for information dealing with technology and innovation.
   ICON serves as a national electronic roadmap to connect users, such as teachers, professors, students, museum staff, and parents with information about our human built and innovated world. Users may use the digital library to access resources ranked according to technological literacy content and pedagogy, interact with quality instructional resources, and to enhance online search capabilities relevant to the needs of the user population. The focused digital library contains online resources including websites, electronic files, information about professional organizations, government agencies, public and private foundations, and commercial enterprises. Identification and selection of these resources are in alignment with national standards, grade and age level appropriateness, sound instructional and disciplinary content, and current availability of and access to materials.
   ENC has built a robust electronic infrastructure to support: the development of relevant and appropriate metadata (in conjunction with other synergistic NSDL projects); the processing of records and abstracts; the development of value-added user interfaces; and the maintenance of computer services for optimum and continuous digital library operations. An advisory board is providing annual input into digital library development and identification of quality digital resources. Formal evaluation of the ICON project is being conducted by Horizon Research.
   Field testing of the collection and its services is being undertaken with diverse groups of users to evaluate ease of navigation and discovery of content-rich, pedagogically sound resources.
   A variety of methods of sustainability for the collection are being explored including public and/or private sponsorship and subscriber support. ICON was officially launched March 2003 (www.icontechlit.org) and presented in a special interest session on March 13, 2003 at the ITEA Conference in Nashville, Tennessee. Currently, ICON has established tools for simple search, advance search, and browse by technology concepts. The technology concepts are classified and based from the National Standards for Technological Literacy initially driven by the Technology for All Americans Project (http://www.iteawww.org/TAA/Listing.htm). Continuous user feedback will be monitored through a "contact us" link established to receive not only communications on problems with the digital library but to allow user questions and site evaluations. Users may also "Suggest a Resource" to ICON to be considered for inclusion in the collection.
NanoPort: an example for building knowledge portals for scientific domains BIBAFull-Text 387
  Jialun Qin; Zan Huang; Yilu Zhou; Michael Chau; Chunju Tseng; Alan Yip; T. Gavin Ng; Fei Guo; Zhi-Kai Chen; Hsinchun Chen
We describe the NanoPort (www.nanoport.org) system to demonstrate a general framework of building domain-specific knowledge portals. These portals consolidate diverse information resources and provide rich functionalities to support effective information retrieval and knowledge discovery.
EconPort: a digital library for Microeconomics education BIBAFull-Text 388
  Hsinchun Chen; Daniel Zeng; Riyad Kalla; Zan Huang; James C. Cox; J. Todd Swarthout
We present the EconPort system (www.econport.org), a digital library for Microeconomics education that incorporates experimental economics software and automated e-commerce agents.

Posters

Displaying resources in context: using digital libraries to support changes in undergraduate education BIBAFull-Text 391
  Cathryn A. Manduca; Sean Fox
Education digital libraries strive to foster major improvements in education by supporting adoption of more effective teaching methods. We present initial efforts to assist faculty in changing teaching practice by displaying digital library resources in portals that address a specific educational issue and provide the full spectrum of resources needed to both motivate and implement a change in practice.
A proposal for digital library protection BIBAFull-Text 392
  Hideyasu Sasaki; Yasushi Kiyoki
We propose systematic digital library protection by patentable content-based retrieval processes, especially on image digital libraries in specified domains, without any excessively exclusive protection in general domains.
Content-based summarization for personal image library BIBAFull-Text 393
  Joo-Hwee Lim; Jun Li; Philippe Mulhem; Qi Tian
With the accumulation of consumer's personal image library, the problem of managing, browsing, querying and presenting photos effectively and efficiently would become critical. We propose a framework for automatic organization of personal image libraries based on analysis of image creation time stamps and image contents to facilitate browsing and summarization of images.
Modularization framework for digital museum exhibition BIBAFull-Text 394
  Bai-Hsun Chen; Sheng-Hao Hung; Jen-Shin Hong
Conventionally, digital museum online exhibitions are constructed using handcrafted HTML pages which require tedious hypermedia composing. This paper proposes a sophisticated modularization framework for exhibition website construction by integrating XML and Flash MX. A typical exhibition page is differentiated into several "layers" containing specific types of "media elements". Several categories of modularized Flash-based "mediah-andlers" are used to process and present the layers containing media elements. A complete set of media-handlers presenting the content are then integrated together to give the final page presentation. Based on this modularization framework, the workflow for exhibition construction and management are significant improved.
Sustainability issues and activities for the NSDL BIBAFull-Text 395
  David J. McArthur; Sarah Giersch; Howard Burrows
This poster will review the work on sustainability of digital libraries in the context of the NSF-supported National Science Digital Library (NSDL) program. Applied to digital libraries, sustainability is a broad term, referring to everything from technical issues about the digital preservation of materials, to the social questions surrounding the long-term accessibility of resources to the public at large.
Contribution and collaboration strategies for the National Science Digital Library (nsdl.org): investigating technological solutions to facilitate social evolution of a collaborative infrastructure BIBAFull-Text 396
  Elly Cramer; Dean Krafft; Diane Hillmann; John Saylor; Carol Terrizzi
The NSDL community consists of large, discipline diverse, and decentralized user groups made up of collaborator communities who create, aggregate, and contribute digital resources to the NSDL. NSDL Core Integration provides "wholesale" services to NSDL collaborator communities who may "retail" those services through their own portals, perhaps packaged with additional content selected to meet their specialized users' needs. NSDL "wholesale" services will support rich representations of complex data relation-ships. NSDL will distribute access to aggregations and annotations stored in the NSDL metadata repository that have been harvested, normalized (based on the scaleable library production model in use at nsdl.org), and exposed for re-harvest. "Retailers" may use the Open Archives Initiative (OAI) for Metadata Harvesting Protocol to harvest these structured data relationships and make them available for use in other library services.
FLOW: co-constructing low barrier repository infrastructure in support of heterogeneous knowledge collection(s) BIBAFull-Text 397
  Karen S. Baker; Anna K. Gold; Frank Sudholt
Institutional repositories are being constructed today to address the needs of scholarly communication in a digital environment [1, 2]. The success of such institutional infrastructures as knowledge collections depends in part on offering low barriers for participation and on supporting heterogeneous knowledge inputs and outputs. The San Diego Supercomputer Center (SDSC) in partnership with CERN (European Center for Nuclear Research), the Scripps Institution of Oceanography (SIO), and the University of California, San Diego (UCSD) Science & Engineering Library, has modified CERN's CDSware software to initiate the process of creating a local low barrier repository.
MetaTest: evaluation of metadata from generation to use BIBFull-Text 398
  Elizabeth D. Liddy; Eileen E. Allen; Christina M. Finneran; Geri Gay; Helene Hembrooke; Laura A. Granka
Finding and using data in educational digital libraries BIBAFull-Text 399
  Rajul Pandya; Ben Domenico; Mary Marlino
THREDDS (THematic Real-time Earth Distributed Data Servers) services catalog geophysical data and other data services to support discovery and use by researchers. THREDDS, however, doesn't support data discovery and use by learners and educators (i.e. novices). Educational digital libraries, like DLESE (Digital Library for Earth System Education) provide rich metadata descriptions that are effective in helping novices locate and use most types of learning resources. DLESE, however, doesn't provide a way for novices to discover geophysical data in immediately usable forms. The VGEE (Visual Geophysical Exploration Environment) supports novices' discovery and use of geophysical data by linking THREDDS services with educational curricula and learner-centered data tools. The curricula are cataloged in DLESE and so can be discovered in educational settings. These curricula then guide novices to the appropriate tools and illustrate meaningful use of the data. More generally, by coupling data to curricular documents, text-based discovery tools (e.g. search engines) can be extended to data.
An XQuery engine for digital library systems BIBAFull-Text 400
  Ji-Hoon Kang; Chul-Soo Kim; Eun-Jeong Ko
XML is now a standard markup language for web information. Many application areas are producing XML documents on the web. This situation urges digital library systems to deal with not only typical text documents but also XML documents. XML documents are semi-structured. Some queries based on the structures are useful and necessary.
   MPEG-7 is a metadata standard for multimedia objects. MPEG-7 metadata can describe some features such as color histogram of image, so that a multimedia digital library system using MPEG-7 for metadata representation can provide content-based search for multimedia objects. MPEG-7 is defined by XML schema. In order to retrieve MPEG-7 metadata, a query language for XML data is required.
   A standard query language is very helpful for interoperability among digital library systems over the Internet. XQuery, which has been influenced from most of the previous XML query languages, is a forthcoming standard for querying XML data.
   In this paper we propose an XQuery Engine as depicted in the figure that can be used as an XQuery processing module in a digital library system that supports XML documents. We assume generic digital library system architecture. It consists of four modules: a user interface, an XQuery Engine, an Information retrieval Engine, and an XML Repository. The user interface module gives a user an easy way to search XML documents and transforms a given user query to an equivalent XQuery. The XQuery Engine module takes an XQuery as input and provides a query plan for an information retrieval module as output. The information retrieval engine executes a query plan by communicating with the XML repository, which stores XML documents.
   The XQuery Engine parses an input XQuery and constructs a syntax tree for the query. Then, it transforms the syntax tree into a query plan, called a Primitive Operation Tree (POT). Each node of a POT represents an atomic operation in terms of the information retrieval engine and can be interpreted and processed by the information retrieval engine. The result set is given back to the XQuery engine, which in turn transforms the result into an XML document of the form being required by the user interface. The final result in XML is returned back to the user interface.
   Our approach has the following useful aspects. First, any user interface that generates XQuery is able to access any digital library system including our XQuery Engine. Second, we define a set of primitive operations for POTs so that they can become a standard interface between an XQuery Engine and an Information Retrieval Engine for our generic digital library system that supports XML documents. Third, some query optimizations over POTs can be done in the XQuery Engine so that better searching performance is expected.
   Currently we are developing an XQuery Engine prototype. It will be installed inside an MPEG-7 based Digital Library System that supports content-based searching for images. The XQuery Specification is an ongoing working draft and is not completed yet. Since the current version of the XQuery specification does not define full functions for information retrieval, we need to extend XQuery syntax by adding some functions such as rankby().
CephSchool: a pedagogic portal for teaching biological principles with cephalopod molluscs BIBFull-Text 401-402
  James B. Wood; Caitlin M. H. Shaw
VIVO: a Video Indexing and Visualization Organizer BIBFull-Text 403
  Meng Yang; Xiangming Mu; Gary Marchionini
The roadies take the stage: on-going development and maintenance of the legacy tobacco documents library at the University of California San Francisco BIBFull-Text 404
  Heidi Schmidt
BiosCi Education Network (BEN) collaborative BIBFull-Text 405
  Linda Akli; Cal T. Collins; Jason Smith; Ron Butler; Amy Chang; Yolanda George; Nancy Gough; Melinda Lowy; Marsha Matyas; Brandon Muramatsu; Susan Musante; Jason Taylor
Palau Community College-Belau National Museum image archives digitization and access project BIBAFull-Text 406
  Imengel Mad
This poster presentation will describe a collaboration project between the Palau Community College (PCC) Library and the Belau National Museum (BNM). The project, funded by a two-year U.S. Institute of Museum and Library Services (IMLS) National Leadership Grant, will enhance access to the BNM Media Collection. The Media Collection is in great demand, and the pressures of human use exacerbate an already tenuous situation for the long-term preservation of the images. While digitization is not viewed as the preservation solution, it will assist the Museum to lessen the impact of human handling. By making the Media Collection more accessible through integration of the PCC Library's online catalog, a much wider audience will be reached, and mishandling of the original images will be significantly reduced.
   The PCC website, currently under final development will link the Library WebCollection Plus which will contain digitized images selected from the extensive photo archives, as well as digitized images of the ethnographic and other objects in the Museum's collection, including contemporary art. This poster session will enable viewers to see the range of images included in the project.
   This poster presentation will enable researchers to learn how this project will support scholarly research.
Steps towards establishing shared evaluation goals and procedures in the National Science Digital Library BIBAFull-Text 407
  Tamara Sumner; Sarah Giersch; Casey Jones
A community-based process was used to develope shared evaluation goals and instruments to begin evaluating the National Science Digital Library (NSDL). Results from a pilot study examining library usage, collections growth, and library governance processes are reported. The methods used in the pilot included web log usage analysis, collections assessment techniques, survey instruments, and semi-structured interviews.
A comparison of two educational resource discovery systems BIBAFull-Text 408
  Tamara Sumner; Sonal Bhushan; Faisal Ahmad; Lynne Davis
We describe the results from a pilot study that compared two different discovery systems designed and built to operate in the same educational digital library -- one based on searching over metadata records and another hybrid system which combined metadata and content-based indexing.
Collections and access policies of the digital material of ten national libraries BIBFull-Text 409
  Alexandros Koulouris; Sarantos Kapidakis

Workshops

Cross-cultural usability for digital libraries BIBAFull-Text 415
  Nadia Caidi; Anita Komlodi
The scope and reach of digital libraries (DL) is truly global, spanning geographical and cultural boundaries, yet few scholars have investigated the influence of culture as it pertains to the design and use of digital libraries. This workshop will examine cross-cultural issues around the use and development of DLs, especially as they relate to supporting cross-cultural usability of DLs.
International workshop on Information Visualization Interfaces for Retrieval and Analysis (IVIRA) at the Joint Conference on Digital Libraries 2003 BIBAFull-Text 416
  Javed Mostafa; Katy Borner
The IVIRA workshop has been organized to attract cutting-edge efforts that concentrate on improving information retrieval and analysis by applying visualization techniques in interface design.
Building a meaningful Web: from traditional knowledge organization systems to new semantic tools BIBAFull-Text 417
  Gail M. Hodge; Marcia Lei Zeng; Dagobert Soergel
This Networked Knowledge Organization Systems/Services (NKOS) workshop focused on the transformation of traditional knowledge organization systems (KOSs) to new forms of knowledge representation that are being developed to support a more semantic-based, meaningful Web environment. The goal of the workshop was to identify principles from more traditional practices that can contribute to the design of new knowledge organization systems and ways to exploit the extensive intellectual capital available in traditional KOSs when developing new KOS tools.
   Traditional KOSs include a broad range of system types from term lists to classification systems and complex thesauri. Term lists may be simple authority lists. Classification systems put resources in broad groups or "buckets". Traditional thesauri are built on broader-narrower, synonymous and associative (or related term) relationships. These and other traditional KOSs were developed in a print environment or in the early days of computerized databases to control the vocabulary used when indexing and searching a specific product, such as a bibliographic database, or when organizing a physical collection such as a library.
   New forms of knowledge representation include ontologies, topic maps, and other semantic Web components. The relationships between concepts in these tools are richer. In particular, the associative relationships and broader-narrower relationships are defined in more detail. New semantic tools emphasize the ability of the computer to process the KOS against a body of text, rather than support the human indexer or trained searcher. These tools are intended for use in the broader, more uncontrolled context of the Web to support information discovery by a larger community of interest or by Web users in general.
   While the traditional KOSs and newer tools are related, the development of the newer forms of KOS tools has, on the whole, not taken advantage of traditional KOSs. There is little understanding of how traditional tools can be transformed for the demands of the Web environment and whether there are lessons that can be learned from the decades of development and maintenance of these traditional systems.
   This workshop compared the traditional KOSs and new approaches to improving the semantic capabilities of the Web. Best practices and lessons learned from the development, maintenance and use of traditional KOSs were identified. Descriptions of projects involving the transformation of traditional KOSs to newer forms emphasized the transition process, including the analysis of the traditional KOS, and the characteristics of the KOS that could be carried through to the new tool. The presenters also discussed the degree to which the traditional KOS and the new tool would be used together in the future, whether there would be parallel or separate maintenance activities, etc.
   Presenters described the development of specific Web service functionality applicable to KOSs. The benefits of this service-based approach and the possibility of universal or community-based KOS services were explored.
   In addition to formal presentations, the workshop participants gave brief updates on their work or interest in this area. A facilitated discussion identified areas where standards, best practices, technologies, or more research are needed to take advantage of the investment in traditional KOSs when developing new tools.
   NKOS is an ad hoc group devoted to the discussion of KOSs as networked interactive information services to support the description and retrieval of diverse information resources through the Internet. This is the 6th in a series of NKOS workshops held in conjunction with JCDL. More information about NKOS is available from http://nkos.slis.kent.edu/.
OAI metadata harvesting workshop BIBAFull-Text 418
  Simeon Warner
This workshop will bring together people with Open Archives Initiative (OAI) [1] metadata harvesting experience to discuss problems, their solutions, and to identify best practices. The focus will be on near-to medium-term practical issues. Participants will have the opportunity to discuss problems or raise issues that they have encountered and will benefit from the shared experience of the other participants. The workshop will combine and distill the OAI harvesting knowledge and experience of the participants to detail 1) best practices and existing solutions to particular harvesting problems; and 2) unresolved problems and issues with current implementations, the specification, or limitations of version 2.0 the OAI protocol for metadata harvesting (OAI-PMH) [2]. The conclusions of the workshop will be disseminated to the wider OAI community.