| The Semantic GrowBag Algorithm: Automatically Deriving Categorization Systems | | BIBAK | Full-Text | 1-13 | |
| Jörg Diederich; Wolf-Tilo Balke | |||
| Using keyword search to find relevant objects in digital libraries often
results in way too large result sets. Based on the metadata associated with
such objects, the faceted search paradigm allows users to structure and filter
the result set, for example, using a publication type facet to show only books
or videos. These facets usually focus on clear-cut characteristics of digital
items, however it is very difficult to also organize the actual semantic
content information into such a facet. The Semantic GrowBag approach, presented
in this paper, uses the keywords provided by many authors of digital objects to
automatically create light-weight topic categorization systems as a basis for a
meaningful and dynamically adaptable topic facet. Using such emergent semantics
enables an alternative way to filter large result sets according to the
objects' content without the need to manually classify all objects with respect
to a pre-specified vocabulary. We present the details of our algorithm using
the DBLP collection of computer science documents and show some experimental
evidence about the quality of the achieved results. Keywords: faceted search; category generation; higher-order co-occurrence | |||
| Ontology-Based Question Answering for Digital Libraries | | BIBA | Full-Text | 14-25 | |
| Stephan Bloehdorn; Philipp Cimiano; Alistair Duke; Peter Haase; Jörg Heizmann; Ian Thurlow; Johanna Völker | |||
| In this paper we present an approach to question answering over heterogeneous knowledge sources that makes use of different ontology management components within the scenario of a digital library application. We present a principled framework for integrating structured metadata and unstructured resource content in a seamless manner which can then be flexibly queried using structured queries expressed in natural language. The novelty of the approach lies in the combination of different semantic technologies providing a clear benefit for the application scenario considered. The resulting system is implemented as part of the digital library of British Telecommunications (BT). The original contribution of our paper lies in the architecture we present allowing for the non-straightforward integration of the different components we consider. | |||
| Formalizing the Get-Specific Document Classification Algorithm | | BIBA | Full-Text | 26-37 | |
| Fausto Giunchiglia; Ilya Zaihrayeu; Uladzimir Kharkevich | |||
| The paper represents a first attempt to formalize the get-specific document classification algorithm and to fully automate it through reasoning in a propositional concept language without requiring user involvement or a training dataset. We follow a knowledge-centric approach and convert a natural language hierarchical classification into a formal classification, where the labels are defined in the concept language. This allows us to encode the get-specific algorithm as a problem in the concept language. The reported experimental results provide evidence of practical applicability of the proposed approach. | |||
| Trustworthiness Analysis of Web Search Results | | BIBAK | Full-Text | 38-49 | |
| Satoshi Nakamura; Shinji Konishi; Adam Jatowt; Hiroaki Ohshima; Hiroyuki Kondo; Taro Tezuka; Satoshi Oyama; Katsumi Tanaka | |||
| Increased usage of Web search engines in our daily lives means that the
trustworthiness of searched results has become crucial. User studies on the
usage of search engines and analysis of the factors used to determine trust
that users have in search results are described in this paper. Based on the
analysis, we developed a system to help users determine the trustworthiness of
Web search results by computing and showing each returned page's topic
majority, topic coverage, locality of supporting pages (i.e., pages linked to
each search result) and other information. The measures proposed in the paper
can be applied to the search of Web-based libraries or can be useful in the
usage of digital library search systems. Keywords: Web search; trustworthiness; page locality; user study | |||
| Improved Publication Scores for Online Digital Libraries Via Research Pyramids | | BIBA | Full-Text | 50-62 | |
| Sulieman Bani-Ahmad; Gultekin Özsoyoglu | |||
| Ranking publications of Online Digital Libraries (ODLs) is useful for (i) providing comparative assessment of publications and (ii) listing relevant ODL search results first in search outputs, enabling users to aggregate pertinent results quickly and easily. Studies show that effective citation-based scoring functions, namely, PageRank, HITS and Citation Count, are highly skewed, and have accuracy problems, possibly due to topic diffusion. In this paper, based on the notion of research pyramids, we propose an a priori technique to assign more effective publication scores. Using the ACM SIGMOD Anthology ODL as a testbed, we show that our approach provides more accurate and less skewed publication scores. | |||
| Key Element-Context Model: An Approach to Efficient Web Metadata Maintenance | | BIBA | Full-Text | 63-74 | |
| Ba-Quy Vuong; Ee-Peng Lim; Aixin Sun; Chew-Hung Chang; Kalyani Chatterjea; Dion Hoe-Lian Goh; Yin Leng Theng; Jun Zhang | |||
| In this paper, we study the problem of maintaining metadata for open Web content. In digital libraries such as DLESE, NSDL and G-Portal, metadata records are created for some good quality Web content objects so as to make them more accessible. These Web objects are dynamic making it necessary to update their metadata records. As Web metadata maintenance involves manual efforts, we propose to reduce the efforts by introducing the Key element-Context (KeC) model to monitor only those changes made on Web page content regions that concern metadata attributes while ignoring other changes. We also develop evaluation metrics to measure the number of alerts and the amount of efforts in updating Web metadata records. KeC model has been experimented on metadata records defined for Wikipedia articles, and its performance with different settings is reported. The model is implemented in G-Portal as a metadata maintenance module. | |||
| A Cooperative-Relational Approach to Digital Libraries | | BIBA | Full-Text | 75-86 | |
| Alessio Malizia; Paolo Bottoni; Stefano Levialdi; Francisco Astorga-Paliza | |||
| This paper presents a novel approach to model-driven development of Digital Library (DL) systems. The overall idea is to allow Digital Library systems designers (e.g. information architects, librarians, domain experts) to easily design such systems by using a visual language. We designed a Domain Specific Visual Language for such a purpose and developed a framework supporting it; this framework helps designers by automatically generating code for the defined Digital Library system, so that they do not have to get involved into technical issues concerning its deployment. In our approach, both Human-Computer Interaction and Computer Supported Collaborative Work techniques are exploited when generating interfaces and services for the specific Digital Library domain. | |||
| Mind the (Intelligibility) Gap | | BIBA | Full-Text | 87-99 | |
| Yannis Tzitzikas; Giorgos Flouris | |||
| Intelligibility, evolution and emulation are some of the key notions for digital information preservation. In this paper we define formally these notions on the basis of modules and inter-module dependencies. Subsequently, we discuss how we can handle the evolution of modules and dependencies. This work can be exploited for building advanced preservation information systems and registries. | |||
| Using XML Logical Structure to Retrieve (Multimedia) Objects | | BIBA | Full-Text | 100-111 | |
| Zhigang Kong; Mounia Lalmas | |||
| This paper investigates the use of the logical structure in XML documents for the retrieval of XML multimedia objects. We study different logical levels and their combinations. Our investigation is carried on a purpose-built test collection based on the INEX test collection. Our findings are the followings. First, all logical levels allow discriminating between elements contained in different documents, whereas the lower logical levels allow discriminating between elements within a same document. Second, combining the logical levels improve retrieval performance. | |||
| Lyrics-Based Audio Retrieval and Multimodal Navigation in Music Collections | | BIBA | Full-Text | 112-123 | |
| Meinard Müller; Frank Kurth; David Damm; Christian Fremerey; Michael Clausen | |||
| Modern digital music libraries contain textual, visual, and audio data describing music on various semantic levels. Exploiting the availability of different semantically interrelated representations for a piece of music, this paper presents a query-by-lyrics retrieval system that facilitates multimodal navigation in CD audio collections. In particular, we introduce an automated method to time align given lyrics to an audio recording of the underlying song using a combination of synchronization algorithms. Furthermore, we describe a lyrics search engine and show how the lyrics-audio alignments can be used to directly navigate from the list of query results to the corresponding matching positions within the audio recordings. Finally, we present a user interface for lyrics-based queries and playback of the query results that extends the functionality of our SyncPlayer framework for content-based music and audio navigation. | |||
| Automatic Identification of Music Works Through Audio Matching | | BIBA | Full-Text | 124-135 | |
| Riccardo Miotto; Nicola Orio | |||
| The availability of large music repositories poses challenging research problems, which are also related to the identification of different performances of music scores. This paper presents a methodology for music identification based on hidden Markov models. In particular, a statistical model of the possible performances of a given score is built from the recording of a single performance. To this end, the audio recording undergoes a segmentation process, followed by the extraction of the most relevant features of each segment. The model is built associating a state for each segment and by modeling its emissions according to the computed features. The approach has been tested with a collection of orchestral music, showing good results in the identification and tagging of acoustic performances. | |||
| Roadmap for MultiLingual Information Access in the European Library | | BIBA | Full-Text | 136-147 | |
| Maristella Agosti; Martin Braschler; Nicola Ferro; Carol Peters; Sjoerd Siebinga | |||
| The paper studies the problem of implementing MultiLingual Information Access (MLIA) functionality in The European Library (TEL). The issues that must be considered are described in detail and the results of a preliminary feasibility study are presented. The paper concludes by discussing the difficulties inherent in attempting to provide a realistic full-scale MLIA solution and proposes a roadmap aimed at determining whether this is in fact possible. | |||
| MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries | | BIBA | Full-Text | 148-160 | |
| Christian Zimmer; Christos Tryfonopoulos; Gerhard Weikum | |||
| We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of MinervaDL is based on the peer-to-peer search engine Minerva, and is able to handle huge amounts of data provided by digital libraries in a distributed and self-organizing way. The two-tier architecture and the use of the distributed hash table as the routing substrate provides an infrastructure for creating large networks of digital libraries with minimal administration costs. We discuss the main components of this architecture, present the protocols that regulate node interactions, and experimentally evaluate our approach. | |||
| A Grid-Based Infrastructure for Distributed Retrieval | | BIBA | Full-Text | 161-173 | |
| Fabio Simeoni; Leonardo Candela; George Kakaletris; Mads Sibeko; Pasquale Pagano; Giorgos Papanikos; Paul Polydoras; Yannis E. Ioannidis; Dagfinn Aarvaag; Fabio Crestani | |||
| In large-scale distributed retrieval, challenges of latency, heterogeneity, and dynamicity emphasise the importance of infrastructural support in reducing the development costs of state-of-the-art solutions. We present a service-based infrastructure for distributed retrieval which blends middleware facilities and a design framework to 'lift' the resource sharing approach and the computational services of a European Grid platform into the domain of e-Science applications. In this paper, we give an overview of the Diligent Search Framework and illustrate its exploitation in the field of Earth Science. | |||
| VIRGIL -- Providing Institutional Access to a Repository of Access Grid Sessions | | BIBA | Full-Text | 174-185 | |
| Ron Chernich; Jane Hunter; Alex Davies | |||
| This paper describes the VIRGIL (Virtual Meeting Archival) system which was developed to provide a simple, practical, easy-to-use method for recording, indexing and archiving large scale distributed videoconferences held over Access Grid nodes. Institutional libraries are coming under increasing pressure to support the storage, access and retrieval of such mixed-media complex digital objects in their institutional repositories. Although systems have been developed to record access grid sessions, they don't provide simple mechanisms for repository ingestion, search and retrieval; and they require the installation and understanding of complex Access Grid tools to record and replay the virtual meetings. Our system has been specifically designed to enable both: the easy construction and maintenance of an archive of Access Grid sessions by managers; and easy search and retrieval of recorded sessions by users. This paper describes the underlying architecture, tools and Web interface we developed to enable the recording, storage, search, retrieval and replay of collaborative Access Grid sessions within a Fedora repository. | |||
| Opening Schrödingers Library: Semi-automatic QA Reduces Uncertainty in Object Transformation | | BIBA | Full-Text | 186-197 | |
| Lars Ræder Clausen | |||
| Object transformation for preservation purposes is currently a hit-or-miss affair, where errors in transformation may go unnoticed for years since manual quality assurance is too resource-intensive for large collections of digital objects. We propose an approach of semi-automatic quality assurance (QA), where numerous separate automatic checks of "aspects" of the objects, combined with manual inspection, provides greater assurance that objects are transformed with little or no loss of quality. We present an example of using this approach to appraise the quality of OpenOffice's import of Word documents. | |||
| Texts, Illustrations, and Physical Objects: The Case of Ancient Shipbuilding Treatises | | BIBAK | Full-Text | 198-209 | |
| Carlos Monroy; Richard Furuta; Filipe Castro | |||
| One of the main goals of the Nautical Archaeology Digital Library (NADL) is
to assist nautical archaeologists in the reconstruction of ancient ships and
the study of shipbuilding techniques. Ship reconstruction is a specialized task
that requires supporting materials such as reference to fragments and timbers
recovered from other excavations and consultation of shipbuilding treatises.
The latter are manuscripts written in a variety of languages and spanning
several centuries. Due to their diverse provenance, technical content, and time
of writing, shipbuilding treatises are complex written sources. In this paper
we discuss a digital library approach to handle these manuscripts and their
multilingual properties (often including unknown terms and concepts), and how
scholars in different countries are collaborating in this endeavor. Our
collection of treatises raises interesting challenges and provides a glimpse of
the relationship between texts and illustrations, and their mapping to physical
objects. Keywords: Nautical archaeology; ancient technical manuscripts; shipbuilding treatises;
ship reconstruction | |||
| Trustworthy Digital Long-Term Repositories: The Nestor Approach in the Context of International Developments | | BIBAK | Full-Text | 210-222 | |
| Susanne Dobratz; Astrid Schoger | |||
| This paper describes the general approach nestor -- the German "Network of
Expertise in Long-Term Storage of Digital Resources" has taken in designing a
catalogue of criteria for trustworthy digital repositories for long-term
preservation and how this approach relates to internationalisation and
standardisation of criteria and developments of evaluation methods to
facilitate the audit and certification process. Keywords: Digital Repositories; Long-Term Preservation; Certification;
Trustworthiness; Auditing; Standardisation | |||
| Providing Context-Sensitive Access to the Earth Observation Product Library | | BIBAK | Full-Text | 223-234 | |
| Stephan Kiemle; Burkhard Freitag | |||
| The German Remote Sensing Data Center (DFD) has developed a digital library
for the long-term management of earth observation data products. This Product
Library is a central part of DFD's multi-mission ground segment Data and
Information Management System (DIMS) currently hosting one million digital
products, corresponding to 150 Terabyte of data. Its data model is regularly
extended to support products of upcoming earth observation missions. The ever
increasing complexity led to the development of operating interfaces which use
a-priori and context knowledge, allowing efficient management of the dynamic
library content. This paper presents the development and operating of
context-sensitive library access tools based on meta modeling and online
grammar interpretation. Keywords: context sensitivity; meta modeling; earth observation; object query
language; information management | |||
| T-Scroll: Visualizing Trends in a Time-Series of Documents for Interactive User Exploration | | BIBA | Full-Text | 235-246 | |
| Yoshiharu Ishikawa; Mikine Hasegawa | |||
| On the Internet, a large number of documents such as news articles and online journals are delivered everyday. We often have to review major topics and topic transitions from a large time-series of documents, but it requires much time and effort to browse and analyze the target documents. We have therefore developed an information visualization system called T-Scroll (Trend/Topic-Scroll) to visualize the transition of topics extracted from those documents. The system takes periodical outputs of the underlying clustering system for a time-series of documents then visualizes the relationships between clusters as a scroll. Using its interaction facility, users can grasp the topic transitions and the details of topics for the target time period. This paper describes the idea, the functions, the implementation, and the evaluation of the T-Scroll system. | |||
| Thesaurus-Based Feedback to Support Mixed Search and Browsing Environments | | BIBA | Full-Text | 247-258 | |
| Edgar Meij; Maarten de Rijke | |||
| We propose and evaluate a query expansion mechanism that supports searching and browsing in collections of annotated documents. Based on generative language models, our feedback mechanism uses document-level annotations to bias the generation of expansion terms and to generate browsing suggestions in the form of concepts selected from a controlled vocabulary (as typically used in digital library settings). We provide a detailed formalization of our feedback mechanism and evaluate its effectiveness using the TREC 2006 Genomics track test set. As to the retrieval effectiveness, we find a 20% improvement in mean average precision over a query-likelihood baseline, whilst increasing precision at 10. When we base the parameter estimation and feedback generation of our algorithm on a large corpus, we also find an improvement over state-of-the-art relevance models. The browsing suggestions are assessed along two dimensions: relevancy and specifity. We present an account of per-topic results, which helps understand for what type of queries our feedback mechanism is particularly helpful. | |||
| Named Entity Identification and Cyberinfrastructure | | BIBA | Full-Text | 259-270 | |
| Alison Babeu; David Bamman; Gregory Crane; Robert Kummer; Gabriel Weaver | |||
| Well-established instruments such as authority files and a growing set of data structures such as CIDOC CRM, FRBRoo, and MODS provide the foundation for emerging, new digital services. While solid, these instruments alone neither capture the essential data on which traditional scholarship depends nor enable the services which we can already identify as fundamental to any eResearch, cyberinfrastructure or virtual research environment for intellectual discourse. This paper describes a general model for primary sources, entities and thematic topics, the gap between this model and emerging infrastructure, and the tasks necessary to bridge it. | |||
| Finding Related Papers in Literature Digital Libraries | | BIBA | Full-Text | 271-284 | |
| Nattakarn Ratprasartporn; Gultekin Özsoyoglu | |||
| This paper is about searching literature digital libraries to find "related"
publications of a given publication. Existing approaches do not take into
account publication topics in the relatedness computation, allowing topic
diffusion across query output publications. In this paper, we propose a new way
to measure "relatedness" by incorporating "contexts" (representing topics) of
publications. We utilize existing ontology terms as contexts for publications,
i.e., publications are assigned to their relevant contexts, where a context
characterizes one or more publication topics. We define three ways of
context-based relatedness, namely, (a) relatedness between two contexts
(context-to-context relatedness) by using publications that are assigned to the
contexts and the context structures in the context hierarchy, (b) relatedness
between a context and a paper (paper-to-context relatedness), which is used to
rank the relatedness of contexts with respect to a paper, and (c) relatedness
between two papers (paper-to-paper relatedness) by using both paper-to-context
and context-to-context relatedness measurements.
Using existing biomedical ontology terms as contexts for genomics-oriented publications, our experiments indicate that the context-based approach is accurate, and solves the topic diffusion problem by effectively classifying and ranking related papers of a given paper based on the selected contexts of the paper. | |||
| Extending Semantic Matching Towards Digital Library Contexts | | BIBAK | Full-Text | 285-296 | |
| László Kovács; András Micsik | |||
| Matching users' goals with available offers is a traditional research topic
for electronic market places and service-oriented architectures. The new area
of Semantic Web Services introduced the possibility of semantic matching
between user goals and services. Authors show in the paper what kind of
benefits semantic matching may provide for digital libraries. Various practical
examples are given for the usefulness of semantic matching, and a novel
algorithm is introduced for computing semantic matches. The implementation and
operation of matching are explained using a digital document search scenario. Keywords: Semantic matchmaking; discovery | |||
| Towards a Unified Approach Based on Affinity Graph to Various Multi-document Summarizations | | BIBA | Full-Text | 297-308 | |
| Xiaojun Wan; Jianguo Xiao | |||
| This paper proposes a unified extractive approach based on affinity graph to both generic and topic-focused multi-document summarizations. By using an asymmetric similarity measure, the relationships between sentences are reflected in a directed affinity graph for generic summarization. For topic-focused summarization, the topic information is incorporated into the affinity graph using a topic-sensitive affinity measure. Based on the affinity graph, the information richness of sentences is computed by the graph-ranking algorithm on differentiated intra-document links and inter-document links between sentences. Lastly, the greedy algorithm is employed to impose diversity penalty on sentences and the sentences with both high information richness and high information novelty are chosen into the summary. Experimental results on the tasks of DUC 2002-2005 demonstrate the excellent performances of the proposed approaches to both generic and topic-focused multi-document summarization tasks. | |||
| Large-Scale Clustering and Complete Facet and Tag Calculation | | BIBAK | Full-Text | 309-320 | |
| Bolette Ammitzbøll Madsen | |||
| The State and University Library of Denmark is developing an integrated
search system called Summa, and as part of the Summa project a clustering
module and a facet module. Simple clusters have been created for a collection
of more than six and a half million library metadata records using a linear
clustering algorithm. The created clusters are used to enrich the metadata
records, and search results are presented to the user using a faceted browsing
interface alongside a ranked result list. The most frequent tags in the
different facets in the search result can be calculated and presented at a rate
of approximately three million records per second per machine. Keywords: Library Metadata; Large Data Sets; Clustering; Categorisation; Faceted
Browsing | |||
| Annotation-Based Document Retrieval with Probabilistic Logics | | BIBA | Full-Text | 321-332 | |
| Ingo Frommholz | |||
| Annotations are an important part in today's digital libraries and Web information systems as an instrument for interactive knowledge creation. Annotation-based document retrieval aims at exploiting annotations as a rich source of evidence for document search. The POLAR framework supports annotation-based document search by translating POLAR programs into four-valued probabilistic datalog and applying a retrieval strategy called knowledge augmentation, where the content of a document is augmented with the content of its attached annotations. In order to evaluate this approach and POLAR's performance in document search, we set up a test collection based on a snapshot of ZDNet News, containing IT-related articles and attached discussion threads. Our evaluation shows that knowledge augmentation has the potential to increase retrieval effectiveness when applied in a moderate way. | |||
| Evaluation of Visual Aid Suite for Desktop Searching | | BIBAK | Full-Text | 333-344 | |
| Schubert Foo; Douglas Hendry | |||
| The task of searching for documents is becoming more challenging as the
volumes of data stored continues to increase, and retrieval systems produce
longer results list. Graphical visualisations can assist users to more
efficiently and effectively understand large volumes of information. This work
investigates the use of multiple visualisations in a desktop search tool. These
visualisations include a List View, Tree View, Map View, Bubble View, Tile View
and Cloud View. A preliminary evaluation was undertaken by 94 participants to
gauge its potential usefulness and to detect usability issues with its
interface and graphical presentations. The evaluation results show that these
visualisations made it easier and quicker for them to find relevant documents.
All of the evaluators found at least one of the visualisations useful and over
half of them found at least three of the visualisations to be useful. The
evaluation results support the research premise that a combination of
integrated visualisations will result in a more effective search tool. The next
stage of work is to improve the current views in light of the evaluation
findings in preparation for the scalability and longitudinal tests for a series
of increasingly larger result sets of documents. Keywords: Query result processing; query reformulation; tree view; map view; bubble
view; tile view; cloud view; evaluation; search engine; user interface | |||
| Personal Environment Management | | BIBAK | Full-Text | 345-356 | |
| Anna Zacchi; Frank M., III Shipman | |||
| We report on a study of the practices people employ to organize resources
for their activities on their computers. Today the computer is the main working
environment for many people. People use computers to do an increasing number of
tasks. We observed different patterns of organization of resources across the
desktop and the folder structure. We describe several strategies that people
employ to customize the environment in order to easily perform their
activities, access their resources, and overview their current tasks. Keywords: PIM; Document Management; Project Management | |||
| Empirical Evaluation of Semi-automated XML Annotation of Text Documents with the GoldenGATE Editor | | BIBA | Full-Text | 357-367 | |
| Guido Sautter; Klemens Böhm; Frank Padberg; Walter F. Tichy | |||
| Digitized scientific documents should be marked up according to domain-specific XML schemas, to make maximum use of their content. Such markup allows for advanced, semantics-based access to the document collection. Many NLP applications have been developed to support automated annotation. But NLP results often are not accurate enough; and manual corrections are indispensable. We therefore have developed the GoldenGATE editor, a tool that integrates NLP applications and assistance features for manual XML editing. Plain XML editors do not feature such a tight integration: Users have to create the markup manually or move the documents back and forth between the editor and (mostly command line) NLP tools. This paper features the first empirical evaluation of how users benefit from such a tight integration when creating semantically rich digital libraries. We have conducted experiments with humans who had to perform markup tasks on a document collection from a generic domain. The results show clearly that markup editing assistance in tight combination with NLP functionality significantly reduces the user effort in annotating documents. | |||
| Exploring Digital Libraries with Document Image Retrieval | | BIBA | Full-Text | 368-379 | |
| Simone Marinai; Emanuele Marino; Giovanni Soda | |||
| In this paper, we describe a system to perform Document Image Retrieval in Digital Libraries. The system allows users to retrieve digitized pages on the basis of layout similarities and to make textual searches on the documents without relying on OCR. The system is discussed in the context of recent applications of document image retrieval in the field of Digital Libraries. We present the different techniques in a single framework in which the emphasis is put on the representation level at which the similarity between the query and the indexed documents is computed. We also report the results of some recent experiments on the use of layout-based document image retrieval. | |||
| Know Thy Sensor: Trust, Data Quality, and Data Integrity in Scientific Digital Libraries | | BIBAK | Full-Text | 380-391 | |
| Jillian C. Wallis; Christine L. Borgman; Matthew S. Mayernik; Alberto Pepe; Nithya Ramanathan; Mark H. Hansen | |||
| For users to trust and interpret the data in scientific digital libraries,
they must be able to assess the integrity of those data. Criteria for data
integrity vary by context, by scientific problem, by individual, and a variety
of other factors. This paper compares technical approaches to data integrity
with scientific practices, as a case study in the Center for Embedded Networked
Sensing (CENS) in the use of wireless, in-situ sensing for the collection of
large scientific data sets. The goal of this research is to identify functional
requirements for digital libraries of scientific data that will serve to bridge
the gap between current technical approaches to data integrity and existing
scientific practices. Keywords: data integrity; data quality; trust; user centered design; user experience;
scientific data | |||
| Digital Libraries Without Databases: The Bleek and Lloyd Collection | | BIBA | Full-Text | 392-403 | |
| Hussein Suleman | |||
| Digital library systems are frequently defined with a focus on data collections, traditionally implemented as databases. However, when preservation and widespread access are most critical, some curators are considering how best to build digital library systems without databases. In many instances, XML-based formats are recommended because of many known advantages. This paper discusses the Bleek and Lloyd Collection, where such a solution was adopted. The Bleek and Lloyd Collection is a set of books and drawings that document the language and culture of some Bushman groups in Southern Africa, arguably one of the oldest yet most vulnerable and fragile cultures in the world. Databases were avoided because of the need for multi-OS support, long-term preservation and the use of large collections in remote locations with limited Internet access. While there are many advantages in using XML, scalability concerns are a limiting factor. This paper discusses how many of the scalability problems were overcome, resulting in a viable XML-centric solution for both greater preservation and access. | |||
| A Study of Citations in Users' Online Personal Collections | | BIBA | Full-Text | 404-415 | |
| Nishikant Kapoor; John T. Butler; Sean M. McNee; Gary C. Fouty; James A. Stemper; Joseph A. Konstan | |||
| Users' personal citation collections reflect users' interests and thus offer great potential for personalized digital services. We studied 18,120 citations in the personal collections of 96 users of RefWorks citation management system to understand these in terms of their resolvability i.e. how well these citations can be resolved to a unique identifier and to their online sources. While fewer than 4% of citations to articles in Journals and Conferences included a DOI, we were able to increase this resolvability to 50% by using a citation resolver. A much greater percentage of book citations included an ISBN (53%), but using an online resolver found ISBNs for an additional 20% of the book citations. Considering all citation types, we were able to resolve approximately 47% of all citations to either an online source or a unique identifier. | |||
| Investigating Document Triage on Paper and Electronic Media | | BIBAK | Full-Text | 416-427 | |
| George Buchanan; Fernando Loizides | |||
| Document triage is the critical point in the information seeking process
when the user first decides the relevance of a document to their information
need. This complex process is not yet well understood, and subsequently we have
undertaken a comparison of this task in both electronic and paper media. The
results reveal that in each medium human judgement is influenced by different
factors, and confirm some unproven hypotheses. How users claim they perform
triage, and what they actually do, are often not the same. Keywords: Digital Libraries; Interaction Design; Document Triage | |||
| Motivating and Supporting User Interaction with Recommender Systems | | BIBAK | Full-Text | 428-439 | |
| Andreas W. Neumann | |||
| This contribution reports on the introduction of explicit recommender
systems at the University Library of Karlsruhe. In March 2006, a rating service
and a review service were added to the already existing behavior-based
recommender system. Logged-in users can write reviews and rate all library
documents (books, journals, multimedia, etc.); reading reviews and inspecting
ratings are open to the general public. A role system is implemented that
supports the submission of different reviews for the same document from one
user to different user groups (students, scientists, etc.). Mechanism design
problems like bias and free riding are discussed, to address these problems the
introduction of incentive systems is described. Usage statistics are given and
the question, which recommender system supports which user needs best, is
covered. Summing up, recommender systems are a way to combine the support of
library user interaction with information access beyond catalog searches. Keywords: Recommender system; rating service; review service; mechanism design;
incentive system | |||
| On the Move Towards the European Digital Library: BRICKS, TEL, MICHAEL and DELOS Converging Experiences | | BIBA | Full-Text | 440-441 | |
| Massimo Bertoncini | |||
| In the last few years, a deep paradigm shift has taken place in the Digital Library domain. From several independent online systems and closed library "silos" that store digital heritage content, digital library systems are evolving towards a networked service-based architecture built as a set of fully interoperable local digital library systems. | |||
| Digital Libraries in Central and Eastern Europe: Infrastructure Challenges for the New Europe | | BIBAK | Full-Text | 442-444 | |
| Christine L. Borgman; Tatjana Aparac-Jelusic; Sonja Pigac Ljubi; Zinaida Manzuch; György Sebestyén; András Gábor | |||
| The countries of Central and Eastern Europe (CEE) that were part of the
Soviet Bloc or were non-aligned (Yugoslavia) entered the 1990s with
telecommunications penetration of about fifteen telephones per hundred persons
and a weak technical infrastructure based on pre-Cold War mechanical switching
technology. They lacked digital transmission systems, fiber optics, microwave
links, and automated systems control and maintenance. Until 1990, business,
government, and education made little use of computers, although some
mainframe-based data processing centers handled scientific and military
applications. Communication technologies such as typewriters, photocopiers, and
facsimile machines were registered and controlled to varying degrees in each
country. The CEE countries could not legally make connections between their
computer networks and those of countries outside the Soviet Bloc owing to the
COCOM regulations and other embargoes imposed on the region by the West,
although clandestine network connections were widely known to exist. In the
fifteen-plus years since the collapse of the Soviet Bloc, these countries have
made rapid advances in infrastructure and economics, and several already have
become members of the European Union. Yet many challenges remain, especially
with regard to infrastructure maturity, linguistics, and intellectual property. Keywords: Digital libraries; Central and Eastern Europe; systems design; cultural
heritage; management; political reform; economics; education policy; libraries;
museums; archives; information systems; technology transfer | |||
| Electronic Work: Building Dynamic Services over Logical Structure Using Aqueducts for XML Processing | | BIBA | Full-Text | 445-448 | |
| Miguel A. Martínez-Prieto; Pablo de la Fuente; Jesús Vegas; Joaquín Adiego | |||
| This paper presents, from e-book features, the concept of electronic work as a medium for publishing classic literature in different editions demanded by the Spanish educational system. The electronic work is an entity which, focused in its logical structure, provides a set of interaction services designed by means of Aqueducts, a processing model driven by XML data. | |||
| A Model of Uncertainty for Near-Duplicates in Document Reference Networks | | BIBA | Full-Text | 449-453 | |
| Claudia Hess; Michel de Rougemont | |||
| We introduce a model of uncertainty where documents are not uniquely identified in a reference network, and some links may be incorrect. It generalizes the probabilistic approach on databases to graphs, and defines subgraphs with a probability distribution. The answer to a relational query is a distribution of documents, and we study how to approximate the ranking of the most likely documents and quantify the quality of the approximation. The answer to a function query is a distribution of values and we consider the size of the interval of Minimum and Maximum values as a measure for the precision of the answer. | |||
| Assessing Quality Dynamics in Unsupervised Metadata Extraction for Digital Libraries | | BIBA | Full-Text | 454-457 | |
| Alexander Ivanyukovich; Maurizio Marchese; Patrick Reuther | |||
| Current research in large-scale information management systems is focused on unsupervised methods and techniques for information processing. Such approaches support scalability in regard to present-day exponential growth in information processing needs. In this paper we focus on the problem of automated quality evaluation of a completely unsupervised metadata extraction process in the Digital Libraries domain. In particular, we investigate resulting metadata quality applying specific extraction methodology for scientific documents. We propose and discuss precise quality metrics and measure the dynamics of such quality metrics as a function of the extracted information from the repository and size of the repository. | |||
| Bibliographical Meta Search Engine for the Retrieval of Scientific Articles | | BIBA | Full-Text | 458-461 | |
| Artur Gajek; Stefan Klink; Patrick Reuther; Bernd Walter; Alexander Weber | |||
| The University of Trier maintains the DBLP (Digital Bibliography & Library Project) Computer Science Bibliography which offers bibliographic information about more than 870.000 scientific publications. This paper describes the DBLP WebCrawler, a meta search engine that is able to search for full text publications in PDF format for each DBLP entry on the web. Various search engines such as Google and Yahoo are used as data sources. The retrieved documents are additionally analysed and ranked according to their relevance. The proposed system differs from systems like CiteSeer in so far, that the DBLP Webcrawler builds upon metadata and tries to find relevant full-texts whereas CiteSeer mainly starts with full-texts and extracts metadata. | |||
| In-Browser Digital Library Services | | BIBA | Full-Text | 462-465 | |
| Hussein Suleman | |||
| Service models for digital libraries have looked into how services may be decomposed into modules and components for greater flexibility. These models are, however, mostly aimed at server-side applications. With the emergence of Ajax and similar techniques for processing XML documents within a Web browser, it has now become feasible for a browser to perform far more of the computational tasks traditionally encompassed in server-side DL services. Among other advantages, moving computation to the client can result in improved performance and scalability. As a new twist on service oriented computing, it is argued in this paper that digital library services can be provided partially or wholly through applications that execute client-side. Two case studies are provided to illustrate that such in-browser services are feasible and in fact more powerful and flexible than the traditional server-side service model. | |||
| Evaluating Digital Libraries with 5SQual | | BIBAK | Full-Text | 466-470 | |
| Bárbara Lagoeiro Moreira; Marcos André Gonçalves; Alberto H. F. Laender; Edward A. Fox | |||
| This work describes 5SQual, a quantitative quality assessment tool for
digital libraries based on the 5S framework. 5SQual aims to help administrators
of digital libraries during the implementation and maintenance phases of a
digital library, providing ways to verify the quality of digital objects,
metadata and services. The tool has been designed in a flexible way, which
allows it to be applied to many systems, as long as the necessary data is
available. To facilitate the input of these data, the tool provides a
wizard-like interface that guides the user through its configuration process. Keywords: Digital Libraries; Quality Evaluation; 5S; 5SQual | |||
| Reducing Costs for Digitising Early Music with Dynamic Adaptation | | BIBA | Full-Text | 471-474 | |
| Laurent Pugin; John Ashley Burgoyne; Ichiro Fujinaga | |||
| Optical music recognition (OMR) enables librarians to digitise early music sources on a large scale. The cost of expert human labour to correct automatic recognition errors dominates the cost of such projects. To reduce the number of recognition errors in the OMR process, we present an innovative approach to adapt the system dynamically, taking advantage of the human editing work that is part of any digitisation project. The corrected data are used to perform MAP adaptation, a machine-learning technique used previously in speech recognition and optical character recognition (OCR). Our experiments show that this technique can reduce editing costs by more than half. | |||
| Supporting Information Management in Digital Libraries with Map-Based Interfaces | | BIBAK | Full-Text | 475-480 | |
| Rudolf Mayer; Angela Roiger; Andreas Rauber | |||
| The Self-Organising Map (SOM) has been proposed as an interface for
exploring Digital Libraries, in addition to conventional search and browsing.
With advanced visualisations uncovering the contents and its structure, and
advanced interaction modes as zooming, panning and area selection, the SOM
becomes a feasible alternative to classical interfaces. However, there are
still shortcomings in helping the user to understand the map -- there are
insufficient methods developed for describing the map to support the user in
the analysis of the map contents. In this paper, we present recent work in
assisting the user in exploring the map by automatically describing maps using
advanced labelling and summarisation of map regions. Keywords: Self-Organising Map; Interface; Summarisation; Clustering | |||
| Policy Decision Tree for Academic Digital Collections | | BIBA | Full-Text | 481-484 | |
| Alexandros Koulouris; Sarantos Kapidakis | |||
| We present the results of a questionnaire survey for the access and reproduction policies of 67 digital collections in 34 libraries (national, academic, public, special etc) from 13 countries. We examine and analyze the above policies in relation to specific factors, such as, the acquisition method, copyright ownership, library type (national, academic, etc.), content creation (digitized, born-digital) and content type (audio, video, etc.); how these factors affect the policies of the examined digital collections. Responses were received from a range of library sectors but by far the best responses came from academic libraries, in which we focus. We extract policy (access, reproduction) rules and alternatives according to these factors that lead to a policy decision tree on digital information management for academic libraries. The resulting decision tree is based on a policy model; the model and tree are divided into two parts: for digitized and born-digital content. | |||
| Personalized Faceted Browsing for Digital Libraries | | BIBA | Full-Text | 485-488 | |
| Michal Tvarozek; Mária Bieliková | |||
| Current digital libraries and online bibliographies share several properties with the Web and thus also share some of its problems. Faceted classifications and Semantic Web technologies are explored as possible approaches to improving digital libraries and alleviating their respective shortcomings. We describe the possibilities of using faceted navigation and its personalization in digital libraries. We propose a method of faceted browser adaptation based on an automatically acquired user model with support for dynamic facet generation. | |||
| The Use of Metadata in Visual Interfaces to Digital Libraries | | BIBAK | Full-Text | 489-494 | |
| Ali Shiri | |||
| This poster reports on a study carried out to investigate and analyze a
specific category of digital library visual interfaces that support information
seeking, exploration and retrieval based on metadata representations, namely
metadata-enhanced visual interfaces. This study has examined 21
metadata-enhanced digital library visual interfaces from the following
perspectives: a) information access and retrieval features supported; b)
metadata elements used; c) visualization techniques and metaphors utilized. The
results show that visual interfaces to digital libraries enhanced with metadata
are becoming more widespread. The study also demonstrates that the combined use
of visualization techniques and metaphors is becoming increasingly prevalent as
a design strategy to support users' information exploration. Keywords: Visual interfaces; metadata; information visualization; digital libraries | |||
| Location and Format Independent Distributed Annotations for Collaborative Research | | BIBA | Full-Text | 495-498 | |
| Fabio Corubolo; Paul B. Watry; John Harrison | |||
| This paper describes the development of a distributed annotation system which enables collaborative document consultation and creates new access to otherwise hard to index digital documents. It takes the annotations one step further: not only the same types of annotations are available across file formats, but robust references to the documents introduce format and location independence, and enable the attachment even when the document has been modified. These features are achieved using standards of the digital library systems, and don't require modification of the original documents or impose further restrictions, thus being infrastructure independent. Integration into the Kepler workflow system allows annotating workflow results, and the automatic creation and indexing of annotations in document oriented workflows, which can be used as a flexible way to archive and index collections in the Cheshire3 search engine. | |||
| NSDL MatDL: Adding Context to Bridge Materials e-Research and e-Education | | BIBAK | Full-Text | 499-500 | |
| Laura M. Bartolo; Cathy S. Lowe; Dean B. Krafft; Robert Tandy | |||
| The National Science Digital Library (NSDL) Materials Digital Library
Pathway (MatDL) has implemented an information infrastructure to disseminate
government funded research results and to provide content as well as services
to support the integration of research and education in materials. This poster
describes how we are integrating a digital repository into open-source
collaborative tools, such as wikis, to support users in materials research and
education as well as interactions between the two areas. A search results
plug-in for MediaWiki has been developed to display relevant search results
from the MatDL repository in the Soft Matter Wiki established and developed by
MatDL and its partners. Collaborative work with the NSDL Core Integration team
at Cornell University is also in progress to enable information transfer in the
opposite direction, from a wiki to a repository. Keywords: Materials Science; wiki; plug-in | |||
| A Framework for the Generation of Transformation Templates | | BIBAK | Full-Text | 501-504 | |
| Manuel Llavador; José Hilario Canós | |||
| This demo shows a set of tools for managing and performing document
transformations. These tools share a common infrastructure consisting on a set
of Web Services and programming libraries to define semantic mappings and
generate the corresponding transformation template automatically. The framework
is currently being used on the Bibshare project to support the conversion
between metadata formats, as well as in other domains related to Digital
Libraries and Software Engineering. Keywords: XML; Interoperability; Metadata Schemas; Document Transformation | |||
| MultiMatch -- Multilingual/Multimedia Access to Cultural Heritage | | BIBA | Full-Text | 505-508 | |
| Giuseppe Amato; Juan M. Cigarrán; Julio Gonzalo; Carol Peters; Pasquale Savino | |||
| Cultural heritage content is everywhere on the web, in contexts such as digital libraries, audiovisual archives, and portals of museums or galleries, in multiple languages and multiple media. MultiMatch, a 30 month specific targeted research project under the Sixth Framework Programme, plans to develop a multilingual search engine designed specifically for the access, organisation and personalised presentation of cultural heritage digital objects. | |||
| The Future of Large-Scale Evaluation Campaigns for Information Retrieval in Europe | | BIBA | Full-Text | 509-512 | |
| Maristella Agosti; Giorgio Maria Di Nunzio; Nicola Ferro; Donna Harman; Carol Peters | |||
| A Workshop on "The Future of Large-scale Evaluation Campaigns" was organised jointly by the University of Padua and the DELOS Network of Excellence and held in Padua, Italy, March 2007. The aim was to perform a critical assessment of the scientific results of such initiatives and to formulate recommendations for the future. This poster summarises the outcome of the discussion with respect to the major European activity in this area: the Cross Language Evaluation Forum. | |||
| Digital 101: Public Exhibition System of the National Digital Archives Program, Taiwan | | BIBAK | Full-Text | 513-514 | |
| Ku-Lun Huang; Hsiang-An Wang | |||
| Since the establishment of the National Digital Archives Program (NDAP),
Taiwan in 2002, the five divisions and their accompanying projects have
generated a huge amount of digital materials. The diverse content is available
for multiple purposes, such as research, value-added applications and
educational projects. The goal is allow the public to explore the achievements
of NDAP in user-friendly ways. Digital 101, which is also called the Public
Exhibit System (PES), serves to connect various groups interested in Taiwan's
rich cultural heritage.
PES incorporates artistic, creative & interactive user interfaces and popular methods that allow the public to utilize the content of the NDAP. Through collaboration with local artists, PES provides special exhibits and thematic image galleries about Taiwan's rich culture. It is expected to become a gateway worldwide. Keywords: Digital 101; Digital Archive; Public Exhibit System | |||
| aScience: A Thematic Network on Speech and Tactile Accessibility to Scientific Digital Resources | | BIBA | Full-Text | 515-517 | |
| Cristian Bernareggi; Gian Carlo Dalto | |||
| At present, digital scientific resources can be hardly read by visually impaired people. The systems to retrieve and download documents in digital libraries can be easily used also through speech and tactile assistive technologies. The main problems concern the digital formats employed to store documents. Therefore, visually impaired readers often find the right document, but they cannot read it. That often affects the learning process especially at university. In order to contribute to the preparation of guidelines to provide accessible digital scientific resources and to widespread best practices and best experiences achieved by university libraries and support services, the thematic network aScience was established. It is a two years project supported by the European Union eContentPlus Programme. The web portal www.ascience.eu delivers information about the thematic network activities and it will distribute sample documents of digital scientific literature accessible through speech and tactile assistive technologies. | |||
| PROBADO -- A Generic Repository Integration Framework | | BIBA | Full-Text | 518-521 | |
| Harald Krottmaier; Frank Kurth; Thorsten Steenweg; Hans-Jürgen Appelrath; Dieter W. Fellner | |||
| The number of newly generated multimedia documents (e.g. music, e-learning material, or 3D-graphics) increases year by year. Today, the workflow in digital libraries focuses on textual documents only. Hence, considering content-based retrieval tasks, multimedia documents are not analyzed and indexed sufficiently. To facilitate content-based retrieval and browsing, it is necessary to introduce recent techniques for multimedia document processing into the workflow of nowadays digital libraries. In this short paper, we introduce the PROBADO-framework which will (a) integrate different types of content-repositories -- each one specialized for a specific multimedia domain -- into one seamless system, and (b) will add features available in text-based digital libraries (such as automatic annotation, full-text retrieval, or recommender services) to non-textual documents. Existing libraries will benefit from the framework since it extends existing technology for handling textual documents with features for dealing with the non-textual domain. | |||
| VCenter: A Digital Video Broadcast System of NDAP Taiwan | | BIBAK | Full-Text | 522-524 | |
| Hsiang-An Wang; Chih-Yi Chiu; Yu-Zheng Wang | |||
| VCenter, a platform for broadcasting digital video content, was developed by
the National Digital Archives Program (NDAP), Taiwan. The platform provides a
number of functions, such as digital video archiving, format transformation,
streaming broadcasts, editing, geotagging, and blogging. The concept of Web2.0
is conducted in VCenter to increase user participation and improve interaction
between the system and the user.
For videos, VCenter adopts Flash technology because it has a multi-layer architecture and it can handle multimedia content. We can add watermarks or captions as layers to videos without changing the original video's content so that when users browse videos, the multi-layer overlaps the original video layer in real-time. VCenter serves the Union Catalog system of NDAP as a video broadcasting platform. In addition to archiving the valuable videos of NDAP, it allows the general public to archive, broadcast, and share digital videos. Keywords: blogging; digital archive; digital video; Flash; watermark; Web2.0 | |||
| Retrieving Tsunami Digital Library by Use of Mobile Phones | | BIBAK | Full-Text | 525-528 | |
| Sayaka Imai; Yoshinari Kanamori; Nobuo Shuto | |||
| We are developing a Tsunami Digital Library (TDL) which can store and manage
documents about tsunami, tsunami run up simulations, newspaper articles,
fieldwork data, etc. In this paper, we propose a public education against the
tsunami disaster mitigation as one of TDL applications. For the education, we
use mobile phones to retrieve TDL because we have to walk coast regions. Then,
we have prepared summaries of documents and newspaper articles in TDL, and also
developed query systems for mobile phone retrievals. Keywords: Tsunami Digital Library; Mobile Phone; XML Database | |||
| Using Watermarks and Offline DRM to Protect Digital Images in DIAS | | BIBAK | Full-Text | 529-531 | |
| Hsin-Yu Chen; Hsiang-An Wang; Chin-Lung Lin | |||
| The Digital Image Archiving System (DIAS) is an image management system, the
major functions of which are preserving valuable digital images and serving as
an image provider for external metadata archiving systems.
To enhance the security of images, DIAS enables online adding of watermarks to an image to protect the content owner's copyright. We use the Flash format to add watermarks because it has a multi-layer architecture and it can handle multimedia content. The function allows us to set the watermark as a layer that overlaps the original image. DIAS also provides an offline DRM (Digital Rights Management) mechanism to protect downloaded images. We package an image and its authorized information in an execution file for downloading. Then, when a user executes the file, the program validates the authorized information before showing the image. Using the watermark and offline DRM improves the security of DIAS images. Keywords: digital image; DRM; Flash; watermark | |||
| CIDOC CRM in Action -- Experiences and Challenges | | BIBA | Full-Text | 532-533 | |
| Philipp Nussbaumer; Bernhard Haslhofer | |||
| Integration of metadata from heterogeneous sources is a major issue when connecting cultural institutions to digital library networks. Uniform access to metadata is impeded by the structural and semantic heterogeneities of the metadata and metadata schemes used in the source systems. In this paper we discuss the methodologies we applied to ingest proprietary metadata into the BRICKS digital library network and to process CIDOC CRM metadata in terms of search and retrieval, and how we strove to hide the semantic complexity from the end-user while exploiting the semantic richness of the underlying metadata. | |||
| The Legal Environment of Digital Curation -- A Question of Balance for the Digital Librarian | | BIBAK | Full-Text | 534-538 | |
| Mags McGinley | |||
| Digital curation is about maintaining and adding value to a trusted body of
digital information for current and future use. This requires active management
and on-going appraisal over the entire life-cycle of scholarly and scientific
materials.
Whether there is a desire to make materials as open as possible or a requirement to keep them closed and private (for example in the case of sensitive personal data), legal elements can have a huge impact on the overall ability to effectively curate and preserve digital information over time. The DCC advocates the development of a framework for any curation activity that includes consideration of legal matters throughout. -- from copyright and licensing models, to freedom of information and data protection. Keywords: Digital curation; copyright; licensing; freedom of information; data
protection | |||
| Demonstration: Bringing Lives to Light: Browsing and Searching Biographical Information with a Metadata Infrastructure | | BIBA | Full-Text | 539-542 | |
| Ray R. Larson | |||
| In this demonstration we will show how a metadata infrastructure comprised of gazetteers, biographical dictionaries, and a "Time Period Directory" can be dynamically exploited to help searchers navigate through multiple web-based resources, and displayed in context with related information about "Who?, What?, Where?, and When?" and providing dynamic searches of those external resources. The demonstration will show both a web-based interface and a Google Earth-based geo-temporal browser. | |||
| Repository Junction and Beyond at the EDINA (UK) National Data Centre | | BIBAK | Full-Text | 543-545 | |
| Robin Rice; Peter Burnhill; Christine Rees; Anne Robertson | |||
| EDINA has been funded to undertake a variety of repository-related
development activities to enhance and support access to scholarly and learning
objects in the UK. JORUM is a national learning object repository for sharing
and repurposing educational materials. The purpose of the Depot is to ensure
that all UK academics can enjoy the benefits of Open Access for their
peer-reviewed post-prints by providing a repository for the interim period
before every university has such repository provision. GRADE has been
investigating and reporting on the technical and cultural issues around the
reuse of geospatial data in the context of media-centric, informal and
institutional repositories. With the DataShare project, by supporting academics
who wish to share datasets on which written research outputs are based, a
network of institution-based data repositories will develop a niche model for
deposit of 'orphaned datasets' currently filled neither by centralised
subject-domain data archives nor institutional repositories. Keywords: repositories; open access; research outputs; learning objects; eprints; data
sharing | |||
| A Scalable Data Management Tool to Support Epidemiological Modeling of Large Urban Regions | | BIBAK | Full-Text | 546-548 | |
| Christopher L. Barrett; Keith R. Bisset; Stephen Eubank; Edward A. Fox; Yi Ma; Madhav V. Marathe; Xiaoyu Zhang | |||
| We describe the design and prototype implementation of a data management
tool supporting simulation based models for studying the spread of infectious
diseases in large urban regions. The need for such tools arises due to diverse
and competing disease models, social networks, and experimental designs that
are being investigated. A realistic case study produces large amounts of data.
Organizing such datasets is necessary for effectively supporting analysts and
policy-makers interested in various cases. We report our ongoing efforts to
develop EpiDM -- an integrated information management tool for interrelated
digital resources, where the central piece is EpiDL (a digital library for
efficient access to these datasets). The work is unique in terms of the
specific application domain, which we are not aware of any such efforts and
tools that can be generalized for simulation-based modeling of other
socio-technical systems. EpiDL follows the 5S framework developed in the DL
community. Keywords: Computational Epidemiology; Public Health; Socio-technical and Information
Systems; Simulation and Modeling; Digital Library | |||
| Living Memory Annotation Tool -- Image Annotations for Digital Libraries | | BIBA | Full-Text | 549-550 | |
| Wolfgang Jochum; Max Kaiser; Karin Schellner; Franz Wirl | |||
| Digital Libraries are currently discovering the full potential of web technologies in conjunction with building rich user communities and retaining customers. A visit to a digital library should nowadays offer more than passive consumption of content. Both the library and the user can benefit from moving forward from the "content provider" vs. "consumer" paradigm to the "prosumer" paradigm, thus allowing the user to produce and actively contribute content, interact with content and be part of communities of interest. We are presenting a smart annotation tool developed as part of the 'Living Memory' applications in the context of the EU-project BRICKS that supports the prosumer approach by inviting users to contribute new information by annotating content or commenting other annotations, thereby creating new knowledge in a collaborative way. | |||
| A User-Centred Approach to Metadata Design | | BIBAK | Full-Text | 551-554 | |
| Emma Tonkin | |||
| The process of development of metadata elements and structures can be
approached and supported in a number of different ways. We sketch a
user-centred approach to this process, based around an iterative development
methodology, and briefly outline some major questions, challenges and benefits
related to this approach. Keywords: Metadata; user-centred design; evaluation | |||
| A Historic Documentation Repository for Specialized and Public Access | | BIBA | Full-Text | 555-558 | |
| Cristina Ribeiro; Gabriel David; Catalin Calistru | |||
| The web is currently the information searching and browsing environment of choice for scholars and lay users alike. The goal of most cultural heritage applications is to interest a large audience, and therefore web interfaces are being developed even when part of their functionality is not offered to the general public. We present a web-based interface for managing, browsing and searching a repository of historic documents. The documents pertain to a region which has been an important regional power in medieval times and their originals are under the custody of the Portuguese national archives. The challenges of the project came from its requisites in three aspects: rigorous archival description, the incorporation of document analysis and a flexible search interface. The system is an instance of a multimedia database framework providing both browse and retrieval functionalities to end users and configuration and content management services to the collection administrators. | |||
| Finding It on Google, Finding It on del.icio.us | | BIBAK | Full-Text | 559-562 | |
| Jacek Gwizdka; Michael J. Cole | |||
| We consider search engines and collaborative tagging systems from the
perspective of resource discovery and re-finding on the Web. We performed
repeated searches over nine-months on Google and del.icio.us for web pages
related to three topics selected to have different dynamic characteristics. The
results show differences in the resources they provide to the searcher. The
resources tagged on del.icio.us differ strongly from the top results returned
by Google. The results also suggest the changes in the most recently tagged web
pages may be associated with the level of activity in user communities and,
indirectly, with external events. Keywords: Folksonomy; Collaborative tagging; Resource discovery; Search | |||
| DIGMAP -- Discovering Our Past World with Digitised Maps | | BIBA | Full-Text | 563-566 | |
| José Luis Borbinha; Gilberto Pedrosa; Diogo Reis; João Luzio; Bruno Martins; João Gil; Nuno Freire | |||
| DIGMAP is a project that will develop solutions for georeferenced digital libraries, especially focused on historical materials and in the promoting of our cultural and scientific heritage. The final results of the project will consist in a set of services available in the Internet, and in reusable open-source software solutions. The main service will be a specialized digital library, reusing metadata from European national libraries, to provide discovery and access to contents. Relevant metadata from third party sources will be also reused, as also descriptions and references to any other relevant external resource. The initiative will make a proof of concept reusing and enriching the contents from several European national libraries. | |||
| Specification and Generation of Digital Libraries into DSpace Using the 5S Framework | | BIBAK | Full-Text | 567-569 | |
| Douglas Gorton; Weiguo Fan; Edward A. Fox | |||
| While digital library (DL) systems continue to become more powerful and
usable, a certain amount of inherent complexity remains in the installation,
configuration, and customization of out-of-the-box solutions like DSpace and
Greenstone. In this work, we build upon past work in the 5S Framework for
Digital Libraries and 5SL DL specification language to devise an XML-based
model for the specification of DLs for DSpace. We pair this way of specifying
DLs with a generator tool which takes a DL specification that adheres to the
model and generates a working DSpace instance that matches the specification. Keywords: digital libraries; specification; generation; DSpace; 5S; 5SL | |||
| EOD -- European Network of Libraries for eBooks on Demand | | BIBAK | Full-Text | 570-572 | |
| Zoltán Mez; Sonja Svoljsak; Silvia Gstrein | |||
| European libraries host millions of books published from 1500 to 1900. Due
to age and value, they are often only accessible to users actually present at
these libraries. EOD (eBooks on Demand) is a European wide service which gives
an answer to this problem by providing eBooks on request from a wide range of
European Libraries. The service is currently carried out within the framework
of the EU project "Digitisation on Demand". EOD is an open network and every
European library is welcome to join. Keywords: eBooks; eBooks on Demand; Digitisation on Demand; Network | |||
| Semantics and Pragmatics of Preference Queries in Digital Libraries | | BIBA | Full-Text | 573-578 | |
| Elhadji Mamadou Nguer | |||
| As information becomes available in increasing amounts, and to growing numbers of users, the shift towards a more user-centered, or personalized access to information becomes crucial. In this paper we consider the semantics and pragmatics of preference queries over tables containing information objects described through a set of attributes. In particular, we address two basic issues: * how to define a preference query and its answer (semantics) * how to evaluate a preference query (pragmatics) The main contributions of this paper are (a) the proposal of an expressive language for declaring qualitative preferences, (b) a novel approach to evaluating a preference query (c) the design of a user friendly interface with preference queries. Although our main motivation originates in digital libraries, our proposal is quite general and can be used in several application contexts. | |||
| Applications for Digital Libraries in Language Learning and the Professional Development of Teachers | | BIBAK | Full-Text | 579-582 | |
| Alannah Fitzgerald | |||
| This poster presents plans for designing and developing learning support
systems for end-users involved in the construction of a Language Learning
Digital Library (LLDL). This is in conjunction with the LLDL project for
developing stimulating interactive educational tasks that can be built on top
of digital libraries made in Greenstone's open source software specifically to
support language teaching and learning. The relevance of the proposed work
includes the development of training modules and in-depth workshops for
language teachers and students involved in the participatory design of
stimulating educational activities that can be uploaded to create digital
library collections. Digital libraries can support language teaching and
learning through the use of authentic media, comprehensive searching
capabilities, and automatically generated precision-targeted exercise material.
They also provide social computing environments for teacher-to-student and
peer-to-peer communications, along with opportunities to collaborate on group
projects. What is more, teachers can build their own digital resource
collections and these can be shared among online teaching communities which
include annotations and reflections on how to best integrate the digital
library technology into their teaching practice. Keywords: Case Studies; Collection Development; Computer-Supported Collaborative
Learning; Computer-Supported Cooperative Work; Educational Issues; Educational
Applications; Information Retrieval; Interoperability; Knowledge Organization;
Multilingual Issues; Multimedia; Digital Libraries; Computer-Aided Language
Learning; Educational Issues; Corpus Linguistics; Participatory Design;
Collection Building; Learning Support; Technology Integration; Case Libraries;
Teacher Development | |||