HCI Bibliography Home | HCI Conferences | DocEng Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
DocEng Tables of Contents: 0102030405060708091011121314

Proceedings of the 2004 ACM Symposium on Document Engineering

Fullname:DocEng'04 Proceeding of the 4th ACM Symposium on Document Engineering
Editors:Ethan V. Munson; Jean-Yves Vion-Dury
Location:Milwaukee, Wisconsin, USA
Dates:2004-Oct-28 to 2004-Oct-28
Standard No:ISBN: 1-58113-938-1; ACM DL: Table of Contents hcibib: DocEng04
  1. Document querying and transformation
  2. Document adaptation
  3. Time, media, interaction
  4. Document creation I
  5. Document creation II
  6. Document management
  7. Document analysis
  8. Theory and models I
  9. Theory and medels II

Document querying and transformation

A three-way merge for XML documents BIBAKFull-Text 1-10
  Tancred Lindholm
Three-way merging is a technique that may be employed for reintegrating changes to a document in cases where multiple independently modified copies have been made. While tools for three-way merge of ASCII text files exist in the form of the ubiquitous diff and patch tools these are of limited applicability to XML documents.
   We present a method for three-way merging of XML which is targeted at merging XML formats that model human-authored documents as ordered trees (e.g. rich text formats structured text drawings etc.). To this end we investigate a number of use cases on XML merging (collaborative editing propagating changes across document variants) from which we derive a set of high-level merge rules. Our merge is based on these rules.
   We propose that our merge is easy to both understand and implement yet sufficiently expressive to handle several important cases of merging on document structure that are beyond the capabilities of traditional text-based tools. In order to justify these claims we applied our merging method to the merging tasks contained in the use cases. The overall performance of the merge was found to be satisfactory.
   The key contributions of this work are: a set of merge rules derived from use cases on XML merging a compact and versatile XML merge in accordance with these rules and a classification of conflicts in the context of that merge.
Keywords: XML, collaborative editing, conflict, structured text, three-way merge
Fast structural query with application to Chinese treebank sentence retrieval BIBAKFull-Text 11-20
  Chia-Hsin Huang; Tyng-Ruey Chuang; Hahn-Ming Lee
In natural language processing a huge amount of structured data is constantly used for the extraction and presentation of grammatical structures in sentences. For example the Chinese Treebank corpus developed at the Institute of Information Science Academia Sinica Taiwan is a semantically annotated corpus that has been used to help parse and study Chinese sentences. In this setting users usually use structured tree patterns instead of keywords to query the corpus.
   In this paper we present an online prototype system that provides exploratory search ability. The system implements two flexible and efficient structural query methods and employs a user-friendly web-based interface. Although the system adopts the XML format to present the corpora and search results it does not use conventional XML query languages. As searching the Chinese Treebank corpora is structural in nature and often deals with structural similarities conventional XML query languages such as XPath and XQuery are inflexible and inefficient. We propose and implement a query algorithm called Parent-Child Relationship Filter (PCRF) which provides flexible and efficient structural search. PCRF is sufficiently flexible to provide several similarity-matching options such as wildcard unordered sibling sub-trees ancestor-descendant matching and their combinations. In addition PCRF supports stream-based matching to help users query their XML documents online. We also present three accelerating rules that achieve a 1.5- to 8-fold performance improvement in query time. Our experiment results show that our method archive a 10- to 1000-fold performance improvement compared to the usual text-based XPath query method.
Keywords: XML, structural query, treebank
Querying XML documents by dynamic shredding BIBAKFull-Text 21-30
  Hui Zhang; Frank Wm. Tompa
With the wide adoption of XML as a standard data representation and exchange format querying XML documents becomes increasingly important. However relational database systems constitute a much more mature technology than what is available for native storage of XML. To bridge the gap one way to manage XML data is to use a commercial relational database system. In this approach users typically first "shred" their documents by isolating what they predict to be meaningful fragments then store the individual fragments according to some relational schema and later translate each XML query (e.g. expressed in W3C's XQuery) to SQL queries expressed against the shredded documents.
   In this paper we propose an alternative approach that builds on relational database technology but shreds XML documents dynamically. This avoids many of the problems in maintaining document order and reassembling compound data from its fragments. We then present an algorithm to translate a significant subset of XQuery into an extended relational algebra that includes operators defined for the structured text datatype. This algorithm can be used as the basis of a sound translation from XQuery to SQL and the starting point for query optimization which is required for XML to be supported by relational database technology.
Keywords: XML, XQuery, dynamic shredding, relational algebra, text ADT
Presenting the results of relevance-oriented search over XML documents BIBAKFull-Text 31-33
  Alda Lopes Gançarski; Pedro Rangel Henriques
In this paper we discuss how to present the result of searching elements of any type from XML documents relevant to some information need (relevance-oriented search). As the resulting elements can contain each other we show an intuitive way of organizing the resulting list of elements in several ranked lists at different levels such that each element is presented only one time. Depending on the size of such ranked lists its presentation is given by a structure tree for small lists or by a sequence of pointers for large lists. In both cases the textual content of the implied elements is given. We also analyse the size of ranked lists in a real collection of XML documents.
Keywords: user interface for XML retrieval
The XML world view BIBAFull-Text 34
  Kristoffer H. Rose
XML is unique in its very broad acceptance throughout both the document engineering and data processing community. This creates a unique opportunity for unifying the traditionally separate worlds and ask questions such as "What are the data relations in my document?" and "How can I read a textual version of my data?" all within the single framework provided by XML. In this talk I'll speculate on how one could view the whole world as a single XML document from which both relational "table" and textual "report" queries are possible.

Document adaptation

Supporting virtual documents in just-in-time hypermedia systems BIBAKFull-Text 35-44
  Li Zhang; Michael Bieber; David Millard; Vincent Oria
Many analytical or computational applications especially legacy systems create documents and display screens in response to user queries "dynamically" or in "real time". These "virtual documents" do not exist in advance and thus hypermedia features must be generated "just in time" - automatically and dynamically. Additionally the hypermedia features may have to cause target documents to be generated or re-generated. This paper focuses on the specific challenges faced in hypermedia support for virtual documents of dynamic hypermedia functionality dynamic regeneration and dynamic anchor re-identification and re-location. It presents a prototype called JHE (Just-in-time Hypermedia Engine) to support just-in-time hypermedia across third party applications with dynamic content and discusses issues prompted by this research.
Keywords: dynamic hypermedia functionality, dynamic regeneration, integration architecture, just-in-time hypermedia, re-identification, re-location, virtual documents
A document-based approach to the generation of web applications BIBAKFull-Text 45-47
  Andrea R. de Andrade; Ethan V. Munson; Mariada G. Pimentel
wVIEW is an automated system for generating Web applications that relies extensively on document representations and transformations. wVIEW adopts the widely accepted hypermedia design principle that content navigation and presentation are separate concerns. Each of these aspects of the design process is controlled by separate declarative specifications. Only the first specification the content structure specification which is described using UML must be provided. However the wVIEW user is free to add extensions and customizations to both the data and navigation models in order to make the final application suit specific needs. This paper describes the wVIEW approach and the current prototype which focuses on the data and navigation modelling aspects. The paper discusses experiences in using XSLT as the primary development tool and shows examples how the enhancements planned to XSLT address some limitations of the application generation process.
Keywords: XML, XSLT, cocoon, design, web applications
Assisting artifact retrieval in software engineering projects BIBAKFull-Text 48-50
  Mirjana Andric; Wendy Hall; Leslie Carr
The research presented in this paper focuses on the issue of how a recommender system can support the task of searching documents and artifacts constructed in a software development project. The "A LA" (Associative Linking of Attributes) system represents a recommender facility built on top of a document management system. The facility provides assistance to finding items by utilising hypertextually connected metadata. In order to determine metadata relationships "A LA" employs techniques of content analysis together with exploiting user-generated metadata and usage logs. An evaluation study that compares querying using a full text search approach with the "A LA" method for finding relevant documents was conducted.
Keywords: links, metadata, recommender systems, zigzag
Lightweight integration of documents and services BIBAKFull-Text 51-53
  Nkechi Nnadi; Michael Bieber
This research's primary contribution is providing a relatively straightforward sustainable infrastructure for integrating documents and services. Users see a totally integrated environment. The integration infrastructure generates supplemental link anchors. Selecting one generates a list of relevant links automatically through the use of relationship rules.
Keywords: automatic link generation, metainformation, relationship rules, service integration
Personal glossaries on the WWW: an exploratory study BIBAKFull-Text 54-56
  James Blustein; Mona Noor
We examine basic issues of glossary tools as part of a suite of annotational tools to help users make meaning from documents from unfamiliar realms of discourse. We specifically evaluated the performance of glossary tools for reading medical information about common diseases by users with no formal medical education.
   We developed both automatic and an editable glossary tools. Both of them extracted definitions from the text of articles. Only the editable glossary tool allowed users to add delete and change entries.
   Both tools were evaluated to find out how useful they were to users reading technical articles online. The analytical results showed that user performance improved without increasing total reading time. The glossary tools were effective and pleasing to users at no decrease in efficiency. This experiment points the way for longer-term studies with adaptable tools particularly to help users unfamiliar with technical documents. We also discuss the rôle of glossaries as part of a suite of annotational tools to help users make personal (and therefore meaningful) hypertextual document collections.
Keywords: annotation support, evaluation experiment, hyperlinked glossaries, user interfaces

Time, media, interaction

Behavioral reactivity and real time programming in XML: functional programming meets SMIL animation BIBAKFull-Text 57-66
  Peter King; Patrick Schmitz; Simon Thompson
XML and its associated languages are emerging as powerful authoring tools for multimedia and hypermedia web content. Furthermore intelligent presentation generation engines have begun to appear as have models and platforms for adaptive presentations. However XML-based models are limited by their lack of expressiveness in presentation and animation. As a result authors of dynamic adaptive web content must often use considerable amounts of script or code. The use of such script or code has two serious drawbacks. First such code undermines the declarative description possible in the original presentation language and second the scripting/coding approach does not readily lend itself to authoring by non programmers. In this paper we describe a set of XML language extensions inspired by features from the functional programming world which are designed to widen the class of reactive systems which could be described in languages such as SMIL. The described features extend the power of declarative modeling for the web by allowing the introduction of web media items which may dynamically react to continuously varying inputs both in a continuous way and by triggering discrete user-defined events. The two extensions described herein are discussed in the context of SMIL Animation and SVG but could be applied to many XML-based languages.
Keywords: DOM, SMIL, SVG, XML, animation, behaviors, continuous, declarative, events, expressions, functional programming, modeling, time
A question answer system using mails posted to a mailing list BIBAKFull-Text 67-73
  Yasuhiko Watanabe; Kazuya Sono; Kazuya Yokomizo; Yoshihiro Okada
The most serious difficulty in developing a QA system is knowledge. In this paper we first discuss three problems of developing a knowledge base by which a QA system answers how type questions. Then we propose a method of developing a knowledge base by using mails posted to a mailing list. Next we describe a QA system which can answer how type questions based on the knowledge base. Our system finds question mails which are similar to user's question and shows the answers to the user. The similarity between user's question and a question mail is calculated by matching of user's question and a significant sentence in the question mail. Finally we show that mails posted to a mailing list can be used as a knowledge base by which a QA system answers how type questions.
Keywords: mailing list, question answer system, sentence extraction
A look at some issues during textual linking of homogeneous web repositories BIBAKFull-Text 74-83
  José Antonio Camacho-Guerrero; Alessandra Alaniz Macedo; Maria da Graça Campos Pimentel
Interacting with services that create links automatically via Web users are able to identify relationships among documents stored in different repositories. The fact that automatic linking services do not use queries performed by a human user has impact in the use of information retrieval techniques for the identification of relationships. Information retrieval techniques can lead to the identification of relationships that should not have been generated (generating non-relevant links) at the same time that fail to identify all relevant relationships (poor recall). Towards improving the quality of the relationships identified we have investigated some design issues considered during the automatic linking of textual repositories. The investigations have used a collection of documents from online Brazilian Newspapers and the Cystic Fibrosis Collection. The results of the investigations have defined procedures infrastructures and consequently the requirements for a configurable linking service made also available as a contribution of this work.
Keywords: homogeneous repositories, information retrieval, linking, semantic structures, web
Interactive multimedia annotations: enriching and extending content BIBAKFull-Text 84-86
  Rudinei Goularte; Renan G. Cattelan; José A. Camacho-Guerrero; Valter R., Jr. Inácio; Maria da Graça C. Pimentel
This paper discusses an approach to the problem of annotating multimedia content. Our approach provides annotation as metadata for indexing retrieval and semantic processing as well as content enrichment. We use an underlying model for structured multimedia descriptions and annotations allowing the establishment of spatial temporal and linking relationships. We discuss aspects related with documents and annotations used to guide the design of an application that allows annotations to be made with pen-based interaction with Tablet PCs. As a result a video stream can be annotated during the capture. The annotation can be further edited extended or played back synchronously.
Keywords: MPEG-7, annotation, multimodal interfaces
A reduced yet extensible audio-visual description language BIBAKFull-Text 87-89
  Raphaël Troncy; Jean Carrive
Enabling an intelligent access to multimedia data requires a powerful description language. In this paper we demonstrate why the MPEG-7 standard fails to fulfill this task. We introduce then our proposition: an audio-visual specific description language modular reduced but designed to be extensible. This language is centered on the notions of descriptor and structure with a well-defined semantics. A descriptor can be a low-level feature automatically extracted from the signal or a higher semantic concept that will be used to annotate the video documents. The descriptors can be combined into structures according to defined models that provide description patterns.
Keywords: MPEG-7, audio-visual description language, descriptor, knowledge representation, semantic web, semantics, structure

Document creation I

The case for explicit knowledge in documents BIBAKFull-Text 90-98
  Leslie Carr; Timothy Miles-Board; Arouna Woukeu; Gary Wills; Wendy Hall
The Web is full of documents which must be interpreted by human readers and by software agents (search engines recommender systems clustering processes etc.). Although Web standards have addressed format obfuscation by using XML schemas and stylesheets to specify unambiguous structure and presentation semantics interpretation is still hampered by the fundamental ambiguity of information in PCDATA text. Even the most easily distinguishable kinds of knowledge such as article citations and proper nouns (referring to people organisations projects products technical concepts) have to be identified by fallible post-hoc extraction processes. The WiCK project has investigated the writing process in a Semantic Web environment where knowledge services exist and actively assist the author. In this paper we discuss the need to make knowledge an explicit part of the document representation and the advantages and disadvantages of this step.
Keywords: document structure, knowledge writing, semantic web
Creating structured PDF files using XML templates BIBAKFull-Text 99-108
  Matthew R. B. Hardy; David F. Brailsford; Peter L. Thomas
This paper describes a tool for recombining the logical structure from an XML document with the typeset appearance of the corresponding PDF document. The tool uses the XML representation as a template for the insertion of the logical structure into the existing PDF document thereby creating a Structured/Tagged PDF. The addition of logical structure adds value to the PDF in three ways: the accessibility is improved (PDF screen readers for visually impaired users perform better) media options are enhanced (the ability to reflow PDF documents using structure as a guide makes PDF viable for use on hand-held devices) and the re-usability of the PDF documents benefits greatly from the presence of an XML-like structure tree to guide the process of text retrieval in reading order (e.g. when interfacing to XML applications and databases).
Keywords: PDF, XML, logical structure insertion
Aesthetic measures for automated document layout BIBAKFull-Text 109-111
  Steven J. Harrington; J. Fernando Naveda; Rhys Price Jones; Paul Roetling; Nishant Thakkar
A measure of aesthetics that has been used in automated layout is described. The approach combines heuristic measures of attributes that degrade the aesthetic quality. The combination is nonlinear so that one bad aesthetic feature can harm the overall score. Example heuristic measures are described for the features of alignment regularity separation balance white-space fraction white-space free flow proportion uniformity and page security.
Keywords: aesthetics, document, layout
Creation of topic map by identifying topic chain in Chinese BIBAKFull-Text 112-114
  Ching-Long Yeh; Yi-Chun Chen
XML Topic maps enable multiple concurrent views of sets of information objects and can be used to different applications. For example thesaurus-like interfaces to corpora navigational tools for cross-references or citation systems information filtering or delivering depending on user profiles etc. However to enrich the information of a topic map or to connect with some document's URI is very labor-intensive and time-consuming. To solve this problem we propose an approach based on natural language processing techniques to identify and extract useful information in raw Chinese text. Unlike most traditional approaches to parsing sentences based on the integration of complex linguistic information and domain knowledge we work on the output of a part-of-speech tagger and use shallow parsing instead of complex parsing to identify the topics of sentences. The key elements of the centering model of local discourse coherence are employed to extract structures of discourse segments. We use the local discourse structure to solve the problem of zero anaphora in Chinese and then identify the topic which is the most salient element in a sentence. After we obtain all the topics of a document we may assign this document into a topic node of the topic map and add the information of the document into the topic element simultaneously.
Keywords: centering model, shallow parsing, topic identification, topic maps, zero anaphora resolution

Document creation II

Techniques for authoring complex XML documents BIBAKFull-Text 115-123
  Vincent Quint; Irône Vatton
This paper reviews the main innovations of XML and considers their impact on the editing techniques for structured documents. Namespaces open the way to compound documents; well-formedness brings more freedom in the editing task; CSS allows style to be associated easily with structured documents. In addition to these innovative features the wide deployment of XML introduces structured documents in many new applications including applications where text is not the dominant content type. In languages such as SVG or SMIL for instance XML is used to represent vector graphics or multimedia presentations.
   This is a challenging situation for authoring tools. Traditional methods for editing structured documents are not sufficient to address the new requirements. New techniques must be developed or adapted to allow more users to efficiently create advanced XML documents. These techniques include multiple views semantic-driven editing direct manipulation concurrent manipulation of style and structure and integrated multi-language editing. They have been implemented and experimented in the Amaya editor and in some other tools.
Keywords: CSS, XML, authoring tools, compound documents, direct manipulation, structured editing, style languages
Instructional information in adaptive spatial hypertext BIBAKFull-Text 124-133
  Luis Francisco-Revilla; Frank Shipman
Spatial hypertext is an effective medium for the delivery of help and instructional information on the Web. Spatial hypertext's intrinsic features allow documents to visually reflect the inherent structure of the information space and represent implicit relationships between information objects. This work presents a study of the effectiveness of spatial hypertext as medium for delivery of instructional information. Results were gathered based on direct observation of the people reading a spatial hypertext document which was used as informational support for a complex task. Two versions of the spatial hypertext document were used: a non-adaptive and an adaptive. The document was adapted based upon the inferred relevance of information to the user's knowledge and task requirements. The study produced insights on emergent reading strategies such as informed link traversals and the use of collections as bookmarks. Observations and evaluation of how people interacted with both document versions showed that the spatial layout and the use of collections as a way to encapsulate information allowed people to read browse and navigate very large information spaces while maintaining a clear understanding the structure of the information. Finally several differences between the adaptive and non-adaptive versions were identified showing that adaptation alters not only the display of information but the way that people read spatial hypertext document.
Keywords: adaptation, information delivery, spatial hypertext
Page composition using PPML as a link-editing script BIBAKFull-Text 134-136
  Steven R. Bagley; David F. Brailsford
The advantages of a COG (Component Object Graphic) approach to the composition of PDF pages have been set out in a previous paper [1]. However if pages are to be composed in this way then the individual graphic objects must have known bounding boxes and must be correctly placed on the page in a process that resembles the link editing of a multi-module computer program. Ideally the linker should be able to utilize all declared resource information attached to each COG.
   We have investigated the use of an XML application called Personalized Print Markup Language (PPML) to control the link editing process for PDF COGs. Our experiments though successful have shown up the shortcomings of PPML's resource handling capabilities which are currently active at the document and page levels but which cannot be elegantly applied to individual graphic objects at a sub-page level. Proposals are put forward for modifications to PPML that would make easier any COG-based approach to page composition.
Keywords: PDF, PPML, form Xobjects, graphic objects, link editing

Document management

Managing inconsistent repositories via prioritized repairs BIBAKFull-Text 137-146
  Jan Scheffczyk; Peter Rödig; Uwe M. Borghoff; Lothar Schmitz
Whenever a group of authors collaboratively edits interrelated documents semantic consistency is a major goal. Current document management systems (DMS) lack adequate consistency management facilities. We propose liberal use of formal consistency rules which permits inconsistencies. In this paper we focus on deriving repairs for inconsistencies. Our major contributions are: (1) deriving (common) repairs for multiple rules (2) resolving conflicts between repairs (3)prioritizing repairs and (4) support for partial inconsistency resolution which resolves the most troubling inconsistencies and leaves less important inconsistencies for a later handling. The novel aspect of our approach is that we derive repairs from DAGs (directed acyclic graphs) and not from documents directly. That way the repository is locked during DAG generation only which is performed incrementally.
Keywords: consistency maintenance, document management, repair
The lifecycle of a digital historical document: structure and content BIBAKFull-Text 147-154
  A. Antonacopoulos; D. Karatzas; H. Krawczyk; B. Wiszniewski
This paper describes the lifecycle of a digital historical document, from template-based structure definition through to content extraction from the scanned pages and its final reconstitution as an electronic document (combining content and semantic information) along with the tools that have been created to realise each stage in the lifecycle. The whole approach is described in the context of different types of typewritten documents relating to prisoners in World-War II concentration camps and is the result of a multinational collaboration under the MEMORIAL project funded (€1.5M) by the European Union (www.memorial-project.info). Extensive tests with historians/archivists and evaluation of the content extraction results indicate the superior performance of the whole semantics-driven approach both over manual transcription and over the semi-automated application of off-the-shelf OCR and the use of a conventional (text and layout) document format.
Keywords: digital libraries, document analysis, document architecture, document engineering, historical documents, text enhancement
Accommodating paper in document databases BIBAKFull-Text 155-162
  Majed AbuSafiya; Subhasish Mazumdar
Although the paperless office has been imminent for decades, documents in paper form continue to be used extensively in almost all organizations. Present-day information systems are designed on the premise that any paper document in use will be either converted into electronic form or merely printed from electronic file(s) accessible to the system. Yet, paper is the medium of choice in many situations, mainly owing to its portability and usability, and the medium of necessity in others, especially where external communication or the traditional notion of authenticity are involved. Humans who find unique attractive features in both paper and electronic forms of documents, must survive this tension between the de-jure banishment of paper and its de-facto prevalence. In this paper, we propose to make paper documents first-class citizens by including them in the model underlying the information system. Specifically, we extend the schema of a document database with the notion of paper documents, physical locations, and the organizational hierarchy. This leads to an overall enhancement of document integrity and the ability to answer queries such as "where are the customer complaint letters we have received today?" and "which documents are in this filing cabinet?". Recent technological advances such as sensors have made the implementation of such a model very realistic.
Keywords: RFID, document databases, document management, enterprise document model, paper documents, paper manifestation
Strategies for document optimization in digital publishing BIBAKFull-Text 163-170
  Felipe Rech Meneguzzi; Leonardo Luceiro Meirelles; Fernando Tarlá Martins Mano; Joao Batista de Souza Oliveira; Ana Cristina Benso da Silva
Recent advances in digital press technology have enabled the creation of high-quality personalized documents, with the potential of generating an entire batch of one-of-a-kind documents. Even though digital presses are capable of printing such document sets as fast as they would print regular press jobs, raster image processing might possibly be performed for every different page in the job. Such process demands a large computational effort and it is therefore interesting to gather repeated images that are used throughout all documents and rasterize them as few times as possible. Moreover, performing such process separately from document production in the publishing workflow allows optimization to be performed prior to final printing, thus allowing it to take press hardware specifics into account, and reducing the time taken for it to produce the final output. This paper describes techniques to perform this task using PPML as the document description language, as well as the main issues concerning this kind of document optimization. Several gathering policies are described along with explanatory examples. We also provide and discuss experimental data supporting the use of such strategy.
Keywords: PPML, digital press, personalized printing, variable data printing, variable information documents

Document analysis

Digital capture for automated scanner workflows BIBAKFull-Text 171-177
  Steven J. Simske; Scott C. Baggs
The use of scanners and other capture devices to incorporate film- and paper-based materials into digital workflows is an important part of "digital convergence", or the bringing of paper-based and electronic documents together into the same electronic workflows. The diversity of captured information-from text and mixed-type documents to photos, negatives, slides and transparencies-requires a combination of document analysis techniques to perform, automatically, the segmentation, classification and workflow assignment of the scanned images. We herein present technologies that provide fast (< 1.0 sec) and reliable (> 95% job accuracy) capture solutions for all of these input content types. These solutions offer near real-time capture that provides automated workflow capabilities to a repertoire of scanning hardware: scanners, all-in-one devices, copiers and multifunctional printers. The techniques used to categorize the documents, perform zoning analysis on the documents, and then perform closed loop quality assurance on the documents are presented.
Keywords: classification, negatives, photos, scanning, segmentation, slides, user interface, zoning
Visual signature based identification of Low-resolution document images BIBAKFull-Text 178-187
  Ardhendu Behera; Denis Lalanne; Rolf Ingold
In this paper, we present (a) a method for identifying documents captured from low-resolution devices such as web-cams, digital cameras or mobile phones and (b) a technique for extracting their textual content without performing OCR. The first method associates a hierarchically structured visual signature to the low-resolution document image and further matches it with the visual signatures of the original high-resolution document images, stored in PDF form in a repository. The matching algorithm follows the signature hierarchy, which speeds-up the search by guiding it towards fruitful solution spaces. In a second step, the content of the original PDF document is extracted, structured, and matched with its corresponding high-resolution visual signature. Finally, the matched content is attached to the low-resolution document image's visual signature, which greatly enriches the document's content and indexing. We present in this article both these identification and extraction methods and evaluate them on various documents, resolutions and lighting conditions, using different capture devices.
Keywords: document visual signature, document-based meeting retrieval, documents' content extraction, low-resolution document image identification
NCL 2.0: integrating new concepts to XML modular languages BIBAKFull-Text 188-197
  Heron V. O. Silva; Rogério F. Rodrigues; Luiz Fernando G. Soares; Débora C. Muchaluat Saade
This paper presents the main new features of Nested Context Language (NCL) version 2.0. NCL 2.0 is a modular and declarative hypermedia language, whose modules can be combined to other languages, such as SMIL, to provide new facilities. Among the NCL 2.0 new features, we can highlight the support for handling hypermedia relations as first-class entities, through the definition of hypermedia connectors, and the possibility of specifying any semantics for a hypermedia composition, using the concept of composition templates. Another important goal of this paper is to describe a framework to facilitate the development of NCL parsing and processing tools. Based on this framework, the paper comments several implemented compilers, which allow, for instance, the conversion of NCL documents into SMIL specifications.
Keywords: NCL, SMIL, XConnector, XTemplate, composition template, framework for parsing and processing XML, hypermedia connector
Document capture using stereo vision BIBAKFull-Text 198-200
  Adrian Ulges; Christoph H. Lampert; Thomas Breuel
Capturing images of documents using handheld digital cameras has a variety of applications in academia, research, knowledge management, retail, and office settings. The ultimate goal of such systems is to achieve image quality comparable to that currently achieved with flatbed scanners even for curved, warped, or curled pages. This can be achieved by high-accuracy 3D modeling of the page surface, followed by a "flattening" of the surface. A number of previous systems have either assumed only perspective distortions, or used techniques like structured lighting, shading, or side-imaging for obtaining 3D shape. This paper describes a system for handheld camera-based document capture using general purpose stereo vision methods followed by a new document dewarping technique. Examples of shape modeling and dewarping of book images is shown.
Keywords: camera based document capture, dewarping, stereo vision

Theory and models I

On modular transformation of structural content BIBAKFull-Text 201-210
  Tyng-Ruey Chuang; Jan-Li Lin
We show that an XML DTD (Document Type Definition) can be viewed as the fixed point of a parametric content model. Based on the parametric content model, we develop a model of modular transformation of XML documents. A fold operator is used to capture a class of functions that consume valid XML document trees in a bottom-up matter. Similarly, an unfold operator is used to generate valid XML document trees in a top-down fashion. We then show that DTD-aware XML document transformation, which consumes a document of one DTD and generates a document of another DTD, can be thought as both a fold operation and an unfold operation.
   This leads us to model certain DTD-aware document transformations by mappings from the source content models to the target content models. From these mappings, we derive DTD-aware XML document transformational programs. Benefits of such derived programs include automatic validation of the target documents (no invalid document will be generated) and modular property in the composition of these programs (intermediate results from successive transformations can be eliminated).
Keywords: ML, XML, bird-meertens formalism, document transformation and validation, functional programming, modules
Logic-based XPath optimization BIBAKFull-Text 211-219
  Pierre Genevès; Jean-Yves Vion-Dury
XPath [5] was introduced by the W3C as a standard language for specifying node selection, matching conditions, and for computing values from an XML document. XPath is now used in many XML standards such as XSLT [4] and the forthcoming XQuery [10] database access language. Since efficient XML content querying is crucial for the performance of almost all XML processing architectures, a growing need for studying high performance XPath-based querying has emerged. Our approach aims at optimizing XPath performance through static analysis and syntactic transformation of XPath expressions.
Keywords: XML, XPath, axiomatization, containment, efficiency, optimization, query

Theory and medels II

Supervised learning for the legacy document conversion BIBAKFull-Text 220-228
  Boris Chidlovskii; Jérôme Fuselier
We consider the problem of document conversion from the rendering-oriented HTML markup into a semantic-oriented XML annotation defined by user-specific DTDs or XML Schema descriptions. We represent both source and target documents as rooted ordered trees so the conversion can be achieved by applying a set of tree transformations. We apply the supervised learning framework to the conversion task according to which the tree transformations are learned from a set of training examples. Because of the complexity of tree-to-tree transformations, We develop a two-step approach to the conversion problem, that first labels leaves in the source trees and then recomposes target trees from the leaf labels. We present two solutions based of the leaf classification with the target terminals and paths. Moreover, we develop three methods for the leaf classification. All methods and solutions have been tested on two real collections.
Keywords: XML markup, legacy document conversion, machine learning
Chart-parsing techniques and the prediction of valid editing moves in structured document authoring BIBAKFull-Text 229-238
  Marc Dymetman
We present an approach to controlled document authoring that significantly extends the functionality of existing methods by allowing bottom-up and top-down specifications to be freely mixed. A finite-state automaton is used to represent the partial, evolving, description of the document during authoring. Using a generalization of chart-parsing techniques to FSAs rather than fixed input strings, we show how the authoring system is able to automatically detect the consequences of the choices already made by the author so as to only propose for the next authoring steps choices which may provably lead to a globally valid document.
   We start by considering the case of authoring purely textual documents controlled by a context-free grammar, then show a generalization of this approach to structured documents controlled by a specification whose formal expressive power is at least that of Regular Hedge Grammars (closely related to RELAX NG Schemas) and therefore greater than that of DTDs.
Keywords: XML, computational linguistics, document authoring tools and systems, parsing
Towards efficient implementation of XML schema content models BIBAKFull-Text 239-241
  Pekka Kilpeläinen; Rauno Tuhkanen
XML Schema uses an extension of traditional regular expressions for describing allowed contents of document elements. Iteration is described through numeric attributes minOccurs and maxOccurs attached to content-describing elements such as sequence, choice, and element. These numeric occurrence indicators are a challenge to standard automata-based solutions. Straightforward solutions require space that is exponential with respect to the length of the expressions.
   We describe a strategy to implement unambiguous content model expressions as counter automata, which are of linear size only.
Keywords: XML schema, automaton, regular expression