| Issues and Tradeoffs in Document Preparation Systems | | BIBA | 1-16 | |
| Brian W. Kernighan | |||
| Users of document preparation systems must balance how much effort they put into producing their documents against how close their output is to what they want. The evolution of document preparation systems is a history of how users and implementers have dealt with this tradeoff as technology improves and as the user population itself evolves. | |||
| Towards Document Engineering | | BIBAK | 17-29 | |
| Vincent Quint; Marc Nanard; Jacques Andre | |||
| This article compares methods and techniques used in software engineering
with the ones used for handling electronic documents. It shows the common
features in both domains, but also the differences and it proposes an approach
which extends the field of document manipulation to document engineering. It
shows also in what respect document engineering is different from software
engineering. Therefore specific techniques must be developed for building
integrated environments for document engineering. Keywords: Software engineering, Document engineering, Structured editing, Integrated
environments | |||
| Managing Properties in a System of Cooperating Editors | | BIBAK | 31-46 | |
| Donald D. Chamberlin | |||
| Today's workstations make it possible for users to create and interact with
many types of objects. It is desirable that a document creation tool allow all
these types of objects to be mixed and nested without restriction in documents,
that each type of object be treated uniformly wherever it is found, and that
the tool be extensible to new types of objects. The Quill document creation
system addresses these requirements by providing an extensible family of
specialized editors, coordinated by a Shell that provides common services and
presents a consistent user interface. The Shell manages a database that
records the properties of various objects in the document, allows objects to
inherit properties from other objects, and allows users to override properties
when desired. Quill generalizes the concept of properties to include
user-supplied procedures that specify the active behavior of an object during
WYSIWYG editing. Keywords: Document systems, Editors, Markup, Properties, Inheritance, Extensibility | |||
| A Logic Grammar Foundation for Document Representation and Document Layout | | BIBAK | 47-64 | |
| Allen L., Jr. Brown; Howard A. Blair | |||
| We represent a powerful grammar-based paradigm for electronic document
markup: coordinated definite clause translation grammars. This markup is of a
declarative character, being, in effect, a collection of constraints on the
logical and physical structure of documents. To the best of our knowledge,
coordinated grammars and their parsers can accommodate all of the descriptive
and layout processing functionality enjoyed by extant electronic markup
languages. We describe an operational prototype that demonstrates the
feasibility of a syntax-directed basis for formalizing and realizing document
layout. Keywords: Document description language, Layout processing, Logic grammar | |||
| Structured Editing - Hypertext Approach: Cooperation and Complementarity | | BIBAK | 65-78 | |
| Anne-Marie Vercoustre | |||
| As Hypertext systems are now widely available, many technical and conceptual
problems have been identified. We argue here that such systems could take
advantage of the proven technology of structured editors in order to provide
both the user and the system with a conceptual document model providing a sound
basis for the hierarchical links. A prototype combining structured editing and
hypertext facilities proposes two approaches to implementing non-hierarchical
links to subtrees: the first one uses the paths from the tree root as anchorage
mechanism, while the second one uses tree pattern matching as a first step
towards semantic and more manageable links. Keywords: Structured editing, Syntax directed editors, Hypertext, Scripted documents,
Anchor | |||
| An ODA Page Planner for Professional Publishing | | BIBA | 79-92 | |
| Giovanni Guardalben; Mose Giacomello | |||
| By its own nature, Professional Publishing requires that document processing be completed in many steps. This is in stark contrast to Desktop Publishing, where all actions leading to the printed page are performed by a single application and usually by the same person. Nowadays, a typical Professional Publishing environment comprises a large database and processing server, usually on mainframe, and many external processors performing integrated functions, usually on independent workstations. We believe that the front-end function of layout page planning can be served by local applications running on relatively inexpensive graphics workstations. PcPage is a personal computer application that tries to ease and make more efficient the work of layout page planning. Since page planning is a transitional step in document processing, it interacts with other tools and applications. To do so, it has to be built on rich data structures and standard data exchange mechanisms. With these goals in mind, we based PcPage on the ODA/ODIF ISO standards and we chose Microsoft Windows as its graphics interface environment. This paper describes PcPage implementation of the ODA hierarchical data structure and the sophisticated user interface built upon it. | |||
| flo -- A Language for Typesetting Flowcharts | | BIBAK | 93-106 | |
| Anthony P. Wolfman; Daniel M. Berry | |||
| flo is a language for including flowcharts into documents typeset using the
UNIX ditroff. A basic flowchart can be created with minimal effort by
inputting only the basic algorithm written in a Pascal-like notation. The
example below illustrates the general capability of flo. The flowchart to the
left is obtained from the input to the right.
This input uses default settings except for a sizing parameter in the .FL command. flo is a pic preprocessor, which in turn is a ditroff preprocessor. flo lets most of its input pass through untouched; it translates flo commands lying between .FL and .FE into pic commands that draw the flowcharts. This paper was typeset camera-ready using flo, pic, ditroff, and other ditroff preprocessors. Keywords: Flowcharting, Typesetting, Ditroff, Pic | |||
| Design of Hypermedia Publications: Issues and Solutions | | BIBAK | 107-124 | |
| Paul Kahn; Julie Launhardt; Krzysztof Lenk; Ronnie Peters | |||
| For a hypermedia collection to function properly, an author must
successfully combine the verbal language of the document content with an
equally persuasive visual language of hypermedia design. This visual language
should help define a sense of hierarchy in the presentation of information,
create a sense of order, structure and clarity, and allow the user to focus on
what is alike and what is different. This paper discusses some of the issues
that face the designer of hypermedia documents being considered by a joint
research team of software engineers, software designers, content specialists
and graphic designers. We discuss specific implementation issues that informed
the creation of Exploring the Moon and The Dickens Web, the first two
hypermedia publications created with IRIS Intermedia version 3.0. In analyzing
these two works as well as ideas for future hypermedia publications, we have
identified a new set of issues which we list at the end of the paper. Keywords: Hypermedia, Graphic design, Intermedia | |||
| Strengths and Weaknesses of Database Models for Textual Documents | | BIBAK | 125-138 | |
| B. N. Rossiter; M. A. Heather | |||
| User requirements in large and complex textbases are discussed in the light
of current models. Examples applying relational and semantic models suggest
criteria for a more fundamental approach involving the merger of
object-oriented programming techniques with database methods in future complex
object textbases. Keywords: Document modelling, Databases, Complex objects | |||
| A Structured Document Database System | | BIBA | 139-151 | |
| Pekka Kilpelainen; Greger Linden; Heikki Mannila; Erja Nikunen | |||
| We describe a database system for writing, editing, and querying structured documents. The structure of the text is described using a context-free grammar, and the operations are implemented using a powerful query language. The system supports the use of user-defined multiple views of the documents: one view can contain all the structure explicitly, while another can contain only part of the document and have only part of the structure visible. This makes the system flexible for different editing tasks. The system is implemented in C using a relational database system. | |||
| The Integration of Structured Documents into DBMS | | BIBAK | 153-168 | |
| Jose Valdeni De Lima; Henri Galy | |||
| The modeling of structured documents creates enormous problems for database
designers. Those problems are related to the requirements to consider the
logical structure and the exchange of documents in an open system. We want to
be able to handle documents, both as atomic objects and as objects composed of
other objects. We first try to classify different possible approaches
according to the typical database concepts. After describing an integration of
the ODA Standard to a functional type model, the "Fact Model", we describe the
implementation of a functional interface built on the top of a relational DBMS,
ORACLE. Keywords: Structured documents, Databases, Logical structures, Complex objects, ODA,
ODIF, Functional models, ORACLE, DOEOIS | |||
| Electronic Publishing -- Practice and Experience | | BIBAK | 169-182 | |
| David F. Brailsford; David R. Evans; Geeti Granger | |||
| Electronic Publishing -- Origination, Dissemination and Design ('EP-odd') is
an academic journal which publishes refereed papers in the subject area of
electronic publishing. The authors of the present paper are, respectively,
editor-in-chief, system software consultant and senior production manager for
the journal. EP-odd's policy is that editors, authors, referees and production
staff will work closely together using electronic mail. Authors are also
encouraged to originate their papers using one of the approved text-processing
packages together with the appropriate set of macros which enforce the layout
style for the journal. This same software will then be used by the publisher
in the production phase. Our experiences with these strategies are presented,
and two recently developed suites of software are described: one of these makes
the macro sets available over electronic mail and the other automates the flow
of papers through the refereeing process. The decision to produce EP-odd in
this way means that the publisher has to adopt production procedures which
differ markedly from those employed for a conventional journal. Keywords: Journal production, Computer aided refereeing system, Remote file access | |||
| ADAPT: Automated Document Analysis Processing and Tagging | | BIBAK | 183-192 | |
| John Handley; Stuart Weibel | |||
| ADAPT is a document processing system that automatically builds full-text
databases from document images. The major components of the process are
scanning, image segmentation, optical character recognition (OCR), layout
object identification, and database building. A retrieval system and user
interface completes the functionality. The system features a general document
representation that includes the document image and an SGML tagged version.
Standards are adhered to where applicable. Keywords: Document processing, Full-text retrieval, Document structure analysis,
Abstract syntax notation one | |||
| Recognition Processing for Multilingual Documents | | BIBAK | 193-205 | |
| A. Lawrence Spitz | |||
| We have extended earlier work on document recognition systems to include
multilingual documents, specifically those containing both English and
Japanese. The segmentation process divides the page into areas of homogeneous
content and produces a hierarchical representation of page layout called the
segment map. There is an initial halftone segmentation pass, followed by
text/graphics segmentation. Text segments are subjected to analysis to
determine whether they are English (roman) or Japanese, before routing the
output to the appropriate character recognition process. Graphics segments are
routed to a raster-to-vector converter. Having identified text and graphics
segments, we then attempt to recognize their individual internal structures and
merge all of this information into an intermediate representation from which
output transformations are performed. We have implemented three output
filters, two for commercial document formatting systems, and one into an
international standard document architecture. Keywords: Document recognition, Segmentation, Character recognition, Vectorization,
Multilingual | |||
| Editing Images of Text | | BIBAK | 207-220 | |
| Gary E. Kopec; Steven C. Bagley | |||
| Most document recognition systems are based on the paradigm of format
conversion, in which scanned document images are converted into a structured
symbolic description which can be manipulated by a conventional document
processing system. While this approach is attractive in many respects, there
are situations in which complete recognition and format conversion is either
unnecessary or very difficult to achieve with sufficient accuracy. This paper
describes Image EMACS, a text editor for binary document images which
illustrates an alternative to the format conversion paradigm. The inputs and
outputs of Image EMACS are scanned images of text and the primary document
representation within Image EMACS is the image itself, rather than a symbolic
description of it. The goal of Image EMACS is to allow images of text to be
created and manipulated as if they were conventional text files. The central
insight behind Image EMACS is that many text editing operations may be
implemented directly in terms of geometrical operations on image blobs, without
explicit knowledge of the symbolic character labels (i.e. without character
recognition). Keywords: Document recognition, Text editing, Bitmap editing | |||
| Automatic Generation of Gridfitting Hints for Rasterization of Outline Fonts or Graphics | | BIBA | 221-234 | |
| Sten F. Andler | |||
| The advent of bitmapped displays and printers, high-function page description languages, and outline fonts, have dramatically changed the ability of computers to produce typeset documents. Using outline fonts and rasterizing them into bitmaps on demand eliminates costly storage of raster bitmaps for all combinations of device resolution and type size. The problem, however, with these resolution-independent fonts are that aesthetic quality is hard to achieve at low device resolution and/or small font size. This paper presents a method for achieving aesthetic quality without manual intervention. | |||
| Chinese Fonts and Their Digitization | | BIBA | 235-248 | |
| Y. S. Moon; T. Y. Shin | |||
| This paper presents the state-of-the-art in digital Chinese font design. Both academic and industrial achievements are covered. We first highlight the difficulties in Chinese typography which are not encountered in English typesetting. Existing techniques for designing digital Chinese fonts are then examined, with their limitations identified. Finally, we propose future research directions, taking into account the recent trend in outline font technology. | |||
| Documents as User Interfaces | | BIBAK | 249-262 | |
| Eric A. Bier; Aaron Goodisman | |||
| Each year the electronic documents community produces better tools for
creating and changing document elements, including text, illustrations, tables,
equations, video, voice, hypertext links, and animation. At the same time, the
user interface community is working to build interfaces that improve the
quality of interaction by effectively presenting information to the user and
making it easy to act on and manipulate that information. These efforts can be
combined by using documents as user interfaces. This paper describes a
prototype architecture, EmbeddedButtons, that allows arbitrary document
elements to behave as buttons. Using examples from EmbeddedButtons, we
enumerate some of the reasons that user interfaces should be documents and
documents should be user interfaces. Keywords: Active documents, User interfaces, Buttons, EmbeddedButtons | |||
| An Extensible, Object-Oriented System for Active Documents | | BIBA | 263-276 | |
| Paul M. English; Ethan S. Jacobson; Robert A. Morris; Kimbo B. Mundy; Stephen D. Pelletier; Thomas A. Polucci; H. David Scarbro | |||
| An extensible, object-oriented system for describing and executing active documents is discussed. An existing commercial, structured document processing system was extended with a run-time bindable object system and Lisp interpreter. | |||
| The Role of a Descriptive Markup Language in the Creation of Interactive Multimedia Documents for Customized Electronic Delivery | | BIBAK | 277-290 | |
| Gil C. Cruz; Thomas H. Judd | |||
| The emerging broadband telecommunications network promises to support a
myriad of new mass-market information services that may in turn create a
tremendous demand for new source material capable of exploiting the multimedia
transport capability of the network. Authoring such material is presently a
complex and time consuming process requiring specialized tools. We propose
that a descriptive markup language, based on SGML and enhanced for interactive
multimedia applications, can form the basis for a new set of authoring tools
that will let experienced text authors transfer their skills to multimedia
documents. Experience with a prototype version of such a language in the
production of an experimental electronic magazine indicates that the approach
is valid and useful. Future work includes defining text-like structure in
temporal media and creating a unified set of editing and previewing tools. Keywords: Authoring, Hypermedia, Interactive, Markup, SGML | |||