HCI Bibliography Home | HCI Journals | About TOIS | Journal Info | TOIS Journal Volumes | Detailed Records | RefWorks | EndNote | Hide Abstracts
TOIS Tables of Contents: 040506070809101112131415161718192021222324

ACM Transactions on Information Systems 14

Editors:W. Bruce Croft
Dates:1996
Volume:14
Publisher:ACM
Standard No:ISSN 1046-8188; HF S548.125 A33
Papers:16
Links:Table of Contents
  1. TOIS 1996 Volume 14 Issue 1
  2. TOIS 1996 Volume 14 Issue 2
  3. TOIS 1996 Volume 14 Issue 3
  4. TOIS 1996 Volume 14 Issue 4

TOIS 1996 Volume 14 Issue 1

In Memoriam: Gerard Salton BIB 1
 
A Visual Retrieval Environment for Hypermedia Information Systems BIBAKPDF 3-29
  Dario Lucarella; Antonella Zanzi
We present a graph-based object model that may be used as a uniform framework for direct manipulation of multimedia information. After an introduction motivating the need for abstraction and structuring mechanisms in hypermedia systems, we introduce the data model and the notion of perspective, a form of data abstraction that acts as a user interface to the system, providing control over the visibility of the objects and their properties. A perspective is defined to include an intension and an extension. The intension is defined in terms of a pattern, a subgraph of the schema graph, and the extension is the set of pattern-matching instances. Perspectives, as well as database schema and instances, are graph structures that can be manipulated in various ways. The resulting uniform approach is well suited to a visual interface. A visual interface for complex information systems provides high semantic power, thus exploiting the semantic expressibility of the underlying data model, while maintaining ease of interaction with the system. In this way, we reach the goal of decreasing cognitive load on the user, with the additional advantage of always maintaining the same interaction style. We present a visual retrieval environment that effectively combines filtering, browsing, and navigation to provide an integrated view of the retrieval problem. Design and implementation issues are outlined for MORE (Multimedia Object Retrieval Environment), a prototype system relying on the proposed model. The focus is on the main user interface functionalities, and actual interaction sessions are presented including schema creation, information loading, and information retrieval.
Keywords: Browsing, Complex objects, Direct object manipulation, Graph-Oriented models, Hypermedia applications, Information filtering, Visual interface, Design, Human factors, Management, H.5.1 Information Systems, Information interfaces and presentation, Multimedia Information Systems, Hypertext navigation and maps, H.2.1 Information Systems, Database management, Logical Design, Data models, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Query formulation, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Selection process, H.5.2 Information Systems, Information interfaces and presentation, User Interfaces, Interaction styles
Sequential Patterns in Information Systems Development: An Application of a Social Process Model BIBAKPDF 30-63
  Daniel Robey; Michael Newman
We trace the process of developing and implementing a materials management system in one company over a 15-year period. Using a process research model developed by Newman and Robey, we identify 44 events in the process and define them as either encounters or episodes. Encounters are concentrated events, such as meetings and announcements, that separate episodes, which are events of longer duration. By examining the sequence of events over the 15 years of the case, we identify a pattern of repeated failure, followed by success. Our discussion centers on the value of detecting and displaying such patterns and the need for theoretical interpretation of recurring sequences of events. Five alternative theoretical perspectives, originally proposed by Kling, are used to interpret the sequential patterns identified by the model. We conclude that the form of the process model allows researchers who operate from different perspectives to enrich their understanding of the process of system development.
Keywords: Social processes, System implementation, Human factors, Management, K.6.1 Computing Milieux, Management of computing and information systems, Project and People Management, H.4.2 Information Systems, Information systems applications, Types of Systems
Evaluation of Model-Based Retrieval Effectiveness with OCR Text BIBAKPDF 64-93
  Kazem Taghva; Julie Borsack; Allen Condit
We give a comprehensive report on our experiments with retrieval from OCR-generated text using systems based on standard models of retrieval. More specifically, we show that average precision and recall is not affected by OCR errors across systems for several collections. The collections used in these experiments include both actual OCR-generated text and standard information retrieval collections corrupted through the simulation of OCR errors. Both the actual and simulation experiments include full-text and abstract-length documents. We also demonstrate that the ranking and feedback methods associated with these models are generally not robust enough to deal with OCR errors. It is further shown that the OCR errors and garbage strings generated from the mistranslation of graphic objects increase the size of the index by a wide margin. We not only point out problems that can arise from applying OCR text within an information retrieval environment, we also suggest solutions to overcome some of these problems.
Keywords: Error correction, Feedback, Optical character recognition, Ranking algorithms, Experimentation, Performance, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Retrieval models, H.3.1 Information Systems, Information storage and retrieval, Content Analysis and Indexing, Indexing methods, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Search process, I.4.1 Computing Methodologies, Image processing and computer vision, Digitization and Image Capture, Scanning
An Extension of Ukkonen's Enhanced Dynamic Programming ASM Algorithm BIBAKPDF 94-106
  Hal Berghel; David Roach
We describe an improvement on Ukkonen's Enhanced Dynamic Programming (EHD) approximate string-matching algorithm for unit-penalty four-edit comparisons. The new algorithm has an asymptotic complexity similar to that of Ukkonen's but is significantly faster due to a decrease in the number of array cell calculations. A 42% speedup was achieved in an application involving name comparisons. Even greater improvements are possible when comparing longer and more dissimilar strings. Although the speed of the algorithm under consideration is comparable to other fast ASM algorithms, it has greater effectiveness in text-processing applications because it supports all four basic Damerau-type editing operations.
Keywords: Approximate string matching, Dynamic programming, Enhanced dynamic programming, Similarity relations, Algorithms, Performance, F.2.2 Theory of Computation, Analysis of algorithms and problem complexity, Nonnumerical Algorithms and Problems, Pattern matching, H.3.1 Information Systems, Information storage and retrieval, Content Analysis and Indexing, H.4.1 Information Systems, Information systems applications, Office Automation, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Search process

TOIS 1996 Volume 14 Issue 2

Document Ranking on Weight-Partitioned Signature Files BIBAKPDF 109-137
  Dik Lun Lee; Liming Ren
A signature file organization, called the weight-partitioned signature file, for supporting document ranking is proposed. It employs multiple signature files, each of which corresponds to one term frequency, to represent terms with different term frequencies. Words with the same term frequency in a document are grouped together and hashed into the signature file corresponding to that term frequency. This eliminates the need to record the term frequency explicitly for each word. We investigate the effect of false drops on retrieval effectiveness if they are not eliminated in the search process. We have shown that false drops introduce insignificant degradation on precision and recall when the false-drop probability is below a certain threshold. This is an important result since false-drop elimination could become the bottleneck in systems using fast signature file search techniques. We perform an analytical study on the performance of the weight-partitioned signature file under different search strategies and configurations. An optimal formula is obtained to determine for a fixed total storage overhead the storage to be allocated to each partition in order to minimize the effect of false drops on document ranks. Experiments were performed using a document collection to support the analytical results.
Keywords: Access method, Document retrieval, Information retrieval, Signature file, Superimposed coding, Text retrieval, Algorithms, Design, Experimentation, Performance, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Retrieval models, H.2.2 Information Systems, Database management, Physical Design, Access methods, H.3.6 Information Systems, Information storage and retrieval, Library Automation, H.3.1 Information Systems, Information storage and retrieval, Content Analysis and Indexing
Using Local Optimality Criteria for Efficient Information Retrieval with Redundant Information Filters BIBAKPDF 138-174
  Neil C. Rowe
We consider information retrieval when the data -- for instance, multimedia -- is computationally expensive to fetch. Our approach uses "information filters" to considerably narrow the universe of possibilities before retrieval. We are especially interested in redundant information filters that save time over more general but more costly filters. Efficient retrieval requires that decisions must be made about the necessity, order, and concurrent processing of proposed filters (an "execution plan"). We develop simple polynomial-time local criteria for optimal execution plans and show that most forms of concurrency are suboptimal with information filters. Although the general problem of finding an optimal execution plan is likely to be exponential in the number of filters, we show experimentally that our local optimality criteria, used in a polynomial-time algorithm, nearly always find the global optimum with 15 filters or less, a sufficient number of filters for most applications. Our methods require no special hardware and avoid the high processor idleness that is characteristic of massive-parallelism solutions to this problem. We apply our ideas to an important application, information retrieval of captioned data using natural-language understanding, a problem for which the natural-language processing can be the bottleneck if not implemented well.
Keywords: Boolean algebra, Conjunction, Filters, Natural language, Optimization, Queries, Performance, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Search process
TROLL: A Language for Object-Oriented Specification of Information Systems BIBAKPDF 175-211
  Ralf Jungclaus; Gunter Saake; Thorsten Hartmann; Cristina Sernadas
TROLL is a language particularly suited for the early stages of information system development, when the universe of discourse must be described. In TROLL the descriptions of the static and dynamic aspects of entities are integrated into object descriptions. Sublanguages for data terms, for first-order and temporal assertions, and for processes, are used to describe respectively the static properties, the behavior, and the evolution over time of objects. TROLL organizes system design through object-orientation and the support of abstractions such as classification, specialization, roles, and aggregation. Language features for state interactions and dependencies among components support the composition of the system from smaller modules, as does the facility of defining interfaces on top of object descriptions.
Keywords: Formal specification, Information system design, Object-oriented conceptual modeling, Design, Languages, D.2.1 Software, Software engineering, Requirements/Specifications, Languages, D.3.3 Software, Programming languages, Language Constructs and Features, H.1.0 Information Systems, Models and principles, General, D.3.2 Software, Programming languages, Language Classifications
Computerized Performance Monitors as Multidimensional Systems: Derivation and Application BIBAKPDF 212-235
  Rebecca A. Grant; Chris A. Higgins
An increasing number of companies are introducing computer technology into more aspects of work. Effective use of information systems to support office and service work can improve staff productivity, broaden a company's market, or dramatically change its business. It can also increase the extent to which work is computer mediated and thus within the reach of software known as Computerized Performance Monitoring and Control Systems (CPMCSs). Virtually all research has studied CPMCSs as unidimensional systems. Employees are described as "monitored" or "unmonitored" or as subject to "high," "moderate," or "low" levels of monitoring. Research that does not clearly distinguish among possible monitor design cannot explain how designs may differ in effect. Nor can it suggest how to design better monitors. A multidimensional view of CPMCSs describes monitor designs in terms of object of measurements, tasks measured, recipient of data, reporting period, and message content. This view is derived from literature in control systems, organizational behavior, and management information systems. The multidimensional view can then be incorporated into causal models to explain contradictory results of earlier CPMCS research.
Keywords: Computerized performance evaluation, Computerized work monitoring, Work monitoring system design, Measurement, Management, Performance, Theory, H.4.2 Information Systems, Information systems applications, Types of Systems, Logistics, K.4.3 Computing Milieux, Computers and society, Organizational Impacts, H.1.2 Information Systems, Models and principles, User/Machine Systems, Human factors, K.7.m Computing Milieux, The computing profession, Miscellaneous, C.4 Computer Systems Organization, Performance of systems, Modeling techniques, H.4.1 Information Systems, Information systems applications, Office Automation, Time management

TOIS 1996 Volume 14 Issue 3

Natural-Language Retrieval of Images Based on Descriptive Captions BIBAKPDF 237-267
  Eugene J. Guglielmo; Neil C. Rowe
We describe a prototype intelligent information retrieval system that uses natural-language understanding to efficiently locate captioned data. Multimedia data generally require captions to explain their features and significance. Such descriptive captions often rely on long nominal compounds (strings of consecutive nouns) which create problems of disambiguating word sense. In our system, captions and user queries are parsed and interpreted to produce a logical form using a detailed theory of the meaning of nominal compounds. A fine-grain match can then compare the logical form of the query to the logical forms for each caption. To improve system efficiency, we first perform a coarse-grain match with index files, using nouns and verbs extracted from the query. Our experiments with randomly selected queries and captions from an existing image library show an increase of 30% in precision and 50% in recall over the keyphrase approach currently used. Our processing times have a median of seven seconds as compared to eight minutes for the existing system, and our system is much easier to use.
Keywords: Captions, Multimedia database, Type hierarchy, Algorithms, Experimentation, Human factors, Performance, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Selection process, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Search process, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Query formulation, H.3.1 Information Systems, Information storage and retrieval, Content Analysis and Indexing, Indexing methods, H.3.1 Information Systems, Information storage and retrieval, Content Analysis and Indexing, Linguistic processing, I.2.4 Computing Methodologies, Artificial intelligence, Knowledge Representation Formalisms and Methods, Predicate logic, I.2.7 Computing Methodologies, Artificial intelligence, Natural Language Processing, Language parsing and understanding
Extending Object-Oriented Systems with Roles BIBAKPDF 268-296
  Georg Gottlob; Michael Schrefl; Brigitte Rock
In many class-based object-oriented systems the association between as instance and a class is exclusive and permanent. Therefore these systems have serious difficulties in representing objects taking on different roles over time. Such objects must be reclassified any time they evolve (e.g., if a person becomes a student and later an employee). Class hierarchies must be planned carefully and may grow exponentially if entities may take on several independent roles. The problem is even more severe for object-oriented databases than for common object-oriented programming. Databases store objects over longer periods, during which the represented entities evolve. This article shows how class-based object-oriented systems can be extended to handle evolving objects well. Class hierarchies are complemented by role hierarchies, whose nodes represent role types an object classified in the root may take on. At any point in time, an entity is represented by an instance of the root and an instance of every role type whose role it currently plays. In a natural way, the approach extends traditional object-oriented concepts, such as classification, object identity, specialization, inheritance, and polymorphism in a natural way. The practicability of the approach is demonstrated by an implementation in Smalltalk. Smalltalk was chosen because it is widely known, which is not true for any particular class-based object-oriented database programming language. Roles can be provided in Smalltalk by adding a few classes. There is no need to modify the semantics of Smalltalk itself. Role hierarchies are mapped transparently onto ordinary classes. The presented implementation can easily be ported to object-oriented database programming languages based on Smalltalk, such as Gemstone's OPAL hierarchies are complemented by role hierarchies, whose nodes represent role types an object classified in the root may take on. At any point in time, an entity is represented by an instance of the root and an instance of every role type whose role in currently plays.
Keywords: Delegation, Inheritance, Object-oriented databases, Object specialization, Roles, Design, Languages, D.1.5 Software, Programming techniques, Object-oriented Programming, D.2.10 Software, Software engineering, Design, Methodologies, D.2.10 Software, Software engineering, Design, Representation, D.3.3 Software, Programming languages, Language Constructs and Features, H.2.3 Information Systems, Database management, Languages, Database (persistent) programming languages, D.2.1 Software, Software engineering, Requirements/Specifications
A General Explanation Component for Conceptual Modeling in CASE Environments BIBAKPDF 297-329
  Jon Atle Gulla
In information systems engineering, conceptual models are constructed to assess existing information systems and work out requirements for new ones. As these models serve as a means for communication between customers and developers, it is paramount that both parties understand the models, as well as that the models form a proper basis for the subsequent design and implementation of the systems. New CASE environments are now experimenting with formal modeling languages and various techniques for validating conceptual models, though it seems difficult to come up with a technique that handles the linguistic barriers between the parties involved in a satisfactory manner. In this article, we discuss the theoretical basis of an explanation component implemented for the PPP CASE environment. This component integrates other validation techniques and provides a very flexible natural-language interface to complex model information. It describes properties of the modeling language and the conceptual models in terms familiar to users, and the explanations can be combined with graphical model views. When models are executed, it can justify requested inputs and explain computed outputs by relating trace information to properties of the models.
Keywords: Conceptual modeling, Explanation generation, Help systems, Linguistics, Paraphrasing, Requirements engineering, Design, Documentation, Human factors, D.2.2 Software, Software engineering, Design Tools and Techniques, Computer-aided software engineering (CASE), I.2.7 Computing Methodologies, Artificial intelligence, Natural Language Processing, C.4 Computer Systems Organization, Performance of systems, Modeling techniques
Bias in Computer Systems BIBAKPDF 330-347
  Batya Friedman; Helen Nissenbaum
From an analysis of actual cases, three categories of bias in computer systems have been developed: preexisting, technical, and emergent. Preexisting bias has its roots in social institutions, practices, and attitudes. Technical bias arises from technical constraints of considerations. Emergent bias arises in a context of use. Although others have pointed to bias in particular computer systems and have noted the general problem, we know of no comparable work that examines this phenomenon comprehensively and which offers a framework for understanding and remedying it. We conclude by suggesting that freedom from bias should by counted among the select set of criteria -- including reliability, accuracy, and efficiency -- according to which the quality of systems in use in society should be judged.
Keywords: Bias, Computer ethics, Computers and society, Design methods, Ethics, Human values, Standards, Social computing, Social impact, System design, Universal design, Values, Design, Human factors, H.1.2 Information Systems, Models and principles, User/Machine Systems, D.2.0 Software, Software engineering, General, K.4.0 Computing Milieux, Computers and society, General

TOIS 1996 Volume 14 Issue 4

Self-Indexing Inverted Files for Fast Text Retrieval BIBAKPDF 349-379
  Alistair Moffat; Justin Zobel
Query-processing costs on large text databases are dominated by the need to retrieve and scan the inverted list of each query term. Retrieval time for inverted lists can be greatly reduced by the use of compression, but this adds to the CPU time required. Here we show that the CPU component of query response time for conjunctive Boolean queries and for informal ranked queries can be similarly reduced, at little cost in terms of storage, by the inclusion of an internal index in each compressed inverted list. This method has been applied in a retrieval system for a collection of nearly two million short documents. Our experimental results show that the self-indexing strategy adds less than 20% to the size of the compressed inverted file, which itself occupies less than 10% of the indexed text, yet can reduce processing time for Boolean queries of 5-10 terms to under one fifth of the previous cost. Similarly, ranked queries of 40-50 terms can be evaluated in as little as 25% of the previous time, with little or no loss of retrieval effectiveness.
Keywords: Full-text retrieval, Index compression, Information retrieval, Inverted file, Query processing, Design, Performance, H.3.1 Information Systems, Information storage and retrieval, Content Analysis and Indexing, Indexing methods, E.4 Data, Coding and information theory, Data compaction and compression, H.3.2 Information Systems, Information storage and retrieval, Information Storage, File organization, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Search process
Information System Behavior Specification by High Level Petri Nets BIBAKPDF 380-420
  Andreas Oberweis; Peter Sander
The specification of an information system should include a description of structural system aspects as well as a description of the system behavior. In this article, we show how this can be achieved by high-level Petri nets -- namely, the so-called NR/T-nets (Nested-Relation/Transition Nets). In NR/T-nets, the structural part is modeled by nested relations, and the behavioral part is modeled by a novel Petri net formalism. Each place of a net represents a nested relation scheme, and the marking of each place is given as a nested relation of the respective type. Insert and delete operations in a nested relational database (NF2-database) are expressed by transitions in a net. These operations may operate not only on whole tuples of a given relation, but also on "subtuples" of existing tuples. The arcs of a net are inscribed with so-called Filter Tables, which allow (together with an optional logical expression as transition inscription) conditions to be formulated on the specified (sub-) tuples. The occurrence rule for NR/T-net transitions is defined by the operations union, intersection, and "negative" in lattices of nested relations. The structure of an NR/T-net, together with the occurrence rule, defines classes of possible information system procedures, i.e., sequences of (possibly concurrent) operations in an information system.
Keywords: Behavior specification, Complex objects, Conceptual design, Nested relations, Petri nets, Design, Languages, Management, H.1.1 Information Systems, Models and principles, Systems and Information Theory, General systems theory, H.2.1 Information Systems, Database management, Logical Design, Data models, H.2.3 Information Systems, Database management, Languages, Data manipulation languages (DML)
The Model-Assisted Global Query System for Multiple Databases in Distributed Enterprises BIBAKPDF 421-470
  Waiman Cheung; Cheng Hsu
Today's enterprises typically employ multiple information systems, which are independently developed, locally administered, and different in logical or physical designs. Therefore, a fundamental challenge in enterprise information management is the sharing of information for enterprise users across organizational boundaries; this requires a global query system capable of providing on-line intelligent assistance to users. Conventional technologies, such as schema-based query languages and hard-coded schema integration, are not sufficient to solve this problem. This article develops a new approach, a "model-assisted global query system," that utilizes an on-line repository of enterprise metadata -- the Metadatabase -- to facilitate global query formulation and processing with certain desirable properties such as adaptiveness and open-systems architecture. A definitional model characterizing the various classes and roles of the required metadata as knowledge for the system is presented. The significance of possessing this knowledge (via a Metadatabase) toward improving the global query capabilities available previously is analyzed. On this basis, a direct method using model traversal and a query language using global model constructs are developed along with other new methods required for this approach. It is then tested through a prototype system in a computer-integrated manufacturing (CIM) setting.
Keywords: Enterprise integration, Global query system, Heterogeneous distributed information systems, Metadatabase, Design, Performance, H.2.4 Information Systems, Database management, Systems, Query processing, H.2.3 Information Systems, Database management, Languages, Query languages, H.2.4 Information Systems, Database management, Systems, Distributed databases, H.2.7 Information Systems, Database management, Database Administration, Data dictionary/directory, H.3.3 Information Systems, Information storage and retrieval, Information Search and Retrieval, Query formulation, H.5.2 Information Systems, Information interfaces and presentation, User Interfaces