HCI Bibliography Home | HCI Conferences | GW Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
GW Tables of Contents: 96979901030507091113

GW 1997: Gesture Workshop

Fullname:GW 1997: Gesture and Sign Language in Human-Computer Interaction: International Gesture Workshop Proceedings
Editors:Ipke Wachsmuth; Martin Fröhlich
Location:Bielefeld, Germany
Dates:1997-Sep-17 to 1997-Sep-19
Publisher:Springer Berlin Heidelberg 1998
Series:Lecture Notes in Computer Science 1371
Standard No:DOI: 10.1007/BFb0052983; hcibib: GW97; ISBN: 978-3-540-64424-8 (print), 978-3-540-69782-4 (online)
Papers:25
Pages:308
Links:Online Proceedings | Conference Website
  1. Invited Papers
  2. Semiotics of Gesture and Movement
  3. Hidden Markov Models
  4. Motion Analysis and Synthesis
  5. Techniques for Multimodal Interfaces
  6. Neural Network Methods
  7. Applications

Invited Papers

Research Challenges in Gesture: Open Issues and Unsolved Problems BIBAFull-Text 1-11
  Alan Wexelblat
Gesture today remains a sideline in computer interfaces. I argue that this is due to several longstanding deficiencies in the theoretical foundations of the field. We must act to correct these deficiencies and strengthen the research community in order to avoid becoming a footnote in the history of computer science. I specify fundamental unsolved problems in the areas of naturalness, anthropology and systems building. I also suggest some things that we could do to make our research community stronger and more able to tackle these problems.
Progress in Sign Languages Recognition BIBAFull-Text 13-21
  Alistair D. N. Edwards
The automatic recognition of sign language is an attractive prospect; the technology exists to make it possible, while the potential applications are exciting and worthwhile. To date the research emphasis has been on the capture and classification of the gestures of sign language and progress in that work is reported. However, it is suggested that there are some greater, broader research questions to be addressed before full sign language recognition is achieved. The main areas to be addressed are sign language representation (grammars) and facial expression recognition.

Semiotics of Gesture and Movement

Movement Phase in Signs and Co-Speech Gestures, and Their Transcriptions by Human Coders BIBAFull-Text 23-35
  Sotaro Kita; Ingeborg van Gijn; Harry van der Hulst
The previous literature has suggested that the hand movement in co-speech gestures and signs consists of a series of phases with qualitatively different dynamic characteristics. In this paper, we propose a syntagmatic rule system for movement phases that applies to both co-speech gestures and signs. Descriptive criteria for the rule system were developed for the analysis video-recorded continuous production of signs and gesture. It involves segmenting a stream of body movement into phases and identifying different phase types. Two human coders used the criteria to analyze signs and cospeech gestures that are produced in natural discourse. It was found that the criteria yielded good inter-coder reliability. These criteria can be used for the technology of automatic recognition of signs and co-speech gestures in order to segment continuous production and identify the potentially meaning-bearing phase.
Classifying Two Dimensional Gestures in Interactive Systems BIBAFull-Text 37-48
  Axel Kramer
This paper motivates and presents a classification scheme for two-dimensional gestures in interactive systems. Most pen-based systems allow the user to perform gestures in order to enter and execute commands, but the usage of gestures can be found in other interactive systems as well. Much research so far has been focused on how to implement two-dimensional gestures, how to recognize the users input, or what context to use gestures in.
   Instead, the focus of this paper is to explore and classify interactive characteristics of two-dimensional gestures as they are used in interactive systems. The benefits for the field are three-fold. First, such a classification describes one design space for the usage of two-dimensional gestures in interactive systems and thus presents possible choices to system designers. Second, empirical researchers can make use of such a classification to make systematic choices about aspects of gesture based systems that are worth studying. Finally, it can serve as a starting point for drawing parallels and exploring differences to gestures used in three-dimensional interfaces.
Are Listeners Paying Attention to the Hand Gestures of an Anthropomorphic Agent? An Evaluation Using a Gaze Tracking Method BIBAKFull-Text 49-59
  Shuichi Nobe; Satoru Hayamizu; Osamu Hasegawa; Hideaki Takahashi
The information that listeners are looking at and paying attention to is significant in the evaluation of the human-anthropomorphic agent interaction system. A pilot study was conducted, using a gaze tracking method, on relevant aspects of an anthropomorphic agent's hand gestures in a real-time setting. It revealed that a highly informative, one-handed gesture with seemingly-interactive speech attracted attention when it had a slower stroke and/or a long post-stroke hold at the Center-Center space and upper position.
Keywords: gestures; anthropomorphic agents; gaze tracking method; human-computer interaction
Gesture-Based and Haptic Interaction for Human Skill Acquisition BIBAFull-Text 61-68
  Monica Bordegoni; Franco De Angelis
This paper describes the preliminary results of the research work currently ongoing at the University of Parma, and partially carried out within a basic research project funded by the European Union. The research work aims at applying the techniques used in gesture analysis and recognition for understanding human skill performed for non rigid object grasping and manipulation. The various grasping gestures have been classified on the basis of some quantitative features extracted from the hand gesture analysis. Finally, it is planned to map the formalized skill into a robotics system, that will be able to grasp and manipulate non rigid objects, also facing unexpected situations.

Hidden Markov Models

High Performance Real-Time Gesture Recognition Using Hidden Markov Models BIBAFull-Text 69-80
  Gerhard Rigoll; Andreas Kosmala; Stefan Eickeler
An advanced real-time system for gesture recognition is presented, which is able to recognize complex dynamic gestures, such as "hand waving", "spin", "pointing", and "head moving". The recognition is based on global motion features, extracted from each difference image of the image sequence. The system uses Hidden Markov Models (HMMs) as statistical classifier. These HMMs are trained on a database of 24 isolated gestures, performed by 14 different people. With the use of global motion features, a recognition rate of 92.9% is achieved for a person and background independent recognition.
Velocity Profile Based Recognition of Dynamic Gestures with Discrete Hidden Markov Models BIBAFull-Text 81-95
  Frank G. Hofmann; Peter Heyer; Günter Hommel
In this paper we present a method for the recognition of dynamic gestures with discrete Hidden Markov Models (HMMs) from a continuous stream of gesture input data. The segmentation problem is addressed by extracting two velocity profiles from the gesture data and using their extrema as segmentation cues. Gestures are captured with a TUB-SensorGlove. The paper focuses on the description of the gesture recognition method (including data preprocessing) and describes experiments for the evaluation of the performance of the recognition method. The paper combines and further develops ideas from some of our previous work.
Video-Based Sign Language Recognition Using Hidden Markov Models BIBAFull-Text 97-109
  Marcell Assan; Kirsti Grobel
This paper is concerned with the video-based recognition of signs. Concentrating on the manual parameters of sign language, the system aims for the signer dependent recognition of 262 different signs taken from Sign Language of the Netherlands. For Hidden Markov Modelling a sign is considered a doubly stochastic process, represented by an unobservable state sequence. The observations emitted by the states are regarded as feature vectors, that are extracted from video frames. This work deals with three topics: Firstly the recognition of isolated signs, secondly the influence of variations of the feature vector on the recognition rate and thirdly an approach for the recognition of connected signs. The system achieves recognition rates up to 94% for isolated signs and 73% for a reduced vocabulary of connected signs.

Motion Analysis and Synthesis

Corpus 3D Natural Movements and Sign Language Primitives of Movement BIBAFull-Text 111-121
  Sylvie Gibet; James Richardson; Thierry Lebourque; Annelies Braffort
This paper describes the development of a corpus or database of hand-arm pointing gestures, considered as a basic element for gestural communication. The structure of the corpus is defined for natural pointing movements carried out in different directions, heights and amplitudes. It is then extended to movement primitives habitually used in sign language communication. The corpus is based on movements recorded using an optoelectronic recording system that allows the 3D description of movement trajectories in space. The main technical characteristics of the capture and pretreatment system are presented, and perspectives are highlighted for recognition and generation purposes.
On the Use of Context and A Priori Knowledge in Motion Analysis for Visual Gesture Recognition BIBAFull-Text 123-134
  Karin Husballe Munk; Erik Granum
The correspondence analysis part of a model based vision system is investigated theoretically and through a synthetic image sequence showing a human hand gesture. The purpose of the study is to find and describe ways of improving the conditions for robust tracking, by introducing a priori knowledge such as structural information from the model and temporal context of the observed motion.
   Primary performance characteristics are the size of the search space for correspondence analysis, and the prediction error under various conditions.
   Theoretical models for the search space dependencies on connectivity properties and on prediction accuracy are developed. Observations from the image sequence suggest simple predictors for the context of smooth motion, and their expected influence on search space is verified. Special considerations must be given to handling of motion trajectory discontinuities, and alternatives are suggested.
Automatic Estimation of Body Regions from Video Images BIBAKFull-Text 135-145
  Hermann Hienz; Kirsti Grobel
In our approach video-based recognition of sign language requires the extraction of sign parameters. Each sign can be characterised by means of manual (handshape, hand orientation, location and movement) and non-manual (trunk, head, gaze, facial expression, mouth) parameters. This paper introduces a software module which is as a part of the developed automatic sign language recognition system able to extract relevant body regions from digitised video images. The recognition of body regions is crucial for determining location of signs. The proposed software module uses a rule-based system for analysing the body contour in order to compute the 2D position of the shoulders, the top of head and the vertical axis of the body. Based on these results the position of the eyes are calculated directly from the segmented face of the signer. The position of the remaining face- (nose, forehead, mouth, cheek, chin) and trunk regions (shoulder belt, chest, belly, hip) are determined by means of two estimators, where a-priori known geometric data of the face and fuzzy technique are used. Experiments indicate that our approach leads to good estimation of body regions, which we all compute in real time.
Keywords: Estimation of body regions; sign language recognition; digital image processing; gesture analysis
Rendering Gestures as Line Drawings BIBAFull-Text 147-157
  Frank Godenschweger; Thomas Strothotte; Hubert Wagener
This paper discusses computer generated illustrations and animation sequences of hand gestures. Especially the animation of gestures is very useful in teaching the sign language.
   We propose algorithms for rendering 3D models of hands as line drawings and for designing animations of line drawn gestures. Presentations of gestures as line drawings as opposed to a photorealistic representations have several advantages. Most importantly, the abstract nature of line drawings emphasizes the essential information a picture is to express and thus supports an easier cognition. Especially when line drawings are rendered from simple 3D-models (of human parts), they are aesthetically more pleasing than photorealistic renderings of the same model. This leads us to the assumption that simpler 3D-models suffice for line drawn illustrations and animations of gestures, which in consequence facilitates the 3D modeling task and speeds up the rendering. Another advantage of line drawings include fast transmission in networks, as e.g. the Internet, and the wide scale-independence they exhibit.

Techniques for Multimodal Interfaces

Investigating the Role of Redundancy in Multimodal Input Systems BIBAFull-Text 159-171
  Karen McKenzie Mills; James L. Alty
A major concern of Human Computer Interaction is to improve communication between people and computer applications. One possible way of improving such communication is to capitalise on the way human beings use speech and gesture in a complementary manner, exploiting the redundancy of information between these modes. Redundant data input via multiple modalities, give considerable scope for the resolution of error and ambiguity. This paper describes implementation of a simple, inexpensive tri-modal input system accepting touch, two dimensional gesture and speech input. Currently the speech and gesture recognition systems operate separately. Truth maintenance and blackboard system architectures in a multimodal interpreter are proposed for handling the integration between modes and task knowledge. Preliminary results from the two dimensional gesture recognition system are presented. Rule Induction is used for analysis of the gesture data and preliminary classification results are presented. Current implementations and future work on redundancy are also discussed.
Gesture Recognition of the Upper Limbs -- From Signal to Symbol BIBAFull-Text 173-184
  Martin Fröhlich; Ipke Wachsmuth
To recognise gestures performed by people without disabilities during verbal communication -- so-called coverbal gestures -- a flexible system with task-oriented design is proposed. The issue of flexibility is addressed via different kinds of modules -- grasped as agents --, which are grouped in different levels. They can be easily reconfigured or rewritten to suit another application. This system of layered agents uses an abstract body-model to transform the up-taken data from the six-degree-of-freedom-sensors, and the data gloves, to a first-level symbolic description of gesture features. In a first integration step the first-level symbols are integrated to second-level symbols describing a whole gesture. Second-level symbolic gesture descriptions are the entities which can be integrated with speech tokens to form multi-modal utterances.
Exploiting Distant Pointing Gestures for Object Selection in a Virtual Environment BIBAFull-Text 185-196
  Marc Erich Latoschik; Ipke Wachsmuth
Developing state of the art multimedia applications nowadays calls for the use of sophisticated visualisation and immersion techniques, commonly referenced as Virtual Reality. While Virtual Reality meanwhile reaches good results both in image quality and in fast user feedback using parallel computation techniques, the methods for interacting with these systems need to be improved. In this paper we introduce a multimedia application that uses a gesture-driven interface and, secondly, the architecture for an expandable gesture recognition system. After different gesture types for interaction in a virtual environment are discussed with respect to a required functionality, the implementation of a specific gesture detection module for distant pointing recognition is described, and the whole system design is tested for its task adequacy.
An Intuitive Two-Handed Gestural Interface for Computer Supported Product Design BIBAFull-Text 197-208
  Caroline Hummels; Gerda Smets; Kees Overbeeke
More and more researchers emphasize the development of humanising computer interaction, thus bringing us closer to intuitive interfaces. Gestural interface research fits in with these new developments. However, the existing gestural interfaces hardly take advantage of the possibilities gestures offer. They even force the user to learn a new language. We propose a gestural interface for product design that exploits the use of gestures. This interface supports the perceptual-motor skills of the designer and the expressive and creative design process. To develop this task-specific gestural interface we emphasize the importance of explorative experiments to obtain the meaning of gestures used for product design. We show with two experiments that an accurate interpretation of a created product can be made, even when designers are allowed full freedom in their gestures. MOVE ON, a computer supported design application is our first step towards full freedom gestural human-computer interaction. Creating task-specific human-computer interaction using limitless gestures is feasible, although extensive research is necessary and ongoing.

Neural Network Methods

Detection of Fingertips in Human Hand Movement Sequences BIBAFull-Text 209-218
  Claudia Nölker; Helge Ritter
This paper presents an hierarchical approach with neural networks to locate the positions of the fingertips in grey-scale images of human hands. The first chapters introduce and sum up the research done in this area. Afterwards, our hierarchical approach and the preprocessing of the grey-scale images are described. A low-dimensional encoding of the images is done by the means of Gabor-Filters and a special kind of artificial neural net, the LLM-net, is employed to find the positions of the fingertips. The capabilities of the system are demonstrated on three tasks: locating the tip of the forefinger and of the thumb, finding the pointing-direction regardless of the operator's pointing style, and detecting all 5 fingertips in hand movement sequences. The system is able to perform these tasks even when the fingertips are in an area with low contrast.
Neural Architecture for Gesture-Based Human-Machine-Interaction BIBAFull-Text 219-232
  Hans-Joachim Böhme; Anja Brakensiek; Ulf-Dietrich Braumann; Markus Krabbes; Horst-Michael Gross
We present a neural architecture for gesture-based interaction between a mobile robot and human users. One crucial problem for natural interface techniques is the robustness under highly varying environmental conditions. Therefore, we propose a multiple cue approach for the localisation of a potential user in the operation field, followed by the acquisition and interpretation of its gestural instructions. The whole approach is motivated in the context of a reliable operation scenario, but can be extended easily for other applications, such as videoconferencing.
Robotic Gesture Recognition BIBAFull-Text 233-244
  Jochen Triesch; Christoph von der Malsburg
Robots of the future should communicate with humans in a natural way. We are especially interested in vision-based gesture interfaces. In the context of robotics several constraints exist, which make the task of gesture recognition particularly challenging. We discuss these constraints and report on progress being made in our lab in the development of techniques for building robust gesture interfaces which can handle these constraints. In an example application, the techniques are shown to be easily combined to build a gesture interface for a real robot grasping objects on a table in front of it.
Image Based Recognition of Graze Direction Using Adaptive Methods BIBAFull-Text 245-257
  Axel Christian Varchmin; Robert Rae; Helge Ritter
Human-machine interfaces based on gaze recognition can greatly simplify the handling of computer applications. However, most of the existing systems have problems with changing environments and different users. As a solution we use (i) adaptive components which can be trained online and (ii) detect common facial features, i.e. eyes, nose and mouth, for gaze recognition. In a first step an adaptive color histogram segmentation method roughly determines the region of interest including the user's face. Within this region we then use a hierarchical recognition approach to detect the facial features. In the last stage of our system these feature positions are used to estimate gaze direction by detailed analysis of the eye region. We achieve an average precision of 1.5 ++ for the gaze pan and 2.5 ++ for the tilt angle while the user looks on a computer screen. The system runs at a rate of one frame per second on a common workstation.

Applications

Towards a Dialogue System Based on Recognition and Synthesis of Japanese Sign Language BIBAFull-Text 259-271
  Shan Lu; Seiji Igi; Hideaki Matsuo; Yuji Nagashima
This paper describes a dialogue system based on the recognition and synthesis of Japanese sign language. The purpose of this system is to support conversation between people with hearing impairments and hearing people. The system consists of five main modules: sign-language recognition and synthesis, voice recognition and synthesis, and dialogue control. The sign-language recognition module uses a stereo camera and a pair of colored gloves to track the movements of the signer, and sign-language synthesis is achieved by regenerating the motion data obtained by an optical motion capture system. An experiment was done to investigate changes in the gaze-line of hearing-impaired people when they read sign language, and the results are reported.
The Recognition Algorithm with Non-contact for Japanese Sign Language Using Morphological Analysis BIBAFull-Text 273-284
  Hideaki Matsuo; Seiji Igi; Shan Lu; Yuji Nagashima; Yuji Takata; Terutaka Teshima
This paper documents the recognition method of deciphering Japanese sign language (JSL) using projected images. The goal of the movement recognition is to foster communication between hearing impaired and people capable of normal speech. We uses a stereo camera for recording three-dimensional movements, a image processing board for tracking movements, and a personal computer for an image processor charting the recognition of JSL patterns. This system works by formalizing the space area of the signers according to the characteristics of the human body, determining components such as location and movements, and then recognizing sign language patterns.
   The system is able to recognize JSL by determining the extent of similarities in the sign field, and does so even when vibrations in hand movements occur and when there are differences in body build. We obtained useful results from recognition experiments in 38 different JSL in two signers.
Special Topics of Gesture Recognition Applied in Intelligent Home Environments BIBAFull-Text 285-296
  Markus Kohler
This report shows how to realize a gesture recognition system for controlling appliances in home environments. It gives a brief overview on an existing system and clarifies details on ergonomic remote control of devices by gestures with the help of a vision system. The focus is on the motion detection, object normalization and identification, the modelling and the prediction of motion by the Kalman Filter. A main interest was to show through the example ARGUS, how the Kalman Filter should be modelled and initialized for a physical human motion. The initialization problem of the Kalman Filter of a vision based system for human motion tracking differs from initializing for physical systems, where manuals report measurement errors. Most aspects mentioned in this report were implemented in the ARGUS prototype.
BUILD-IT: An Intuitive Design Tool Based on Direct Object Manipulation BIBAKFull-Text 297-308
  Morten Fjeld; Martin Bichsel; Matthias Rauterberg
Natural interaction, in the context of this paper, means human action in a world of tangible objects and live subjects. We introduce the concept of action regulation and relate it to observable human behaviour. A tool bringing together motor and cognitive action is a promising way to assure complete task regulation. Aiming for such tools, we propose a set of guidelines for the next generation of user interfaces, the Natural User Interface (NUI). We present a NUI instantiation called BUILD-IT, featuring video-mediated interaction in a task specific context. This multi-brick interaction tool renders virtual objects tangible and allows multiple user simultaneous interaction in one common space. A few user experiences are briefly described.
Keywords: Augmented Reality; natural interaction; Natural User Interface; graspable objects; computer mediated design