HCI Bibliography Home | HCI Conferences | GW Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
GW Tables of Contents: 96979901030507091113

GW 1999: Gesture Workshop

Fullname:GW 1999: Gesture-Based Communication in Human-Computer Interaction: International Gesture Workshop Proceedings
Editors:Annelies Braffort; Rachid Gherbi; Sylvie Gibet; Daniel Teil; James Richardson
Location:Gif-sur-Yvette, France
Dates:1999-Mar-17 to 1999-Mar-19
Publisher:Springer Berlin Heidelberg
Series:Lecture Notes in Computer Science 1739
Standard No:DOI: 10.1007/3-540-46616-9; hcibib: GW99; ISBN: 978-3-540-66935-7 (print), 978-3-540-46616-1 (online)
Links:Online Proceedings | Conference Website
  1. Section 1: Human Perception and Production of Gesture
  2. Section 2: Localisation and Segmentation
  3. Section 3: Recognition
  4. Section 4: Sign Language
  5. Section 5: Gesture Synthesis and Animation
  6. Section 6: Multimodality
  7. Round Table

Section 1: Human Perception and Production of Gesture

Seeing Biological Motion -- Is There a Role for Cognitive Strategies? BIBAFull-Text 3-22
  Winand H. Dittrich
The aim of the paper is to suggest components of a model for the processing of human movement information introducing the concept of 'motion integrators'. Two approaches to the perception of biological motion are contrasted: the low-level and the high-level processing approach. It is suggested that conceptually-driven processes play a prominent role in motion recognition. Examples from experimental psychology and neurobiology are discussed. Our quasi-automatic perception of biological motion seems to involve resource-dependent cognitive processes and an 'interactive-encoding' hypothesis is elaborated further. In particular, the role of attentional mechanisms and the influence of concept use are highlighted. Finally, recent findings are interpreted in connection to specific encoding strategies.
The Expressive Power of Gestures: Capturing Scent in a Spatial Shape BIBAFull-Text 23-36
  Caroline Hummels; Kees Overbeeke
Our engagement with consumer products diminishes gradually over the last decades, which causes considerable usability problems. To dissolve these problems, the designer's emphasis should shift from creating beautiful products in appearance to beautiful interactions with products. Consequently, the designer needs new tools like gestural sketching. To develop a gestural design tool, we tested the suitability of gestures to capture expressive ideas and the capability of outsiders to recognise this expression. Scents were used to make this expression measurable. Twenty-two creators made four dynamic sculptures expressing these scents. Half of those sculptures were made through gesturing and half through traditional sketching. Subjects were asked to match the scents and the sculptures. Results show that there is no significant difference between sketching and gesturing. Dependent on the scent, an interpreter was able to capture the expression when looking at the gestures. These findings support the potential of a gestural design tool.
Non-obvious Performer Gestures in Instrumental Music BIBAFull-Text 37-48
  Marcelo M. Wanderley
This paper deals with the gestural language of instrumentalists playing wind instruments. It discusses the role of non-obvious performer gestures that may nevertheless influence the final sound produced by the acoustic instrument. These gestures have not commonly been considered in sound synthesis, although they are an integral part of the instrumentalist's full gestural language. The structure of this paper will be based on an analysis of these non-obvious gestures followed by some comments on how to best classify them according to existing research on gesture reviewed in the introduction; finally, the influence of these gestures on the sound produced by the instrument will be studied and measurement and simulation results presented.
The Ecological Approach to Multimodal System Design BIBAFull-Text 49-52
  Antonella De Angeli; Frederic Wolff; Laurent Romary; Walter Gerbino
Following the ecological approach to visual perception, this paper presents a framework that emphasizes the role of vision on referring actions. In particular, affordances are utilized to explain gestures variability in a multimodal human-computer interaction. Such a proposal is consistent with empirical findings obtained in different simulation studies showing how referring gestures are determined by the mutuality of information coming from the target and the set of movements available to the speaker. A prototype that follows anthropomorphic perceptual principles to analyze gestures has been developed and tested in preliminary computational validations.
Analysis of Trunk and Upper Limb Articular Synergies BIBAFull-Text 53-57
  Agnès Roby-Brami; Mounir Mokhtari; Isabelle Laffont; Nezha Bennis; Elena Biryukova
A new method of recording and reconstruction of upper-limb kinematics has been developed in order to analyze the mechanisms of motor recovery in disabled patients. It has been applied to the analysis of non constrained gestures in normal subjects and in patients with an hemiparesis following stroke. The results show evidence of new motor strategies and new co-ordinations developed to compensate for the motor impairment. In the future, this method could be applied to gesture recognition and synthesis and for the development of enhanced learning environments in rehabilitation.

Section 2: Localisation and Segmentation

GREFIT: Visual Recognition of Hand Postures BIBAFull-Text 61-72
  Claudia Nölker; Helge Ritter
In this paper, we present GREFIT (Gesture REcognition based on FInger Tips) which is able to extract the 3-dimensional hand posture from video images of the human hand. GREFIT uses a two-stage approach to solve this problem.
   This paper is based on earlier presented results of a system to locate the 2-D positions of the fingertips in images. We now describe the second stage, where the 2-D position information is transformed by an artificial neural net into an estimate of the 3-D configuration of an articulated hand model, which is also used for visualization. This model is designed according to the dimensions and movement possibilities of a natural human hand.
   The virtual hand imitates the user's hand to an astonishing accuracy and can track postures from grey scale images at a speed of 10 Hz.
Towards Imitation Learning of Grasping Movements by an Autonomous Robot BIBAFull-Text 73-84
  Jochen Triesch; Jan Wieghardt; Eric Maël; Christoph von der Malsburg
Imitation learning holds the promise of robots which need not be programmed but instead can learn by observing a teacher. We present recent efforts being made at our laboratory towards endowing a robot with the capability of learning to imitate human hand gestures. In particular, we are interested in grasping movements. The aim is a robot that learns, e.g., to pick up a cup at its handle by imitating a human teacher grasping it like this. Our main emphasis is on the computer vision techniques for finding and tracking the human teacher's grasping fingertips. We present first experiments and discuss limitations of the approach and planned extensions.
A Line-Scan Computer Vision Algorithm for Identifying Human Body Features BIBAFull-Text 85-96
  Damian M. Lyons; Daniel L. Pelletier
A computer vision algorithm for identifying human body features algorithm, called the nine-grid algorithm, is introduced in this paper. The algorithm identifies body features via a two level hierarchy. The lower level makes a series of measurements on the image from line-scan input. The upper level uses a set of heuristics to assign the measurements to body features. A ground truth study is presented, showing the performance of the algorithm for four classes of activity that we consider typical of in-home user interface applications of computer vision. The study showed that the algorithm correctly identified features to a close degree 77% of the time. Closer investigation of the results suggested refinements for the algorithm that would improve this score.
Hand Posture Recognition in a Body-Face Centered Space BIBAFull-Text 97-100
  Sébastien Marcel; Olivier Bernier
We propose a model for image space discretisation based on face location and on body anthropometry. In this body-face space, a neural network recognizes hand postures. The neural network is a constrained generative model already applied to face detection.

Section 3: Recognition

Vision-Based Gesture Recognition: A Review BIBAFull-Text 103-115
  Ying Wu; Thomas S. Huang
The use of gesture as a natural interface serves as a motivating force for research in modeling, analyzing and recognition of gestures. In particular, human computer intelligent interaction needs vision-based gesture recognition, which involves many interdisciplinary studies. A survey on recent vision-based gesture recognition approaches is given in this paper. We shall review methods of static hand posture and temporal gesture recognition. Several application systems of gesture recognition are also described in this paper. We conclude with some thoughts about future research directions.
Person Localization and Posture Recognition for Human-Robot Interaction BIBAFull-Text 117-128
  Hans-Joachim Böhme; Ulf-Dietrich Braumann; Andrea Corradini; Horst-Michael Gross
The development of a hybrid system for (mainly) gesture-based human-robot interaction is presented, thereby describing the progress in comparison to the work shown at the last gesture workshop (see [2]). The system makes use of standard image processing techniques as well as of neural information processing. The performance of our architecture includes the detection of a person as a potential user in an indoor environment, followed by the recognition of her gestural instructions. In this paper, we concentrate on two major mechanisms: (i), the contour-based person localization via a combination of steerable filters and three-dimensional dynamic neural fields, and (ii), our first experiences concerning the recognition of different instructional postures via a combination of statistical moments and neural classifiers.
Statistical Gesture Recognition Through Modelling of Parameter Trajectories BIBAFull-Text 129-140
  Jérôme Martin; Daniela Hall; James L. Crowley
The recognition of human gestures is a challenging problem that can contribute to a natural man-machine interface. In this paper, we present a new technique for gesture recognition. Gestures are modelled as temporal trajectories of parameters. Local sub-sequences of these trajectories are extracted and used to define an orthogonal space using principal component analysis. In this space the probabilistic density function of the training trajectories is represented by a multidimensional histogram, which builds the basis for the recognition. Experiments on three different recognition problems show the general utility of the approach.
Gesture Recognition for Visually Mediated Interaction BIBAFull-Text 141-151
  A. Jonathan Howell; Hilary Buxton
This paper reports initial research on supporting Visually Mediated Interaction (VMI) by developing person-specific and generic gesture models for the control of active cameras. We describe a time-delay variant of the Radial Basis Function (TDRBF) network and evaluate its performance on recognising simple pointing and waving hand gestures in image sequences. Experimental results are presented that show that high levels of performance can be obtained for this type of gesture recognition using such techniques, both for particular individuals and across a set of individuals. Characteristic visual evidence can be automatically selected, depending on the task demands.
Interpretation of Pointing Gestures: The PoG System BIBAFull-Text 153-157
  Rachid Gherbi; Annelies Braffort
We present in this paper a system named PoG. Its role is to recognise and interpret natural pointing gestures in the context of a multimodal interaction. The user's hand gestures are tracked by a camera located above a building plan. The user points to a room on the plan with his index finger while using speech to ask for some information from the system. The PoG system is composed of an extraction process, which computes visual primitives, a recognition process, providing the name of the gesture, and a localisation process, which computes the coordinates of the index tip.
Control of In-vehicle Systems by Gestures BIBAFull-Text 159-162
  Jean-François Kamp; Franck Poirier; Philippe Doignon
In this paper, we propose a new input interface for interaction with in-vehicle systems (traffic information, car phones etc). The new device is a touchpad of small dimensions designed to record any control-gesture drawn with the finger by the user. An analysis is carried out of the tasks to be achieved by the driver and a method called "brainstormingïs used to generate a set of possible gestures. A neural network approach is applied for the recognition of gestures. The method is tested on a database of 5252 uppercase letters written by 101 different writers. The average error rate is less than 6,5%. The method is also of great interest in terms of speed and memory space.

Section 4: Sign Language

French Sign Language: Proposition of a Structural Explanation by Iconicity BIBAFull-Text 165-184
  Christian Cuxac
In this article, I shall attempt to demonstrate that sign languages are linguistic objects which provide us with increasingly tangible means of accessing cognitive activity. This is possible by virtue of the existence in language of the visible, iconic manifestation of a dynamic process, which is set in motion by deaf signers to speak of experience outside of the situation of the utterance.
HMM-Based Continuous Sign Language Recognition Using Stochastic Grammars BIBAFull-Text 185-196
  Hermann Hienz; Britta Bauer; Karl-Friedrich Kraiss
This paper describes the development of a video-based continuous sign language recognition system using Hidden Markov Models (HMM). The system aims for automatic signer dependent recognition of sign language sentences, based on a lexicon of 52 signs of German Sign Language. A single colour video camera is used for image recording. The recognition is based on Hidden Markov Models concentrating on manual sign parameters. As an additional component, a stochastic language model is utilised, which considers uni- and bigram probabilities of single and successive signs. The system achieves an accuracy of 95% using a bigram language model.
A Method for Analyzing Spatial Relationships Between Words in Sign Language Recognition BIBAFull-Text 197-209
  Hirohiko Sagawa; Masaru Takeuchi
There are expressions using spatial relationships in sign language that are called directional verbs. To understand a sign-language sentence that includes a directional verb, it is necessary to analyze the spatial relationship between the recognized sign-language words and to find the proper combination of a directional verb and the sign-language words related to it. In this paper, we propose an analysis method for evaluating the spatial relationship between a directional verb and other sign-language words according to the distribution of the parameters representing the spatial relationship.
Toward Scalability in ASL Recognition: Breaking Down Signs into Phonemes BIBAFull-Text 211-224
  Christian Vogler; Dimitris N. Metaxas
In this paper we present a novel approach to continuous, whole-sentence ASL recognition that uses phonemes instead of whole signs as the basic units. Our approach is based on a sequential phonological model of ASL. According to this model the ASL signs can be broken into movements and holds, which are both considered phonemes. This model does away with the distinction between whole signs and epenthesis movements that we made in previous work [17]. Instead, epenthesis movements are just like the other movements that constitute the signs.
   We subsequently train Hidden Markov Models (HMMs) to recognize the phonemes, instead of whole signs and epenthesis movements that we recognized previously [17]. Because the number of phonemes is limited, HMM-based training and recognition of the ASL signal becomes computationally more tractable and has the potential to lead to the recognition of large-scale vocabularies.
   We experimented with a 22 word vocabulary, and we achieved similar recognition rates with phoneme-and word-based approaches. This result is very promising for scaling the task in the future.

Section 5: Gesture Synthesis and Animation

A Complete System for the Specification and the Generation of Sign Language Gestures BIBAFull-Text 227-238
  Thierry Lebourque; Sylvie Gibet
This paper describes a system called GeSsyCa which is able to produce synthetic sign language gestures from a high level specification. This specification is made with a language based both on a discrete description of space, and on a movement decomposition inspired from sign language gestures. Communication gestures are represented through symbolic commands which can be described by qualitative data, and traduced in terms of spatio-temporal targets driving a generation system. Such an approach is possible for the class of generation models controlled through key-points information. The generation model used in our approach is composed of a set of sensori-motor servo-loops. Each of these models resolves in real time the inversion of the servo-loop, from the direct specification of location targets, while satisfying psycho-motor laws of biological movement. The whole control system is applied to the synthesis of communication and sign language gestures, and a validation of the synthesized movements is presented.
Sign Specification and Synthesis BIBAFull-Text 239-251
  Olivier Losson; Jean-Marc Vannobel
A description in terms of elementary primitives is proposed, in the view of sign language synthesis. Gradual combination leads to global sign specification. Grammatical inflexions are also taken into account in the sign hierarchic description built. Particular attention is focused on hand configurations synthesis from fingers primitives and hand properties, and on location and orientation computation issues. From sign features edition to virtual animation, here are laid the foundations of a new interface intended for deaf people.
Active Character: Dynamic Reaction to the User BIBAFull-Text 253-264
  Shan Lu; Seiji Igi
This paper describes a computer-character system intended to create a natural interaction between the computer and the user. Using predefined control rules, it generates the movements of the computer character's head, body, hands, and gaze-lines according to changes in the user's position and gaze-lines. This system acquires the user's information about the user's position, facial region, and gaze-lines by using a vision subsystem and an eye-tracker unit. The vision subsystem detects the presence of a person, estimates the three-dimensional position of the person by using information acquired by a stationary camera, and determines the locations of the face and hands. The reactive motions of the computer character are generated according to a set of predefined if-then rules. Furthermore, a motion-description file is designed to define simple and complex kinds of gestures.
Reactiva'Motion Project: Motion Synthesis Based on a Reactive Representation BIBAFull-Text 265-268
  Frédéric Julliard; Sylvie Gibet
This work is part of the SAGA (SYNTHESIS AND ANALYSIS OF GESTURES FOR ANIMATION) project which aim is to develop a real time animation system for articulated human bodies. The purpose of the Reactiva'Motion project is to propose new methods for designing and synthesizing skilled motions requiring coordination features and reacting to external events, such as walking or juggling. Motions are specified by the way of a reactive representation; the reactivity results from the execution which takes into account sensory data provided by the environment.
The Emotional Avatar: Non-verbal Communication Between Inhabitants of Collaborative Virtual Environments BIBAFull-Text 269-273
  Marc Fabri; David J. Moore; Dave J. Hobbs
Collaborative Virtual Environments (CVEs) are distributed virtual reality systems with multi-user access. Each inhabitant is represented by a humanoid embodiment, an avatar, making them virtually present in the artificial world. This paper investigates how inhabitants of these CVEs can communicate with each other through channels other than speech, and it is primarily concerned with the visualization and perception of facial expressions and body postures in CVEs. We outline our experimental work and discuss ways of expressing the emotional state of CVE inhabitants through their avatars.

Section 6: Multimodality

Communicative Rhythm in Gesture and Speech BIBAFull-Text 277-289
  Ipke Wachsmuth
Led by the fundamental role that rhythms apparently play in speech and gestural communication among humans, this study was undertaken to substantiate a biologically motivated model for synchronizing speech and gesture input in human computer interaction. Our approach presents a novel method which conceptualizes a multimodal user interface on the basis of timed agent systems. We use multiple agents for the purpose of polling presemantic information from different sensory channels (speech and hand gestures) and integrating them to multimodal data structures that can be processed by an application system which is again based on agent systems. This article motivates and presents technical work which exploits rhythmic patterns in the development of biologically and cognitively motivated mediator systems between humans and machines.
Temporal Symbolic Integration Applied to a Multimodal System Using Gestures and Speech BIBAFull-Text 291-302
  Timo Sowa; Martin Fröhlich; Marc Erich Latoschik
This paper presents a technical approach for temporal symbol integration aimed to be generally applicable in unimodal and multimodal user interfaces. It draws its strength from symbolic data representation and an underlying rule-based system, and is embedded in a multiagent system. The core method for temporal integration is motivated by findings from cognitive science research. We discuss its application for a gesture recognition task and speech-gesture integration in a Virtual Construction scenario. Finally an outlook of an empirical evaluation is given.
A Multimodal Interface Framework for Using Hand Gestures and Speech in Virtual Environment Applications BIBAFull-Text 303-314
  Joseph J., Jr. LaViola
Recent approaches to providing users with a more natural method of interacting with virtual environment applications have shown that more than one mode of input can be both beneficial and intuitive as a communication medium between humans and computer applications. Hand gestures and speech appear to be two of the most logical since users will typically be in environments that will have them immersed in a virtual world with limited access to traditional input devices such as the keyboard or the mouse. In this paper, we describe an ongoing research project to develop multimodal interfaces that incorporate 3D hand gestures and speech in virtual environments.

Round Table

Stimulating Research into Gestural Human Machine Interaction BIBAFull-Text 317-331
  Marilyn Panayi; David M. Roy; James Richardson
This is the summary report of the roundtable session held at the end of Gesture Workshop '99. This first roundtable aimed to act as a forum of discussion for issues and concerns relating to the achievements, future development, and potential of the field of gestural and sign-language based human computer interaction.