| The Role of Iconic Gestures in Production and Comprehension of Language: Evidence from Brain and Behavior | | BIBAK | Full-Text | 1-10 | |
| Asli Özyürek | |||
| Speakers in all cultures and ages use gestures as they speak (i.e., cospeech
gestures). There have been different views in the literature with regard to
whether and how a specific type of gestures speakers use, i.e., iconic
gestures, interacts with language processing. Here I review evidence showing
that iconic gestures are not produced merely from the spatial and/or motoric
imagery but from an in interface representation of imagistic and linguistic
representation during online speaking Similarly, for comprehension,
neuroimaging and behavioral studies indicate that speech and gesture influences
semantic processing of each other during online comprehension. These findings
show overall that processing of information in both modalities interacts during
both comprehension and production of language arguing against models that
propose independent processing of each modality. They also have implications
for AI models that aim to simulate cospeech gesture use in conversational
agents. Keywords: iconic; cospeech gesture; interface; production; comprehension; brain;
behavior | |||
| Speakers' Use of Interactive Gestures as Markers of Common Ground | | BIBAK | Full-Text | 11-22 | |
| Judith Holler | |||
| This study experimentally manipulates common ground (the knowledge, beliefs
and assumptions interlocutors mutually share [6]) and measures the effect on
speakers' use of interactive gestures to mark common ground. The data consist
of narratives based on a video of which selected scenes were known to both
speaker and addressee (common ground condition) or to only the speaker (no
common ground condition). The analysis focuses on those interactive gestures
that have been described in the literature as 'shared information gestures'
[4]. The findings provide experimental evidence that certain interactive
gestures are indeed linked to common ground. Further, they show that speakers
seem to employ at least two different forms of shared knowledge gestures. This
difference in form appears to be linked to speakers' use of gesture in the
grounding process, as addressees provided feedback more frequently in response
to one of the gesture types. Keywords: common ground; interactive gestures; gestural markers; pointing; palm up
open hand gesture | |||
| Gesture Space and Gesture Choreography in European Portuguese and African Portuguese Interactions: A Pilot Study of Two Cases | | BIBAK | Full-Text | 23-33 | |
| Isabel Galhano Rodrigues | |||
| This pilot study focuses on aspects of cultural variation in cospeech
gestures in two interactions with Angolan and European Portuguese participants.
The elements compared are gesture features -- extension, drawn path,
articulation points -- and generated gesture spaces. Posture, interpersonal
distance and other speech-correlated movements were taken into account as
essential parameters for the definition of different kinds of physical spaces.
Some differences were obvious: gestures performed by Angolan speakers were
articulated at the levels of the shoulders, elbows and wrists, thus tracing
considerable larger angles than those traced by gestures performed by
Portuguese speakers. As the Angolan participants sit close to one another,
their extended arms constantly invade the other participants' personal spaces. Keywords: gesture; cultural variations in cospeech gestures; gesture space | |||
| The Embodied Morphemes of Gaze | | BIBAK | Full-Text | 34-46 | |
| Isabella Poggi; Francesca D'Errico; Alessia Spagnolo | |||
| The paper presents some empirical studies aimed at singling out the meanings
of specific items of gaze and of some of their parameters. It argues that the
values on some parameters of gaze items are not comparable to phonemes in a
verbal language, but rather to morphemes, since by themselves they convey some
specific meanings. The different positions of the upper and lower eyelids are
combined and the meanings conveyed by their possible values are investigated.
It is found that wide open upper eyelids and raised lower eyelids convey
activation and effort, while half open and half closed upper eyelids convey
de-activation and relaxation. These embodied morphemes, stemming from
particular physical states, become part of the meanings of gaze items conveyed
by the combination of these eyelid positions. Keywords: multimodality; gaze; lexicon; morphemes; embodied | |||
| On Factoring Out a Gesture Typology from the Bielefeld Speech-and-Gesture-Alignment Corpus (SAGA) | | BIBAK | Full-Text | 47-60 | |
| Hannes Rieser | |||
| The paper is based on the Bielefeld Speech-And-Gesture-Alignment corpus
(SAGA). From this corpus one video film is taken to establish a typological
grid for iconic and referring gesture types, i.e. a multiple inheritance
hierarchy of types proceeding from single gestural features like hand shape to
sequences of entities filling up the whole gesture space. Types are mapped onto
a partial ontology specifying their respective meaning. Multi-modal meaning is
generated via linking verbal meaning and gestural meaning. How verbal and
gestural meaning interface is shown with an example using a quantified NP. It
is argued that gestural meaning extends the restriction of the original
quantified NP. On the other hand it is shown that gestural meaning is not
strong enough to resolve the underspecification of the lexical information. Keywords: SAGA corpus; iconic gesture; gesture typology; partial ontology;
speech-gesture interface | |||
| Function and Form of Gestures in a Collaborative Design Meeting | | BIBAK | Full-Text | 61-72 | |
| Willemien Visser | |||
| This paper examines the relationship between gestures' function and form in
design collaboration. It adopts a cognitive design research viewpoint. The
analysis is restricted to gesticulations and emblems. The data analysed come
from an empirical study conducted on an architectural design meeting. Based on
a previous analysis of the data, guided by our model of design as the
construction of representations, we distinguish representational and
organisational functions. The results of the present analysis are that, even if
form-function association tendencies exist, gestures with a particular function
may take various forms, and particular gestural movements as regards form can
fulfil different functions. Reconsidering these results and other research on
gesture, we formulate the assumption that, if formal characteristics do not
allow differentiating functional gestures in collaboration, context-dependent,
semantic characteristics may be more appropriate. We also envision the
possibility that closer inspection of the data reveal tendencies of another
nature. Keywords: Gestural interaction; Cognitive design research; Collaborative design;
Collaboration; Architectural design; Gesticulations; Emblems | |||
| Continuous Realtime Gesture Following and Recognition | | BIBAK | Full-Text | 73-84 | |
| Frédéric Bevilacqua; Bruno Zamborlin; Anthony Sypniewski; Norbert Schnell; Fabrice Guédy; Nicolas H. Rasamimanana | |||
| We present a HMM based system for real-time gesture analysis. The system
outputs continuously parameters relative to the gesture time progression and
its likelihood. These parameters are computed by comparing the performed
gesture with stored reference gestures. The method relies on a detailed
modeling of multidimensional temporal curves. Compared to standard HMM systems,
the learning procedure is simplified using prior knowledge allowing the system
to use a single example for each class. Several applications have been
developed using this system in the context of music education, music and dance
performances and interactive installation. Typically, the estimation of the
time progression allows for the synchronization of physical gestures to sound
files by time stretching/compressing audio buffers or videos. Keywords: gesture recognition; gesture following; Hidden Markov Model; music;
interactive systems | |||
| Multiscale Detection of Gesture Patterns in Continuous Motion Trajectories | | BIBAK | Full-Text | 85-97 | |
| Radu-Daniel Vatavu; Laurent Grisoni; Stefan Gheorghe Pentiuc | |||
| We describe a numerical method for scale invariant detection of gesture
patterns in continuous 2D motions. The algorithm is fast due to our
rejection-based reasoning achieved using a new set of curvature-based functions
which we call Integral Absolute Curvatures. Detection rates above 96% are
reported on a large data set consisting of 72,000 samples with demonstrated low
execution time. The technique can be used to automatically detect gesture
patterns in unconstrained motions in order to enable click-free interactions. Keywords: gesture recognition; pattern detection; multiscale; curvature; integral of
curvature; motion trajectory | |||
| Recognition of Gesture Sequences in Real-Time Flow, Context of Virtual Theater | | BIBAK | Full-Text | 98-109 | |
| Ronan Billon; Alexis Nédélec; Jacques Tisseau | |||
| Our aim is to put on a short play featuring a real actor and a virtual
actor, who will communicate through movements and choreography, with mutual
synchronization. Gesture recognition in our context of Virtual Theater is
mainly based on the ability of a virtual actor to perceive gestures made by a
real actor. We present a method for real-time recognition. We use properties
from Principal Component Analysis (PCA) to create signature for each gesture
and a multiagent system to perform the recognition. Keywords: motion-capture; gesture recognition; virtual theatre; synthetic actor | |||
| Deictic Gestures with a Time-of-Flight Camera | | BIBA | Full-Text | 110-121 | |
| Martin Haker; Martin Böhme; Thomas Martinetz; Erhardt Barth | |||
| We present a robust detector for deictic gestures based on a time-of-flight (TOF) camera, a combined range and intensity image sensor. Pointing direction is used to determine whether the gesture is intended for the system at all and to assign different meanings to the same gesture depending on pointing direction. We use the gestures to control a slideshow presentation: Making a "thumbs-up" gesture while pointing to the left or right of the screen switches to the previous or next slide. Pointing at the screen causes a "virtual laser pointer" to appear. Since the pointing direction is estimated in 3D, the user can move freely within the field of view of the camera after the system was calibrated. The pointing direction is measured with an absolute accuracy of 0.6 degrees and a measurement noise of 0.9 degrees near the center of the screen. | |||
| Towards Analysis of Expressive Gesture in Groups of Users: Computational Models of Expressive Social Interaction | | BIBAK | Full-Text | 122-133 | |
| Antonio Camurri; Giovanna Varni; Gualtiero Volpe | |||
| In this paper we present a survey of our research on analysis of expressive
gesture and how it is evolving towards the analysis of expressive social
interaction in groups of users. Social interaction and its expressive
implications (e.g., emotional contagion, empathy) is an extremely relevant
component for analysis of expressive gesture, since it provides significant
information on the context expressive gestures are performed in. However, most
of the current systems analyze expressive gestures according to basic emotion
categories or simple dimensional approaches. Moreover, almost all of them are
intended for a single user, whereas social interaction is often neglected.
After briefly recalling our pioneering studies on collaborative robot-human
interaction, this paper presents two steps in the direction of novel
computational models and techniques for measuring social interaction: (i) the
interactive installation Mappe per Affetti Erranti for active listening to
sound and music content, and (ii) the techniques we developed for explicitly
measuring synchronization within a group of users. We conclude with the
research challenges we will face in the near future. Keywords: expressive gesture analysis and processing; analysis of social interaction
in small groups; multimodal interactive systems | |||
| On Gestural Variation and Coarticulation Effects in Sound Control | | BIBAK | Full-Text | 134-145 | |
| Tommaso Bianco; Vincent Freour; Nicolas H. Rasamimanana; Frédéric Bevilacqua; René Caussé | |||
| In this paper we focus on the analysis of sound producing gestures in the
musical domain. We investigate the behavior of intraoral pressure exerted by a
trumpet performer in the production of single and concatenated notes.
Investigation is carried out with functional data analysis techniques. Results
show that different variation patterns occur for single note production, which
depend on dynamic level, suggesting the hypothesis that two different motor
control programs are available. Results from analysis on consecutive notes give
evidence that the coarticulation between two gesture curves cannot be modelled
by linear superposition, and that local coarticulation is affected by
contiguous units. Keywords: coarticulation; music performance; gesture synthesis; anticipation; motor
program; functional statistical analysis | |||
| Gesture Saliency: A Context-Aware Analysis | | BIBAK | Full-Text | 146-157 | |
| Matei Mancas; Donald Glowinski; Gualtiero Volpe; Paolo Coletta; Antonio Camurri | |||
| This paper presents a motion attention model that aims at analyzing gesture
saliency using context-related information at three different levels. At the
first level, motion features are compared in the spatial context of the current
video frame; at the intermediate level, salient behavior is analyzed on a short
temporal context; at the third level, computation of saliency is extended to
longer time windows. An attention/saliency index is computed at the three
levels based on an information theory approach. This model can be considered as
a preliminary step towards context-aware expressive gesture analysis. Keywords: Visual attention; expressive gesture; context-aware analysis | |||
| Towards a Gesture-Sound Cross-Modal Analysis | | BIBAK | Full-Text | 158-170 | |
| Baptiste Caramiaux; Frédéric Bevilacqua; Norbert Schnell | |||
| This article reports on the exploration of a method based on canonical
correlation analysis (CCA) for the analysis of the relationship between gesture
and sound in the context of music performance and listening. This method is a
first step in the design of an analysis tool for gesture-sound relationships.
In this exploration we used motion capture data recorded from subjects
performing free hand movements while listening to short sound examples. We
assume that even though the relationship between gesture and sound might be
more complex, at least part of it can be revealed and quantified by linear
multivariate regression applied to the motion capture data and audio
descriptors extracted from the sound examples. After outlining the theoretical
background, the article shows how the method allows for pertinent reasoning
about the relationship between gesture and sound by analysing the data sets
recorded from multiple and individual subjects. Keywords: Gesture analysis; Gesture-Sound Relationship; Sound Perception; Canonical
Correlation Analysis | |||
| Methods for Effective Sonification of Clarinetists' Ancillary Gestures | | BIBAK | Full-Text | 171-181 | |
| Florian Grond; Thomas Hermann; Vincent Verfaille; Marcelo M. Wanderley | |||
| We present the implementation of two different sonifications methods of
ancillary gestures from clarinetists. The sonifications are data driven from
the clarinetist's posture which is captured with a VICON motion tracking
system. The first sonification method is based on the velocities of the
tracking markers, the second method involves a principal component analysis as
a data preprocessing step. Further we develop a simple complementary visual
display with a similar information content to match the sonification. The
effect of the two sonifications with respect to the movement perception is
studied in an experiment where test subjects annotate the clarinetists
performance represented by various combinations of the resulting uni- and
multimodal displays. Keywords: sonification; 3D movement data; ancillary gestures; multimodal displays | |||
| Systematicity and Idiosyncrasy in Iconic Gesture Use: Empirical Analysis and Computational Modeling | | BIBAK | Full-Text | 182-194 | |
| Kirsten Bergmann; Stefan Kopp | |||
| Why an iconic gesture takes its particular form is a largely open question,
given the variations one finds across both situations and speakers. We present
results of an empirical study that analyzes correlations between contextual
factors (referent features, discourse) and gesture features, and tests whether
they are systematic (shared among speakers) or idiosyncratic
(inter-individually different). Based on this, a computational model of gesture
formation is presented that combines data-based, probabilistic and model-based
decision making. Keywords: Iconic gesture; meaning-form mapping; systematicity; idiosyncrasy | |||
| To Beat or Not to Beat: Beat Gestures in Direction Giving | | BIBAK | Full-Text | 195-206 | |
| Mariët Theune; Chris J. Brandhorst | |||
| Research on gesture generation for embodied conversational agents (ECA's)
mostly focuses on gesture types such as pointing and iconic gestures, while
ignoring another gesture type frequently used by human speakers: beat gestures.
Analysis of a corpus of route descriptions showed that although annotators show
very low agreement in applying a 'beat filter' aimed at identifying physical
features of beat gestures, they are capable of reliably distinguishing beats
from other gestures in a more intuitive manner. Beat gestures made up more than
30% of the gestures in our corpus, and they were sometimes used when expressing
concepts for which other gesture types seemed a more obvious choice. Based on
these findings we propose a simple, probabilistic model of beat production for
ECA's. However, it is clear that more research is needed to determine why
direction givers in some cases use beats when other gestures seem more
appropriate, and vice versa. Keywords: gesture and speech; gesture analysis; beats; direction giving | |||
| Requirements for a Gesture Specification Language -- A Comparison of Two Representation Formalisms | | BIBAK | Full-Text | 207-218 | |
| Alexis Heloir; Michael Kipp | |||
| We present a comparative study of two gesture specification languages. Our
aim is to derive requirements for a new, optimal specification language that
can be used to extend the emerging BML standard. We compare MURML, which has
been designed to specify coverbal gestures, and a language we call LV,
originally designed to describe French Sign Language utterances. As a first
step toward a new gesture specification language we created EMBRScript, a
low-level animation language capable of describing multi-channel animations,
that can be used as a foundation for future BML extensions. Keywords: embodied conversational agents; gesture description language; comparative
study | |||
| Statistical Gesture Models for 3D Motion Capture from a Library of Gestures with Variants | | BIBAK | Full-Text | 219-230 | |
| Zhenbo Li; Patrick Horain; André-Marie Pez; Catherine Pelachaud | |||
| A challenge for 3D motion capture by monocular vision is 3D-2D projection
ambiguities that may bring incorrect poses during tracking. In this paper, we
propose improving 3D motion capture by learning human gesture models from a
library of gestures with variants. This library has been created with virtual
human animations. Gestures are described as Gaussian Process Dynamic Models
(GPDM) and are used as constraints for motion tracking. Given the raw input
poses from the tracker, the gesture model helps to correct ambiguous poses. The
benefit of the proposed method is demonstrated with results. Keywords: Gaussian Process; 3D motion capture; gesture model; gesture library | |||
| Modeling Joint Synergies to Synthesize Realistic Movements | | BIBAK | Full-Text | 231-242 | |
| Matthieu Aubry; Frédéric Julliard; Sylvie Gibet | |||
| This paper presents a new method to generate arm gestures which reproduces
the dynamical properties of human movements. We describe a model of synergy,
defined as a coordinative structure responsible for the flexible organization
of joints over time when performing a movement. We propose a generic method
which incorporates this synergy model into a motion controller system based on
any iterative inverse kinematics technique. We show that this method is
independent of the task and can be parametrized to suit an individual using a
novel learning algorithm based on a motion capture database. The method yields
different models of synergies for reaching tasks that are confronted to the
same set of example motions. The quantitative results obtained allow us to
select the best model of synergies for reaching movements and prove that our
method is independent of the inverse kinematic technique used for the motion
controller. Keywords: Virtual Humanoids; Movement Synthesis; Synergy; Reaching Gesture; Joint
Synergies; Movement Learning | |||
| Multimodal Interfaces in Support of Human-Human Interaction | | BIBA | Full-Text | 243-244 | |
| Alex Waibel | |||
| After building computers that paid no intention to communicating with humans, the computer science community has devoted significant effort over the years to more sophisticated interfaces that put the "human in the loop" of computers. These interfaces have improved usability by providing more appealing output (graphics, animations), more easy to use input methods (mouse, pointing, clicking, dragging) and more natural interaction modes (speech, vision, gesture, etc.). Yet all these interaction modes have still mostly been restricted to human-machine interaction and made severely limiting assumptions on sensor setup and expected human behavior. (For example, a gesture might be presented clearly in front of the camera and have a clear start and end time). Such assumptions, however, are unrealistic and have, consequently, limited the potential productivity gains, as the machine still operates in a passive mode, requiring the user to pay considerable attention to the technological artifact. | |||
| Gestures for Large Display Control | | BIBAK | Full-Text | 245-256 | |
| Wim Fikkert; Paul E. van der Vet; Gerrit C. van der Veer; Anton Nijholt | |||
| The hands are highly suited to interact with large public displays. It is,
however, not apparent which gestures come naturally for easy and robust use of
the interface. We first explored how uninstructed users gesture when asked to
perform basic tasks. Our subjects gestured with great similarity and readily
produced gestures they had seen before; not necessarily in a human-computer
interface. In a second investigation these and other gestures were rated by a
hundred subjects. A gesture set for explicit command-giving to large displays
emerged from these ratings. It is notable that for a selection task, tapping
the index finger in mid-air, like with a traditional mouse, scored highest by
far. It seems that the mouse has become a metaphor in everyday life. Keywords: Human-centered computing; user interfaces; input devices and strategies;
intuitive hand gestures; large display interaction | |||
| Gestural Attributions as Semantics in User Interface Sound Design | | BIBAK | Full-Text | 257-268 | |
| Kai Tuuri | |||
| This paper proposes a gesture-based approach to user interface sound design,
which utilises projections of body movements in sounds as meaningful
attributions. The approach is founded on embodied conceptualisation of human
cognition and it is justified through a literature review on the subject of
interpersonal action understanding. According to the resulting hypothesis,
stereotypical gestural cues, which correlate with, e.g., a certain
communicative intention, represent specific non-linguistic meanings. Based on
this theoretical framework, a model of a process is also outlined where
stereotypical gestural cues are implemented in sound design. Keywords: gestures; user interfaces; sound design; semantics | |||
| Gestural Interfaces for Elderly Users: Help or Hindrance? | | BIBAK | Full-Text | 269-280 | |
| Christian Stößel; Hartmut Wandke; Lucienne T. M. Blessing | |||
| In this paper we investigate whether finger gesture input is a suitable
input method, especially for older users (60+) with respect to age-related
changes in sensory, cognitive and motor abilities. We present a study in which
we compare a group of older users to a younger user group on a set of 42
different finger gestures on measures of speed and accuracy. The size and the
complexity of the gestures varied systematically in order to find out how these
factors interact with age on gesture performance. The results showed that older
users are a little slower, but not necessarily less accurate than younger
users, even on smaller screen sizes, and across different levels of gesture
complexity. This indicates that gesture-based interaction could be a suitable
input method for older adults. At least not a hindrance -- maybe even a help. Keywords: Gestural interfaces; aging psychology; human factors | |||
| Gestures in Human-Computer Interaction -- Just Another Modality? | | BIBAK | Full-Text | 281-288 | |
| Antti Pirhonen | |||
| The traditional framework in human-computer studies is based on a simple
input-output model of interaction. In many cases, however, splitting
interaction into input and output is not necessarily appropriate. Gestures work
as a good example of a modality which is difficult or inappropriate to be
conceptualised within the traditional input-output paradigm. In the search for
a more appropriate interaction paradigm, gestures, as modality, have potential
in working as a meta-modality, in terms of which all other modalities could be
analysed. This paper proposes the use of gestures and gestural metaphors in a
central role in interaction design, and presents a case study as an
illustration of the point. Keywords: gesture; metaphor; human-computer interaction | |||
| Body Posture Estimation in Sign Language Videos | | BIBAK | Full-Text | 289-300 | |
| François Lefebvre-Albaret; Patrice Dalle | |||
| This article deals with the posture reconstruction from a mono view video of
a signed utterance. Our method makes no use of additional sensors or visual
markers. The head and the two hands are tracked by means of a particle filter.
The elbows are detected as convolution local maxima. A non linear filter is
first used to remove the outliers, then some criteria using French Sign
Language phonology are used to process the hand disambiguation. The posture
reconstruction is achieved by using inverse kinematics, using a Kalman
smoothing and the correlation between strong and week hand depth that can be
noticed in the signed utterances. The article ends with a quantitative and
qualitative evaluation of the reconstruction. We show how the results could be
used in the framework of automatic Sign Language video processing. Keywords: Sign Language; Posture Reconstruction; Inverse Kinematics; Mono Vision | |||
| Influence of Handshape Information on Automatic Sign Language Recognition | | BIBAK | Full-Text | 301-312 | |
| Gineke A. ten Holt; Marcel J. T. Reinders; Emile A. Hendriks; Huib de Ridder; Andrea J. van Doorn | |||
| Research on automatic sign language recognition (ASLR) has mostly been
conducted from a machine learning perspective. We propose to implement results
from human sign recognition studies in ASLR. In a previous study it was found
that handshape is important for human sign recognition. The current paper
describes the implementation of this conclusion: using handshape in ASLR.
Handshape information in three different representations is added to an
existing ASLR system. The results show that recognition improves, except for
one representation. This refutes the idea that extra (handshape) information
will always improve recognition. Results also vary per sign: some sign
classifiers improve greatly, others are unaffected, and rare cases even show
decreased performance. Adapting classifiers to specific sign types could be the
key for future ASLR. Keywords: sign language; automatic sign language recognition; handshape representation | |||
| Towards Interactive Web-Based Virtual Signers: First Step, a Platform for Experimentation Design | | BIBAK | Full-Text | 313-324 | |
| Jean-Paul Sansonnet; Annelies Braffort; Cyril Verrecchia | |||
| In this paper, we present a Web-based framework for interactive Sign
Language using virtual signing agents. The main feature of this framework is
that it is a full DOM-Integrated architecture. Firstly, we discuss the
advantages and the constraints raised by the implementation of proper
interactive Virtual Signers within this full DOM-integrated approach. Secondly,
we discuss an experimental study about Web-based Virtual Signers that take
advantage of the specific interactivity provided by our framework. This study
deals with a structure of Sign Language utterances that requires dynamic
handling of spatio-temporal variability and coarticulation stances in the sign
generation phase. Keywords: Web-based Virtual Signers; Sign Language dynamic generation; Sign
variability; Coarticulation | |||
| Toward Modeling Sign Language Coarticulation | | BIBAK | Full-Text | 325-336 | |
| Jérémie Segouat; Annelies Braffort | |||
| This article presents a study on coarticulation modeling in French Sign
Language. Our aim is to use this model to provide information to deaf people,
by the mean of a virtual signer. We propose a definition for "coarticulation",
based on an overview of the literature. We explain the methodology we have set
up: from video corpus design to features correlations extractions, through
corpus annotations and analysis. We expose first results and what are going to
be the next steps of this study. Keywords: Sign Language; Coarticulation Modeling; Corpus Design; Corpus Annotation | |||