| International workshop on human-centered multimedia: overview | | BIBAK | Full-Text | 1-2 | |
| Alejandro Jaimes; Nicu Sebe | |||
| In this paper we describe the scope and goals of the International Workshop
on Human-Centered Multimedia held in conjunction with ACM Multimedia 2007, and
give an overview of some of the papers and ideas presented in the first version
of the workshop held in conjunction with the same conference in 2006. Keywords: human-centered computing, human-computer interfaces, multimedia, multimodal
interaction | |||
| Pillows as adaptive interfaces in ambient environments | | BIBAK | Full-Text | 3-12 | |
| Frank Nack; Thecla Schiphorst; Zeljko Obrenovic; Michiel KauwATjoe; Simon de Bakker; Angel Perez Rosillio; Lora Aroyo | |||
| We have developed a set of small interactive throw pillows containing
intelligent touch-sensing surfaces, in order to explore new ways to model the
environment, participants, artefacts, and their interactions, in the context of
expressive non-verbal interaction. We present the overall architecture of the
environment, describing a model of the user, the interface (the interactive
pillows and the devices it can interact with) and the context engine. We
describe the representation and process modules of the context engine and
demonstrate how they support real-time adaptation. We present an evaluation of
the current prototype and conclude with plans for future work. Keywords: haptic sensing, human-centred computing, input devices and strategies,
presence, social interaction, tactile UIs, tangible UI, user experience design | |||
| Music emotion recognition: the role of individuality | | BIBAK | Full-Text | 13-22 | |
| Yi-Hsuan Yang; Ya-Fan Su; Yu-Ching Lin; Homer H. Chen | |||
| It has been realized in the music emotion recognition (MER) community that
personal difference, or individuality, has significant impact on the success of
an MER system in practice. However, no previous work has explicitly taken
individuality into consideration in an MER system. In this paper, the
group-wise MER approach (GWMER) and personalized MER approach (PMER) are
proposed to study the role of individuality. GWMER evaluates the importance of
each individual factor such as sex, personality, and music experience, whereas
PMER evaluates whether the prediction accuracy for a user is significantly
improved if the MER system is personalized for the user. Experimental results
demonstrate the effect of personalization and suggest the need for a better
representation of individuality and for better prediction accuracy. Keywords: individuality, music emotion recognition, personalization | |||
| Tracking pointing gesture in 3D space for wearable visual interfaces | | BIBAK | Full-Text | 23-30 | |
| Yunde Jia; Shanqing Li; Yang Liu | |||
| This paper proposes a practical method of tracking pointing gesture in 3D
space for wearable visual interfaces. We integrate dense depth maps and contour
cues to achieve more stable tracking performance. A strategy of fusing
information from selective attention maps and synthetical feature maps is
presented for locating the focus of attention pointed at by pointing gesture.
We have developed a wearable stereo vision system for pointing gesture
tracking, called POGEST, with FPGA-based dense depth mapping at video rate. The
system enables a wearer to locate and select an object in 3D space using
his/her pointing hand. A common focus of attention shared by a wearer and a
computer is established for natural human computer interaction. We also discuss
an application of locating objects in 3D virtual environments. Keywords: gesture recognition, hand tracking, real time system, visual interface,
wearable vision | |||
| Affective multimodal mirror: sensing and eliciting laughter | | BIBAK | Full-Text | 31-40 | |
| Willem A. Melder; Khiet P. Truong; Marten Den Uyl; David A. Van Leeuwen; Mark A. Neerincx; Lodewijk R. Loos; B. Stock Plum | |||
| In this paper, we present a multimodal affective mirror that senses and
elicits laughter. Currently, the mirror contains a vocal and a facial
affect-sensing module, a component that fuses the output of these two modules
to achieve a user-state assessment, a user state transition model, and a
component to present audiovisual affective feedback that should keep or bring
the user in the intended state. Interaction with this intelligent interface
involves a full cyclic process of sensing, interpreting, reacting, sensing (of
the reaction effects), interpreting. The intention of the mirror is to evoke
positive emotions, to make people laugh and to increase the laughter. The first
user experiences tests showed that users show cooperative behavior, resulting
in mutual user-mirror action-reaction cycles. Most users enjoyed the
interaction with the mirror and immersed in an excellent user experience. Keywords: affective mirror, face voice emotion expression, multi modal laughter
recognition | |||
| Towards open source authoring and presentation of multimedia content | | BIBAK | Full-Text | 41-46 | |
| Nikitas M. Sgouros; Alexandros Margaritis | |||
| Open source principles and methodologies allow open access to both the
development process and its products. This paper describes a number of
significant research issues for the creation of novel development environments
that support open source authoring of multimedia content and dynamic forms of
personalization during content consumption. These environments should allow an
unlimited number of users to modify existing media content and post their
contributions on the net. In addition, they should allow users to visualize the
current state of development in each project, select a subset of the various
contributions and dynamically compose, view and share with other users new
content versions containing all the selected contributions. Furthermore, the
paper describes a pilot web-services-based implementation for such a system
developed in C# that is now freely available on the Web. Keywords: authoring paradigms, open source, personalization tools, visualization | |||
| Preattentive visualization of information relevance | | BIBAK | Full-Text | 47-56 | |
| Matthias Deller; Achim Ebert; Michael Bender; Stefan Agne; Henning Barthel | |||
| When presenting complex, multidimensional data to users, emphasis of
relevant information plays an important role. Especially when data is arranged
according to several criteria, the simultaneous use of multiple visualization
metaphors frequently results in information overload and unintuitive
visualizations. In this paper, we present a comparison of preattentive visual
features specifically for highlighting relevance of data extracted from
electronic documents in an information-rich virtual environment. Several visual
cues were evaluated with regard to their effectivity, comprehensibility and
influence on other visualized features. At the same time we introduce two
innovative data handling techniques to achieve practical applicability of our
system: An intuitive way to reduce visual cluttering of information by
filtering information based on its visual depth and a way to efficiently
utilize visualizations of different dimensions -- dimensional congruence. Keywords: human factors, performance, user-centered design | |||
| Local spatiotemporal descriptors for visual recognition of spoken phrases | | BIBAK | Full-Text | 57-66 | |
| Guoying Zhao; Matti Pietikäinen; Abdenour Hadid | |||
| Visual speech information plays an important role in speech recognition
under noisy conditions or for listeners with hearing impairment. In this paper,
we propose local spatiotemporal descriptors to represent and recognize spoken
isolated phrases based solely on visual input. Positions of the eyes determined
by a robust face and eye detector are used for localizing the mouth regions in
face images. Spatiotemporal local binary patterns extracted from these regions
are used for describing phrase sequences. In our experiments with 817 sequences
from ten phrases and 20 speakers, promising accuracies of 62% and 70% were
obtained in speaker-independent and speaker-dependent recognition,
respectively. In comparison with other methods on the Tulips1 audio-visual
database, the accuracy 92.7% of our method clearly out performs the others.
Advantages of our approach include local processing and robustness to monotonic
gray-scale changes. Moreover, no error prone segmentation of moving lips is
needed. Keywords: face and eye detection, local spatiotemporal descriptors, mouth region
localization, visual speech recognition | |||
| Interconnected media for human-centered understanding | | BIBAK | Full-Text | 67-76 | |
| Katja Einsfeld; Achim Ebert; Jürgen Wölle | |||
| Today, there are many systems with large amounts of complex data sets.
Visualizing these systems in a way that enlightens the user and provides a
profound understanding ofthe respective information space is one of the big
information visualization research challenges. Keim states that it is no longer
possible to display an overview of these systems as proposed in Shneiderman's
information seeking mantra. To overcome this incapacity and to provide a
solution to the dilemma of time multiplexing vs. space multiplexing techniques,
we propose the context-sensitive use of a collection of animated 3D metaphors.
These metaphors are integrated in a flexible framework called HANNAH. This
provides the possibility to interconnect media of various types in order to
bridge the semantic gab as required for human-centered applications according
to Elgammal. Keywords: 3D, human centered interfaces, information visualization | |||
| Multimedia and human-in-the-loop: interaction as content enrichment | | BIBAK | Full-Text | 77-84 | |
| Bruno Emond | |||
| The current work is part of the broadband visual research program at the
Institute for Information Technology (National Research Council Canada). The
research program is currently focused on developing human-centered multimedia
technology to support large group visual communication and collaboration. This
paper outlines some conceptual foundations for the development of a
human-centered multimedia research tool to capture interaction data, which
could be linked to users cognitive processing. The approach is based on the
notion of multimedia interaction as content enrichment and on cognitive
modeling methodology. Keywords: cognitive modeling, context, human interaction modeling from multimedia,
task modeling in multimedia systems, unified theories of cognition, user | |||
| Too close for comfort?: adapting to the user's cultural background | | BIBAK | Full-Text | 85-94 | |
| Matthias Rehm; Nikolaus Bee; Birgit Endrass; Michael Wissner; Elisabeth André | |||
| The cultural context of the user is a largely neglected aspect of human
centered computing. This is because culture is a very fuzzy concept and even
with a computational model of culture it remains difficult to derive the
necessary information to recognize the user's cultural background. Such
information is only indirectly available and has to be derived from the
observable multimodal behavior of the user. We propose the usage of a
dimensional model of culture that allows applying computational methods to
derive a user's cultural background and to adjust the system's behavior
accordingly. To this end, a Bayesian network is applied to allow for the
necessary inferences despite the fact that the given knowledge about the user's
behavior is incomplete and unreliable. Keywords: Bayesian network modeling, cultural computing, embodied conversational
agents | |||
| Human support improvements by natural man-machine collaboration | | BIBAK | Full-Text | 95-101 | |
| Motoyuki Ozeki; Yasushi Miyata; Hideki Aoyama; Yuichi Nakamura | |||
| In this paper, we propose a novel framework that improves the recognition
performance of human support systems, and then discuss why our framework is
Human-Centered. A Human-Centered system should have a high recognition ability
with minimum burden on the user. Our framework aims to satisfy this requirement
by using an artificial agent between a recognition system and the user. If a
system is in a difficult situation concerning recognition, an agent will
require the user's help. For example, if an object that a system aims to
recognize is hidden by the user's hand, the agent will ask the user to move
his/her hand. Based on this idea, we implemented a prototype system with two
modules: a recognition module to recognize objects and user's motions and an
agent module to ask for a user's cooperative action. In the experiment, our
prototype system recovers around 50%-70% of the recognition failures caused by
three typical difficult situations. The user study reveals that our prototype
system has the potential to realize natural and considerate human support
systems. Keywords: human-centered computing, interactive system | |||