| Preferences and Patterns of Paralinguistic Voice Input to Interactive Media | | BIBAK | Full-Text | 3-12 | |
| Sama'a Al Hashimi | |||
| This paper investigates the factors that affect users' preferences of
non-speech sound input and determine their vocal and behavioral interaction
patterns with a non-speech voice-controlled system. It throws light on shyness
as a psychological determinant and on vocal endurance as a physiological
factor. It hypothesizes that there are certain types of non-speech sounds, such
as whistling, that shy users are more prone to resort to as an input. It also
hypothesizes that there are some non-speech sounds which are more suitable for
interactions that involve prolonged or continuous vocal control. To examine the
validity of these hypotheses, it presents and employs a voice-controlled
Christmas tree in a preliminary experimental approach to investigate the
factors that may affect users' preferences and interaction patterns during
non-speech voice control, and by which the developer's choice of non-speech
input to a voice-controlled system should be determined. Keywords: Paralanguage; vocal control; preferences; voice-physical | |||
| "Show and Tell": Using Semantically Processable Prosodic Markers for Spatial Expressions in an HCI System for Consumer Complaints | | BIBAK | Full-Text | 13-22 | |
| Christina Alexandris | |||
| The observed relation between prosodic information and the degree of
precision and lack of ambiguity is attempted to be integrated in the processing
of the user's spoken input in the CitizenShield ("POLIAS") system for consumer
complaints for commercial products. The prosodic information contained in the
spoken descriptions provided by the consumers is attempted to be preserved with
the use of semantically processable markers, classifiable within an Ontological
Framework and signalizing prosodic prominence in the speakers spoken input.
Semantic processability is related to the reusability and/or extensibility of
the present system to multilingual applications or even to other types of
monolingual applications. Keywords: Prosodic prominence; Ontology; Selectional Restrictions; Indexical
Interpretation for Emphasis; Deixis; Ambiguity resolution; Spatial Expressions | |||
| Exploiting Speech-Gesture Correlation in Multimodal Interaction | | BIBAK | Full-Text | 23-30 | |
| Fang Chen; Eric H. C. Choi; Ning Wang | |||
| This paper introduces a study about deriving a set of quantitative
relationships between speech and co-verbal gestures for improving multimodal
input fusion. The initial phase of this study explores the prosodic features of
two human communication modalities, speech and gestures, and investigates the
nature of their temporal relationships. We have studied a corpus of natural
monologues with respect to frequent deictic hand gesture strokes, and their
concurrent speech prosody. The prosodic features from the speech signal have
been co-analyzed with the visual signal to learn the correlation of the
prominent spoken semantic units with the corresponding deictic gesture strokes.
Subsequently, the extracted relationships can be used for disambiguating hand
movements, correcting speech recognition errors, and improving input fusion for
multimodal user interactions with computers. Keywords: Multimodal user interaction; gesture; speech; prosodic features; lexical
features; temporal correlation | |||
| Pictogram Retrieval Based on Collective Semantics | | BIBA | Full-Text | 31-39 | |
| Heeryon Cho; Toru Ishida; Rieko Inaba; Toshiyuki Takasaki; Yumiko Mori | |||
| To retrieve pictograms having semantically ambiguous interpretations, we propose a semantic relevance measure which uses pictogram interpretation words collected from a web survey. The proposed measure uses ratio and similarity information contained in a set of pictogram interpretation words to (1) retrieve pictograms having implicit meaning but not explicit interpretation word and (2) rank pictograms sharing common interpretation word(s) according to query relevancy which reflects the interpretation ratio. | |||
| Enrich Web Applications with Voice Internet Persona Text-to-Speech for Anyone, Anywhere | | BIBAK | Full-Text | 40-49 | |
| Min Chu; Yusheng Li; Xin Zou; Frank K. Soong | |||
| To embrace the coming age of rich Internet applications and to enrich
applications with voice, we propose a Voice Internet Persona (VIP) service.
Unlike current text-to-speech (TTS) applications, in which users need to
painstakingly install TTS engines in their own machines and do all
customizations by themselves, our VIP service consists of a simple, easy-to-use
platform that enables users to voice-empower their content, such as podcasts or
voice greeting cards. We offer three user interfaces for users to create and
tune new VIPs with built-in tools, share their VIPs via this new platform, and
generate expressive speech content with selected VIPs. The goal of this work is
to popularize TTS features to additional scenarios such as entertainment and
gaming with the easy-to-access VIP platform. Keywords: Voice Internet Persona; Text-to-Speech; Rich Internet Application | |||
| Using Recurrent Fuzzy Neural Networks for Predicting Word Boundaries in a Phoneme Sequence in Persian Language | | BIBAK | Full-Text | 50-59 | |
| Mohammad-Reza Feizi-Derakhshi; Mohammad Reza Kangavari | |||
| The word boundary detection has an application in speech processing systems.
The problem this paper tries to solve is to separate words of a sequence of
phonemes where there is no delimiter between phonemes. In this paper, at first,
a recurrent fuzzy neural network (RFNN) together with its relevant structure is
proposed and learning algorithm is presented. Next, this RFNN is used to
predict word boundaries. Some experiments have already been implemented to
determine complete structure of RFNN. Here in this paper, three methods are
proposed to encode input phoneme and their performance have been evaluated.
Some experiments have been conducted to determine required number of fuzzy
rules and then performance of RFNN in predicting word boundaries is tested.
Experimental results show an acceptable performance. Keywords: Word boundary detection; Recurrent fuzzy neural network (RFNN); Fuzzy neural
network; Fuzzy logic; Natural language processing; Speech processing | |||
| Subjective Measurement of Workload Related to a Multimodal Interaction Task: NASA-TLX vs. Workload Profile | | BIBAK | Full-Text | 60-69 | |
| Dominique Fréard; Eric Jamet; Olivier Le Bohec; Gérard Poulain; Valérie Botherel | |||
| This paper addresses workload evaluation in the framework of a multimodal
application. Two multidimensional subjective workload rating instruments are
compared. The goal is to analyze the diagnostics obtained on four
implementations of an applicative task. In addition, an Automatic Speech
Recognition (ASR) error was introduced in one of the two trials. Eighty
subjects participated in the experiment. Half of them rated their subjective
workload with NASA-TLX and the other half rated it with Workload Profile (WP)
enriched with two stress-related scales. Discriminant and variance analyses
revealed a better sensitivity with WP. The results obtained with this
instrument led to hypotheses on the cognitive activities of the subjects during
interaction. Furthermore, WP permitted us to classify two strategies offered
for error recovery. We conclude that WP is more informative for the task
tested. WP seems to be a better diagnostic instrument in multimodal system
conception. Keywords: Human-Computer Dialogue; Workload Diagnostic | |||
| Menu Selection Using Auditory Interface | | BIBAK | Full-Text | 70-75 | |
| Koichi Hirota; Yosuke Watanabe; Yasushi Ikei | |||
| An approach to auditory interaction with wearable computer is investigated.
Menu selection and keyboard input interfaces are experimentally implemented by
integrating pointing interface using motion sensors with auditory localization
system based on HRTF. Performance of users, or the efficiency of interaction,
is evaluated through experiments using subjects. The average time for selecting
a menu item was approximately 5-9 seconds depending on the geometric
configuration of the menu, and average key input performance was approximately
6 seconds per a character. The result did not support our expectation that
auditory localization of menu items will be a helpful cue for accurate
pointing. Keywords: auditory interface; menu selection; keyboard input | |||
| Analysis of User Interaction with Service Oriented Chatbot Systems | | BIBAK | Full-Text | 76-83 | |
| Marie-Claire Jenkins; Richard Churchill; Stephen Cox; Dan Smith | |||
| Service oriented chatbot systems are designed to help users access
information from a website more easily. The system uses natural language
responses to deliver the relevant information, acting like a customer service
representative. In order to understand what users expect from such a system and
how they interact with it we carried out two experiments which highlighted
different aspects of interaction. We observed the communication between humans
and the chatbots, and then between humans, applying the same methods in both
cases. These findings have enabled us to focus on aspects of the system which
directly affect the user, meaning that we can further develop a realistic and
helpful chatbot. Keywords: human-computer interaction; chatbot; question-answering; communication;
intelligent system; natural language; dialogue | |||
| Performance Analysis of Perceptual Speech Quality and Modules Design for Management over IP Network | | BIBAK | Full-Text | 84-93 | |
| Jinsul Kim; Hyun-Woo Lee; Won Ryu; Seung Ho Han; Minsoo Hahn | |||
| Voice packets with guaranteed QoS (Quality of Service) on the VoIP system
are responsible for digitizing, encoding, decoding, and playing out the speech
signal. The important point is based on the factor that different parts of
speech over IP networks have different perceptual importance and each part of
speech does not contribute equally to the overall voice quality. In this paper,
we propose new additive noise reduction algorithms to improve voice over IP
networks and present performance evaluation of perceptual speech signal through
IP networks in the additive noise environment during realtime phone-call
service. The proposed noise reduction algorithm is applied to pre-processing
method before speech coding and to post-processing method after speech decoding
based on single microphone VoIP system. For noise reduction, this paper
proposes a Wiener filter optimized to the estimated SNR of noisy speech for
speech enhancement. Various noisy conditions including white Gaussian, office,
babble, and car noises are considered with G.711 codec. Also, we provide
critical message report procedures and management schemes to guarantee QoS over
IP networks. Finally, as following the experimental results, the proposed
algorithm and method has been prove for improving speech quality. Keywords: VoIP; Noise Reduction; QoS; Speech Packet; IP Network | |||
| A Tangible User Interface with Multimodal Feedback | | BIBAK | Full-Text | 94-103 | |
| Laehyun Kim; Hyunchul Cho; Se Hyung Park; Manchul Han | |||
| Tangible user interface allows the user to manipulate digital information
intuitively through physical things which are connected to digital contents
spatially and computationally. It takes advantage of human ability to
manipulate delicate objects precisely. In this paper, we present a novel
tangible user interface, SmartPuck system, which consists of a PDP-based table
display, SmartPuck having a built-in actuated wheel and button for the physical
interactions, and a sensing module to track the position of SmartPuck. Unlike
passive physical things in the previous systems, SmartPuck has built-in sensors
and actuator providing multimodal feedback such as visual feedback by LEDs,
auditory feedback by a speaker, and haptic feedback by an actuated wheel. It
gives a feeling as if the user works with physical object. We introduce new
tangible menus to control digital contents just as we interact with physical
devices. In addition, this system is used to navigate geographical information
in Google Earth program. Keywords: Tangible User Interface; Tabletop display; Smart Puck System | |||
| Minimal Parsing Key Concept Based Question Answering System | | BIBAK | Full-Text | 104-113 | |
| Sunil Kumar Kopparapu; Akhilesh Srivastava; P. V. S. Rao | |||
| The home page of a company is an effective means for show casing their
products and technology. Companies invest major effort, time and money in
designing their web pages to enable their user's to access information they are
looking for as quickly and as easily as possible. In spite of all these
efforts, it is not uncommon for a user to spend a sizable amount of time trying
to retrieve the particular information that he is looking for. Today, he has to
go through several hyperlink clicks or manually search the pages displayed by
the site search engine to get to the information that he is looking for. Much
time gets wasted if the required information does not exist on that website.
With websites being increasingly used as sources of information about companies
and their products, there is need for a more convenient interface. In this
paper we discuss a system based on a set of Natural Language Processing (NLP)
techniques which addresses this problem. The system enables a user to ask for
information from a particular website in free style natural English. The NLP
based system is able to respond to the query by 'understanding' the intent of
the query and then using this understanding to retrieve relevant information
from its unstructured info-base or structured database for presenting it to the
user. The interface is called UniqliQ as it avoids the user having to click
through several hyperlinked pages. The core of UniqliQ is its ability to
understand the question without formally parsing it. The system is based on
identifying key-concepts and keywords and then using them to retrieve
information. This approach enables UniqliQ framework to be used for different
input languages with minimal architectural changes. Further, the key-concept --
keyword approach gives the system an inherent ability to provide approximate
answers in case the exact answers are not present in the information database. Keywords: NL Interface; Question Answering System; Site search engine | |||
| Customized Message Generation and Speech Synthesis in Response to Characteristic Behavioral Patterns of Children | | BIBAK | Full-Text | 114-123 | |
| Ho-Joon Lee; Jong C. Park | |||
| There is a growing need for a user-friendly human-computer interaction
system that can respond to various characteristics of a user in terms of
behavioral patterns, mental state, and personalities. In this paper, we present
a system that generates appropriate natural language spoken messages with
customization for user characteristics, taking into account the fact that human
behavioral patterns usually reveal one's mental state or personality
subconsciously. The system is targeted at handling various situations for
five-year old kindergarteners by giving them caring words during their everyday
lives. With the analysis of each case study, we provide a setting for a
computational method to identify user behavioral patterns. We believe that the
proposed link between the behavioral patterns and the mental state of a human
user can be applied to improve not only user interactivity but also
believability of the system. Keywords: natural language processing; customized message generation; behavioral
pattern recognition; speech synthesis; ubiquitous computing | |||
| Multi-word Expression Recognition Integrated with Two-Level Finite State Transducer | | BIBAK | Full-Text | 124-133 | |
| Keunyong Lee; Ki-Soen Park; Yong-Seok Lee | |||
| This paper proposes another two-level finite state transducer to recognize
the multi-word expression (MWE) in two-level morphological parsing environment.
In our proposed the Finite State Transducer with Bridge State (FSTBS), we
defined Bridge State (concerned with connection of multi-word), Bridge
Character (used in connection of multi-word expression) and two-level rule to
extend existing FST. FSTBS could recognize both Fixed Type MWE and Flexible
Type MWE which are expressible as regular expression, because FSTBS recognizes
MWE in morphological parsing. Keywords: Multi-word Expression; Two-level morphological parsing; Finite State
Transducer | |||
| Towards Multimodal User Interfaces Composition Based on UsiXML and MBD Principles | | BIBAK | Full-Text | 134-143 | |
| Sophie Lepreux; Anas Hariri; José Rouillard; Dimitri Tabary; Jean-Claude Tarby; Christophe Kolski | |||
| In software design, the reuse issue brings the increasing of web services,
components and others techniques. These techniques allow reusing code
associated to technical aspect (as software component). With the development of
business components which can integrate technical aspect with HCI, the
composition issue has appeared. Our previous work concerned the GUI composition
based on an UIDL as UsiXML. With the generalization of Multimodal User
Interfaces (MUI), MUI composition principles have to be studied. This paper
aims at extend existing basic composition principles in order to treat
multimodal interfaces. The same principle as in the previous work, based on the
tree algebra, can be used in another level (AUI) of the UsiXML framework to
support the Multimodal User Interfaces composing. This paper presents a case
study on the food ordering system based on multimodal (coupling GUI and MUI). A
conclusion and the future works in the HCI domain are presented. Keywords: User interfaces design; UsiXML; AUI (Abstract User Interface); Multimodal
User Interfaces; Vocal User Interfaces | |||
| m-LoCoS UI: A Universal Visible Language for Global Mobile Communication | | BIBAK | Full-Text | 144-153 | |
| Aaron Marcus | |||
| The LoCoS universal visible language developed by the graphic/sign designer
Yukio Ota in Japan in 1964 may serve as a usable, useful, and appealing basis
for a mobile phone applications that can provide capabilities for communication
among people who do not share a spoken language. User-interface design issues
including display and input are discussed in conjunction with prototype screens
showing the use of LoCoS for a mobile phone. Keywords: design; interface; language; LoCoS; mobile; phone; user | |||
| Developing a Conversational Agent Using Ontologies | | BIBA | Full-Text | 154-164 | |
| Manish Mehta; Andrea Corradini | |||
| We report on the benefits achieved by using ontologies in the context of a fully implemented conversational system that allows for a real-time rich communication between primarily 10 to 18 years old human users and a 3D graphical character through spontaneous speech and gesture. In this paper, we focus on the categorization of ontological resources into domain independent and domain specific components in the effort of both augmenting the agent's conversational capabilities and enhancing system's reusability across conversational domains. We also present a novel method of exploiting the existing ontological resources along with Google directory categorization for a semi-automatic understanding of user utterance on general purpose topics like e.g. movies and games. | |||
| Conspeakuous: Contextualising Conversational Systems | | BIBA | Full-Text | 165-175 | |
| S. Arun Nair; Amit Anil Nanavati; Nitendra Rajput | |||
| There has been a tremendous increase in the amount and type of information that is available through the Internet and through various sensors that now pervade our daily lives. Consequentially, the field of context aware computing has also contributed significantly in providing new technologies to mine and use the available context data. We present Conspeakuous -- an architecture for modeling, aggregating and using the context in spoken language conversational systems. Since Conspeakuous is aware of the environment through different sources of context, it helps in making the conversation more relevant to the user, and thus reducing the cognitive load on the user. Additionally, the architecture allows for representing learning of various user/environment parameters as a source of context. We built a sample tourist information portal application based on the Conspeakuous architecture and conducted user studies to evaluate the usefulness of the system. | |||
| Persuasive Effects of Embodied Conversational Agent Teams | | BIBA | Full-Text | 176-185 | |
| Hien Nguyen; Judith Masthoff; Peter Edwards | |||
| In a persuasive communication, not only the content of the message but also its source, and the type of communication can influence its persuasiveness on the audience. This paper compares the effects on the audience of direct versus indirect communication, one-sided versus two-sided messages, and one agent presenting the message versus a team presenting the message. | |||
| Exploration of Possibility of Multithreaded Conversations Using a Voice Communication System | | BIBAK | Full-Text | 186-195 | |
| Kanayo Ogura; Kazushi Nishimoto; Kozo Sugiyama | |||
| Everyday voice conversations require people to obey the turn-taking rule and
to keep to a single topic thread; therefore, it is not always an effective way
to communicate. Hence, we propose "ChaTEL," a voice communication system for
facilitating real-time multithreaded voice communications. ChaTEL has two
functions to support multithreaded communications: a function to indicate to
whom the user talks and a function to indicate which utterance the user
responds to. Comparing ChaTEL with a baseline system that does not have these
functions, we show that multithreaded conversations occur more frequently with
ChaTEL. Moreover, we discuss why ChaTEL can facilitate multi-threaded
conversations based on analyses of users' speaking and listening behaviors. Keywords: CMC (Computer-Mediated Communication); Multithreaded Conversations | |||
| A Toolkit for Multimodal Interface Design: An Empirical Investigation | | BIBAK | Full-Text | 196-205 | |
| Dimitrios I. Rigas; Mohammad M. Alsuraihi | |||
| This paper introduces a comparative multi-group study carried out to
investigate the use of multimodal interaction metaphors (visual, oral, and
aural) for improving learnability (or usability from first time use) of
interface-design environments. An initial survey was used for taking views
about the effectiveness and satisfaction of employing speech and
speech-recognition for solving some of the common usability problems. Then, the
investigation was done empirically by testing the usability parameters:
efficiency, effectiveness, and satisfaction of three design-toolkits (TVOID,
OFVOID, and MMID) built especially for the study. TVOID and OFVOID interacted
with the user visually only using typical and time-saving interaction
metaphors. The third environment MMID added another modality through vocal and
aural interaction. The results showed that the use of vocal commands and the
mouse concurrently for completing tasks from first time use was more efficient
and more effective than the use of visual-only interaction metaphors. Keywords: interface-design; usability; learnability; effectiveness; efficiency;
satisfaction; visual; oral; aural; multimodal; auditory-icons; earcons; speech;
text-to-speech; speech recognition; voice-instruction | |||
| An Input-Parsing Algorithm Supporting Integration of Deictic Gesture in Natural Language Interface | | BIBAK | Full-Text | 206-215 | |
| Yong Sun; Fang Chen; Yu (David) Shi; Vera Chung | |||
| Natural language interface (NLI) enables an efficient and effective
interaction by allowing a user to submit a single phrase in natural language to
the system. Free hand gestures can be added to an NLI to specify the referents
for deictic terms in speech. By combining NLI with other modalities to a
multimodal user interface, speech utterance length can be reduced, and users
need not clearly specify the referent verbally. Integrating deictic terms with
deictic gestures is a critical function in multimodal user interface. This
paper presents a novel approach to extend chart parsing used in natural
language processing (NLP) to integrate multimodal input based on speech and
manual deictic gesture. The effectiveness of the technique has been validated
through experiments, using a traffic incident management scenario where an
operator interacts with a map on large display at a distance and issues
multimodal commands through speech and manual gestures. The preliminary
experiment of the proposed algorithm shows encouraging results. Keywords: Multimodal chart parsing; Multimodal Fusion; Deictic Gesture; Deictic Terms | |||
| Multimodal Interfaces for In-Vehicle Applications | | BIBAK | Full-Text | 216-224 | |
| Roman Vilimek; Thomas Hempel; Birgit Otto | |||
| This paper identifies several factors that were observed as being crucial to
the usability of multimodal in-vehicle applications -- a multimodal system is
not of value in itself. Focusing in particular on the typical combination of
manual and voice control, this article describes important boundary conditions
and discusses the concept of natural interaction. Keywords: Multimodal; usability; driving; in-vehicle systems | |||
| Character Agents in E-Learning Interface Using Multimodal Real-Time Interaction | | BIBAK | Full-Text | 225-231 | |
| Hua Wang; Jie Yang; Mark H. Chignell; Mitsuru Ishizuka | |||
| This paper describes an e-learning interface with multiple tutoring
character agents. The character agents use eye movement information to
facilitate empathy-relevant reasoning and behavior. Eye Information is used to
monitor user's attention and interests, to personalize the agent behaviors, and
for exchanging information of different learners. The system reacts to multiple
users' eye information in real-time and the empathic character agents owned by
each learner exchange learner's information to help to form the online learning
community. Based on these measures, the interface infers the focus of attention
of the learner and responds accordingly with affective and instructional
behaviors. The paper will also report on some preliminary usability test
results concerning how users respond to the empathic functions and interact
with other learners using the character agents. Keywords: Multiple user interface; e-learning; character agent; tutoring; educational
interface | |||
| An Empirical Study on Users' Acceptance of Speech Recognition Errors in Text-Messaging | | BIBA | Full-Text | 232-242 | |
| Shuang Xu; Santosh Basapur; Mark Ahlenius; Deborah Matteo | |||
| Although speech recognition technology and voice synthesis systems have become readily available, recognition accuracy remain a serious problem in the design and implementation of voice-based user interfaces. Error correction becomes particularly difficult on mobile devices due to the limited system resources and constrained input methods. This research is aimed to investigate users' acceptance of speech recognition errors in mobile text messaging. Our results show that even though the audio presentation of the text messages does help users understand the speech recognition errors, users indicate low satisfaction when sending or receiving text messages with errors. Specifically, senders show significantly lower acceptance than the receivers due to the concerns of follow-up clarifications and the reflection of the sender's personality. We also find that different types of recognition errors greatly affect users' overall acceptance of the received message. | |||
| Flexible Multi-modal Interaction Technologies and User Interface Specially Designed for Chinese Car Infotainment System | | BIBAK | Full-Text | 243-252 | |
| Chen Yang; Nan Chen; Peng-fei Zhang; Zhen Jiao | |||
| In this paper, we present a car infotainment prototype system which aims to
develop an advanced concept for intuitive use-centered human machine interface
especially designed for Chinese users. In technology aspect, we apply several
innovative interaction technologies (most of which are Chinese language
specific) to make interaction easier, more convenient and effective. Speech
interaction design is especially elaborated in this aspect. While in user
interface design aspect, we systematically conducted user investigation to give
enlightening clue for better designing logic flow of the system and aesthetic
design. Under user-centered design principle and with deep understanding of
different interaction technologies, our prototype system makes transition from
different interaction modalities quite flexible. Preliminary performance
evaluation shows that our system attains high user acceptance. Keywords: Car Infotainment; Chinese ASR; Chinese TTS; Chinese NLU; Chinese Finger
Stroke Recognition; Melody Recognition; User-centered design | |||
| A Spoken Dialogue System Based on Keyword Spotting Technology | | BIBA | Full-Text | 253-261 | |
| Pengyuan Zhang; Qingwei Zhao; Yonghong Yan | |||
| In this paper, a keyword spotting based dialogue system is described. It is critical to understand user's requests accurately in a dialogue system. But the performance of large vocabulary continuous speech recognition (LVCSR) system is far from perfect, especially for spontaneous speech. In this work, an improved keyword spotting scheme is adopted instead. A fuzzy search algorithm is proposed to extract keyword hypotheses from syllable confusion networks (CN). CNs are linear and naturally suitable for indexing. To accelerate search process, CNs are pruned to feasible sizes. Furthermore, we enhance the discriminability of confidence measure by applying entropy information to the posterior probability of word hypotheses. On mandarin conversational telephone speech (CTS), the proposed algorithms obtained a 4.7% relative equal error rate (EER) reduction. | |||
| Dynamic Association Rules Mining to Improve Intermediation Between User Multi-channel Interactions and Interactive e-Services | | BIBAK | Full-Text | 265-274 | |
| Vincent Chevrin; Olivier Couturier | |||
| This paper deals with multi-channel interaction managing thru an
intermediation between channels and Interactive e-Services (IeS). After work on
modeling and theoretical framework, we implemented a platform: Ubi-Learn, which
is able to manage this kind of interaction thru an intermediation middleware
based on a Multi-Agents System (MAS): Jade. The issue addressed here is linked
to the way you choose a channel depending on the user's task. First, we have
encoded several ad hoc rules (tacit knowledge) into the system. In this paper,
we present our new approach based on association rules mining approach which
allows us to propose automatically several dynamic rules (explicit knowledge). Keywords: Interactive e-Services; Intermediation; association rules mining | |||
| Emotionally Expressive Avatars for Chatting, Learning and Therapeutic Intervention | | BIBAK | Full-Text | 275-285 | |
| Marc Fabri; Salima Y. Awad Elzouki; David J. Moore | |||
| We present our work on emotionally expressive avatars, animated virtual
characters that can express emotions via facial expressions. Because these
avatars are highly distinctive and easily recognizable, they may be used in a
range of applications. In the first part of the paper we present their use in
computer mediated communication where two or more people meet in virtual space,
each represented by an avatar. Study results suggest that social interaction
behavior from the real-world is readily transferred to the virtual world.
Empathy is identified as a key component for creating a more enjoyable
experience and greater harmony between users. In the second part of the paper
we discuss the use of avatars as an assistive, educational and therapeutic
technology for people with autism. Based on the results of a preliminary study,
we provide pointers regarding how people with autism may overcome some of the
limitations that characterize their condition. Keywords: Emotion; avatar; virtual reality; facial expression; instant messaging;
empathy; autism; education; therapeutic intervention | |||
| Can Virtual Humans Be More Engaging Than Real Ones? | | BIBA | Full-Text | 286-297 | |
| Jonathan Gratch; Ning Wang; Anna Okhmatovskaia; Francois Lamothe; Mathieu Morales; Rick J. van der Werf; Louis-Philippe Morency | |||
| Emotional bonds don't arise from a simple exchange of facial displays, but often emerge through the dynamic give and take of face-to-face interactions. This article explores the phenomenon of rapport, a feeling of connectedness that seems to arise from rapid and contingent positive feedback between partners and is often associated with socio-emotional processes. Rapport has been argued to lead to communicative efficiency, better learning outcomes, improved acceptance of medical advice and successful negotiations. We provide experimental evidence that a simple virtual character that provides positive listening feedback can induce stronger rapport-like effects than face-to-face communication between human partners. Specifically, this interaction can be more engaging to storytellers than speaking to a human audience, as measured by the length and content of their stories. | |||
| Automatic Mobile Content Conversion Using Semantic Image Analysis | | BIBA | Full-Text | 298-307 | |
| Eunjung Han; JongYeol Yang; HwangKyu Yang; Keechul Jung | |||
| An approach to knowledge-assisted semantic offline content re-authoring based on an automatic content conversion (ACC) ontology infrastructure is presented. Semantic concepts in the context are defined in ontology, text detection (e.g. connected component based), feature (e.g. texture homogeneity), feature parameter (e.g. texture model distribution), clustered feature (e.g. k-manes algorithm). We will show how the adaptation of the layout can facilitate browsing with mobile devices, especially small-screen mobile phones. In a second stage we address the topic of content personalization by providing a personalization scheme that is based on the ontology technology. Our experiment shows that the proposed ACC is more efficient than the existing methods in providing mobile comic contents. | |||
| History Based User Interest Modeling in WWW Access | | BIBAK | Full-Text | 308-312 | |
| Shuang Han; Wenguang Chen; Heng Wang | |||
| WWW cache stores user's browsing history, which contains large amount of
information that may be accessed again but not yet added to user's favorite
page folder. The existed www pages can be used to abstract user's interest and
predicts user interaction. By that means, a model that describes user's
interest is needed. In this paper, we discuss two methods about www-cache, data
mining and user interest: simple user interest model and real time
two-dimensional interest model. Moreover, the latter is described in detail and
applied to user interest modeling. An experiment is performed on 20 users'
interest data sets, which shows real time two-dimensional interest model is
more effective in www cache modeling. Keywords: www cache; user interest; interest model; data mining | |||
| Development of a Generic Design Framework for Intelligent Adaptive Systems | | BIBAK | Full-Text | 313-320 | |
| Ming Hou; Michelle Sylvia Gauthier; Simon Banbury | |||
| A lack of established design guidelines for intelligent adaptive systems is
a challenge in designing a human-machine performance maximization system. An
extensive literature review was conducted to examine existing approaches in the
design of intelligent adaptive systems. A unified framework to describe design
approaches using consistent and unambiguous terminology was developed.
Combining design methodologies from both Human Computer Interaction and Human
Factors fields, conceptual and design frameworks were also developed to provide
guidelines for the design and implementation of intelligent adaptive systems. A
number of criteria for the selection of appropriate analytical techniques are
recommended. The proposed frameworks will not only provide guidelines for
designing intelligent adaptive systems in the military domain, but also broadly
guide the design of other generic systems to optimize human-machine system
performance. Keywords: design guidance; design framework; intelligent adaptive interface;
intelligent adaptive system | |||
| Three Way Relationship of Human-Robot Interaction | | BIBAK | Full-Text | 321-330 | |
| Jung-Hoon Hwang; Kang-Woo Lee; Dong-Soo Kwon | |||
| In this paper, we conceptualize human-robot interaction (HRI) such that a
3-way relationship among a human, robot and environment can be established.
Various interactive patterns that may occur are analyzed on the basis of shared
ground. The model sheds light on how uncertainty caused by lack of knowledge
may be resolved and how shared ground can be established through interaction.
We also develop measures to evaluate the interactivities such as an Interaction
Effort, Interaction Situation Awareness using the information theory as well as
Markovian transition. An experiment is carried out in which human subjects are
asked to explain or answer about objects through interaction. The results of
the experiments show the feasibility of the proposed model and the usefulness
of the measures. It is expected that the presented model and measures will
serve to increase understanding of the patterns of HRI and to evaluate the
interactivity of HRI system. Keywords: Human-Robot Interaction; Shared Ground; Metrics; Interaction Effort;
Interaction SA | |||
| MEMORIA: Personal Memento Service Using Intelligent Gadgets | | BIBAK | Full-Text | 331-339 | |
| Hyeju Jang; Jongho Won; Changseok Bae | |||
| People would like to record what they experience to recall their earlier
events, share with others, or even hand down to their next generations. In
addition, our environment has been getting digitalized and the cost of storing
media has been being reduced. This has led research on the life log that stores
people's daily life. The research area includes collecting experience
information conveniently, manipulating and recording the collected information
efficiently, and retrieving and providing the stored information to users
effectively. This paper describes a personalized memory augmentation service,
called MEMORIA, that collects, stores and retrieves various kinds of experience
information in real time using the specially designed wearable intelligent
gadget (WIG). Keywords: Intelligent Gadget; Smart Object; Personalized Service; Memory Assistant
System; Memory Augmentation Service | |||
| A Location-Adaptive Human-Centered Audio Email Notification Service for Multi-user Environments | | BIBAK | Full-Text | 340-348 | |
| Ralf Jung; Tim Schwartz | |||
| In this paper, we introduce an application for a discreet notification of
mobile persons in a multi-user environment. In particular we use the current
user position to provide a personalized email notification with non-speech
audio cues embedded in aesthetic background music. The notification is done in
a peripheral way to avoid distraction of other people in the surrounding. Keywords: Auditory Display; Ambient Soundscapes; Indoor Positioning | |||
| Emotion-Based Textile Indexing Using Neural Networks | | BIBAK | Full-Text | 349-357 | |
| Na Yeon Kim; Yunhee Shin; Eun Yi Kim | |||
| This paper proposes a neural network based approach for emotion based
textile indexing. Generally, the human emotion can be affected by some physical
features such as color, texture, pattern, and so on. In the previous work, we
investigated the correlation between the human emotion and color or texture.
Here, we aim at investigating the correlation between the emotion and pattern,
and developing the textile indexing system using the pattern information.
Therefore, the survey is first conducted to investigate the correlation between
the emotion and the pattern. The result shows that a human emotion is deeply
affected by the certain pattern. Based on that result, an automatic indexing
system is developed. The proposed system is composed of feature extraction and
classification. To describe the pattern information in the textiles, the
wavelet transform is used. And the neural network is used as the classifier. To
assess the validity of the proposed method, it was applied to recognize the
human emotions in 100 textiles, and then our system produced the accuracy of
90%. This result confirmed that our system has the potential to be applied for
various applications such as textile industry and e-business. Keywords: Emotion recognition; neural networks; pattern recognition; feature
extraction; wavelet transform | |||
| Decision Theoretic Perspective on Optimizing Intelligent Help | | BIBAK | Full-Text | 358-365 | |
| Chulwoo Kim; Mark R. Lehto | |||
| With the increasing complexity of systems and information overload, agent
technology has become widely used to provide personalized advice (help message)
to users with their computer-based tasks. The purpose of this study is to
investigate the way to optimize advice provided by the intelligent agent from a
decision theoretic perspective. The study utilizes the time associated with
processing a help message as the trade-off criterion of whether to present a
help message or not. The proposed approach is expected to provide guidance as
to where, when and why help messages are likely to be effective or ineffective
by providing quantitative predictions of value of help messages in time. Keywords: intelligent agent; intelligent help; decision theoretic perspective; help
optimization | |||
| Human-Aided Cleaning Algorithm for Low-Cost Robot Architecture | | BIBAK | Full-Text | 366-375 | |
| Seungyong Kim; Kiduck Kim; Tae-Hyung Kim | |||
| This paper presents a human-aided cleaning algorithm that can be implemented
on low-cost robot architecture while the cleaning performance far exceeds the
conventional random style cleaning. We clarify the advantages and disadvantages
of the two notable cleaning robot styles: the random and the mapping styles,
and show the possibility how we can achieve the performance of the complicated
mapping style under the random style-like robot architecture using the idea of
human-aided cleaning algorithm. Experimental results are presented to show the
cleaning performance. Keywords: Cleaning robots; Random style cleaning; Mapping style cleaning; Human-robot
interaction | |||
| The Perception of Artificial Intelligence as "Human" by Computer Users | | BIBAK | Full-Text | 376-384 | |
| Jurek Kirakowski; Patrick O'Donnell; Anthony Yiu | |||
| This paper deals with the topic of 'humanness' in intelligent agents.
Chatbot agents (e.g. Eliza, Encarta) had been criticized on their ability to
communicate in human like conversation. In this study, a CIT approach was used
for analyzing the human and non-human parts of Eliza's conversation. The result
showed that Eliza could act like a human as if it could greet, maintain a
theme, apply damage control, react appropriately to cue, offer a cue, use
appropriate language style and have a personality. It was non human insofar as
it used formal or unusual treatment of language, failed to respond to a
specific question, failed to respond to a general question or implicit cue,
evidenced time delays and phrases delivered at inappropriate times. Keywords: chatbot; connectionist network; Eliza; Critical Incident Technique;
humanness | |||
| Speaker Segmentation for Intelligent Responsive Space | | BIBAK | Full-Text | 385-392 | |
| Soonil Kwon | |||
| Information drawn from conversational speech can be useful for enabling
intelligent interactions between humans and computers. Speaker information can
be obtained from speech signals by performing Speaker Segmentation. In this
paper, a method for Speaker Segmentation is presented to address the challenge
of identifying speakers even when utterances are very short (0.5sec). This
method, involving the selective use of feature vectors, experimentally reduced
the relative error rates by 27-42% for groups of 2 to 16 speakers as compared
to the conventional approach for Speaker Segmentation. Thus, this new approach
offers a way to significantly improve speech-data classification and retrieval
systems. Keywords: Speaker Segmentation; Speaker Recognition; Intelligent Responsive Space
(IRS); Human Computer Interaction (HCI) | |||
| Emotion and Sense of Telepresence: The Effects of Screen Viewpoint, Self-transcendence Style, and NPC in a 3D Game Environment | | BIBAK | Full-Text | 393-400 | |
| Jim Jiunde Lee | |||
| Telepresence, or the sense of "being there", has been discussed in the
literature as an essential, defining aspect of a virtual environment, including
definitions rooted in behavioral response, signal detection theory, and
philosophy, but has generally ignored the emotional aspects of the virtual
experience. The purpose of this study is to examine the concept of presence in
terms of people's emotional engagement within an immersive mediate environment.
Three main theoretical statements are discussed: a). Objective telepresence:
display viewpoint; b). Subjective telepresence: emotional factors and
individual self-transcendence styles; c). Social telepresence:
program-controlled entities in an on-line game environment. This study has
implications for how research could be conducted to further our understanding
of telepresence. Validated psychological subjective techniques for assessing
emotions and a sense of telepresence will be applied. The study results could
improve our knowledge of the construct of telepresence, as well as better
inform us about how a virtual environment, such as an online game, can be
managed in creating and designing emotional effects. Keywords: Computer game; emotion; self-transcendence style; telepresence | |||
| Emotional Interaction Through Physical Movement | | BIBAK | Full-Text | 401-410 | |
| Jong-Hoon Lee; Jin-Yung Park; Tek-Jin Nam | |||
| As everyday products become more intelligent and interactive, there are
growing interests on the methods to improve emotional value attached to the
products. This paper presents a basic method of using temporal and dynamic
design elements, in particular physical movements, to improve the emotional
value of products. To utilize physical movements in design, a relation
framework between movement and emotion was developed as the first step of the
research. In the framework, the movement representing emotion was structurized
in terms of three properties; velocity, smoothness and openness. Based on this
framework, a new interactive device, 'Emotion Palpus', was developed, and a
user study was also conducted. The result of the research is expected to
improve emotional user experience when used as a design method or directly
applied to design practice as an interactive element of products. Keywords: Emotion; Physical Movement Design; Interaction Design; Interactive Product
Design; Design Method | |||
| Towards Affective Sensing | | BIBA | Full-Text | 411-420 | |
| Gordon McIntyre; Roland Göcke | |||
| This paper describes ongoing work towards building a multimodal computer system capable of sensing the affective state of a user. Two major problem areas exist in the affective communication research. Firstly, affective states are defined and described in an inconsistent way. Secondly, the type of training data commonly used gives an oversimplified picture of affective expression. Most studies ignore the dynamic, versatile and personalised nature of affective expression and the influence that social setting, context and culture have on its rules of display. We present a novel approach to affective sensing, using a generic model of affective communication and a set of ontologies to assist in the analysis of concepts and to enhance the recognition process. Whilst the scope of the ontology provides for a full range of multimodal sensing, this paper focuses on spoken language and facial expressions as examples. | |||
| Affective User Modeling for Adaptive Intelligent User Interfaces | | BIBAK | Full-Text | 421-430 | |
| Fatma Nasoz; Christine L. Lisetti | |||
| In this paper we describe the User Modeling phase of our general research
approach: developing Adaptive Intelligent User Interfaces to facilitate
enhanced natural communication during the Human-Computer Interaction. Natural
communication is established by recognizing users' affective states (i.e.,
emotions experienced by the users) and responding to those emotions by adapting
to the current situation via an affective user model. Adaptation of the
interface was designed to provide multi-modal feedback to the users about their
current affective state and to respond to users' negative emotional states in
order to compensate for the possible negative impacts of those emotions.
Bayesian Belief Networks formalization was employed to develop the User Model
to enable the intelligent system to appropriately adapt to the current context
and situation by considering user-dependent factors, such as: personality
traits and preferences. Keywords: User Modeling; Bayesian Belief Networks; Intelligent Interfaces; Human
Computer Interaction | |||
| A Multidimensional Classification Model for the Interaction in Reactive Media Rooms | | BIBA | Full-Text | 431-439 | |
| Ali A. Nazari Shirehjini | |||
| We are already living in a world where we are surrounded by intelligent devices which support us to plan, organize, and perform our daily life. Their number is constantly increasing. At the same time, the complexity of the environment and the number of intelligent devices must not distract the user from his original tasks. Therefore a primary goal is to reduce the user's mental workload. With the emergence of newly available technology, the challenge to maintain control increases, while the additional value decreases. After taking a closer look at enriched environments, there will come up the question of how to build a more intuitive way for people to interact with such an environment. As a result the design of proper interaction models appears to be crucial for AmI systems. To facilitate the design of proper interaction models we are introducing a multidimensional classification model for the interaction in reactive media rooms. It describes the various dimensions of interaction and outlines the design space for the creation of interaction models. By doing so, the proposed work can also be used as a meta-model for interaction design. | |||
| An Adaptive Web Browsing Method for Various Terminals: A Semantic Over-Viewing Method | | BIBAK | Full-Text | 440-448 | |
| Hisashi Noda; Teruya Ikegami; Yushin Tatsumi; Shin'ichi Fukuzumi | |||
| This paper proposed a semantic over-viewing method. This method extracts
headings and semantic blocks by analyzing a layout structure of a web page and
can provide a semantic overview of the web page. This method allows users grasp
the overall structure of pages. It also reduces the number of operations to
target information to about 6% by moving along semantic blocks. Additionally,
it reduces the cost of Web page creation because of adapting one Web page
content to multi-terminals. The evaluations were conducted in respect to
effectiveness, efficiency and satisfaction. The results confirmed that the
proposed browser is more usable than the traditional method. Keywords: cellular phone; mobile phone; non-PC terminal; remote controller; web
browsing; overview | |||
| Evaluation of P2P Information Recommendation Based on Collaborative Filtering | | BIBAK | Full-Text | 449-458 | |
| Hidehiko Okada; Makoto Inoue | |||
| Collaborative filtering is a social information recommendation/ filtering
method, and the peer-to-peer (P2P) computer network is a network on which
information is distributed on the peer-to-peer basis (each peer node works as a
server, a client, and even a router). This research aims to develop a model of
P2P information recommendation system based on collaborative filtering and
evaluate the ability of the system by computer simulations based on the model.
We previously proposed a simple model, and the model in this paper is a
modified one that is more focused on recommendation agents and user-agent
interactions. We have developed a computer simulator program and tested
simulations with several parameter settings. From the results of the
simulations, recommendation recall and precision are evaluated. Findings are
that the agents are likely to overly recommend so that the recall score becomes
high but the precision score becomes low. Keywords: Multi agents; P2P network; information recommendation; collaborative
filtering; simulation | |||
| Understanding the Social Relationship Between Humans and Virtual Humans | | BIBAK | Full-Text | 459-464 | |
| Sung Park; Richard Catrambone | |||
| Our review surveys a range of human-human relationship models and research
that might provide insights to understanding the social relationship between
humans and virtual humans. This involves investigating several social
constructs (expectations, communication, trust, etc.) that are identified as
key variables that influence the relationship between people and how these
variables should be implemented in the design for an effective and useful
virtual human. This theoretical analysis contributes to the foundational theory
of human computer interaction involving virtual humans. Keywords: Embodied conversational agent; virtual agent; animated character; avatar;
social interaction | |||
| EREC-II in Use -- Studies on Usability and Suitability of a Sensor System for Affect Detection and Human Performance Monitoring | | BIBAK | Full-Text | 465-474 | |
| Christian Peter; Randolf Schultz; Jörg Voskamp; Bodo Urban; Nadine Nowack; Hubert Janik; Karin Kraft; Roland Göcke | |||
| Interest in emotion detection is increasing significantly. For research and
development in the field of Affective Computing, in smart environments, but
also for reliable non-lab medical and psychological studies or human
performance monitoring, robust technologies are needed for detecting evidence
of emotions in persons under everyday conditions. This paper reports on
evaluation studies of the EREC-II sensor system for acquisition of
emotion-related physiological parameters. The system has been developed with a
focus on easy handling, robustness, and reliability. Two sets of studies have
been performed covering 4 different application fields: medical, human
performance in sports, driver assistance, and multimodal affect sensing.
Results show that the different application fields pose different requirements
mainly on the user interface, while the hardware for sensing and processing the
data proved to be in an acceptable state for use in different research domains. Keywords: Physiology sensors; Emotion detection; Evaluation; Multimodal affect
sensing; Driver assistance; Human performance; Cognitive load; Medical
treatment; Peat baths | |||
| Development of an Adaptive Multi-agent Based Content Collection System for Digital Libraries | | BIBAK | Full-Text | 475-485 | |
| R. Ponnusamy; T. V. Gopal | |||
| Relevant digital content collection and access are huge problems in digital
libraries. It poses a greater challenge to the digital library users and
content builders. In this present work an attempt has been made to design and
develop a user-adaptive multi-agent system approach to recommend the contents
automatically for the digital library. An adaptive dialogue based
user-interaction screen has also been provided to access the contents. Once the
new contents are added to the collection then the system should automatically
alert appropriate user about the new content arrivals based on their interest.
The user interactive Question Answering (QA) system provides enough knowledge
about the user requirements. Keywords: Question Answering (QA) systems; Adaptive Interaction; Digital Libraries;
Multi-Agent System | |||
| Using Content-Based Multimedia Data Retrieval for Multimedia Content Adaptation | | BIBAK | Full-Text | 486-492 | |
| Adriana Reveiu; Marian Dardala; Felix Furtuna | |||
| The effective retrieval and multimedia data management techniques to
facilitate the searching and querying of large multimedia data sets are very
important in multimedia applications development. The content-based retrieval
systems must use the multimedia content to represent and to index data. The
representation of multimedia data supposes to identify the most useful features
for representing the multimedia content and the approaches needed for coding
the attributes of multimedia data. The multimedia content adaptation realize
the multimedia resources manipulation, respecting the specific quality
parameters, function on the limits required by networks and terminal devices.
The goal of the paper is to identify a design model for using content-based
multimedia data retrieval in multimedia content adaptation. The goal of this
design is to deliver the multimedia content in various networks and to
different types of peripheral devices, in the most appropriate format and
function on specific characteristics. Keywords: multimedia streams; content based data retrieval; content adaptation; media
type | |||
| Coping with Complexity Through Adaptive Interface Design | | BIBAK | Full-Text | 493-498 | |
| Nadine B. Sarter | |||
| Complex systems are characterized by a large number and variety of, and
often a high degree of dependency between, subsystems. Complexity, in
combination with coupling, has been shown to lead to difficulties with
monitoring and comprehending system status and activities and thus to an
increased risk of breakdowns in human-machine coordination. In part, these
breakdowns can be explained by the fact that increased complexity tends to be
paralleled by an increase in the amount of data that is made available to
operators. Presenting this data in an inappropriate form is crucial to avoiding
problems with data overload and attention management. One approach for
addressing this challenge is to move from fixed display designs to adaptive
information presentation, i.e., information presentation that changes as a
function of context. This paper will discuss possible approaches to, challenges
for, and effects of increasing the flexibility of information presentation. Keywords: interface design; adaptive; adaptable; complex systems; adaptation drivers | |||
| Region-Based Model of Tour Planning Applied to Interactive Tour Generation | | BIBA | Full-Text | 499-507 | |
| Inessa Seifert | |||
| The paper addresses a tour planning problem, which encompasses weakly specified constraints such as different kinds of activities together with corresponding spatial assignments such as locations and regions. Alternative temporal orders of planed activities together with underspecified spatial assignments available at different levels of granularity lead to a high computational complexity of the given tour planning problem. The paper introduces the results of an exploratory tour planning study and a Region-based Direction Heuristic, derived from the acquired data. A gesture-based interaction model is proposed, which allows structuring the search space by a human user at a high level of abstraction for the subsequent generation of alternative solutions so that the proposed Region-based Direction Heuristic can be applied. | |||
| A Learning Interface Agent for User Behavior Prediction | | BIBAK | Full-Text | 508-517 | |
| Gabriela Serban; Adriana Tarta; Grigoreta Sofia Moldovan | |||
| Predicting user behavior is an important issue in Human Computer Interaction
([5]) research, having an essential role when developing intelligent user
interfaces. A possible solution to deal with this challenge is to build an
intelligent interface agent ([8]) that learns to identify patterns in users
behavior. The aim of this paper is to introduce a new agent based approach in
predicting users behavior, using a probabilistic model. We propose an
intelligent interface agent that uses a supervised learning technique in order
to achieve the desired goal. We have used Aspect Oriented Programming ([7]) in
the development of the agent in order to benefit of the advantages of this
paradigm. Based on a newly defined evaluation measure, we have determined the
accuracy of the agent's prediction on a case study. Keywords: user interface; interface agent; supervised learning; aspect oriented
programming | |||
| Sharing Video Browsing Style by Associating Browsing Behavior with Low-Level Features of Videos | | BIBAK | Full-Text | 518-526 | |
| Akio Takashima; Yuzuru Tanaka | |||
| This paper focuses on a method to extract video browsing styles and reusing
it. In video browsing process for knowledge work, users often develop their own
browsing styles to explore the videos because the domain knowledge of contents
is not enough, and then the users interact with videos according to their
browsing style. The User Experience Reproducer enables users to browse new
videos according to their own browsing style or other users' browsing styles.
The preliminary user studies show that video browsing styles can be reused to
other videos. Keywords: video browsing; active watching; tacit knowledge | |||
| Adaptation in Intelligent Tutoring Systems: Development of Tutoring and Domain Models | | BIBAK | Full-Text | 527-534 | |
| Oswaldo Velez-Langs; Xiomara Argüello | |||
| This paper describes the aspects kept in mind for the development of the
tutoring and domain models, of an Intelligent Tutoring System (ITS), where the
instruction type that will give the tutoring system, the pedagogic strategies
and the structure of the course are established. Also is described the software
development process and their principal functions. This work is part of the
research project that involves the adaptation process of the interfaces into
Intelligent Tutoring Systems at the University of Sinu's TESEEO Research Group
([2]). The final objective of this work is to provide mechanisms for the design
and development of system interfaces for tutoring/training, those are effective
and at the same time modular, structured, configurable, flexible and adaptable. Keywords: Adaptive Interfaces; Tutoring Model; Domain Model; Tutoring Intelligent
Systems; Instructional Cognitive Theory | |||
| Confidence Measure Based Incremental Adaptation for Online Language Identification | | BIBAK | Full-Text | 535-543 | |
| Shan Zhong; Yingna Chen; Chunyi Zhu; Jia Liu | |||
| This paper proposes an novel two-pass adaptation method for online language
identification by using confidence measure based incremental language model
adaptation. In this system, we firstly used semi-supervised language model
adaptation to solve the problem of channel mismatch, and then used unsupervised
incremental adaptation to adjust new language model during online language
identification. For robust adaptation, we compare three confidence measures and
then present a new fusion method with Bayesian classifier. Tested on the RMTS
(Real-world Multi-channel Telephone Speech) database, experiments show that
using semi-supervised language model adaptation, the target language detection
rate rises from 73.26% to 80.02% and after unsupervised incremental language
model adaptation, an extra rise over 3.91% (from 80.02% to 83.93%) is obtained. Keywords: Language Identification; Language Model Adaptation; Confidence Measure;
Bayesian Fusion | |||
| Study on Speech Emotion Recognition System in E-Learning | | BIBAK | Full-Text | 544-552 | |
| Aiqin Zhu; Qi Luo | |||
| Aiming at emotion deficiency in present E-Learning system, speech emotion
recognition system is proposed in the paper. A corpus of emotional speech from
various subjects, speaking different languages is collected for developing and
testing the feasibility of the system. The potential prosodic features are
first identified and extracted from the speech data. Then we introduce a
systematic feature selection approach which involves the application of
Sequential Forward Selection (SFS) with a General Regression Neural Network
(GRNN) in conjunction with a consistency-based selection method. The selected
features are employed as the input to a Modular Neural Network (MNN) to realize
the classification of emotions. Our simulation experiment results show that the
proposed system gives high recognition performance. Keywords: E-learning; SFS; GRNN; MNN; Affective computing | |||
| How Do Adults Solve Digital Tangram Problems? Analyzing Cognitive Strategies Through Eye Tracking Approach | | BIBAK | Full-Text | 555-563 | |
| Bahar Baran; Berrin Dogusoy; Kursat Cagiltay | |||
| Purpose of the study is to investigate how adults solve tangram based
geometry problems on computer screen. Two problems with different difficulty
levels were presented to 20 participants. The participants tried to solve
problems by placing seven geometric objects into correct locations. In order to
analyze the process, the participants and their eye movements were recorded by
an Tobii Eye Tracking device while solving the problems. The results showed
that the participants employed different strategies while solving problems with
different difficulty levels. Keywords: Tangram; problem solving; eye tracking; spatial ability | |||
| Gesture Interaction for Electronic Music Performance | | BIBAK | Full-Text | 564-572 | |
| Reinhold Behringer | |||
| This paper describes an approach for a system which analyses an orchestra
conductor in real-time, with the purpose of using the extracted information of
time pace and expression for an automatic play of a computer-controlled
instrument (synthesizer). The system in its final stage will use non-intrusive
computer vision methods to track the hands of the conductor. The main challenge
is to interpret the motion of the hand/baton/mouse as beats for the timeline.
The current implementation uses mouse motion to simulate the movement of the
baton. It allows to "conduct" a pre-stored MIDI file of a classical orchestral
music work on a PC. Keywords: Computer music; human-computer interaction; gesture interaction | |||
| A New Method for Multi-finger Detection Using a Regular Diffuser | | BIBAK | Full-Text | 573-582 | |
| Li-Wei Chan; Yi-Fan Chuang; Yi-Wei Chia; Yi-Ping Hung; Jane Yung-jen Hsu | |||
| In this paper, we developed a fingertip finding algorithm working with a
regular diffuser. The proposed algorithm works on images captured by infra-red
cameras, settled on one side of the diffuser, observing human gestures taken
place on the other side. With diffusion characteristics of the diffuser, we can
separate finger-touch from palm-hover events when the user interacts with the
diffuser. This paper contributes on: Firstly, the technique works with a
regular diffuser, infra-red camera coupled with an infra-red illuminator, which
is easy to deploy and cost effective. Secondly, the proposed algorithm is
designed to be robust for casually illuminated surface. Lastly, with diffusion
characteristics of the diffuser, we can detect finger-touch and palm-hover
events, which is useful for natural user interface design. We have deployed the
algorithm on a rear-projection multi-resolution tabletop, called I-M-Top. A
video retrieval application using the two events on design of UIs is
implemented to show its intuitiveness on the tabletop system. Keywords: Multi-Finger Detection; Intuitive Interaction | |||
| Lip Contour Extraction Using Level Set Curve Evolution with Shape Constraint | | BIBA | Full-Text | 583-588 | |
| Jae Sik Chang; Eun Yi Kim; Se Hyun Park | |||
| In this work, a novel method for lip contour extraction based on level set curve evolution is presented. This method takes not only color information but also lip contour shape constraint represented by a distance function between the evolving curve and parametric shape model. In this method, the curve is evolved by minimizing an energy function that incorporates shape constraint function as internal energy, while previous curve evolution methods use a simple smoothing function. The new shape constraint function prevents the curve from evolving to arbitrary shapes occurred due to weak color contrast between lip and skin regions. Comparisons with other method are conducted to evaluate the proposed method. It showed that the proposed method provides more accurate results than other methods. | |||
| Visual Foraging of Highlighted Text: An Eye-Tracking Study | | BIBAK | Full-Text | 589-598 | |
| Ed Huai-hsin Chi; Michelle Gumbrecht; Lichan Hong | |||
| The wide availability of digital reading material online is causing a major
shift in everyday reading activities. Readers are skimming instead of reading
in depth [Nielson 1997]. Highlights are increasingly used in digital interfaces
to direct attention toward relevant passages within texts. In this paper, we
study the eye-gaze behavior of subjects using both keyword highlighting and
ScentHighlights [Chi et al. 2005]. In this first eye-tracking study of
highlighting interfaces, we show that there is direct evidence of the von
Restorff isolation effect [VonRestorff 1933] in the eye-tracking data, in that
subjects focused on highlighted areas when highlighting cues are present. The
results point to future design possibilities in highlighting interfaces. Keywords: Automatic text highlighting; dynamic summarization; contextualization;
personalized information access; eBooks; Information Scent | |||
| Effects of a Dual-Task Tracking on Eye Fixation Related Potentials (EFRP) | | BIBA | Full-Text | 599-604 | |
| Hiroshi Daimoto; Tsutomu Takahashi; Kiyoshi Fujimoto; Hideaki Takahashi; Masaaki Kurosu; Akihiro Yagi | |||
| The eye fixation related brain potentials (EFRP) associated with the occurrence of fixation pause can be obtained by averaging EEGs at offset of saccades. EFRP is a kind of event-related brain potential (ERP) measurable at the eye movement situation. In this experiment, EFRP were examined concurrently along with performance and subjective measures to compare the effects of tracking difficulty during a dual-task. Twelve participants were assigned four different types of a tracking task for each 5 min. The difficulty of tracking task is manipulated by the easiness to track a target with a trackball and the easiness to give a correct response to the numerical problem. The workload of the each tracking condition is different in the task quality (the difficulty of perceptual motor level and/or cognitive level). As a result, the most prominent positive component with latency of about 100 ms in EFRP was observed under all tracking conditions. The amplitude of the condition with the highest workload was smaller than that of the condition with the lowest workload, while the effects of the task quality and the correspondency with the subjective difficulty in incremental step were not recognized in this experiment. The results suggested that EFRP was an useful index of the excessive mental workload. | |||
| Effect of Glance Duration on Perceived Complexity and Segmentation of User Interfaces | | BIBAK | Full-Text | 605-614 | |
| Yifei Dong; Chen Ling; Lesheng Hua | |||
| Computer users who handle complex tasks like air traffic control (ATC) need
to quickly detect updated information from multiple displays of graphical user
interface. The objectives of this study are to investigate how much computer
users can segment GUI display into distinctive objects within very short
glances and whether human perceives complexity differently after different
durations of exposure. Subjects in this empirical study were presented with 20
screenshots of web pages and software interfaces for different short durations
(100ms, 500ms, 1000ms) and were asked to recall the visual objects and rate the
complexity of the images. The results indicate that subjects can reliably
recall 3-5 objects regardless of image complexity and exposure duration up to
1000ms. This result agrees with the "magic number 4" of visual short-term
memory (VSTM). Perceived complexity by subjects is consistent among the
different exposure durations, and it is highly correlated with subjects' rating
on the ease to segmentation as well as the image characteristics of density,
layout, and color use. Keywords: Visual Segmentation; Perceptual Complexity; Rapid Glance | |||
| Movement-Based Interaction and Event Management in Virtual Environments with Optical Tracking Systems | | BIBAK | Full-Text | 615-624 | |
| Maxim Foursa; Gerold Wesche | |||
| In this paper we present our experience in using optical tracking systems in
Virtual Environment applications. First we briefly describe the tracking
systems we used, and then we describe the application scenarios and present how
we adapted the scenarios for the tracking systems. One of the tracking systems
is markerless, that means that a user doesn't have to wear any specific devices
to be tracked and can interact with an application with free hand movements.
With our application we compare the performance of different tracking systems
and demonstrate that it is possible to perform complex actions in an intuitive
way with just small special knowledge of the system and without any specific
devices. This is a step forward to a more natural human-computer interface. Keywords: tracking systems; virtual environments; application scenarios; interaction
techniques | |||
| Multiple People Gesture Recognition for Human-Robot Interaction | | BIBAK | Full-Text | 625-633 | |
| Seok-Ju Hong; Nurul Arif Setiawan; Chil-Woo Lee | |||
| In this paper, we propose gesture recognition in multiple people
environment. Our system is divided into two modules: Segmentation and
Recognition. In segmentation part, we extract foreground area from input image,
and we decide the closest person as a recognition subject. In recognition part,
firstly we extract feature point of subject's both hands using contour based
method and skin based method. Extracted points are tracked using Kalman filter.
We use trajectories of both hands for recognizing gesture. In this paper, we
use the simple queue matching method as a recognition method. We also apply our
system as an animation system. Our method can select subject effectively and
recognize gesture in multiple people environment. Therefore, proposed method
can be used for real world application such as home appliance and humanoid
robot. Keywords: Context Aware; Gesture Recognition; Multiple People | |||
| Position and Pose Computation of a Moving Camera Using Geometric Edge Matching for Visual SLAM | | BIBAK | Full-Text | 634-641 | |
| HyoJong Jang; Gye-Young Kim; Hyung-Il Choi | |||
| A prerequisite component of a autonomous mobile vehicle system is the self
localization ability to recognize its environment and to estimate where it is.
Generally, we can determine the position and the pose using homography
approach, but it has errors especially in simultaneous change of position and
pose. In this paper, we proposed position and pose computation method of a
camera through analysis of images obtained from camera equipped mobile robot.
Proposed method is made up of two steps. First step is to extract feature
points and matching in sequential images. Second step is to compute the
accurate camera position and pose using geometric edge matching. In first step,
we use KLT tracking to extract feature points and matching in sequential
images. In second step, we propose an iterative matching method between
predicted edge models through perspective transform using the result calculated
by homography of the matched feature points and generated edge models in
correspond points till there is no variation in matching error. For the purpose
of the performance evaluation, we performed the test to compensate the position
and the pose of the camera installed in wireless-controlled vehicle with the
video sequence stream obtained at 15Hz frame rate and show the experimental
results. Keywords: vSLAM; Perspective Transformation; KLT tracking; Geometric Edge Matching | |||
| "Shooting a Bird": Game System Using Facial Feature for the Handicapped People | | BIBAK | Full-Text | 642-648 | |
| Jinsun Ju; Yunhee Shin; Eun Yi Kim | |||
| This paper presents a novel computer game system that controls a game using
only the movement of human's facial features. Our system is specially
designated for the handicapped people with severe disabilities and the people
without experience of using the computer. Using a usual PC camera, the proposed
game system detects the user's eye movement and mouse movement, and then
interprets the communication intent to play a game. The game system is tested
with 42 numbers of people, and then the result shows that our game system
should be efficiently and effectively used as the interface for the disabled
people. Keywords: Augmented game; HCI; Facial feature tracking; neural network | |||
| Human Pose Estimation Using a Mixture of Gaussians Based Image Modeling | | BIBAK | Full-Text | 649-658 | |
| Do Joon Jung; Kyung Su Kwon; Hang Joon Kim | |||
| In this paper, we propose an approach toward body parts representation,
localization, and human pose estimation from an image. In the image, the human
body parts and a background are represented by a mixture of Gaussians, and the
body parts configuration is modeled by a Bayesian network. In this model, state
nodes represent pose parameters of an each body part, and arcs represent
spatial constraints. The Gaussian mixture distribution is used to model the
prior distribution for the body parts and the background as a parametric model.
We estimate the human pose through an optimization of the pose parameters using
likelihood objective functions. The performance of the proposed approach is
illustrated on various single images, and improves the human pose estimation
quality. Keywords: Human Pose Estimation; Mixture of Gaussians; Bayesian Network | |||
| Human Motion Modeling Using Multivision | | BIBA | Full-Text | 659-668 | |
| Byoung-Doo Kang; Jae-Seong Eom; Jong-Ho Kim; Chulsoo Kim; Sang-Ho Ahn; Bum-Joo Shin; Sang-Kyoon Kim | |||
| In this paper, we propose a gesture modeling system based on computer vision in order to recognize a gesture naturally without any trouble between a system and a user using real-time 3D modeling information on multiple objects. It recognizes a gesture after 3D modeling and analyzing the information pertaining to the user's body shape in stereo views for human movement. In the 3D-modeling step, 2D information is extracted from each view by using an adaptive color difference detector. Potential objects such as faces, hands, and feet are labeled by using the information from 2D detection. We identify reliable objects by comparing the similarities of the potential objects that are obtained from both the views. We acquire information on 2D tracking from the selected objects by using the Kalman filter and reconstruct it as a 3D gesture. A joint of each part of a body is generated in the combined objects. We experimented on ambiguities using occlusion, clutter, and irregular 3D gestures to analyze the efficiency of the proposed system. In this experiment, the proposed gesture modeling system showed a good detection and a processing time of 30 frames per second, which can be used in a real-time. | |||
| Real-Time Face Tracking System Using Adaptive Face Detector and Kalman Filter | | BIBA | Full-Text | 669-678 | |
| Jong-Ho Kim; Byoung-Doo Kang; Jae-Seong Eom; Chulsoo Kim; Sang-Ho Ahn; Bum-Joo Shin; Sang-Kyoon Kim | |||
| In this paper, we propose a real-time face tracking system using adaptive face detector and the Kalman filter. Basically, the features used for face detection are five types of simple Haar-like features. To only extract the more significant features from these features, we employ principal component analysis (PCA). The extracted features are used for a learning vector of the support vector machine (SVM), which classifies the faces and non-faces. The face detector locates faces from the face candidates separated from the background by using real-time updated skin color information. We trace the moving faces with the Kalman filter, which uses the static information of the detected faces and the dynamic information of changes between previous and current frames. In this experiment, the proposed system showed an average tracking rate of 97.3% and a frame rate of 23.5 frames per s, which can be adapted into a real-time tracking system. | |||
| Kalman Filtering in the Design of Eye-Gaze-Guided Computer Interfaces | | BIBAK | Full-Text | 679-689 | |
| Oleg Komogortsev; Javed I. Khan | |||
| In this paper, we design an Attention Focus Kalman Filter (AFKF) -- a
framework that offers interaction capabilities by constructing an eye-movement
language, provides real-time perceptual compression through Human Visual System
(HVS) modeling, and improves system's reliability. These goals are achieved by
an AFKF through identification of basic eye-movement types in real-time, the
prediction of a user's perceptual attention focus, and the use of the eye's
visual sensitivity function and eye-position data signal de-noising. Keywords: Human Visual System Modeling; Kalman Filter; Human Computer Interaction;
Perceptual Compression | |||
| Human Shape Tracking for Gait Recognition Using Active Contours with Mean Shift | | BIBAK | Full-Text | 690-699 | |
| Kyung Su Kwon; Se Hyun Park; Eun Yi Kim; Hang Joon Kim | |||
| In this paper, we present a human shape extraction and tracking for gait
recognition using geodesic active contour models (GACMs) combined with
mean-shift algorithm. The active contour models (ACMs) are very effective to
deal with the non-rigid object because of its elastic property, but they have
the limitation that their performance is mainly dependent on the initial curve.
To overcome this problem, we combine the mean-shift algorithm with the
traditional GACMs. The main idea is very simple. Before evolving using
level-set method, the initial curve in each frame is re-localized near the
human region and is resized enough to include the target object. This mechanism
allows for reducing the number of iterations and for handling the large object
motion. Our system is composed of human region detection and human shape
tracking. In the human region detection module, the silhouette of a walking
person is extracted by background subtraction and morphologic operation. Then
human shape are correctly obtained by the GACMs with mean-shift algorithm. To
evaluate the effectiveness of the proposed method, it is applied the common
gait data, then the results show that the proposed method is extracted and
tracked efficiently accurate shape for gait recognition. Keywords: Human Shape Tracking; Geodesic Active Contour Models; Mean Shift; Gait
Recognition | |||
| Robust Gaze Tracking Method for Stereoscopic Virtual Reality Systems | | BIBA | Full-Text | 700-709 | |
| Eui Chul Lee; Kang Ryoung Park; Min Cheol Whang; Junseok Park | |||
| In this paper, we propose a new face and eye gaze tracking method that works by attaching gaze tracking devices to stereoscopic shutter glasses. This paper presents six advantages over previous works. First, through using the proposed method with stereoscopic VR systems, users feel more immersed and comfortable. Second, by capturing reflected eye images with a hot mirror, we were able to increase eye gaze accuracy in a vertical direction. Third, by attaching the infrared passing filter and using an IR illuminator, we were able to obtain robust gaze tracking performance irrespective of environmental lighting conditions. Fourth, we used a simple 2D-based eye gaze estimation method based on the detected pupil center and the 'geometric transform' process. Fifth, to prevent gaze positions from being unintentionally moved by natural eye blinking, we discriminated between different kinds of eye blinking by measuring pupil sizes. This information was also used for button clicking or mode toggling. Sixth, the final gaze position was calculated by the vector summation of face and eye gaze positions and allowing for natural face and eye movements. Experimental results showed that the face and eye gaze estimation error was less than one degree. | |||
| EyeScreen: A Gesture Interface for Manipulating On-Screen Objects | | BIBA | Full-Text | 710-717 | |
| Shanqing Li; Jingjun Lv; Yihua Xu; Yunde Jia | |||
| This paper presented a gesture-based interaction system which provides a natural way of manipulating on-screen objects. We generate a synthetic image by linking images from two cameras to recognize hand gestures. The synthetic image contains all the features captured from two different views, which can be used to alleviate the self-occlusion problem and improve the recognition rate. The MDA and EM algorithms are used to obtain parameters for pattern classification. To compute more detailed pose parameters such as fingertip positions and hand contours in the image, a random sampling method is introduced in our system. We describe a method based on projective geometry for background subtraction to improve the system performance. Robustness of the system has been verified by extensive experiments with different user scenarios. The applications of picture browser and visual pilot are discussed in this paper. | |||
| GART: The Gesture and Activity Recognition Toolkit | | BIBAK | Full-Text | 718-727 | |
| Kent Lyons; Helene Brashear; Tracy L. Westeyn; Jungsoo Kim; Thad Starner | |||
| The Gesture and Activity Recognition Toolkit (GART) is a user interface
toolkit designed to enable the development of gesture-based applications. GART
provides an abstraction to machine learning algorithms suitable for modeling
and recognizing different types of gestures. The toolkit also provides support
for the data collection and the training process. In this paper, we present
GART and its machine learning abstractions. Furthermore, we detail the
components of the toolkit and present two example gesture recognition
applications. Keywords: Gesture recognition; user interface toolkit | |||
| Static and Dynamic Hand-Gesture Recognition for Augmented Reality Applications | | BIBAK | Full-Text | 728-737 | |
| Stefan Reifinger; Frank Wallhoff; Markus Ablaßmeier; Tony Poitschke; Gerhard Rigoll | |||
| This contribution presents our approach for an instrumented automatic
gesture recognition system for use in Augmented Reality, which is able to
differentiate static and dynamic gestures. Basing on an infrared tracking
system, infrared targets mounted at the users thumbs and index fingers are used
to retrieve information about position and orientation of each finger. Our
system receives this information and extracts static gestures by distance
classifiers and dynamic gestures by statistical models. The concluded gesture
is provided to any connected application. We introduce a small demonstration as
basis for a short evaluation. In this we compare interaction in a real
environment, Augmented Reality with a mouse/keyboard, and our gesture
recognition system concerning properties, such as task execution time or
intuitiveness of interaction. The results show that tasks executed by
interaction with our gesture recognition system are faster than using the
mouse/keyboard. However, this enhancement entails a slightly lowered wearing
comfort. Keywords: Augmented Reality; Gesture Recognition; Human Computer Interaction | |||
| Multiple People Labeling and Tracking Using Stereo for Human Computer Interaction | | BIBAK | Full-Text | 738-746 | |
| Nurul Arif Setiawan; Seok-Ju Hong; Chil-Woo Lee | |||
| In this paper, we propose a system for multiple people tracking using
fragment based histogram matching. Appearance model is based on Improved HLS
color histogram which can be calculated efficiently using integral histogram
representation. Since the histograms will loss all spatial information, we
define a fragment based region representation which retains spatial
information, robust against occlusion and scale issue by using disparity
information. Multiple people labeling is maintained by creating an online
appearance representation for each person detected in the scene and calculating
fragment vote map. Initialization is performed automatically from the
background segmentation step. Keywords: Integral Histogram; Fragment Based Tracking; Multiple People; Stereo Vision | |||
| A Study of Human Vision Inspection for Mura | | BIBAK | Full-Text | 747-754 | |
| Pei-Chia Wang; Sheue-Ling Hwang; Chao-Hua Wen | |||
| In the present study, some factors were considered such as the various types
and sizes of real Mura, and Mura inspection experience. The steps of data
collection and experiments were conducted systematically from the viewpoint of
human factors. From the experimental results, Mura size was the most important
factor on visual contrast threshold. The purpose of this research was to
objectively describe the relationships between the Mura characteristics and
visual contrast thresholds. Furthermore, a domestic JND model of LCD industry
was constructed. This model could be an inspection criterion for LCD industry. Keywords: Mura; JND; vision; LCD | |||
| Tracing Users' Behaviors in a Multimodal Instructional Material: An Eye-Tracking Study | | BIBAK | Full-Text | 755-762 | |
| Esra Yecan; Evren Sumuer; Bahar Baran; Kursat Cagiltay | |||
| This study aims to explore user behaviors in instructional environments
combining multimodal presentation of information. Cognitive load theory and
dual coding theory were taken as the theoretical perspectives for the analyses.
For this purpose, user behaviors were analyzed by recording participants' eye
movements while they were using an instructional material with synchronized
video and PowerPoint slides. 15 participants' eye fixation counts and durations
for specific parts of the material were collected. Findings of the study
revealed that the participants used the slide and video presentations in a
complementary way. Keywords: Producer; PowerPoint; video; eye tracking; cognitive load; dual coding;
multiple channels | |||
| A Study on Interactive Artwork as an Aesthetic Object Using Computer Vision System | | BIBAK | Full-Text | 763-768 | |
| Joonsung Yoon; Jaehwa Kim | |||
| With the recent rapid rise of Human-Computer Interaction and surveillance
system, various application systems are a matter of primary concern. However,
the application systems mostly deal with the technologies of recognition facial
characteristics, analyzing facial expression and automatic face recognition. By
applying this kind of various technologies and methods of face recognition, I
made an interactive artwork after computing the range of hands. This study is
about the artwork application theory and using computer vision system method.
The approach of this study makes possible to create artworks application in
real-time. Now, I'd like to propose how to utilize analyze and make
interactions, of created artworks. And also I'll explain the immersion of the
viewers. The viewers can express their imagination freely and artists provide
viewers an opportunity not to only enjoy visual experience, but also interact
and be immersed in the works via interface. This interactive art makes viewers
to actually take part in the works. Keywords: aesthetic object; artistic desire; interactive art; art and science | |||
| Human-Computer Interaction System Based on Nose Tracking | | BIBAK | Full-Text | 769-778 | |
| Lumin Zhang; Fuqiang Zhou; Weixian Li; Xiaoke Yang | |||
| This paper presents a novel Human-Computer Interaction (HCI) system with
calibrated mono-camera which integrates active computer vision technology and
embedded speech command recognition technology. Mainly by tracking the nose tip
motion robustly as the mouse trace, this system completes mouse mission with
recognition rate more than 85% at the speed 15 frame per second. To achieve the
goal, we adopt a novel approach based on the symmetry of the nose plane feature
to localize and track invariantly to the varied environment. Comparing to other
kinds of pointing device, this hand-free HCI system is hands-free, cheap,
real-time, convenient and unpolluted, which can be used in the field of
disabled aid, entertainment and remote control. Keywords: HCI; Nose Tracking; Calibration | |||
| Evaluating Eye Tracking with ISO 9241 -- Part 9 | | BIBAK | Full-Text | 779-788 | |
| Xuan Zhang; I. Scott MacKenzie | |||
| The ISO 9241-9 standard for computer pointing devices proposes an evaluation
of performance and comfort [4]. This paper is the first eye tracking evaluation
conforming to ISO 9241-9. We evaluated three techniques and compared them with
a standard mouse. The evaluation used throughput (in bits/s) as a measurement
of user performance in a multi-directional point-select task. The "Eye Tracking
Long" technique required participants to look at an on-screen target and dwell
on it for 750 ms for selection. Results revealed a lower throughput than for
the "Eye Tracking Short" technique with a 500 ms dwell time. The "Eye+Spacebar"
technique allowed participants to "point" with the eye and "select" by pressing
the spacebar upon fixation. This eliminated the need to wait for selection. It
was the best among the three eye tracking techniques with a throughput of 3.78
bits/s, which was close to the 4.68 bits/s for the mouse. Keywords: Pointing devices; ISO 9241; Fitts' law; performance evaluation; eye
movement; eye tracking | |||
| Impact of Mental Rotation Strategy on Absolute Direction Judgments: Supplementing Conventional Measures with Eye Movement Data | | BIBAK | Full-Text | 789-798 | |
| Ronggang Zhou; Kan Zhang | |||
| By training participants to use map-first mental rotation as their primary
strategy on absolute navigational task, this study focused on how integration
of heading information (from the exocentric reference frame) with target
position information (from the egocentric reference frame) affects absolute
direction judgments. Comparing with previous studies, the results in this study
showed (1) response was not better for north than for south, (2) response was
the slowest for back position in canonical position condition, and (3) the
cardinal direction advantage of right-back position was not impaired. Eye
movement data supported these conclusions partially, and should be cautious to
use for similar goals. These findings can be applied to navigational training
and interfaces design such as electric space. Keywords: absolute direction judgments; mental rotation strategy; eye movement;
reference frame | |||
| Beyond Mobile TV: Understanding How Mobile Interactive Systems Enable Users to Become Digital Producers | | BIBAK | Full-Text | 801-810 | |
| Anxo Cereijo Roibás; Riccardo Sala | |||
| This paper aims to explore the quality of the user experience with mobile
and pervasive interactive multimedia systems that enable the creation and
sharing of digital content through mobile phones. It also looks at discussing
the use and validity of different experimental in-situ and other data gathering
and evaluation techniques for the assessment of how the physical and social
contexts might influence the use of these systems. This scenario represents an
important shift away from professionally produced digital content for the
mass-market. It addresses methodologies and techniques that are suitable to
design co-creative applications for non-professional users in different
contexts of use at home or in public spaces. Special focus is be given to
understand how user participation and motivation in small themed communities
can be encouraged, and how social interaction can be enabled through mobile
interfaces. An enhancement of users creativity, self-authored content sharing,
sociability and co-experience can be evidence for how creative people can
benefit from Information and Communication Technologies. Keywords: users' generated content; pervasive multimedia; mobileTV | |||
| Media Convergence, an Introduction | | BIBA | Full-Text | 811-814 | |
| Sepideh Chakaveh; Manfred Bogen | |||
| Media convergence is a theory in communications where every mass medium eventually merges to the point where they become one medium due to the advent of new communication technologies. The Media Convergence research theme normally refers to entire production, distribution, and use process of future digital media services from contents production to service delivery through various channels such as mobile terminals, digital TV, or the Internet. | |||
| An Improved H.264 Error Concealment Algorithm with User Feedback Design | | BIBAK | Full-Text | 815-820 | |
| XiaoMing Chen; Yuk Ying Chung | |||
| This paper proposes a new Error Concealment (EC) method for the H.264/AVC
[1] video coding standard using both spatial and temporal information for
intra-frame concealment. Five error concealing modes are offered by this
method. The proposed EC method also allows feedback from users. It allows users
to define and change the thresholds for switching between five different modes
during the error concealing procedure. As a result, the concealing result for a
video sequence can be optimized by taking advantage of relevant user feedback.
The concealed video quality has been measured by a group of users and compared
with the H.264 EC method which is without user feedback. The experimental
results show that the proposed new EC algorithm with the user feedback performs
better (3 dB gains) than the H.264 EC without user feedback. Keywords: H.264; Error Concealment; User Feedback; Video Compression | |||
| Classification of a Person Picture and Scenery Picture Using Structured Simplicity | | BIBAK | Full-Text | 821-828 | |
| Myoung-Bum Chung; Il-Ju Ko | |||
| We can classify various images as either people pictures, if they contain
one or more persons, or scenery pictures, if they lack people, by using face
region detection. However, the precision of a picture's classification is low
if it uses existing face region detection technique. This paper proposes the
algorithm about structured simplicity of the picture to do classification with
higher accuracy. To verify the usefulness of an offer method, we did a
classification experiment which uses 500 people pictures and scenery pictures.
The experiment to use only face region detection in Open CV showed an accuracy
of 79% detection rate. While the experiment to use face region detection in
structured simplicity with Open CV showed an accuracy of 86.4%. Therefore by
using structured simplicity with face region detection, we can do an efficient
picture classification of a person picture and scenery picture. Keywords: Face region detection; Picture classification; Structured simplicity | |||
| Designing Personalized Media Center with Focus on Ethical Issues of Privacy and Security | | BIBAK | Full-Text | 829-835 | |
| Alma Leora Culén; Yonggong Ren | |||
| While considering the development of interactive television (iTV), we also
need to consider new possibilities for personalization of its audio-video
content as well as ethical issues related to such personalization. While
offering immense possibilities for new ways of informing, communicating, gaming
as well as watching selected and personalized broadcasted content, doors also
open to misuse, manipulation and destructive behavior. Our goal is to propose
and analyze a user-centered prototype for iTV, while keeping in mind ethical
principles that we hope would lead to a positive experience of this forthcoming
technology. Keywords: interactive television; experience; ethics; privacy; multi-touch interface | |||
| Evaluation of VISTO: A New Vector Image Search TOol | | BIBAK | Full-Text | 836-845 | |
| Tania Di Mascio; Daniele Frigioni; Laura Tarantino | |||
| We present en experimental evaluation of VISTO (Vector Image Search TOol), a
new content-based image retrieval (CBIR) system that deals with vector images
in SVG (Scalable Vector Graphics) format, differently to most of the CBIR tools
available in the literature that deal with raster images. The experimental
evaluation of retrieval systems is a critical part in the process of
continuously improving the existing retrieval metrics. While researchers in
text image retrieval have long been using a sophisticated set of tools for
user-based evaluation, this does not yet apply to image retrieval. In this
paper, we make a step forward toward this direction and present an experimental
evaluation of VISTO in a framework for the production of 2D animation. Keywords: Content Based Image Retrieval; vector images; SVG; evaluation | |||
| G-Tunes -- Physical Interaction Design of Playing Music | | BIBAK | Full-Text | 846-851 | |
| Jia Du; Ying Li | |||
| In this paper we present G-tunes, a music player that couples tangible
interface with digital music. The design is done based on the research of
tangible interface and interaction engineering. We offer an overview of design
concept, explain the prototyping and discuss the result. One of the goals of
this project is to create rich experiences for people to play music; another
goal is to explore how external physical expressions relate to human's inner
perception and emotion, and how we can couple this with the design of a
tangible music player. Keywords: Interaction design; Tangible interaction; Sensory perception; Music player;
Scale; Weight | |||
| nan0sphere: Location-Driven Fiction for Groups of Users | | BIBA | Full-Text | 852-861 | |
| Kevin Eustice; Venkatraman Ramakrishna; Alison Walker; Matthew Schnaider; Nam T. Nguyen; Peter L. Reiher | |||
| We developed a locative fiction application called nan0sphere and deployed it on the UCLA campus. This application presents an interactive narrative to users working in a group as they move around the campus. Based on each user's current location, previously visited locations, actions taken, and on the similar attributes of other users in the same group, the story will develop in different ways. Group members are encouraged by the story to move independently, with their individual actions and progress affecting the narrative and the overall group experience. Eight different locations on campus are involved in this story. Groups consist of four participants, and the complete story unfolds through the actions of all four group members. The supporting system could be used to create other similar types of locative literature, possibly augmented with multimedia, for other purposes and in other locations. We will discuss benefits and challenges of group interactions in locative fiction, infrastructure required to support such applications, issues of determining user locations, and our experiences using the application. | |||
| How Panoramic Photography Changed Multimedia Presentations in Tourism | | BIBAK | Full-Text | 862-871 | |
| Nelson Gonçalves | |||
| An overview of the use of panoramic photography, the panorama concept, and
evolution of presentation and multimedia projects targeting tourism promotions
The purpose is to stress the importance of panoramic pictures in the Portuguese
design of the multimedia systems for the promotion of tourism. Through
photography in the multimedia support on-line and off-line, the user can go
back in time and watch what those landscapes were like in his/her childhood,
for example. Consequently, one of the additional quality options in our
productions is the diachronic view of the landscape. Keywords: Design; Multimedia; CD-ROM; DVD; Web; Photography; Panorama; Tourism;
Virtual Tour | |||
| Frame Segmentation Used MLP-Based X-Y Recursive for Mobile Cartoon Content | | BIBAK | Full-Text | 872-881 | |
| Eunjung Han; Kirak Kim; HwangKyu Yang; Keechul Jung | |||
| With rapid growth of the mobile industry, the limitation of small screen
mobile is attracting a lot of researchers attention for transforming
on/off-line contents into mobile contents. Frame segmentation for limited
mobile browsers is the key point of off-line contents transformation. The X-Y
recursive cut algorithm has been widely used for frame segmentation in document
analysis. However, this algorithm has drawbacks for cartoon images which have
various image types and image with noises, especially the online cartoon
contents obtain during scanning. In this paper, we propose a method to segment
on/off-line cartoon contents into fitted frames for the mobile screen. This
makes the x-y recursive cut algorithm difficult to find the exact cutting
point. Therefore we use a method by combining two concepts: an X-Y recursive
cut algorithm to extract candidate segmenting positions which shows a good
performance on noises free contents, and Multi-Layer Perceptrons (MLP) concept
use on candidate for verification. These methods can increase the accuracy of
the frame segmentation and feasible to apply on various off-line cartoon images
with frames. Keywords: MLP; X-Y recursive; frame segmentation; mobile cartoon contents | |||
| Browsing and Sorting Digital Pictures Using Automatic Image Classification and Quality Analysis | | BIBAK | Full-Text | 882-891 | |
| Otmar Hilliges; Peter Kunath; Alexey Pryakhin; Andreas Butz; Hans-Peter Kriegel | |||
| In this paper we describe a new interface for browsing and sorting of
digital pictures. Our approach is two-fold. First we present a new method to
automatically identify similar images and rate them based on sharpness and
exposure quality of the images. Second we present a zoomable user interface
based on the details-on-demand paradigm enabling users to browse large
collections of digital images and select only the best images for further
processing or sharing. Keywords: Photoware; digital photography; image analysis; similarity measurement;
informed browsing; zoomable user interfaces; content based image retrieval | |||
| A Usability Study on Personalized EPG (pEPG) UI of Digital TV | | BIBAK | Full-Text | 892-901 | |
| Myo Ha Kim; Sang Min Ko; Jae Seung Mun; Yong Gu Ji; Moon Ryul Jung | |||
| As the use of digital television (D-TV) has spread across the globe,
usability problems on D-TV have become an important issue. However, so far,
very little has been done in the usability studies on D-TV. The aim of this
study is developing evaluation methods for the user interface (UI) of a
personalized electronic program guide (pEPG) of D-TV, and evaluating the UI of
a working prototype of pEPG using this method. To do this, first, the structure
of the UI system and navigation for a working prototype of pEPG was designed
considering the expanded channel. Secondly, the evaluation principles as the
usability method for a working prototype of pEPG were developed. Third,
lab-based usability testing for a working prototype of pEPG was conducted with
these evaluation principles. The usability problems founded by usability
testing were reflected to improve the UI of a working prototype of pEPG. Keywords: Usability; User Interface (UI); Evaluation Principles; Personalized EPG
(pEPG); Digital Television (D-TV) | |||
| Recognizing Cultural Diversity in Digital Television User Interface Design | | BIBAK | Full-Text | 902-908 | |
| Joonhwan Kim; Sanghee Lee | |||
| Research trends in user interface design and human-computer interaction have
been shifting toward the consideration of use context. The reflection of
differences in users' cultural diversity is an important topic in the consumer
electronics design process, particularly for widely internationally sold
products. In the present study, the authors compared users' responses to
preference and performance to investigate the effect of different cultural
backgrounds. A high-definition display product with digital functions was
selected as a major digital product domain. Four user interface design concepts
were suggested, and user studies were conducted internationally with 57
participants in three major market countries. The tests included users'
subjective preferences on the suggested graphical designs, performances of the
on-screen display navigation, and feedback on newly suggested TV features. For
reliable analysis, both qualitative and quantitative data were measured. The
results reveal that responses to design preference were affected by
participants' cultural background. On the other hand, universal conflicts
between preference and performance were witnessed regardless of cultural
differences. This study indicates the necessity of user studies of cultural
differences and suggests an optimized level of localization in the example of
digital consumer electronics design. Keywords: User Interface Design; Cultural Diversity; Consumer Electronics; Digital
Television; Usability; Preference; Performance; International User Studies | |||
| A Study on User Satisfaction Evaluation About the Recommendation Techniques of a Personalized EPG System on Digital TV | | BIBAK | Full-Text | 909-917 | |
| Sang Min Ko; Yeon Jung Lee; Myo Ha Kim; Yong Gu Ji; Soo Won Lee | |||
| With the growing popularity of digital broadcasting, viewers the have chance
to watch various programs. However, they may have trouble choosing just one
among many programs. To solve this problem, various studies about EPG and
Personalized EPG have been performed. In this study, we reviewed previous
studies about EPG, Personalized EPG and the results of recommendation
evaluations, and evaluated PEPG system's recommendation, which was implemented
as working prototype. We collected preference information about categories and
channels with 30 subjects and executed evaluation through e-mail. Recall and
Precision were calculated by analyzing recommended programs from an E-mail
questionnaire, and an evaluation of subjective satisfaction was conducted. As a
result, we determined how much the result of an evaluation reflects viewer
satisfaction by comparing the variation of subjects' satisfaction and the
variation of objective evaluation criteria. Keywords: EPG; PEPG; Satisfaction; Digital TV; DTV | |||
| Usability of Hybridmedia Services -- PC and Mobile Applications Compared | | BIBAK | Full-Text | 918-925 | |
| Jari Laarni; Liisa Lähteenmäki; Johanna Kuosmanen; Niklas Ravaja | |||
| The aim is to present results of a usability test of a prototype of a
context-based personalized hybridmedia service for delivering product-specific
information to consumers. We recorded participants' eye movements when they
used the service either with a camera phone or with the web browser of a PC.
The participants' task was to search for product-specific information from the
food product database and test calculators by using both a PC and mobile user
interface. Eye movements were measured by a head-mounted eye tracking system.
Even though the completion of the tasks took longer when the participants used
the mobile phone than when they used the PC, they could complete the tasks
successfully with both interfaces. Provided that the barcode tag was not very
small, taking pictures from the barcodes with a mobile phone was quite easy.
Overall, the use of the service via the mobile phone provides a quite good
alternative for the PC. Keywords: Hybridmedia; usability; eye tracking; barcode reading | |||
| m-YouTube Mobile UI: Video Selection Based on Social Influence | | BIBAK | Full-Text | 926-932 | |
| Aaron Marcus; Angel Perez | |||
| The ease-of-use of Web-based video-publishing services provided by
applications like YouTube has encouraged a new means of asynchronous
communication, in which users can post videos not only to make them public for
review and criticism, but also as a way to express moods, feelings, or
intentions to an ever-growing network of friends. Following the current trend
of porting Web applications onto mobile platforms, the authors sought to
explore user-interface design issues of a mobile-device-based YouTube, which
they call m-YouTube. They first analyzed the elements of success of the current
YouTube Web site and observed its functionality. Then, they looked for unsolved
issues that could give benefit through information-visualization design for
small screens on mobile phones to explore a mobile version of such a
product/service. The biggest challenge was to reduce the number of functions
and amount information to fit into a mobile phone screen, but still be usable,
useful, and appealing within the YouTube context of use and user experience.
Borrowing ideas from social research in the area of social influence processes,
they made design decisions aiming to help YouTube users to make the decision of
what video content to watch and to increase the chances of YouTube authors
being evaluated and observed by peers. The paper proposes a means to visualize
large amounts of video relevant to YouTube users by using their friendship
network as a relevance indicator to help in the decision-making process. Keywords: design; interface; mobile; network; social; user; YouTube; video | |||
| Can Video Support City-Based Communities? | | BIBA | Full-Text | 933-942 | |
| Raquel Navarro-Prieto; Nidia Berbegal | |||
| The goal of our research has been to investigate the different ways in with using new communication technologies, especially mobile multimedia communications, could support the city-based communities. In this paper we review the research done about the effect of mobile technology, specially mobile video, into communities' communication patterns, and highlight the new challenges and gaps still not covered in this area. Finally, we will describe how we have tried to respond to these challenges by using User Centered Design in two very different types of communities: women associations, and elderly people. | |||
| Watch, Press, and Catch -- Impact of Divided Attention on Requirements of Audiovisual Quality | | BIBAK | Full-Text | 943-952 | |
| Ulrich Reiter; Satu Jumisko-Pyykkö | |||
| Many of today's audiovisual application systems offer some kind of
interactivity. Yet, quality assessments of these systems are often performed
without taking into account the possible effects of divided attention caused by
interaction or user task. We present a subjective assessment performed among 40
test subjects to investigate the impact of divided attention on the perception
of audiovisual quality in interactive application systems. Test subjects were
asked to rate the overall perceived audiovisual quality in an interactive 3D
scene with varying degrees of interactive tasks to be performed by the
subjects. As a result we found that the experienced overall quality did not
vary with the degree of interaction. The results of our study make clear that
in the case where interactivity is offered in an audiovisual application, it is
not generally possible to technically lower the signal quality without
perceptual effects. Keywords: audiovisual quality; subjective assessment; divided attention;
interactivity; task | |||
| Media Service Mediation Supporting Resident's Collaboration in ubiTV | | BIBA | Full-Text | 953-962 | |
| Choonsung Shin; Hyoseok Yoon; Woontack Woo | |||
| A smart home is an intelligent and shared space, where various services coexist and multiple residents with different preferences and habits share these services most of the time. Due to the sharing of space and time, service conflicts may occur when multiple users try to access media services. In this paper, we propose a context-based mediation method, consisting of service mediators and mobile mediators, to resolve the service conflicts in a smart home. The service mediators detect service conflicts among the residents and recommend their preferred media contents on a shared screen and their own mobile devices by exploiting users' preferences and service profiles. The mobile mediators collect the recommendation information and give the users personal recommendation. With combination of the service and mobile mediator, the residents are allowed to negotiate the media contents in the conflict situation. Based on experiments in the ubiHome, we observed that mediation is useful to encourage discussion and helps to choose a proper service in a conflict situation. Therefore, we expect the proposed mediation method to play a vital role in resolving conflicts and providing multiple residents with harmonized services in a smart home environment. | |||
| Implementation of a New H.264 Video Watermarking Algorithm with Usability Test | | BIBAK | Full-Text | 963-970 | |
| Mohd Afizi Mohd Shukran; Yuk Ying Chung; XiaoMing Chen | |||
| With the proliferation of digital multimedia content, issues of copyright
protection have become more important because the copying of digital video does
not result in the decrease in quality that occurs when analog video is copied.
One method of copyright protection is to embed a digital code, "watermark",
into the video sequence. The watermark can then unambiguously identify the
copyright holder of the video sequence. In this paper, we propose a new video
watermarking algorithm for the H.264 coded video with considering usability
factors. The usability testings based on the concept of Human Computer
Interface (HCI) have been performed on the proposed approach. The usability
testing has been considered representative for most image manipulations and
attacks. The proposed algorithm has passed all the attack testings. Therefore,
the watermarking mechanisms in this paper have been proved to be robust and
efficient to protect the copyright of H.264 coded video. Keywords: Video watermarking; H.264; Human Computer Interface (HCI) | |||
| Innovative TV: From an Old Standard to a New Concept of Interactive TV -- An Italian Job | | BIBAK | Full-Text | 971-980 | |
| Rossana Simeoni; Linnea Etzler; Elena Guercio; Monica Perrero; Amon Rapp; Roberto Montanari; Francesco Tesauri | |||
| The current market of television services adopts several broadcast
technologies (e.g. IPTV, DVBH, DTT), delivering different ranges of contents.
These services may be extremely heterogeneous, but they're all affected by the
continuous increase in quantity of contents and this trend is becoming more and
more complicated to manage. Hence, future television services must respond to
an emerging question: in what way could the navigation among this increasing
volume of multimedia contents be facilitated? To answer this question, a
research study was conducted, resulting in a set of guidelines for Interactive
TV development. At first, the current scenario was portrayed through a
functional analysis of existing TV systems and a survey of actual and potential
users. Subsequently, interaction models which could possibly be applied to
Interactive TV (e.g.: peer-to-peer programs) were assessed. Guidelines were
eventually defined as a synthesis of current best practices and new interactive
features. Keywords: Interactive TV; IPTV; enhanced TV; media consumers; peer-to-peer; focus
group; heuristic evaluation | |||
| Evaluating the Effectiveness of Digital Storytelling with Panoramic Images to Facilitate Experience Sharing | | BIBAK | Full-Text | 981-989 | |
| Zuraidah Sulaiman; Nor Laila Md. Noor; Narinderjit Singh; Suet Peng Yong | |||
| Technology advancement has now enabled experience sharing to happen in a
digital storytelling environment that is facilitated through different delivery
technologies such as panoramic images and virtual reality. However, panoramic
images have not being fully explored and formally studied especially to assist
experience sharing in digital storytelling setting. This research aims to study
the effectiveness of an interactive digital storytelling to facilitate the
sharing of experience. The interactive digital storytelling artifact was
developed to convey the look and feel of Universiti Teknologi PETRONAS through
the panoramic images. The effectiveness of digital storytelling through
panoramic images was empirically tested based on the adapted Delone and McLean
IS success model. The experiment was conducted on participants who have never
visited the university. Six hypotheses were derived and experiment showed that
there are correlations between user satisfaction of digital storytelling with
panoramic images and user's individual impact of the application to assist
experience sharing among users. Hence, this research concludes a model on the
production of an effective digital storytelling with panoramic images for
specific experience sharing to bloom among users. Keywords: Digital storytelling; interactivity; panoramic images; experience sharing;
effective system; effectiveness study; human computer interaction | |||
| User-Centered Design and Evaluation of a Concurrent Voice Communication and Media Sharing Application | | BIBAK | Full-Text | 990-999 | |
| David Wheatley | |||
| This paper describes two user-centered studies undertaken in the development
of a concurrent group voice and media sharing application. The first used paper
prototyping to identify the user values relating to a number of functional
capabilities. These results informed the development of a prototype
application, which was ported to a 3G handset and evaluated in the second study
using a conjoint analysis approach. Results indicated that concurrent photo
sharing was of high user value, while the value of video sharing was limited by
established mental models of file sharing. Overall higher ratings were found
among female subjects and among less technologically aware subjects and most
media sharing would be with those who are close and trusted. This, and other
results suggest that the reinforcement of social connections, spontaneity and
emotional communications would be important user objectives of such a media
sharing application. Keywords: User centered design; wireless communications; concurrent media sharing;
cell-phone applications | |||
| Customer-Dependent Storytelling Tool with Authoring and Viewing Functions | | BIBAK | Full-Text | 1000-1009 | |
| Sunhee Won; Mi Young Choi; Gye-Young Kim; Hyung-Il Choi | |||
| The animation is the main content of the digital storytelling. It usually
has the fixed number of characters. We want a customer to appear in the
animation as a main character. For this purpose, we have developed the tool
that helps to automatically implants facial shapes of a customer into the
existing animation images. Our tool first takes an image of a customer and
extracts out a face region and some valuable features that depicts the shape
and facial expression of the customer. Our tool has the module that changes the
existing character's face with that of the customer. This module employs the
facial expression recognition and warping functions so that the customer's face
fits into the confined region with the similar facial expression. Our tool also
has the module that shows the sequence of images in the form of animation. This
module employs the data compression function and produces the AVI format files
and throws them into the graphic board. Keywords: facial expression recognition; warping functions | |||
| Reliable Partner System Always Providing Users with Companionship Through Video Streaming | | BIBAK | Full-Text | 1010-1018 | |
| Takumi Yamaguchi; Kazunori Shimamura; Haruya Shiba | |||
| This paper presents a basic configuration of a system that provides dynamic
delivery of full-motion video while following target users in ubiquitous
computing environments. The proposed system is composed of multiple computer
displays with radio frequency identification (RFID) tag readers, which are
automatically connected to a network via IP, and RFID tags worn by users and
some network servers. We adopted a passive tag RFID system. The delivery of
full-motion video uses adaptive broadcasting. The system can continuously
deliver streaming data, such as full-motion video, to the display, through the
database and the streaming server on the network, moving from one display to
the next as the user moves through the network. Because it maintains the
information about the user's location in real time, it supports the user
wherever he or she is, without requiring a conscious request to obtain their
information. This paper describes a prototype implementation of this framework
and a practical application. Keywords: Ubiquitous; Partner system; Video streaming; Awareness | |||
| Modeling of Places Based on Feature Distribution | | BIBA | Full-Text | 1019-1027 | |
| Yi Hu; Chang Woo Lee; Jong Yeol Yang; Bum Joo Shin | |||
| In this paper, a place model based on a feature distribution is proposed for place recognition. In many previous proposed methods, places are modeled as images or a set of extracted features. In those methods, a database of images or feature sets should be built. The cost of search time will grow exponentially when the database goes large. The proposed feature distribution method uses global information of each place and the search space grows linearly according to the number of places. In the experiments, we evaluate the performance using different number of frames and features for the recognition each time. Additionally, we have shown that the proposed method is applicable to many real-time applications such as robot navigation, wearable computing systems, and so on. | |||
| Knowledge Transfer in Semi-automatic Image Interpretation | | BIBAK | Full-Text | 1028-1034 | |
| Jun Zhou; Li Cheng; Terry Caelli; Walter F. Bischof | |||
| Semi-automatic image interpretation systems utilize interactions between
users and computers to adapt and update interpretation algorithms. We have
studied the influence of human inputs on image interpretation by examining
several knowledge transfer models. Experimental results show that the quality
of the system performance depended not only on the knowledge transfer patterns
but also on the user input, indicating how important it is to develop
user-adapted image interpretation systems. Keywords: knowledge transfer; image interpretation; road tracking; human influence;
performance evaluation | |||