Speech-based Interaction: Myths, Challenges, and Opportunities
Course Overviews
/
Munteanu, Cosmin
/
Penn, Gerald
Extended Abstracts of the ACM CHI'16 Conference on Human Factors in
Computing Systems
2016-05-07
v.2
p.992-995
© Copyright 2016 ACM
Summary: HCI research has for long been dedicated to better and more naturally
facilitating information transfer between humans and machines. Unfortunately,
humans' most natural form of communication, speech, is also one of the most
difficult modalities to be understood by machines -- despite, and perhaps,
because it is the highest-bandwidth communication channel we possess. While
significant research efforts, from engineering, to linguistic, and to cognitive
sciences, have been spent on improving machines' ability to understand speech,
the CHI community (and the HCI field at large) has been relatively timid in
embracing this modality as a central focus of research. This can be attributed
in part to the relatively discouraging levels of accuracy in understanding
speech, in contrast with often-unfounded claims of success from industry, but
also to the intrinsic difficulty of designing and especially evaluating speech
and natural language interfaces. As such, the development of interactive
speech-based systems is mostly driven by engineering efforts to improve such
systems with respect to largely arbitrary performance metrics. Such
developments have often been void of any user-centered design principles or
consideration for usability or usefulness. The goal of this course is to inform
the CHI community of the current state of speech and natural language research,
to dispel some of the myths surrounding speech-based interaction, as well as to
provide an opportunity for researchers and practitioners to learn more about
how speech recognition and speech synthesis work, what are their limitations,
and how they could be used to enhance current interaction paradigms. Through
this, we hope that HCI researchers and practitioners will learn how to combine
recent advances in speech processing with user-centred principles in designing
more usable and useful speech-based interactive systems.