Human-Centred Machine Learning
Workshop Summaries
/
Gillies, Marco
/
Fiebrink, Rebecca
/
Tanaka, Atau
/
Garcia, Jérémie
/
Bevilacqua, Frédéric
/
Heloir, Alexis
/
Nunnari, Fabrizio
/
Mackay, Wendy
/
Amershi, Saleema
/
Lee, Bongshin
/
d'Alessandro, Nicolas
/
Tilmanne, Joëlle
/
Kulesza, Todd
/
Caramiaux, Baptiste
Extended Abstracts of the ACM CHI'16 Conference on Human Factors in
Computing Systems
2016-05-07
v.2
p.3558-3565
© Copyright 2016 ACM
Summary: Machine learning is one of the most important and successful techniques in
contemporary computer science. It involves the statistical inference of models
(such as classifiers) from data. It is often conceived in a very impersonal
way, with algorithms working autonomously on passively collected data. However,
this viewpoint hides considerable human work of tuning the algorithms,
gathering the data, and even deciding what should be modeled in the first
place. Examining machine learning from a human-centered perspective includes
explicitly recognising this human work, as well as reframing machine learning
workflows based on situated human working practices, and exploring the
co-adaptation of humans and systems. A human-centered understanding of machine
learning in human context can lead not only to more usable machine learning
tools, but to new ways of framing learning computationally. This workshop will
bring together researchers to discuss these issues and suggest future research
questions aimed at creating a human-centered approach to machine learning.
Designing speech and language interactions
Workshop summaries
/
Munteanu, Cosmin
/
Jones, Matt
/
Whittaker, Steve
/
Oviatt, Sharon
/
Aylett, Matthew
/
Penn, Gerald
/
Brewster, Stephen
/
d'Alessandro, Nicolas
Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems
2014-04-26
v.2
p.75-78
© Copyright 2014 ACM
Summary: Speech and natural language remain our most natural forms of interaction;
yet the HCI community have been very timid about focusing their attention on
designing and developing spoken language interaction techniques. While
significant efforts are spent and progress made in speech recognition,
synthesis, and natural language processing, there is now sufficient evidence
that many real-life applications using speech technologies do not require 100%
accuracy to be useful. This is particularly true if such systems are designed
with complementary modalities that better support their users or enhance the
systems' usability. Engaging the CHI community now is timely -- many recent
commercial applications, especially in the mobile space, are already tapping
the increased interest in and need for natural user interfaces (NUIs) by
enabling speech interaction in their products. This multidisciplinary, one-day
workshop will bring together interaction designers, usability researchers, and
general HCI practitioners to analyze the opportunities and directions to take
in designing more natural interactions based on spoken language, and to look at
how we can leverage recent advances in speech processing in order to gain
widespread acceptance of speech and natural language interaction.
MAGEFACE: Performative Conversion of Facial Characteristics into Speech
Synthesis Parameters
Technologies for Live Entertainment
/
d'Alessandro, Nicolas
/
Astrinaki, Maria
/
Dutoit, Thierry
Proceedings of the 2013 International Conference on INtelligent TEchnologies
for interactive enterTAINment
2013-07-03
p.179-182
Keywords: speech synthesis; software library; performative media; streaming
architecture; HTS; MAGE; realtime audio software; face tracking; mapping
© Copyright 2013 Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
Summary: In this paper, we illustrate the use of the MAGE performative speech
synthesizer through its application to the conversion of realtime-measured
facial features with FaceOSC into speech synthesis features such as vocal tract
shape or intonation. MAGE is a new software library for using HMM-based speech
synthesis in reactive programming environments. MAGE uses a rewritten version
of the HTS engine enabling the computation of speech audio samples on a
two-label window instead of the whole sentence. Only this feature enables the
realtime mapping of facial attributes to synthesis parameters.
IMAGE 2.0: New Features and its Application in the Development of a Talking
Guitar
Session 12: Augmented Instrument
/
Astrinaki, Maria
/
d'Alessandro, Nicolas
/
Reboursière, Loïc
/
Moinet, Alexis
/
Dutoit, Thierry
NIME 2013: New Interfaces for Musical Expression
2013-05-27
p.38
Keywords: speech synthesis, augmented guitar, hexaphonic guitar
© Copyright 2013 Authors
Summary: This paper describes the recent progress in our approach to generate
performative and controllable speech. The goal of the performative HMM-based
speech and singing synthesis library, called Mage, is to have the ability to
generate natural sounding speech with arbitrary speaker's voice
characteristics, speaking styles and expressions and at the same time to have
accurate reactive user control over all the available production levels. Mage
allows to arbitrarily change between voices, control speaking style or vocal
identity, manipulate voice characteristics or alter the targeted context
on-the-fly and also maintain the naturalness and intelligibility of the output.
To achieve these controls, it was essential to redesign and improve the initial
library. This paper focuses on the improvements of the architectural design,
the additional user controls and provides an overview of a prototype, where a
guitar is used to reactively control the generation of a synthetic voice in
various levels.
MAGE 2.0: New Features and its Application in the Development of a Talking
Guitar
Demos (2)
/
Astrinaki, Maria
/
d'Alessandro, Nicolas
/
Reboursière, Loïc
/
Moinet, Alexis
/
Dutoit, Thierry
NIME 2013: New Interfaces for Musical Expression
2013-05-27
p.100
PENny: An Extremely Low-Cost Pressure-Sensitive Stylus for Existing
Capacitive Touchscreens
Demos (3)
/
Wang, Johnty
/
d'Alessandro, Nicolas
/
Pon, Aura
/
Fels, Sidney
NIME 2013: New Interfaces for Musical Expression
2013-05-27
p.127
Keywords: input interfaces, touch screens, tablets, pressure-sensitive, low-cost
© Copyright 2013 Authors
Summary: By building a wired passive stylus we have added pressure sensitivity to
existing capacitive touch screen devices for less than $10 in materials, about
1/10th the cost of existing solutions. The stylus makes use of the built-in
audio interface that is available on most smartphones and tablets on the market
today. Limitations of the device include the physical constraint of wires, the
occupation of one audio input and output channel, and increased latency equal
to the period of at least one audio buffer duration. The stylus has been
demonstrated in two NIME applications thus far: a visual musical score drawing
and a singing synthesis application.
Investigation of Gesture Controlled Articulatory Vocal Synthesizer using a
Bio-Mechanical Mapping Layer
Paper Session I (Actuation and Visualization)
/
Wang, Johnty
/
d'Alessandro, Nicolas
/
Fels, Sidney
/
Pritchard, Robert
NIME 2012: New Interfaces for Musical Expression
2012-05-21
p.291
Keywords: Gesture, Mapping, Articulatory, Speech, Singing, Synthesis
© Copyright 2012 Authors
Summary: We have added a dynamic bio-mechanical mapping layer that contains a model
of the human vocal tract with tongue muscle activations as input and tract
geometry as output to a real time gesture controlled voice synthesizer system
used for musical performance and speech research. Using this mapping layer, we
conducted user studies comparing controlling the model muscle activations using
a 2D set of force sensors with a position controlled kinematic input space that
maps directly to the sound. Preliminary user evaluation suggests that it was
more difficult to using force input but the resultant output sound was more
intelligible and natural compared to the kinematic controller. This result
shows that force input is a potentially feasible for browsing through a vowel
space for an articulatory voice synthesis system, although further evaluation
is required.
MAGE -- A Platform for Tangible Speech Synthesis
Posters+Demos
/
Astrinaki, Maria
/
d'Alessandro, Nicolas
/
Dutoit, Thierry
NIME 2012: New Interfaces for Musical Expression
2012-05-21
p.164
Keywords: speech synthesis, Hidden Markov Models, tangible interaction, software
library, MAGE, HTS, performative
© Copyright 2012 Authors
Summary: In this paper, we describe our pioneering work in developing speech
synthesis beyond the Text-To-Speech paradigm. We introduce tangible speech
synthesis as an alternate way of envisioning how artificial speech content can
be produced. Tangible speech synthesis refers to the ability, for a given
system, to provide some physicality and interactivity to important speech
production parameters. We present MAGE, our new software platform for
high-quality reactive speech synthesis, based on statistical parametric
modeling and more particularly hidden Markov models. We also introduce a new
HandSketch-based musical instrument. This instrument brings pen and posture
based interaction on the top of MAGE, and demonstrates a first proof of
concept.
A Digital Mobile Choir: Joining Two Interfaces towards Composing and
Performing Collaborative Mobile Music
Posters+Demos
/
d'Alessandro, Nicolas
/
Pon, Aura
/
Wang, Johnty
/
Eagle, David
/
Sharlin, Ehud
/
Fels, Sidney
NIME 2012: New Interfaces for Musical Expression
2012-05-21
p.310
Keywords: singing synthesis, mobile music, interactive display, interface design, OSC,
ChoirMob, Vuzik, social music, choir
© Copyright 2012 Authors
Summary: We present the integration of two musical interfaces into a new music-making
system that seeks to capture the experience of a choir and bring it into the
mobile space. This system relies on three pervasive technologies that each
support a different part of the musical experience. First, the mobile device
application for performing with an artificial voice, called ChoirMob. Then, a
central composing and conducting application running on a local interactive
display, called Vuzik. Finally, a network protocol to synchronize the two.
ChoirMob musicians can perform music together at any location where they can
connect to a Vuzik central conducting device displaying a composed piece of
music. We explored this system by creating a chamber choir of ChoirMob
performers, consisting of both experienced musicians and novices, that
performed in rehearsals and live concert scenarios with music composed using
the Vuzik interface.
ROOM #81 -- Agent-Based Instrument for Experiencing Architectural and Vocal
Cues
/
d'Alessandro, Nicolas
/
Calderon, Roberto
/
Müller, Stefanie
NIME 2011: New Interfaces for Musical Expression
2011-05-30
p.132-135
Keywords: Installation, instrument, architecture, interactive fabric, motion, light,
voice synthesis, agent, collaboration
© Copyright 2011 Authors
Summary: ROOM#81 is a digital art installation which explores how visitors can
interact with architectural and vocal cues to intimately collaborate. The main
space is split into two distinct areas separated by a soft wall, i.e. a large
piece of fabric tensed vertically. Movement within these spaces and interaction
with the soft wall is captured by various kinds of sensors. People's activity
is constantly used by an agent in order to predict their actions. Machine
learning is then achieved by such agent to incrementally modify the nature of
light in the room and some laryngeal aspects of synthesized vocal spasms. The
combination of people closely collaborating together, light changes and vocal
responses creates an intimate experience of touch, space and sound.
SQUEEZY: Extending a Multi-touch Screen with Force Sensing Objects for
Controlling Articulatory Synthesis
/
Wang, Johnty
/
d'Alessandro, Nicolas
/
Fels, Sidney S.
/
Pritchard, Bob
NIME 2011: New Interfaces for Musical Expression
2011-05-30
p.531-532
Keywords: Musical controllers, tangible interfaces, force sensor, multitouch, voice
synthesis.
© Copyright 2011 Authors
Summary: This paper describes Squeezy: a low-cost, tangible input device that adds
multi-dimensional input to capacitive multi-touch tablet devices. Force input
is implemented through force sensing resistors mounted on a rubber ball, which
also provides passive haptic feedback. A microcontroller samples and transmits
the measured pressure information. Conductive fabric attached to the finger
contact area translates the touch to the bottom of the ball which allows the
touchscreen to detect the position and orientation. The addition of a tangible,
pressure-sensitive input to a portable multimedia device opens up new
possibilities for expressive musical interfaces and Squeezy is used as a
controller for real-time gesture controlled voice synthesis research.
Ubiquitous voice synthesis: interactive manipulation of speech and singing
on mobile distributed platforms
Interactivity 1
/
d'Alessandro, Nicolas
/
Pritchard, Robert
/
Wang, Johnty
/
Fels, Sidney
Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems
2011-05-07
v.2
p.335-340
© Copyright 2011 ACM
Summary: Vocal production is one of the most ubiquitous and expressive activities of
people, yet understanding its production and synthesis remains elusive. When
vocal synthesis is elevated to include new forms of singing and sound
production, fundamental changes to culture and musical expression emerge.
Nowadays, Text-To-Speech (TTS) synthesis seems unable to suggest innovative
solutions for new computing trends, such as mobility, interactivity, ubiquitous
computing or expressive manipulation. In this paper, we describe our pioneering
work in developing interactive voice synthesis beyond the TTS paradigm. We
present DiVA and HandSketch as our two current voice-based digital musical
instruments. We then discuss the evolution of this performance practice into a
new ubiquitous model applied to voice synthesis, and we describe our first
prototype using a mobile phone and wireless embodied devices in order to allow
a group of users to collaboratively produce voice synthesis in real-time.
Performance: what does a body know?
Interactivity special performances
/
Pritchard, Bob
/
Fels, Sid
/
d'Alessandro, Nicolas
/
Witvoet, Marguerite
/
Wang, Johnty
/
Hassall, Cameron
/
Day-Fraser, Helene
/
Cadell, Meryn
Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems
2011-05-07
v.2
p.2403-2407
© Copyright 2011 ACM
Summary: What Does A Body Know? is a concert work for Digital Ventriloquized Actor
(DiVA) and sound clips. A DiVA is a real time gesture-controlled formant-based
speech synthesizer using a Cyberglove®, touchglove, and Polhemus
Tracker® as the main interfaces. When used in conjunction with the
performer's own voice solos and "duets" can be performed in real time.
Advanced Techniques for Vertical Tablet Playing A Overview of Two Years of
Practicing the HandSketch 1.x
/
d'Alessandro, Nicolas
/
Dutoit, Thierry
NIME 2009: New Interfaces for Musical Expression
2009-06-04
p.173-174
© Copyright 2009 Authors
HandSketch Bi-Manual Controller Investigation on Expressive Control Issues
of an Augmented Tablet
/
d'Alessandro, Nicolas
/
Dutoit, Thierry
NIME 2007: New Interfaces for Musical Expression
2007-06-06
p.78-81
© Copyright 2007 Authors
Real-time CALM Synthesizer: New Approaches in Hands-Controlled Voice
Synthesis
Paper Session 5: Brain, Hands, and Expression
/
D'Alessandro, Nicolas
/
d'Alessandro, Christophe
/
Beux, Sylvain Le
/
Doval, Boris
NIME 2006: New Interfaces for Musical Expression
2006-06-04
p.266-271
© Copyright 2006 Authors