HCI Bibliography : Search Results skip to search form | skip to results |
Database updated: 2016-05-10 Searches since 2006-12-01: 32,346,849
director@hcibib.org
Hosted by ACM SIGCHI
The HCI Bibliogaphy was moved to a new server 2015-05-12 and again 2016-01-05, substantially degrading the environment for making updates.
There are no plans to add to the database.
Please send questions or comments to director@hcibib.org.
Query: Lasecki_W* Results: 40 Sorted by: Date  Comments?
Help Dates
Limit:   
<<First <Previous Permalink Next> Last>> Records: 1 to 25 of 40 Jump to: 2016 | 15 | 14 | 13 | 12 | 11 | 05 |
Towards Providing On-Demand Expert Support for Software Developers Software and Programming Tools / Chen, Yan / Oney, Steve / Lasecki, Walter S. Proceedings of the ACM CHI'16 Conference on Human Factors in Computing Systems 2016-05-07 v.1 p.3192-3203
ACM Digital Library Link
Summary: Software development is an expert task that requires complex reasoning and the ability to recall language or API-specific details. In practice, developers often seek support from IDE tools, Web resources, or other developers to help fill in gaps in their knowledge on-demand. In this paper, we present two studies that seek to inform the design of future systems that use remote experts to support developers on demand. The first explores what types of questions developers would ask a hypothetical assistant capable of answering any question they pose. The second study explores the interactions between developers and remote experts in supporting roles. Our results suggest eight key system features needed for on-demand remote developer assistants to be effective, which has implications for future human-powered development tools.

Coding Varied Behavior Types Using the Crowd Demos / Yim, Jinyeong / Jasani, Jeel / Henderson, Aubrey / Koutra, Danai / Dow, Steven / Leung, Winnie / Lim, Ellen / Gordon, Mitchell / Bigham, Jeffrey / Lasecki, Walter Companion Proceedings of ACM CSCW 2016 Conference on Computer-Supported Cooperative Work and Social Computing 2016-02-27 v.2 p.114-117
ACM Digital Library Link
Summary: Social science researchers spend significant time annotating behavioral events in video data in order to quantitatively assess interactions [2]. These behavioral events may be instantaneous changes, continuous actions that span unbounded periods of time, or behaviors that would be best described by severity or other scalar ratings. The complexity of these judgments, coupled with the time and effort required to meticulously assess video, results in a training and evaluation process that can take days or weeks. Computational analysis of video data is still limited due to the challenges introduced by objective interpretation and varied contexts. Glance [4] introduced a means of leveraging human intelligence by recruiting crowds of paid online workers to accurately analyze hours of video data in a matter of minutes. This approach has been shown to expedite work in human-centered fields, as well as generate training data for automated recognition systems. In this paper, we describe an interactive demonstration of an improved, more expressive version of Glance that expands the initial set of supported annotation formats (e.g. time range, classification, etc.) from one to nine. Worker interfaces for each of these options are dynamically generated, along with tutorials, based on the analyst's question. These new features allow analysts to acquire more specific information about events in video datasets.

Measuring text simplification with the crowd Human computation / Lasecki, Walter S. / Rello, Luz / Bigham, Jeffrey P. Proceedings of the 2015 International Cross-Disciplinary Conference on Web Accessibility (W4A) 2015-05-18 p.4
ACM Digital Library Link
Summary: Text can often be complex and difficult to read, especially for people with cognitive impairments or low literacy skills. Text simplification is a process that reduces the complexity of both wording and structure in a sentence, while retaining its meaning. However, this is currently a challenging task for machines, and thus, providing effective on-demand text simplification to those who need it remains an unsolved problem. Even evaluating the simplicity of text remains a challenging problem for both computers, which cannot understand the meaning of text, and humans, who often struggle to agree on what constitutes a good simplification.
    This paper focuses on the evaluation of English text simplification using the crowd. We show that leveraging crowds can result in a collective decision that is accurate and converges to a consensus rating. Our results from 2,500 crowd annotations show that the crowd can effectively rate levels of simplicity. This may allow simplification systems and system builders to get better feedback about how well content is being simplified, as compared to standard measures which classify content into 'simplified' or 'not simplified' categories. Our study provides evidence that the crowd could be used to evaluate English text simplification, as well as to create simplified text in future work.

The Effects of Sequence and Delay on Crowd Work Evaluating Crowdsourcing / Lasecki, Walter S. / Rzeszotarski, Jeffrey M. / Marcus, Adam / Bigham, Jeffrey P. Proceedings of the ACM CHI'15 Conference on Human Factors in Computing Systems 2015-04-18 v.1 p.1375-1378
ACM Digital Library Link
Summary: A common approach in crowdsourcing is to break large tasks into small microtasks so that they can be parallelized across many crowd workers and so that redundant work can be more easily compared for quality control. In practice, this can result in the microtasks being presented out of their natural order and often introduces delays between individual microtasks. In this paper, we demonstrate in a study of 338 crowd workers that non-sequential microtasks and the introduction of delays significantly decreases worker performance. We show that interruptions where a large delay occurs between two related tasks can cause up to a 102% slowdown in completion time, and interruptions where workers are asked to perform different tasks in sequence can slow down completion time by 57%. We conclude with a set of design guidelines to improve both worker performance and realized pay, and instructions for implementing these changes in existing interfaces for crowd work.

Apparition: Crowdsourced User Interfaces that Come to Life as You Sketch Them Understanding Crowdwork in Many Domains / Lasecki, Walter S. / Kim, Juho / Rafter, Nick / Sen, Onkur / Bigham, Jeffrey P. / Bernstein, Michael S. Proceedings of the ACM CHI'15 Conference on Human Factors in Computing Systems 2015-04-18 v.1 p.1925-1934
ACM Digital Library Link
Summary: Prototyping allows designers to quickly iterate and gather feedback, but the time it takes to create even a Wizard-of-Oz prototype reduces the utility of the process. In this paper, we introduce crowdsourcing techniques and tools for prototyping interactive systems in the time it takes to describe the idea. Our Apparition system uses paid microtask crowds to make even hard-to-automate functions work immediately, allowing more fluid prototyping of interfaces that contain interactive elements and complex behaviors. As users sketch their interface and describe it aloud in natural language, crowd workers and sketch recognition algorithms translate the input into user interface elements, add animations, and provide Wizard-of-Oz functionality. We discuss how design teams can use our approach to reflect on prototypes or begin user studies within seconds, and how, over time, Apparition prototypes can become fully-implemented versions of the systems they simulate. Powering Apparition is the first self-coordinated, real-time crowdsourcing infrastructure. We anchor this infrastructure on a new, lightweight write-locking mechanism that workers can use to signal their intentions to each other.

Zensors: Adaptive, Rapidly Deployable, Human-Intelligent Sensor Feeds Understanding Crowdwork in Many Domains / Laput, Gierad / Lasecki, Walter S. / Wiese, Jason / Xiao, Robert / Bigham, Jeffrey P. / Harrison, Chris Proceedings of the ACM CHI'15 Conference on Human Factors in Computing Systems 2015-04-18 v.1 p.1935-1944
ACM Digital Library Link
Summary: The promise of "smart" homes, workplaces, schools, and other environments has long been championed. Unattractive, however, has been the cost to run wires and install sensors. More critically, raw sensor data tends not to align with the types of questions humans wish to ask, e.g., do I need to restock my pantry? Although techniques like computer vision can answer some of these questions, it requires significant effort to build and train appropriate classifiers. Even then, these systems are often brittle, with limited ability to handle new or unexpected situations, including being repositioned and environmental changes (e.g., lighting, furniture, seasons). We propose Zensors, a new sensing approach that fuses real-time human intelligence from online crowd workers with automatic approaches to provide robust, adaptive, and readily deployable intelligent sensors. With Zensors, users can go from question to live sensor feed in less than 60 seconds. Through our API, Zensors can enable a variety of rich end-user applications and moves us closer to the vision of responsive, intelligent environments.

Exploring Privacy and Accuracy Trade-Offs in Crowdsourced Behavioral Video Coding Understanding Crowdwork in Many Domains / Lasecki, Walter S. / Gordon, Mitchell / Leung, Winnie / Lim, Ellen / Bigham, Jeffrey P. / Dow, Steven P. Proceedings of the ACM CHI'15 Conference on Human Factors in Computing Systems 2015-04-18 v.1 p.1945-1954
ACM Digital Library Link
Summary: Coding behavioral video is an important method used by researchers to understand social phenomenon. Unfortunately, traditional hand-coding approaches can take days or weeks of time to complete. Recent work has shown that these tasks can be completed quickly by leveraging the parallelism of large online crowds, but using the crowd introduces new concerns about accuracy, reliability, privacy, and cost. To explore these issues, we conducted interviews with 12 researchers who frequently code behavioral video, to investigate common practices and challenges with video coding. We find accuracy and privacy to be the researchers' primary concerns. To explore this more concretely, we used sample videos to investigate whether crowds can accurately recognize instances of commonly coded behaviors, and show that the crowd yields accurate results. Then, we demonstrate a method for obfuscating participant identity with a video blur filter, and find, as expected, that workers' ability to identify participants decreases as blur level increases. The workers' ability to accurately and reliably code behaviors also decreases, but not as steeply as the identity test. This trade-off between coding quality and privacy protection suggests that researchers can use online crowds to code for some key behaviors in video without compromising participant identity. We conclude with a discussion of how researchers can balance privacy and accuracy on their own data using a system we introduce called Incognito.

RegionSpeak: Quick Comprehensive Spatial Descriptions of Complex Images for Blind Users Accessibility at Home & on The Go / Zhong, Yu / Lasecki, Walter S. / Brady, Erin / Bigham, Jeffrey P. Proceedings of the ACM CHI'15 Conference on Human Factors in Computing Systems 2015-04-18 v.1 p.2353-2362
ACM Digital Library Link
Summary: Blind people often seek answers to their visual questions from remote sources, however, the commonly adopted single-image, single-response model does not always guarantee enough bandwidth between users and sources. This is especially true when questions concern large sets of information, or spatial layout, e.g., where is there to sit in this area, what tools are on this work bench, or what do the buttons on this machine do? Our RegionSpeak system addresses this problem by providing an accessible way for blind users to (i) combine visual information across multiple photographs via image stitching, em (ii) quickly collect labels from the crowd for all relevant objects contained within the resulting large visual area in parallel, and (iii) then interactively explore the spatial layout of the objects that were labeled. The regions and descriptions are displayed on an accessible touchscreen interface, which allow blind users to interactively explore their spatial layout. We demonstrate that workers from Amazon Mechanical Turk are able to quickly and accurately identify relevant regions, and that asking them to describe only one region at a time results in more comprehensive descriptions of complex images. RegionSpeak can be used to explore the spatial layout of the regions identified. It also demonstrates broad potential for helping blind users to answer difficult spatial layout questions.

Towards Integrating Real-Time Crowd Advice with Reinforcement Learning Poster & Demo Session / de la Cruz, Gabriel V. / Peng, Bei / Lasecki, Walter S. / Taylor, Matthew E. Companion Proceedings of the 2015 International Conference on Intelligent User Interfaces 2015-03-29 v.2 p.17-20
ACM Digital Library Link
Summary: Reinforcement learning is a powerful machine learning paradigm that allows agents to autonomously learn to maximize a scalar reward. However, it often suffers from poor initial performance and long learning times. This paper discusses how collecting on-line human feedback, both in real time and post hoc, can potentially improve the performance of such learning systems. We use the game Pac-Man to simulate a navigation setting and show that workers are able to accurately identify both when a sub-optimal action is executed, and what action should have been performed instead. Demonstrating that the crowd is capable of generating this input, and discussing the types of errors that occur, serves as a critical first step in designing systems that use this real-time feedback to improve systems' learning performance on-the-fly.

Increasing the bandwidth of crowdsourced visual question answering to better support blind users Poster abstracts / Lasecki, Walter S. / Zhong, Yu / Bigham, Jeffrey P. Sixteenth International ACM SIGACCESS Conference on Computers and Accessibility 2014-10-20 p.263-264
ACM Digital Library Link
Summary: Many of the visual questions that blind people ask cannot be easily answered with a single image or a short response, especially when questions are of an exploratory nature, e.g. what is in this area, or what tools are available on this work bench? We introduce RegionSpeak to allow blind users to capture large areas of visual information, identify all of the objects within them, and explore their spatial layout with fewer interactions. RegionSpeak helps blind users capture all of the relevant visual information using an interface designed to support stitching multiple images together. We use a parallel crowdsourcing workflow that asks workers to define and describe regions of interest, allowing even complex images to be described quickly. The regions and descriptions are displayed on an auditory touchscreen interface, allowing users to know what is in a scene and how it is laid out.

Legion scribe: real-time captioning by non-experts Demonstration abstracts / Lasecki, Walter S. / Kushalnagar, Raja / Bigham, Jeffrey P. Sixteenth International ACM SIGACCESS Conference on Computers and Accessibility 2014-10-20 p.303-304
ACM Digital Library Link
Summary: The promise of affordable, automatic approaches to real-time captioning imagines a future in which deaf and hard of hearing (DHH) users have immediate access to speech in the world around them my simply picking up their phone or other mobile device. While the challenges of processing highly variable natural language has prevented automated approaches from completing this task reliably enough for use in settings such as classrooms or workplaces [4], recent work in crowd-powered approaches have allowed groups of non-expert captionists to provide a similarly-flexible source of captions for DHH users. This is in contrast to current human-powered approaches, which use highly-trained professional captionists who can type up to 250 words per minute (WPM), but also can cost over $100/hr. In this paper, we describe a real-time demo of Legion:Scribe (or just "Scribe"), a crowd-powered captioning system that allows untrained participants and volunteers to provide reliable captions with less than 5 seconds of latency by computationally merging their input into a single collective answer that is more accurate and more complete than any one worker could have generated alone.

Expert crowdsourcing with flash teams Working with crowds / Retelny, Daniela / Robaszkiewicz, Sébastien / To, Alexandra / Lasecki, Walter S. / Patel, Jay / Rahmati, Negar / Doshi, Tulsee / Valentine, Melissa / Bernstein, Michael S. Proceedings of the 2014 ACM Symposium on User Interface Software and Technology 2014-10-05 v.1 p.75-85
ACM Digital Library Link
Summary: We introduce flash teams, a framework for dynamically assembling and managing paid experts from the crowd. Flash teams advance a vision of expert crowd work that accomplishes complex, interdependent goals such as engineering and design. These teams consist of sequences of linked modular tasks and handoffs that can be computationally managed. Interactive systems reason about and manipulate these teams' structures: for example, flash teams can be recombined to form larger organizations and authored automatically in response to a user's request. Flash teams can also hire more people elastically in reaction to task needs, and pipeline intermediate output to accelerate completion times. To enable flash teams, we present Foundry, an end-user authoring platform and runtime manager. Foundry allows users to author modular tasks, then manages teams through handoffs of intermediate work. We demonstrate that Foundry and flash teams enable crowdsourcing of a broad class of goals including design prototyping, course development, and film animation, in half the work time of traditional self-managed teams.

Glance: rapidly coding behavioral video with the crowd Video / Lasecki, Walter S. / Gordon, Mitchell / Koutra, Danai / Jung, Malte F. / Dow, Steven P. / Bigham, Jeffrey P. Proceedings of the 2014 ACM Symposium on User Interface Software and Technology 2014-10-05 v.1 p.551-562
ACM Digital Library Link
Summary: Behavioral researchers spend considerable amount of time coding video data to systematically extract meaning from subtle human actions and emotions. In this paper, we present Glance, a tool that allows researchers to rapidly query, sample, and analyze large video datasets for behavioral events that are hard to detect automatically. Glance takes advantage of the parallelism available in paid online crowds to interpret natural language queries and then aggregates responses in a summary view of the video data. Glance provides analysts with rapid responses when initially exploring a dataset, and reliable codings when refining an analysis. Our experiments show that Glance can code nearly 50 minutes of video in 5 minutes by recruiting over 60 workers simultaneously, and can get initial feedback to analysts in under 10 seconds for most clips. We present and compare new methods for accurately aggregating the input of multiple workers marking the spans of events in video data, and for measuring the quality of their coding in real-time before a baseline is established by measuring the variance between workers. Glance's rapid responses to natural language queries, feedback regarding question ambiguity and anomalies in the data, and ability to build on prior context in followup queries allow users to have a conversation-like interaction with their data -- opening up new possibilities for naturally exploring video data.

Powering interactive intelligent systems with the crowd Doctoral symposium / Lasecki, Walter S. Adjunct Proceedings of the 2014 ACM Symposium on User Interface Software and Technology 2014-10-05 v.2 p.21-24
ACM Digital Library Link
Summary: Creating intelligent systems that are able to recognize a user's behavior, understand unrestricted spoken natural language, complete complex tasks, and respond fluently could change the way computers are used in daily life. But fully-automated intelligent systems are a far-off goal -- currently, machines struggle in many real-world settings because problems can be almost entirely unconstrained and can vary greatly between instances. Human computation has been shown to be effective in many of these settings, but is traditionally applied in an offline, batch-processing fashion. My work focuses on a new model of continuous, real-time crowdsourcing that enables interactive crowd-powered systems.

Real-time captioning with the crowd Features / Lasecki, Walter S. / Bigham, Jeffrey P. interactions 2014-05 v.21 n.3 p.50-55
ACM Digital Library Link

Crowd storage: storing information on existing memories Crowdfunding and crowd storage / Bigham, Jeffrey P. / Lasecki, Walter S. Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems 2014-04-26 v.1 p.601-604
ACM Digital Library Link
Summary: This paper introduces the concept of crowd storage, the idea that digital files can be stored and retrieved later from the memories of people in the crowd. Similar to human memory, crowd storage is ephemeral, which means that storage is temporary and the quality of the stored information degrades over time. Crowd storage may be preferred over storing information directly in the cloud, or when it is desirable for information to degrade inline with normal human memories. To explore and validate this idea, we created WeStore, a system that stores and then later retrieves digital files in the existing memories of crowd workers. WeStore does not store information directly, but rather encrypts the files using details of the existing memories elicited from individuals within the crowd as cryptographic keys. The fidelity of the retrieved information is tied to how well the crowd remembers the details of the memories they provided. We demonstrate that crowd storage is feasible using an existing crowd marketplace (Amazon Mechanical Turk), explore design considerations important for building systems that use crowd storage, and outline ideas for future research in this area.

Finding dependencies between actions using the crowd Decisions, recommendations, and machine learning / Lasecki, Walter S. / Weingard, Leon / Ferguson, George / Bigham, Jeffrey P. Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems 2014-04-26 v.1 p.3095-3098
ACM Digital Library Link
Summary: Activity recognition can provide computers with the context underlying user inputs, enabling more relevant responses and more fluid interaction. However, training these systems is difficult because it requires observing every possible sequence of actions that comprise a given activity. Prior work has enabled the crowd to provide labels in real-time to train automated systems on-the-fly, but numerous examples are still needed before the system can recognize an activity on its own. To reduce the need to collect this data by observing users, we introduce ARchitect, a system that uses the crowd to capture the dependency structure of the actions that make up activities. Our tests show that over seven times as many examples can be collected using our approach versus relying on direct observation alone, demonstrating that by leveraging the understanding of the crowd, it is possible to more easily train automated systems.

Glance: enabling rapid interactions with data using the crowd Interactivity / Lasecki, Walter S. / Gordon, Mitchell / Dow, Steven P. / Bigham, Jeffrey P. Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems 2014-04-26 v.2 p.511-514
ACM Digital Library Link
Summary: Behavioral coding is a common technique in the social sciences and human computer interaction for extracting meaning from video data [3]. Since computer vision cannot yet reliably interpret human actions and emotions, video coding remains a time-consuming manual process done by a small team of researchers. We present Glance, a tool that allows researchers to rapidly analyze video datasets for behavioral events that are difficult to detect automatically. Glance uses the crowd to interpret natural language queries, and then aggregates and summarizes the content of the video. We show that Glance can accurately code events in video in a fraction of the time it would take a single person. We also investigate speed improvements made possible by recruiting large crowds, showing that Glance is able to code 80% of an hour-long video in just 5 minutes. Rapid coding allows participants to have a "conversation with their data" to rapidly develop and refine research hypotheses in ways not previously possible.

Selfsourcing personal tasks Works-in-progress / Teevan, Jaime / Liebling, Daniel J. / Lasecki, Walter S. Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems 2014-04-26 v.2 p.2527-2532
ACM Digital Library Link
Summary: Large tasks can be overwhelming. For example, many people have thousands of digital photographs that languish in unorganized archives because it is difficult and time consuming to gather them into meaningful collections. Such tasks are hard to start because they seem to require long uninterrupted periods of effort to make meaningful progress. We propose the idea of selfsourcing as a way to help people to perform large personal information tasks by breaking them into manageable microtasks. Using ideas from crowdsourcing and task management, selfsourcing can help people take advantage of existing gaps in time and recover quickly from interruptions. We present several achievable selfsourcing scenarios and explore how they can facilitate information work in interruption-driven environments.

Helping students keep up with real-time captions by pausing and highlighting Education / Lasecki, Walter S. / Kushalnagar, Raja / Bigham, Jeffrey P. Proceedings of the 2014 International Cross-Disciplinary Conference on Web Accessibility (W4A) 2014-04-07 p.39
ACM Digital Library Link
Summary: We explore methods for improving the readability of real-time captions by allowing users to more easily switch their gaze between multiple visual information sources. Real-time captioning provides deaf and hard of hearing (DHH) users with access to spoken content during live events, and the web has allowed these services to be provided via remotely-located captioning services, and for web content itself. However, despite caption benefits, spoken language reading rates often result in DHH users falling behind spoken content, especially when the audio is paired with visual references. This is particularly true in classroom settings, where multi-modal content is the norm, and captions are often poorly positioned in the room, relative to speakers. Additionally, this accommodation can benefit other students who face temporary or "situational" disabilities such as listening to unfamiliar speech accents, or if a student is in a location with poor acoustics.
    In this paper, we explore pausing and highlighting as a means of helping DHH students keep up with live classroom content by helping them track their place when reading text involving visual references. Our experiments show that by providing users with a tool to more easily track their place in a transcript while viewing live video, it is possible for them to follow visual content that might otherwise have been missed. Both pausing and highlighting have a positive impact on students' scores on comprehension tests, but highlighting is preferred to pausing, and yields nearly twice as large of an improvement. We then discuss several issues with captioning that we observed during our design process and user study, and then suggest future work that builds on these insights.

Information extraction and manipulation threats in crowd-powered systems Performing crowd work / Lasecki, Walter S. / Teevan, Jaime / Kamar, Ece Proceedings of ACM CSCW 2014 Conference on Computer-Supported Cooperative Work and Social Computing 2014-02-15 v.1 p.248-256
ACM Digital Library Link
Summary: Crowd-powered systems have become a popular way to augment the capabilities of automated systems in real-world settings. Many of these systems rely on human workers to process potentially sensitive data or make important decisions. This puts these systems at risk of unintentionally releasing sensitive data or having their outcomes maliciously manipulated. While almost all crowd-powered approaches account for errors made by individual workers, few factor in active attacks on the system. In this paper, we analyze different forms of threats from individuals and groups of workers extracting information from crowd-powered systems or manipulating these systems' outcomes. Via a set of studies performed on Amazon's Mechanical Turk platform and involving 1,140 unique workers, we demonstrate the viability of these threats. We show that the current system is vulnerable to coordinated attacks on a task based on the requests of another task and that a significant portion of Mechanical Turk workers are willing to contribute to an attack. We propose several possible approaches to mitigating these threats, including leveraging workers who are willing to go above and beyond to help, automatically flagging sensitive content, and using workflows that conceal information from each individual, while still allowing the group to complete a task. Our findings enable the crowd to continue to play an important part in automated systems, even as the data they use and the decisions they support become increasingly important.

Accessibility Evaluation of Classroom Captions / Kushalnagar, Raja S. / Lasecki, Walter S. / Bigham, Jeffrey P. ACM Transactions on Accessible Computing 2014-01 v.5 n.3 p.7
ACM Digital Library Link
Summary: Real-time captioning enables deaf and hard of hearing (DHH) people to follow classroom lectures and other aural speech by converting it into visual text with less than a five second delay. Keeping the delay short allows end-users to follow and participate in conversations. This article focuses on the fundamental problem that makes real-time captioning difficult: sequential keyboard typing is much slower than speaking. We first surveyed the audio characteristics of 240 one-hour-long captioned lectures on YouTube, such as speed and duration of speaking bursts. We then analyzed how these characteristics impact caption generation and readability, considering specifically our human-powered collaborative captioning approach. We note that most of these characteristics are also present in more general domains. For our caption comparison evaluation, we transcribed a classroom lecture in real-time using all three captioning approaches. We recruited 48 participants (24 DHH) to watch these classroom transcripts in an eye-tracking laboratory. We presented these captions in a randomized, balanced order. We show that both hearing and DHH participants preferred and followed collaborative captions better than those generated by automatic speech recognition (ASR) or professionals due to the more consistent flow of the resulting captions. These results show the potential to reliably capture speech even during sudden bursts of speed, as well as for generating "enhanced" captions, unlike other human-powered captioning approaches.

Answering visual questions with conversational crowd assistants Papers / Lasecki, Walter S. / Thiha, Phyo / Zhong, Yu / Brady, Erin / Bigham, Jeffrey P. Fifteenth Annual ACM SIGACCESS Conference on Assistive Technologies 2013-10-21 p.18
ACM Digital Library Link
Summary: Blind people face a range of accessibility challenges in their everyday lives, from reading the text on a package of food to traveling independently in a new place. Answering general questions about one's visual surroundings remains well beyond the capabilities of fully automated systems, but recent systems are showing the potential of engaging on-demand human workers (the crowd) to answer visual questions. The input to such systems has generally been a single image, which can limit the interaction with a worker to one question; or video streams where systems have paired the end user with a single worker, limiting the benefits of the crowd. In this paper, we introduce Chorus:View, a system that assists users over the course of longer interactions by engaging workers in a continuous conversation with the user about a video stream from the user's mobile device. We demonstrate the benefit of using multiple crowd workers instead of just one in terms of both latency and accuracy, then conduct a study with 10 blind users that shows Chorus:View answers common visual questions more quickly and accurately than existing approaches. We conclude with a discussion of users' feedback and potential future work on interactive crowd support of blind users.

Real-time captioning by non-experts with legion scribe Posters and demos / Lasecki, Walter S. / Miller, Christopher D. / Kushalnagar, Raja / Bigham, Jeffrey P. Fifteenth Annual ACM SIGACCESS Conference on Assistive Technologies 2013-10-21 p.55
ACM Digital Library Link
Summary: Real-time captioning provides people who are deaf or hard of hearing access to speech in settings such as classrooms and live events. The most reliable approach to provide these captions is to recruit an expert stenographer who is able to type at natural speaking rates, but they charge more than $100 USD per hour and must be scheduled in advance. We introduce Legion Scribe (Scribe), a system that allows 3-5 ordinary people who can hear and type to jointly caption speech in real-time. Each person is unable to type at natural speaking rates, and so is asked only to type part of what they hear. Scribe automatically stitches all of the partial captions together to form a complete caption stream. We have shown that the accuracy of Scribe captions approaches that of a professional stenographer, while its latency and cost is dramatically lower.

Chorus: a crowd-powered conversational assistant Crowd & creativity / Lasecki, Walter S. / Wesley, Rachel / Nichols, Jeffrey / Kulkarni, Anand / Allen, James F. / Bigham, Jeffrey P. Proceedings of the 2013 ACM Symposium on User Interface Software and Technology 2013-10-08 v.1 p.151-162
ACM Digital Library Link
Summary: Despite decades of research attempting to establish conversational interaction between humans and computers, the capabilities of automated conversational systems are still limited. In this paper, we introduce Chorus, a crowd-powered conversational assistant. When using Chorus, end users converse continuously with what appears to be a single conversational partner. Behind the scenes, Chorus leverages multiple crowd workers to propose and vote on responses. A shared memory space helps the dynamic crowd workforce maintain consistency, and a game-theoretic incentive mechanism helps to balance their efforts between proposing and voting. Studies with 12 end users and 100 crowd workers demonstrate that Chorus can provide accurate, topical responses, answering nearly 93% of user queries appropriately, and staying on-topic in over 95% of responses. We also observed that Chorus has advantages over pairing an end user with a single crowd worker and end users completing their own tasks in terms of speed, quality, and breadth of assistance. Chorus demonstrates a new future in which conversational assistants are made usable in the real world by combining human and machine intelligence, and may enable a useful new way of interacting with the crowds powering other systems.
<<First <Previous Permalink Next> Last>> Records: 1 to 25 of 40 Jump to: 2016 | 15 | 14 | 13 | 12 | 11 | 05 |