[1]
Parting Crowds: Characterizing Divergent Interpretations in Crowdsourced
Annotation Tasks
Crowd Workflows
/
Kairam, Sanjay
/
Heer, Jeffrey
Proceedings of ACM CSCW 2016 Conference on Computer-Supported Cooperative
Work and Social Computing
2016-02-27
v.1
p.1637-1648
© Copyright 2016 ACM
Summary: Crowdsourcing is a common strategy for collecting the "gold standard" labels
required for many natural language applications. Crowdworkers differ in their
responses for many reasons, but existing approaches often treat disagreements
as "noise" to be removed through filtering or aggregation. In this paper, we
introduce the workflow design pattern of crowd parting: separating workers
based on shared patterns in responses to a crowdsourcing task. We illustrate
this idea using an automated clustering-based method to identify divergent, but
valid, worker interpretations in crowdsourced entity annotations collected over
two distinct corpora -- Wikipedia articles and Tweets. We demonstrate how the
intermediate-level view provide by crowd-parting analysis provides insight into
sources of disagreement not easily gleaned from viewing either individual
annotation sets or aggregated results. We discuss several concrete applications
for how this approach could be applied directly to improving the quality and
efficiency of crowdsourced annotation tasks.
[2]
Forum77: An Analysis of an Online Health Forum Dedicated to Addiction
Recovery
Managing Chronic Illness through Collaboration
/
MacLean, Diana
/
Gupta, Sonal
/
Lembke, Anna
/
Manning, Christopher
/
Heer, Jeffrey
Proceedings of ACM CSCW 2015 Conference on Computer-Supported Cooperative
Work and Social Computing
2015-02-28
v.1
p.1511-1526
© Copyright 2015 ACM
Summary: Prescription drug abuse is a pressing public health issue, and people who
misuse prescription drugs are turning to online forums for help. Are such
forums effective? We analyze the process of opioid withdrawal, recovery and
relapse on Forum77, MedHelp.org's online health forum for substance abuse
recovery. Applying Prochashka's Transtheoretical Model for behavior change, we
develop a taxonomy describing phases of addiction expressed by Forum77 members.
We examine activity and linguistic features across the phases USING,
WITHDRAWING and RECOVERING. We train statistical classifiers to identify
addiction phase, relapse and whether a user was RECOVERING at the time of her
last post. Applying our classifiers to 2,848 users, we find that while almost
50% relapse, the prognosis for ending in RECOVERING is favorable. Supplementing
our results with users' own accounts of their experiences, we discuss Forum77's
efficacy and shortcomings, and implications for future technologies.
[3]
Predictive translation memory: a mixed-initiative system for human language
translation
Modeling and prediction
/
Green, Spence
/
Chuang, Jason
/
Heer, Jeffrey
/
Manning, Christopher D.
Proceedings of the 2014 ACM Symposium on User Interface Software and
Technology
2014-10-05
v.1
p.177-187
© Copyright 2014 ACM
Summary: The standard approach to computer-aided language translation is
post-editing: a machine generates a single translation that a human translator
corrects. Recent studies have shown this simple technique to be surprisingly
effective, yet it underutilizes the complementary strengths of
precision-oriented humans and recall-oriented machines. We present Predictive
Translation Memory, an interactive, mixed-initiative system for human language
translation. Translators build translations incrementally by considering
machine suggestions that update according to the user's current partial
translation. In a large-scale study, we find that professional translators are
slightly slower in the interactive mode yet produce slightly higher quality
translations despite significant prior experience with the baseline
post-editing condition. Our analysis identifies significant predictors of time
and quality, and also characterizes interactive aid usage. Subjects entered
over 99% of characters via interactive aids, a significantly higher fraction
than that shown in previous work.
[4]
Declarative interaction design for data visualization
Developer tools II
/
Satyanarayan, Arvind
/
Wongsuphasawat, Kanit
/
Heer, Jeffrey
Proceedings of the 2014 ACM Symposium on User Interface Software and
Technology
2014-10-05
v.1
p.669-678
© Copyright 2014 ACM
Summary: Declarative visualization grammars can accelerate development, facilitate
retargeting across platforms, and allow language-level optimizations. However,
existing declarative visualization languages are primarily concerned with
visual encoding, and rely on imperative event handlers for interactive
behaviors. In response, we introduce a model of declarative interaction design
for data visualizations. Adopting methods from reactive programming, we model
low-level events as composable data streams from which we form higher-level
semantic signals. Signals feed predicates and scale inversions, which allow us
to generalize interactive selections at the level of item geometry (pixels)
into interactive queries over the data domain. Production rules then use these
queries to manipulate the visualization's appearance. To facilitate reuse and
sharing, these constructs can be encapsulated as named interactors: standalone,
purely declarative specifications of interaction techniques. We assess our
model's feasibility and expressivity by instantiating it with extensions to the
Vega visualization grammar. Through a diverse range of examples, we demonstrate
coverage over an established taxonomy of visualization interaction techniques.
[5]
BodyDiagrams: improving communication of pain symptoms through drawing
Quantified self
/
Jang, Amy
/
MacLean, Diana L.
/
Heer, Jeffrey
Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems
2014-04-26
v.1
p.1153-1162
© Copyright 2014 ACM
Summary: Thousands of people use the Internet to discuss pain symptoms. While
communication between patients and physicians involves both verbal and physical
interactions, online discussions of symptoms typically comprise text only. We
present BodyDiagrams, an online interface for expressing symptoms via drawings
and text. BodyDiagrams augment textual descriptions with pain diagrams drawn
over a reference body and annotated with severity and temporal metadata. The
resulting diagrams can easily be shared to solicit feedback and advice. We also
conduct a two-phase user study to assess BodyDiagrams' communicative efficacy.
In the first phase, users describe pain symptoms using BodyDiagrams and a
text-only interface; in the second phase, medical professionals evaluate these
descriptions. We find that patients are significantly more confident that their
BodyDiagrams will be correctly interpreted, while medical professionals rated
BodyDiagrams as significantly more informative than text descriptions. Both
groups indicated a preference for using diagrams to communicate physical
symptoms in the future.
[6]
Designing a prototype interface for visual communication of pain
Health
/
Jang, Amy
/
MacLean, Diana
/
Heer, Jeffrey
Extended Abstracts of ACM CHI'13 Conference on Human Factors in Computing
Systems
2013-04-27
v.2
p.427-432
© Copyright 2013 ACM
Summary: Thousands of people use Online Health Communities (OHCs) as a forum for
expressing and collaborating on symptoms of pain. Despite the physical nature
of pain, these exchanges typically comprise text. While pain referral diagrams
have served as patient-physician communication aids for decades, little
research has focused on translating them into an interactive digital interface.
We propose that such an interface would provide a more efficient and accurate
mechanism for expressing pain and would facilitate useful discussion around
pain symptoms. In this work-in-progress, we present a pilot study in which
users expressed physical symptoms using pen and paper. Our results uncovered
several design considerations that are currently being used to inform the
design of Body Diagrams, an interactive pain visualization tool that we plan to
deploy to a pain-related OHC in the near future.
[7]
Many people, many eyes: aggregating influences of visual perception on user
interface design
Workshop summaries
/
Reinecke, Katharina
/
Flatla, David R.
/
Solovey, Erin
/
Gutwin, Carl
/
Gajos, Krzysztof Z.
/
Heer, Jeffrey
Extended Abstracts of ACM CHI'13 Conference on Human Factors in Computing
Systems
2013-04-27
v.2
p.3299-3302
© Copyright 2013 ACM
Summary: Many factors influence a user's visual perception of an interface (e.g.,
culture, gender, visual impairment). In general, interface researchers and
designers have considered these factors in isolation, without considering the
combined effect of every factor influencing the visual perception of the user.
As a result, interfaces have been optimized for single factors (e.g., improving
accessibility for individuals with low vision), at the expense of optimizing
for the individual's visual perception experience (e.g., considering cultural
preferences and lighting conditions while assisting users with low vision). In
this workshop, we will begin the process of combining the broad range of visual
perception knowledge to create a holistic approach to understanding users'
visual perception. The resulting knowledge pool will be used for generating
interfaces better suited to the full range of users' visual perception
abilities.
[8]
The efficacy of human post-editing for language translation
Papers: language and translation
/
Green, Spence
/
Heer, Jeffrey
/
Manning, Christopher D.
Proceedings of ACM CHI 2013 Conference on Human Factors in Computing Systems
2013-04-27
v.1
p.439-448
© Copyright 2013 ACM
Summary: Language translation is slow and expensive, so various forms of machine
assistance have been devised. Automatic machine translation systems process
text quickly and cheaply, but with quality far below that of skilled human
translators. To bridge this quality gap, the translation industry has
investigated post-editing, or the manual correction of machine output. We
present the first rigorous, controlled analysis of post-editing and find that
post-editing leads to reduced time and, surprisingly, improved quality for
three diverse language pairs (English to Arabic, French, and German). Our
statistical models and visualizations of experimental data indicate that some
simple predictors (like source text part of speech counts) predict translation
time, and that post-editing results in very different interaction patterns.
From these results we distill implications for the design of new language
translation interfaces.
[9]
"Without the clutter of unimportant words": Descriptive keyphrases for text
visualization
/
Chuang, Jason
/
Manning, Christopher D.
/
Heer, Jeffrey
ACM Transactions on Computer-Human Interaction
2012-10
v.19
n.3
p.19
© Copyright 2012 ACM
Summary: Keyphrases aid the exploration of text collections by communicating salient
aspects of documents and are often used to create effective visualizations of
text. While prior work in HCI and visualization has proposed a variety of ways
of presenting keyphrases, less attention has been paid to selecting the best
descriptive terms. In this article, we investigate the statistical and
linguistic properties of keyphrases chosen by human judges and determine which
features are most predictive of high-quality descriptive phrases. Based on
5,611 responses from 69 graduate students describing a corpus of dissertation
abstracts, we analyze characteristics of human-generated keyphrases, including
phrase length, commonness, position, and part of speech. Next, we
systematically assess the contribution of each feature within statistical
models of keyphrase quality. We then introduce a method for grouping similar
terms and varying the specificity of displayed phrases so that applications can
select phrases dynamically based on the available screen space and current
context of interaction. Precision-recall measures find that our technique
generates keyphrases that match those selected by human judges. Crowdsourced
ratings of tag cloud visualizations rank our approach above other automatic
techniques. Finally, we discuss the role of HCI methods in developing new
algorithmic techniques suitable for user-facing applications.
[10]
Termite: visualization techniques for assessing textual topic models
Interface design
/
Chuang, Jason
/
Manning, Christopher D.
/
Heer, Jeffrey
Proceedings of the 2012 International Conference on Advanced Visual
Interfaces
2012-05-22
p.74-77
© Copyright 2012 ACM
Summary: Topic models aid analysis of text corpora by identifying latent topics based
on co-occurring words. Real-world deployments of topic models, however, often
require intensive expert verification and model refinement. In this paper we
present Termite, a visual analysis tool for assessing topic model quality.
Termite uses a tabular layout to promote comparison of terms both within and
across latent topics. We contribute a novel saliency measure for selecting
relevant terms and a seriation algorithm that both reveals clustering structure
and promotes the legibility of related terms. In a series of examples, we
demonstrate how Termite allows analysts to identify coherent and significant
themes.
[11]
GraphPrism: compact visualization of network structure
Graph visualization
/
Kairam, Sanjay
/
MacLean, Diana
/
Savva, Manolis
/
Heer, Jeffrey
Proceedings of the 2012 International Conference on Advanced Visual
Interfaces
2012-05-22
p.498-505
© Copyright 2012 ACM
Summary: Visual methods for supporting the characterization, comparison, and
classification of large networks remain an open challenge. Ideally, such
techniques should surface useful structural features -- such as effective
diameter, small-world properties, and structural holes -- not always apparent
from either summary statistics or typical network visualizations. In this
paper, we present GraphPrism, a technique for visually summarizing arbitrarily
large graphs through combinations of 'facets', each corresponding to a single
node- or edge-specific metric (e.g., transitivity). We describe a generalized
approach for constructing facets by calculating distributions of graph metrics
over increasingly large local neighborhoods and representing these as a stacked
multi-scale histogram. Evaluation with paper prototypes shows that, with
minimal training, static GraphPrism diagrams can aid network analysis experts
in performing basic analysis tasks with network data. Finally, we contribute
the design of an interactive system using linked selection between GraphPrism
overviews and node-link detail views. Using a case study of data from a
co-authorship network, we illustrate how GraphPrism facilitates interactive
exploration of network data.
[12]
Profiler: integrated statistical analysis and visualization for data quality
assessment
Visual analytics
/
Kandel, Sean
/
Parikh, Ravi
/
Paepcke, Andreas
/
Hellerstein, Joseph M.
/
Heer, Jeffrey
Proceedings of the 2012 International Conference on Advanced Visual
Interfaces
2012-05-22
p.547-554
© Copyright 2012 ACM
Summary: Data quality issues such as missing, erroneous, extreme and duplicate values
undermine analysis and are time-consuming to find and fix. Automated methods
can help identify anomalies, but determining what constitutes an error is
context-dependent and so requires human judgment. While visualization tools can
facilitate this process, analysts must often manually construct the necessary
views, requiring significant expertise. We present Profiler, a visual analysis
tool for assessing quality issues in tabular data. Profiler applies data mining
methods to automatically flag problematic data and suggests coordinated summary
visualizations for assessing the data in context. The system contributes novel
methods for integrated statistical and visual analysis, automatic view
suggestion, and scalable visual summaries that support real-time interaction
with millions of data points. We present Profiler's architecture -- including
modular components for custom data types, anomaly detection routines and
summary visualizations -- and describe its application to motion picture,
natural disaster and water quality data sets.
[13]
Strategies for crowdsourcing social data analysis
Leveraging the crowd
/
Willett, Wesley
/
Heer, Jeffrey
/
Agrawala, Maneesh
Proceedings of ACM CHI 2012 Conference on Human Factors in Computing Systems
2012-05-05
v.1
p.227-236
© Copyright 2012 ACM
Summary: Web-based social data analysis tools that rely on public discussion to
produce hypotheses or explanations of the patterns and trends in data, rarely
yield high-quality results in practice. Crowdsourcing offers an alternative
approach in which an analyst pays workers to generate such explanations. Yet,
asking workers with varying skills, backgrounds and motivations to simply
"Explain why a chart is interesting" can result in irrelevant, unclear or
speculative explanations of variable quality. To address these problems, we
contribute seven strategies for improving the quality and diversity of
worker-generated explanations. Our experiments show that using (S1)
feature-oriented prompts, providing (S2) good examples, and including (S3)
reference gathering, (S4) chart reading, and (S5) annotation subtasks increases
the quality of responses by 28% for US workers and 196% for non-US workers.
Feature-oriented prompts improve explanation quality by 69% to 236% depending
on the prompt. We also show that (S6) pre-annotating charts can focus workers'
attention on relevant details, and demonstrate that (S7) generating
explanations iteratively increases explanation diversity without increasing
worker attrition. We used our techniques to generate 910 explanations for 16
datasets, and found that 63% were of high quality. These results demonstrate
that paid crowd workers can reliably generate diverse, high-quality
explanations that support the analysis of specific datasets.
[14]
Interpretation and trust: designing model-driven visualizations for text
analysis
Text visualization
/
Chuang, Jason
/
Ramage, Daniel
/
Manning, Christopher
/
Heer, Jeffrey
Proceedings of ACM CHI 2012 Conference on Human Factors in Computing Systems
2012-05-05
v.1
p.443-452
© Copyright 2012 ACM
Summary: Statistical topic models can help analysts discover patterns in large text
corpora by identifying recurring sets of words and enabling exploration by
topical concepts. However, understanding and validating the output of these
models can itself be a challenging analysis task. In this paper, we offer two
design considerations -- interpretation and trust -- for designing
visualizations based on data-driven models. Interpretation refers to the
facility with which an analyst makes inferences about the data through the lens
of a model abstraction. Trust refers to the actual and perceived accuracy of an
analyst's inferences. These considerations derive from our experiences
developing the Stanford Dissertation Browser, a tool for exploring over 9,000
Ph.D. theses by topical similarity, and a subsequent review of existing
literature. We contribute a novel similarity measure for text collections based
on a notion of "word-borrowing" that arose from an iterative design process.
Based on our experiences and a literature review, we distill a set of design
recommendations and describe how they promote interpretable and trustworthy
visual analysis tools.
[15]
Color naming models for color selection, image editing and palette design
Visionary models + tools
/
Heer, Jeffrey
/
Stone, Maureen
Proceedings of ACM CHI 2012 Conference on Human Factors in Computing Systems
2012-05-05
v.1
p.1007-1016
© Copyright 2012 ACM
Summary: Our ability to reliably name colors provides a link between visual
perception and symbolic cognition. In this paper, we investigate how a
statistical model of color naming can enable user interfaces to meaningfully
mimic this link and support novel interactions. We present a method for
constructing a probabilistic model of color naming from a large, unconstrained
set of human color name judgments. We describe how the model can be used to map
between colors and names and define metrics for color saliency (how reliably a
color is named) and color name distance (the similarity between colors based on
naming patterns). We then present a series of applications that demonstrate how
color naming models can enhance graphical interfaces: a color dictionary &
thesaurus, name-based pixel selection methods for image editing, and evaluation
aids for color palette design.
[16]
Balancing exertion experiences
Movement-based gameplay
/
Mueller, Florian
/
Vetere, Frank
/
Gibbs, Martin
/
Edge, Darren
/
Agamanolis, Stefan
/
Sheridan, Jennifer
/
Heer, Jeffrey
Proceedings of ACM CHI 2012 Conference on Human Factors in Computing Systems
2012-05-05
v.1
p.1853-1862
© Copyright 2012 ACM
Summary: Exercising with others, such as jogging in pairs, can be socially engaging.
However, if exercise partners have different fitness levels then the activity
can be too strenuous for one and not challenging enough for the other,
compromising engagement and health benefits. Our system, Jogging over a
Distance, uses heart rate data and spatialized sound to create an equitable,
balanced experience between joggers of different fitness levels who are
geographically distributed. We extend this prior work by analyzing the
experience of 32 joggers to detail how specific design features facilitated,
and hindered, an engaging and balanced exertion experience. With this
knowledge, we derive four dimensions that describe a design space for balancing
exertion experiences: Measurement, Adjustment, Presentation and Control. We
also present six design tactics for creating balanced exertion experiences
described by these dimensions. By aiding designers in supporting participants
of different physical abilities, we hope to increase participation and
engagement with physical activity and facilitate the many benefits it brings
about.
[17]
Proactive wrangling: mixed-initiative end-user programming of data
transformation scripts
Social information
/
Guo, Philip J.
/
Kandel, Sean
/
Hellerstein, Joseph M.
/
Heer, Jeffrey
Proceedings of the 201 ACM Symposium on User Interface Software and
Technology1
2011-10-16
v.1
p.65-74
© Copyright 2011 ACM
Summary: Analysts regularly wrangle data into a form suitable for computational tools
through a tedious process that delays more substantive analysis. While
interactive tools can assist data transformation, analysts must still
conceptualize the desired output state, formulate a transformation strategy,
and specify complex transforms. We present a model to proactively suggest data
transforms which map input data to a relational format expected by analysis
tools. To guide search through the space of transforms, we propose a metric
that scores tables according to type homogeneity, sparsity and the presence of
delimiters. When compared to "ideal" hand-crafted transformations, our model
suggests over half of the needed steps; in these cases the top-ranked
suggestion is preferred 77% of the time. User study results indicate that
suggestions produced by our model can assist analysts' transformation tasks,
but that users do not always value proactive assistance, instead preferring to
maintain the initiative. We discuss some implications of these results for
mixed-initiative interfaces.
[18]
MUSE: reviving memories using email archives
Social information
/
Hangal, Sudheendra
/
Lam, Monica S.
/
Heer, Jeffrey
Proceedings of the 201 ACM Symposium on User Interface Software and
Technology1
2011-10-16
v.1
p.75-84
© Copyright 2011 ACM
Summary: Email archives silently record our actions and thoughts over the years,
forming a passively acquired and detailed life-log that contains rich material
for reminiscing on our lives. However, exploratory browsing of archives
containing thousands of messages is tedious without effective ways to guide the
user towards interesting events and messages. We present Muse (Memories USing
Email), a system that combines data mining techniques and an interactive
interface to help users browse a long-term email archive. Muse analyzes the
contents of the archive and generates a set of cues that help to spark users'
memories: communication activity with inferred social groups, a summary of
recurring named entities, occurrence of sentimental words, and image
attachments. These cues serve as salient entry points into a browsing interface
that enables faceted navigation and rapid skimming of email messages. In our
user studies, we found that users generally enjoyed browsing their archives
with Muse, and extracted a range of benefits, from summarizing work progress to
renewing friendships and making serendipitous discoveries.
[19]
ReVision: automated classification, analysis and redesign of chart images
Sensing form and rhythm
/
Savva, Manolis
/
Kong, Nicholas
/
Chhajta, Arti
/
Fei-Fei, Li
/
Agrawala, Maneesh
/
Heer, Jeffrey
Proceedings of the 201 ACM Symposium on User Interface Software and
Technology1
2011-10-16
v.1
p.393-402
© Copyright 2011 ACM
Summary: Poorly designed charts are prevalent in reports, magazines, books and on the
Web. Most of these charts are only available as bitmap images; without access
to the underlying data it is prohibitively difficult for viewers to create more
effective visual representations. In response we present ReVision, a system
that automatically redesigns visualizations to improve graphical perception.
Given a bitmap image of a chart as input, ReVision applies computer vision and
machine learning techniques to identify the chart type (e.g., pie chart, bar
chart, scatterplot, etc.). It then extracts the graphical marks and infers the
underlying data. Using a corpus of images drawn from the web, ReVision achieves
image classification accuracy of 96% across ten chart categories. It also
accurately extracts marks from 79% of bar charts and 62% of pie charts, and
from these charts it successfully extracts data from 71% of bar charts and 64%
of pie charts. ReVision then applies perceptually-based design principles to
populate an interactive gallery of redesigned charts. With this interface,
users can view alternative chart designs and retarget content to different
visual styles.
[20]
Peripheral paced respiration: influencing user physiology during information
work
Sensing form and rhythm
/
Moraveji, Neema
/
Olson, Ben
/
Nguyen, Truc
/
Saadat, Mahmoud
/
Khalighi, Yaser
/
Pea, Roy
/
Heer, Jeffrey
Proceedings of the 201 ACM Symposium on User Interface Software and
Technology1
2011-10-16
v.1
p.423-428
© Copyright 2011 ACM
Summary: We present the design and evaluation of a technique for influencing user
respiration by integrating respiration-pacing methods into the desktop
operating system in a peripheral manner. Peripheral paced respiration differs
from prior techniques in that it does not require the user's full attention. We
conducted a within-subjects study to evaluate the efficacy of peripheral paced
respiration, as compared to no feedback, in an ecologically valid environment.
Participant respiration decreased significantly in the pacing condition. Upon
further analysis, we attribute this difference to a significant decrease in
breath rate while the intermittent pacing feedback is active, rather than a
persistent change in respiratory pattern. The results have implications for
researchers in physiological computing, biofeedback designers, and
human-computer interaction researchers concerned with user stress and affect.
[21]
CommentSpace: structured support for collaborative visual analysis
Organizations & enterprise
/
Willett, Wesley
/
Heer, Jeffrey
/
Hellerstein, Joseph
/
Agrawala, Maneesh
Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems
2011-05-07
v.1
p.3131-3140
© Copyright 2011 ACM
Summary: Collaborative visual analysis tools can enhance sensemaking by facilitating
social interpretation and parallelization of effort. These systems enable
distributed exploration and evidence gathering, allowing many users to pool
their effort as they discuss and analyze the data. We explore how adding
lightweight tag and link structure to comments can aid this analysis process.
We present CommentSpace, a collaborative system in which analysts comment on
visualizations and websites and then use tags and links to organize findings
and identify others'" contributions. In a pair of studies comparing
CommentSpace to a system without support for tags and links, we find that a
small, fixed vocabulary of tags (question, hypothesis, to-do) and links
(evidence-for, evidence-against) helps analysts more consistently and
accurately classify evidence and establish common ground. We also find that
managing and incentivizing participation is important for analysts to progress
from exploratory analysis to deeper analytical tasks. Finally, we demonstrate
that tags and links can help teams complete evidence gathering and synthesis
tasks and that organizing comments using tags and links improves analytic
results.
[22]
Wrangler: interactive visual specification of data transformation scripts
Developers & end-user programmers
/
Kandel, Sean
/
Paepcke, Andreas
/
Hellerstein, Joseph
/
Heer, Jeffrey
Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems
2011-05-07
v.1
p.3363-3372
© Copyright 2011 ACM
Summary: Though data analysis tools continue to improve, analysts still expend an
inordinate amount of time and effort manipulating data and assessing data
quality issues. Such "data wrangling" regularly involves reformatting data
values or layout, correcting erroneous or missing values, and integrating
multiple data sources. These transforms are often difficult to specify and
difficult to reuse across analysis tasks, teams, and tools. In response, we
introduce Wrangler, an interactive system for creating data transformations.
Wrangler combines direct manipulation of visualized data with automatic
inference of relevant transforms, enabling analysts to iteratively explore the
space of applicable operations and preview their effects. Wrangler leverages
semantic data types (e.g., geographic locations, dates, classification codes)
to aid validation and type conversion. Interactive histories support review,
refinement, and annotation of transformation scripts. User study results show
that Wrangler significantly reduces specification time and promotes the use of
robust, auditable transforms instead of manual editing.
[23]
Data collection by the people, for the people
Workshops
/
Robson, Christine
/
Kandel, Sean
/
Heer, Jeffrey
/
Pierce, Jeffrey
Proceedings of ACM CHI 2011 Conference on Human Factors in Computing Systems
2011-05-07
v.2
p.25-28
© Copyright 2011 ACM
Summary: Data Collection by the People, for the People is a CHI 2011 workshop to
explore data from the crowd, bringing together mobile crowdsourcing &
participatory urbanism researchers with data analysis and visualization
researchers. The workshop is two-day event beginning with day of field work in
the city of Vancouver, trying out mobile crowdsourcing applications and data
analysis tools. Participants are encouraged to contribute applications and
tools which they wish to share. Our goal is to provoke discussion and
brainstorming, enabling both data collection researchers and data
manipulation/analysis researchers to benefit from mutually learned lessons
about crowdsourced data.
[24]
Groups without tears: mining social topologies from email
Social computing and navigation
/
MacLean, Diana
/
Hangal, Sudheendra
/
Teh, Seng Keat
/
Lam, Monica S.
/
Heer, Jeffrey
Proceedings of the 2011 International Conference on Intelligent User
Interfaces
2011-02-13
p.83-92
© Copyright 2011 ACM
Summary: As people accumulate hundreds of "friends" in social media, a flat list of
connections becomes unmanageable. Interfaces agnostic to social structure
hinder the nuanced sharing of personal data such as photos, status updates,
news feeds, and comments. To address this problem, we propose social
topologies, a set of potentially overlapping and nested social groups, that
represent the structure and content of a person's social network as a
first-class object. We contribute an algorithm for creating social topologies
by mining communication history and identifying likely groups based on
co-occurrence patterns. We use our algorithm to populate a browser interface
that supports creation and editing of social groups via direct manipulation. A
user study confirms that our approach models subjects' social topologies well,
and that our interface enables intuitive browsing and management of a personal
social landscape.
[25]
Tracing genealogical data with TimeNets
Information visualization
/
Kim, Nam Wook
/
Card, Stuart K.
/
Heer, Jeffrey
Proceedings of the 2010 International Conference on Advanced Visual
Interfaces
2010-05-26
p.241-248
Keywords: TimeNets, genealogy, timelines, visualization
© Copyright 2010 ACM
Summary: We present TimeNets, a new visualization technique for genealogical data.
Most genealogical diagrams prioritize the display of generational relations. To
enable analysis of families over time, TimeNets prioritize temporal
relationships in addition to family structure. Individuals are represented
using timelines that converge and diverge to indicate marriage and divorce;
directional edges connect parents and children. This representation both
facilitates perception of temporal trends and provides a substrate for
communicating non-hierarchical patterns such as divorce, remarriage, and plural
marriage. We also apply degree-of-interest techniques to enable scalable,
interactive exploration. We present our design decisions, layout algorithm, and
a study finding that TimeNets accelerate analysis tasks involving temporal
data.