HCI Bibliography Home | HCI Conferences | BELIV Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
BELIV Tables of Contents: 0608101214

Proceedings of the 2010 Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization

Fullname:BELIV'10: BEyond time and errors: novel evaLuation methods for Information Visualization
Location:Atlanta, Georgia
Dates:2010-Apr-10 to 2010-Apr-11
Standard No:ISBN: 978-1-4503-0007-0; ACM DL: Table of Contents; hcibib: BELIV10
Links:Workshop Website | Umbrella Conference Website
  1. New Metrics I
  2. New Metrics II
  3. Methods
  4. Tasks / Data
  5. Insight Evaluations
  6. Mass Evaluations
  7. Physiological Measurements
  8. Evaluation Lessons

New Metrics I

Developing qualitative metrics for visual analytic environments BIBAFull-Text 1-7
  Jean Scholtz
In this paper, we examine reviews for the entries to the 2009 Visual Analytics Science and Technology (VAST) Symposium Challenge. By analyzing these reviews we gained a better understanding of what is important to our reviewers, both visualization researchers and professional analysts. This is a bottom-up approach to the development of heuristics to use in the evaluation of visual analytic environments. The meta-analysis and the results are presented in this paper.
Many roads lead to Rome: mapping users' problem solving strategies BIBAFull-Text 8-15
  Eva Mayr; Michael Smuc; Hanna Risku
Especially in ill-defined problems like complex, real-world tasks more than one way leads to a solution. Until now, the evaluation of information visualizations was often restricted to measuring outcomes only (time and error) or insights into the data set. A more detailed look into the processes which lead to or hinder task completion is provided by analyzing users' problem solving strategies. A study illustrates how they can be assessed and how this knowledge can be used in participatory design to improve a visual analytics tool. In order to provide the users a tool which functions as a real scaffold, it should allow them to choose their own path to Rome. We discuss how evaluation of problem solving strategies can shed more light on the users' "exploratory minds".

New Metrics II

Exploring information visualization: describing different interaction patterns BIBAFull-Text 16-23
  Margit Pohl; Sylvia Wiltner; Silvia Miksch
Interactive Information Visualization methods engage users in exploratory behavior. Detailed information about such processes can help developers to improve the design of such methods. The following study which is based on software logging describes patterns of such behavior in more detail. Subjects in our study engaged in some activities (e.g. adding data, changing form of visualization) significantly more than in others. They adapted their activity patterns to different tasks, but not fundamentally so. In addition, subjects adopted very systematic sequences of actions. These sequences were quite similar across the whole sample, thus indicating that such sequences might reflect specific problem solving behavior. Davidson's [7] framework of problem solving behavior is used to interpret the results. More research is necessary to show whether similar interaction patterns can be found for the usage of other InfoVis methodologies as well.
Towards information-theoretic visualization evaluation measure: a practical example for Bertin's matrices BIBAFull-Text 24-28
  Innar Liiv
This paper presents a discussion about matrix-based representation evaluation measures, including a review of related evaluation measures from different scientific disciplines and a proposal for promising approaches. The paper advocates linking or replacing a large portion of indefinable aesthetics with a mathematical framework and theory backed up by an incomputable function -- Kolmogorov complexity. A suitable information-theoretic evaluation measure is proposed together with a practical approximating implementation example for Bertin's Matrices.


Learning-based evaluation of visual analytic systems BIBAFull-Text 29-34
  Remco Chang; Caroline Ziemkiewicz; Roman Pyzh; Joseph Kielman; William Ribarsky
Evaluation in visualization remains a difficult problem because of the unique constraints and opportunities inherent to visualization use. While many potentially useful methodologies have been proposed, there remain significant gaps in assessing the value of the open-ended exploration and complex task-solving that the visualization community holds up as an ideal. In this paper, we propose a methodology to quantitatively evaluate a visual analytics (VA) system based on measuring what is learned by its users as the users reapply the knowledge to a different problem or domain. The motivation for this methodology is based on the observation that the ultimate goal of a user of a VA system is to gain knowledge of and expertise with the dataset, task, or tool itself. We propose a framework for describing and measuring knowledge gain in the analytical process based on these three types of knowledge and discuss considerations for evaluating each. We propose that through careful design of tests that examine how well participants can reapply knowledge learned from using a VA system, the utility of the visualization can be more directly assessed.

Tasks / Data

A descriptive model of visual scanning BIBAFull-Text 35-42
  Stéphane Conversy; Christophe Hurter; Stéphane Chatty
When designing a representation, a designer implicitly formulates a sequence of visual tasks required to understand and use the representation effectively. This paper aims to make the sequence of visual tasks explicit, in order to help designers eliciting their design choices. In particular, we present a set of concepts to systematically analyze what a user must theoretically do to decipher representation. The analysis consists of a decomposition of the activity of scanning into elementary visualization operations. We show how the analysis applies to various existing representations, and how expected benefits can be expressed in terms of elementary operations. The set of elementary operations form the basis of a shared, common language for representation designers. The decomposition highlights the challenges encountered by a user when deciphering a representation, and helps designers to exhibit possible flaws in their design, justify their choices, and compare designs.
Generating a synthetic video dataset BIBAFull-Text 43-48
  Mark A. Whiting; Jereme Haack; Carrie Varley
A synthetic video dataset, scenario, and task were included in the 2009 VAST Challenge, to allow participants an opportunity to demonstrate visual analytic tool use on video data. This is the first time a video challenge had been presented as part of the VAST contest and provided interesting challenges in task and dataset development, video analytic tool development, and metrics for judging entries. We describe the considerations and requirements for generation of a usable challenge, the video creation itself, and some submissions and assessments from that mini-challenge.

Insight Evaluations

Is your user hunting or gathering insights?: identifying insight drivers across domains BIBAFull-Text 49-54
  Michael Smuc; Eva Mayr; Hanna Risku
In recent years, using the number of insights to benchmark visual analytics tools has become a prominent method in the Infovis community. The insight methodology has become a frequently used instrument to measure the performance of tools that are developed for highly specialized purposes for highly specialized domain-experts. But some tools have a wider target group of experts with knowledge in different domains. The utility of the insight-method for other expert user groups without specific domain knowledge has been addressed to a far lesser extent. In a case study we give an illustration of how and where insights from experts with and without domain knowledge differ, and how these findings might enrich the evaluation of visualization tools designed for usage across different domains.
Comparing benchmark task and insight evaluation methods on timeseries graph visualizations BIBAFull-Text 55-62
  Purvi Saraiya; Chris North; Karen Duca
A study to compare two different empirical research methods for evaluating visualization tools is described: the traditional benchmark-task method and the insight method. The methods were compared using different criteria such as: the conclusions about the visualization tools provided by each method, the time participants spent during the study, the time and effort required to analyze the resulting empirical data, and the effect of individual differences between participants on the results. The studies used three graph visualization alternatives to associate bioinformatics microarray timeseries data to pathway graph vertices, based on popular approaches used in existing bioinformatics software.

Mass Evaluations

Do Mechanical Turks dream of square pie charts? BIBAFull-Text 63-70
  Robert Kosara; Caroline Ziemkiewicz
Online studies are an attractive alternative to the labor-intensive lab study, and promise the possibility of reaching a larger variety and number of people than at a typical university. There are also a number of draw-backs, however, that have made these studies largely impractical so far.
   Amazon's Mechanical Turk is a web service that facilitates the assignment of small, web-based tasks to a large pool of anonymous workers. We used it to conduct several perception and cognition studies, one of which was identical to a previous study performed in our lab.
   We report on our experiences and present ways to avoid common problems by taking them into account in the study design, and taking advantage of Mechanical Turk's features.

Physiological Measurements

Comparing information graphics: a critical look at eye tracking BIBAFull-Text 71-78
  Joseph H. Goldberg; Jonathan I. Helfman
Effective graphics are essential for understanding complex information and completing tasks. To assess graphic effectiveness, eye tracking methods can help provide a deeper understanding of scanning strategies that underlie more traditional, high-level accuracy and task completion time results. Eye tracking methods entail many challenges, such as defining fixations, assigning fixations to areas of interest, choosing appropriate metrics, addressing potential errors in gaze location, and handling scanning interruptions. Special considerations are also required designing, preparing, and conducting eye tracking studies. An illustrative eye tracking study was conducted to assess the differences in scanning within and between bar, line, and spider graphs, to determine which graphs best support relative comparisons along several dimensions. There was excessive scanning to locate the correct bar graph in easier tasks. Scanning across bar and line graph dimensions before comparing across graphs was evident in harder tasks. There was repeated scanning between the same dimension of two spider graphs, implying a greater cognitive demand from scanning in a circle that contains multiple linear dimensions, than from scanning the linear axes of bar and line graphs. With appropriate task design and targeted analysis metrics, eye tracking techniques can illuminate visual scanning patterns hidden by more traditional time and accuracy results.

Evaluation Lessons

Evaluating information visualization in large companies: challenges, experiences and recommendations BIBAFull-Text 79-86
  Michael Sedlmair; Petra Isenberg; Dominikus Baur; Andreas Butz
We examine the process and some implications of evaluating information visualization in a large company setting. While several researchers have addressed the difficulties of evaluating information visualizations with regards to changing data, tasks, and visual encodings, considerably less work has been published on the difficulties of evaluation within specific work contexts. In this paper, we specifically focus on the challenges arising in the context of large companies with several thousand employees. We present a collection of evaluation challenges, discuss our own experiences conducting information visualization evaluation within the context of a large automotive company, and present a set of recommendations derived from our experiences. The set of challenges and recommendations can aid researchers and practitioners in preparing and conducting evaluations of their products within a large company setting.