Proceedings of the 2006 AVI Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization

Fullname:BELIV'06 Proceedings of the 2006 AVI Workshop on BEyond time and errors: novel evaLuation methods for Information Visualization
Editors:Enrico Bertini; Catherine Plaisant; Giuseppe Santucci
Location:Venice, Italy
Standard No:ISBN: 1-59593-562-2; ACM DL: Table of Contents hcibib: BELIV06
  1. Challenges with controlled studies
  2. Lessons learned from case studies
  3. Methodologies: novel approaches and metrics
  4. Methodologies: heuristics for information visualization
  5. Developing benchmarks datasets and tasks

Challenges with controlled studies

Evaluating information visualisations BIBAFull-Text 11-5
  Keith Andrews
As more experience is being gained with the evaluation of information visualisation interfaces, weaknesses in current evaluation practice are coming to the fore.
   This position paper presents an overview of currently used evaluation methods, followed by a discussion of my experiences and lessons learned from a series of studies comparing hierarchy browsers.
An explorative analysis of user evaluation studies in information visualisation BIBAKFull-Text 21-7
  Geoffrey Ellis; Alan Dix
This paper presents an analysis of user studies from a review of papers describing new visualisation applications and uses these to highlight various issues related to the evaluation of visualisations. We first consider some of the reasons why the process of evaluating visualisations is so difficult. We then dissect the problem by discussing the importance of recognising the nature of experimental design, datasets and participants as well as the statistical analysis of results. We propose explorative evaluation as a method of discovering new things about visualisation techniques, which may give us a better understanding of the mechanisms of visualisations. Finally we give some practical guidance on how to do evaluation correctly.
Keywords: case study, evaluation, explorative evaluation, information visualisation

Lessons learned from case studies

Methods for the evaluation of an interactive InfoVis tool supporting exploratory reasoning processes BIBAFull-Text 31-6
  Markus Rester; Margit Pohl
Developing Information Visualization (InfoVis) techniques for complex knowledge domains makes it necessary to apply alternative methods of evaluation. In the evaluation of Gravi++ we used several methods and studied different user groups. We developed a reporting system yielding data about the insights the subjects gained during the exploration. It provides complex information about subjects' reasoning processes. Log files are valuable for time-dependent analysis of cognitive strategies. Focus groups provide a different view on the process of gaining insights. We assume that our experiences with all these methods can also be applied in similar evaluation studies on InfoVis techniques for complex data.
Evaluating information visualization applications with focus groups: the CourseVis experience BIBAKFull-Text 41-6
  Riccardo Mazza
This paper reports our experience of evaluating an application that uses visualization approaches to support instructors in Web based distance education. The evaluation took place in three stages: a focus group, an experimental study, and a semi-structured interview. In this paper we focus our attention on the focus group, and we will show how this evaluation approach can be very effective in uncovering unexpected problems that cannot be identified with analytic evaluations or controlled experiments.
Keywords: focus group, human factors, information visualization evaluation
Evaluating visual table data understanding BIBAKFull-Text 51-5
  Nathalie Henry; Jean-Daniel Fekete
In this paper, we focus on evaluating how information visualization supports exploration for visual table data. We present a controlled experiment designed to evaluate how the layout of table data affects the user understanding and his exploration process. This experiment raised interesting problems from the design phase to the data analysis. We present our task taxonomy, the experiment procedure and give clues about data collection and analysis. We conclude with lessons learnt from this experiment and discuss the format of future evaluation.
Keywords: controlled experiment, evaluation, information visualization, visual table data

Methodologies: novel approaches and metrics

Visual quality metrics BIBAFull-Text 61-5
  Enrico Bertini; Giuseppe Santucci
The definition and usage of quality metrics for Information Visualization techniques is still an immature field. Several proposals are available but a common view and understanding of this issue is still missing. This paper attempts a first step toward a visual quality metrics systematization, providing a general classification of both metrics and usage purposes. Moreover, the paper explores a quite neglected class of visual quality metrics, namely Feature Preservation Metrics, that allow for evaluating and improving in a novel way the effectiveness of basic Infovis techniques.
Metrics for analyzing rich session histories BIBAKFull-Text 71-5
  Howard Goodell; Chih-Hung Chiang; Curran Kelleher; Alex Baumann; Georges Grinstein
To be most useful, evaluation metrics should be based on detailed observation and effective analysis of a full spectrum of system use. Because observation is costly, ideally we want a system to provide in-depth data collection with allied analyses of the key user interface elements. We have developed a visualization and analysis platform [1] that automatically records user actions and states at a high semantic level [2 and 3], and can be directly restored to any state. Audio and text annotations are collected and indexed to states, allowing users to comment on their current situation as they work, and/or as they review the session. These capabilities can be applied to usability evaluation of the system, describing problems they encountered, or to suggest improvements to the environment. Additionally, computed metrics are provided at each state [3, 4, and 5]. We believe that the metrics and the associated history data will allow us to deduce patterns of data exploration, to compare users, to evaluate tools, and to understand in a more automated approach the usability of the visualization system as a whole.
Keywords: session history analysis, session history visualization, user monitoring
Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies BIBAFull-Text 81-7
  Ben Shneiderman; Catherine Plaisant
After an historical review of evaluation methods, we describe an emerging research method called Multi-dimensional In-depth Long-term Case studies (MILCs) which seems well adapted to study the creative activities that users of information visualization systems engage in. We propose that the efficacy of tools can be assessed by documenting 1) usage (observations, interviews, surveys, logging etc.) and 2) expert users' success in achieving their professional goals. We summarize lessons from related ethnography methods used in HCI and provide guidelines for conducting MILCs for information visualization. We suggest ways to refine the methods for MILCs in modest sized projects and then envision ambitious projects with 3-10 researchers working over 1-3 years to understand individual and organizational use of information visualization by domain experts working at the frontiers of knowledge in their fields.

Methodologies: heuristics for information visualization

Systematic inspection of information visualization systems BIBAFull-Text 91-4
  Carmelo Ardito; Paolo Buono; Maria F. Costabile; Rosa Lanzilotti
Recently, several information visualization (IV) tools have been produced and there is a growing number of commercial products. To contribute to a widespread adoption of IV tools, it is indispensable that these tools are effective, efficient and satisfying for the intended users. Various evaluation techniques can be considered and applied at the different phases of the IV software life-cycle. In this paper we propose an inspection technique based on the use of evaluation patterns, called Abstract Tasks, that take into account the specific nature of information visualization systems.
Heuristics for information visualization evaluation BIBAFull-Text 101-6
  Torre Zuk; Lothar Schlesier; Petra Neumann; Mark S. Hancock; Sheelagh Carpendale
Heuristic evaluation is a well known discount evaluation technique in human-computer interaction (HCI) but has not been utilized in information visualization (InfoVis) to the same extent. While several sets of heuristics have been used or proposed for InfoVis, it is not yet known what kind of heuristics are useful for finding general InfoVis problems. We performed a meta-analysis with the goal of exploring the issues of heuristic evaluation for InfoVis. This meta-analysis concentrates on issues pertaining to the selection and organization of heuristics, and the process itself. For this purpose, we used three sets of previously published heuristics to assess a visual decision support system that is used to examine simulation data. The meta-analysis shows that the evaluation process and results have a high dependency on the heuristics and the types of evaluators chosen. We describe issues related to interpretation, redundancy, and conflict in heuristics. We also provide a discussion of generalizability and categorization of these heuristics.

Developing benchmarks datasets and tasks

Just how dense are dense graphs in the real world?: a methodological note BIBAKFull-Text 111-7
  Guy Melancon
This methodological note focuses on the edge density of real world examples of networks. The edge density is a parameter of interest typically when putting up user studies in an effort to prove the robustness or superiority of a novel graph visualization technique. We survey many real world examples all being of equal interest in Information Visualization, and draw a list of conclusions on how to tune edge density when randomly generating graphs in order to build artificial though realistic examples.
Keywords: edge density, graph models, information visualization, interface evaluation, random generation, real world examples
Task taxonomy for graph visualization BIBAKFull-Text 121-5
  Bongshin Lee; Catherine Plaisant; Cynthia Sims Parr; Jean-Daniel Fekete; Nathalie Henry
Our goal is to define a list of tasks for graph visualization that has enough detail and specificity to be useful to: 1) designers who want to improve their system and 2) to evaluators who want to compare graph visualization systems. In this paper, we suggest a list of tasks we believe are commonly encountered while analyzing graph data. We define graph specific objects and demonstrate how all complex tasks could be seen as a series of low-level tasks performed on those objects. We believe that our taxonomy, associated with benchmark datasets and specific tasks, would help evaluators generalize results collected through a series of controlled experiments.
Keywords: evaluation, graph visualization, task taxonomy
Shakespeare's complete works as a benchmark for evaluating multiscale document navigation techniques BIBAKFull-Text 131-6
  Yves Guiard; Michel Beaudouin-Lafon; Yangzhou Du; Caroline Appert; Jean-Daniel Fekete; Olivier Chapuis
In this paper, we describe an experimental platform dedicated to the comparative evaluation of multiscale electronic-document navigation techniques. One noteworthy characteristic of our platform is that it allows the user not only to translate the document (for example, to pan and zoom) but also to tilt the virtual camera to obtain freely chosen perspective views of the document. Second, the platform makes it possible to explore, with semantic zooming, the 150,000 verses that comprise the complete works of William Shakespeare. We argue that reaching and selecting one specific verse in this very large text corpus amounts to a perfectly well defined Fitts task, leading to rigorous assessments of target acquisition performance. For lack of a standard, the various multiscale techniques that have been reported recently in the literature are difficult to compare. We recommend that Shakespeare's complete works, converted into a single document that can be zoomed both geometrically and semantically, be used as a benchmark to facilitate systematic experimental comparisons, using Fitts' target acquisition paradigm.
Keywords: Fitts' law, evaluation benchmark, evaluation standard, multiscale document navigation techniques, target acquisition
Threat stream data generator: creating the known unknowns for test and evaluation of visual analytics tools BIBAKFull-Text 141-3
  Mark A. Whiting; Wendy Cowley; Jereme Haack; Doug Love; Stephen Tratz; Caroline Varley; Kim Wiessner
We present the Threat Stream Data Generator, an approach and tool for creating synthetic data sets for the test and evaluation of visual analytics tools and environments. We have focused on working with information analysts to understand the characteristics of threat data, to develop scenarios that will allow us to define data sets with known ground truth, to define a process of mapping threat elements in a scenario to expressions in data, and creating a software system to generate the data. We are also developing approaches to evaluating our data sets considering characteristics such as threat subtlety and appropriateness of data for the software to be examined.
Keywords: data generator, threat, threat stream
A taxonomy of tasks for guiding the evaluation of multidimensional visualizations BIBAKFull-Text 151-6
  Eliane R. A. Valiati; Marcelo S. Pimenta; Carla M. D. S. Freitas
The design of multidimensional visualization techniques is based on the assumption that a graphical representation of a large dataset can give more insight to a user, by providing him/her a more intuitive support in the process of exploiting data. When developing a visualization technique, the analytic and exploratory tasks that a user might need or want to perform on the data should guide the choice of the visual and interaction metaphors implemented by the technique. Usability testing of visualization techniques also needs the definition of users' tasks. The identification and understanding of the nature of the users' tasks in the process of acquiring knowledge from visual representations of data is a recent branch in information visualization research. Some works have proposed taxonomies to organize tasks that a visualization technique should support. This paper proposes a taxonomy of visualization tasks, based on existing taxonomies as well as on the observation of users performing exploratory tasks in a multidimensional data set using two different visualization techniques, Parallel Coordinates and RadViz. Different scenarios involving low-level tasks were estimated for the completion of some high-level tasks, and they were compared to the scenarios observed during the users' experiments.
Keywords: information visualization, usability evaluation