| Strategies for Guiding Interactive Search: An Empirical Investigation Into the Consequences of Label Relevance for Assessment and Selection | | BIBA | Full-Text | 1-46 | |
| Duncan P. Brumby; Andrew Howes | |||
| When searching a novel Web page, people often estimate the likelihood that labeled links on the page will lead to their goal. A rational analysis of this activity suggests that people should adjust their estimate of the likelihood that any one item will lead to the goal in a manner that is sensitive to the context provided by the likelihoods that other items on the page will lead to the goal. Two experiments were designed to provide evidence to discriminate between this account and others found in the literature (e.g., satisficing and assess-all accounts). The experiments systematically manipulated the relevance of the distractor items and the location of the target item on the page. The results showed that (a) a high-value item was more likely to be selected when it was first encountered if the relevance of competing distractors was relatively low and (b) more items were assessed prior to selection when the distractors were of greater semantic relevance to the goal. The location manipulation showed that if more distractors were assessed prior to the target item, then the relevance of the distractors had a greater influence on the decision as to whether to select the target immediately. These results suggest that decisions as to when to select an item from the page are sensitive to the context provided by the likelihoods of all of the items so far assessed and not just to the most recent item. The findings are therefore inconsistent with both satisficing and assess-all accounts of interactive search. | |||
| Exiting the Cleanroom: On Ecological Validity and Ubiquitous Computing | | BIBA | Full-Text | 47-99 | |
| Scott Carter; Jennifer Mankoff; Scott R. Klemmer; Tara Matthews | |||
| Over the past decade and a half, corporations and academies have invested considerable time and money in the realization of ubiquitous computing. Yet design approaches that yield ecologically valid understandings of ubiquitous computing systems, which can help designers make design decisions based on how systems perform in the context of actual experience, remain rare. The central question underlying this article is, What barriers stand in the way of real-world, ecologically valid design for ubicomp? Using a literature survey and interviews with 28 developers, we illustrate how issues of sensing and scale cause ubicomp systems to resist iteration, prototype creation, and ecologically valid evaluation. In particular, we found that developers have difficulty creating prototypes that are both robust enough for realistic use and able to handle ambiguity and error and that they struggle to gather useful data from evaluations because critical events occur infrequently, because the level of use necessary to evaluate the system is difficult to maintain, or because the evaluation itself interferes with use of the system. We outline pitfalls for developers to avoid as well as practical solutions, and we draw on our results to outline research challenges for the future. Crucially, we do not argue for particular processes, sets of metrics, or intended outcomes, but rather we focus on prototyping tools and evaluation methods that support realistic use in realistic settings that can be selected according to the needs and goals of a particular developer or researcher. | |||
| The Impact of Tangible User Interfaces on Designers' Spatial Cognition | | BIBA | Full-Text | 101-137 | |
| Mi Jeong Kim; Mary Lou Maher | |||
| Most studies on tangible user interfaces for the tabletop design systems are being undertaken from a technology viewpoint. Although there have been studies that focus on the development of new interactive environments employing tangible user interfaces for designers, there is a lack of evaluation with respect to designers' spatial cognition. In this research we study the effects of tangible user interfaces on designers' spatial cognition to provide empirical evidence for the anecdotal views of the effect of tangible user interfaces. To highlight the expected changes in spatial cognition while using tangible user interfaces, we compared designers using a tangible user interface on a tabletop system with 3D blocks to designers using a graphical user interface on a desktop computer with a mouse and keyboard. The ways in which designers use the two different interfaces for 3D design were examined using a protocol analysis method. The result reveals that designers using 3D blocks perceived more spatial relationships among multiple objects and spaces and discovered new visuo-spatial features when revisiting their design configurations. The designers using the tangible interfaces spent more time in relocating objects to different locations to test the moves, and interacted with the external representation through large body movements implying an immersion in the design model. These two physical actions assist in designers' spatial cognition by reducing cognitive load in mental visual reasoning. Further, designers using the tangible interfaces spent more time in restructuring the design problem by introducing new functional issues as design requirements and produced more discontinuities to the design processes, which provides opportunity for reflection and modification of the design. Therefore this research shows that tangible user interfaces changes designers' spatial cognition, and the changes of the spatial cognition are associated with creative design processes. | |||
| Integrating Physical and Digital Interactions on Walls for Fluid Design Collaboration | | BIBA | Full-Text | 138-213 | |
| Scott R. Klemmer; Katherine M. Everitt; James A. Landay | |||
| Web designers use pens, paper, walls, and tables for explaining, developing, and communicating ideas during the early phases of design. These practices inspired The Designers' Outpost. With Outpost, users collaboratively author Web site information architectures on an electronic whiteboard using physical media (sticky notes and images), structuring and annotating that information with electronic pens. This interaction is enabled by a touch-sensitive electronic whiteboard augmented with a computer vision system. The Designers' Outpost integrates wall-scale, paper-based design practices with novel electronic tools to better support collaboration during early-phase design. Our studies with professional designers showed this integration to be especially helpful for fluidly transitioning to other design tools, access and exploration of design history, and remote collaboration. | |||
| The Impact of Control-Display Gain on User Performance in Pointing Tasks | | BIBA | Full-Text | 215-250 | |
| Géry Casiez; Daniel Vogel; Ravin Balakrishnan; Andy Cockburn | |||
| We theoretically and empirically examine the impact of control display (CD)
gain on mouse pointing performance. Two techniques for modifying CD gain are
considered: constant gain (CG) where CD gain is uniformly adjusted by a
constant multiplier, and pointer acceleration (PA) where CD gain is adjusted
using a nonuniform function depending on movement characteristics. Both CG and
PA are evaluated at various levels of relationship between mouse and cursor
movement: from low levels, which have a near one-to-one mapping, through to
high levels that aggressively amplify mouse movement. We further derive a model
predicting the modification in motor-space caused by pointer acceleration.
Experiments are then conducted on a standard desktop display and on a very
large high-resolution display, allowing us to measure performance in high index
of difficulty tasks where the effect of clutching may be pronounced. The
evaluation apparatus was designed to minimize device quantization effects and
used accurate 3D motion tracking equipment to analyze users' limb movements.
On both displays, and in both gain techniques, we found that low levels of CD gain had a marked negative effect on performance, largely because of increased clutching and maximum limb speeds. High gain levels had relatively little impact on performance, with only a slight increase in time when selecting very small targets at high levels of constant gain. On the standard desktop display, pointer acceleration resulted in 3.3% faster pointing than constant gain and up to 5.6% faster with small targets. This supported the theoretical prediction of motor-space modification but fell short of the theoretical potential, possibly because PA caused an increase in target overshooting. Both techniques were accurately modeled by Fitts' law in all gain settings except for when there was a significant amount of clutching. From our results, we derive a usable range of CD gain settings between thresholds of speed and accuracy given the capabilities of a pointing device, display, and the expected range of target widths and distances. | |||
| A Study of the Evaluator Effect in Usability Testing | | BIBA | Full-Text | 251-277 | |
| Kasper Hornbæk; Erik Frøkjær | |||
| The evaluator effect names the observation that usability evaluators in similar conditions identify substantially different sets of usability problems. Yet little is known about the factors involved in the evaluator effect. We present a study of 50 novice evaluators' usability tests and subsequent comparisons, in teams and individually, of the resulting usability problems. The same problems were analyzed independently by 10 human-computer interaction experts. The study shows an agreement between evaluators of about 40%, indicating a substantial evaluator effect. Team matching of problems following the individual matching appears to improve the agreement, and evaluators express greater satisfaction with the teams' matchings. The matchings of individuals, teams, and independent experts show evaluator effects of similar sizes; yet individuals, teams, and independent experts fundamentally disagree about which problems are similar. Previous claims in the literature about the evaluator effect are challenged by the large variability in the matching of usability problems; we identify matching as a key determinant of the evaluator effect. An alternative view of usability problems and evaluator agreement is proposed in which matching is seen as an activity that helps to make sense of usability problems and where the existence of a correct matching is not assumed. | |||
| Scoping Analytical Usability Evaluation Methods: A Case Study | | BIBA | Full-Text | 278-327 | |
| Ann E. Blandford; Joanne K. Hyde; Thomas R. G. Green; Iain Connell | |||
| Analytical usability evaluation methods (UEMs) can complement empirical evaluation of systems: for example, they can often be used earlier in design and can provide accounts of why users might experience difficulties, as well as what those difficulties are. However, their properties and value are only partially understood. One way to improve our understanding is by detailed comparisons using a single interface or system as a target for evaluation, but we need to look deeper than simple problem counts: we need to consider what kinds of accounts each UEM offers, and why. Here, we report on a detailed comparison of eight analytical UEMs. These eight methods were applied to a robotic arm interface, and the findings were systematically compared against video data of the arm in use. The usability issues that were identified could be grouped into five categories: system design, user misconceptions, conceptual fit between user and system, physical issues, and contextual ones. Other possible categories such as user experience did not emerge in this particular study. With the exception of Heuristic Evaluation, which supported a range of insights, each analytical method was found to focus attention on just one or two categories of issues. Two of the three "home-grown" methods (Evaluating Multimodal Usability and Concept-based Analysis of Surface and Structural Misfits) were found to occupy particular niches in the space, whereas the third (Programmable User Modeling) did not. This approach has identified commonalities and contrasts between methods and provided accounts of why a particular method yielded the insights it did. Rather than considering measures such as problem count or thoroughness, this approach has yielded insights into the scope of each method. | |||
| Usability Problem Reports for Comparative Studies: Consistency and Inspectability | | BIBA | Full-Text | 329-380 | |
| Arnold P. O. S. Vermeeren; Jelle Attema; Evren Akar; Huib de Ridder; Andrea J. von Doorn; Cigdem Erbug; Ali E. Berkman; Martin C. Maguire | |||
| This study explores issues of consistency and inspectability in usability test data analysis processes and reports. Problem reports resulting from usability tests performed by three professional usability labs in three different countries are compared. Each of the labs conducted a usability test on the same product, applying an agreed test protocol that was collaboratively developed by the labs. Each lab first analyzed their own findings as they always do in their regular professional practice. A few weeks later, they again analyzed their findings, but then everyone applied the same method (SlimDEVAN: a simplified version of DEVAN, a method developed for facilitating comparison of findings from usability tests in an academic setting). It was found that levels of agreement between labs did not improve when they all used SlimDEVAN, suggesting that there was inherent subjectivity in their analyses. It was found that consistency of single analyst teams varied considerably and that a method like SlimDEVAN can help in making the analysis process and findings more inspectable. Inspectability is helpful in comparative studies based on identified usability problems because it allows for tracing back findings to original observations as well as for laying bare the subjective parts of the data analysis. | |||
| Analogical Problem Solving in Casual and Experienced Users: When Interface Consistency Leads to Inappropriate Transfer | | BIBA | Full-Text | 381-405 | |
| Kraig Finstad | |||
| The problem-solving abilities of casual users were compared to experienced users in a computer setting. It was hypothesized that casual users would benefit from reduced consistency with other applications. Experience was gauged with a questionnaire and empirical measures. Four interfaces were developed with varying degrees of similarity to Web browsers. Using a Web browser as a source problem, participants were tested with two of the experimental interfaces. The data indicated that the accuracy of casual users was equivalent across consistent and inconsistent interfaces but that the consistent interfaces had significantly higher latencies. The primary conclusions of the study are that performance for casual users is improved by superficially inconsistent interfaces and that their performance is equivalent to experienced users when a true analogue is present. Commonalities with familiar elements may be a hindrance. | |||
| Reviewing Meetings in TeamSpace | | BIBA | Full-Text | 406-432 | |
| Heather Richter Lipford; Gregory D. Abowd | |||
| A number of prototype meeting capture applications have been created in the past decade, yet relatively little research has focused on the review and long-term use of real captured meeting information. To that end, we have implemented a system called TeamSpace for capturing and reviewing general meetings. In this article, we describe the long-term deployment of TeamSpace to a university research group, along with a pseudo-controlled study involving the same group of users and their meetings. We gained a detailed understanding of the behavioral patterns involved in reviewing meeting content and how to improve on the experience. Our evaluations also demonstrate several of the barriers and challenges in realizing the potential benefits of meeting capture. | |||