HCI Bibliography : Search Results skip to search form | skip to results |
Database updated: 2016-05-10 Searches since 2006-12-01: 32,646,454
director@hcibib.org
Hosted by ACM SIGCHI
The HCI Bibliogaphy was moved to a new server 2015-05-12 and again 2016-01-05, substantially degrading the environment for making updates.
There are no plans to add to the database.
Please send questions or comments to director@hcibib.org.
Query: Roy_D* Results: 23 Sorted by: Date  Comments?
Help Dates
Limit:   
[1] Word Embedding based Generalized Language Model for Information Retrieval Short Papers / Ganguly, Debasis / Roy, Dwaipayan / Mitra, Mandar / Jones, Gareth J. F. Proceedings of the 2015 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2015-08-09 p.795-798
ACM Digital Library Link
Summary: Word2vec, a state-of-the-art word embedding technique has gained a lot of interest in the NLP community. The embedding of the word vectors helps to retrieve a list of words that are used in similar contexts with respect to a given word. In this paper, we focus on using the word embeddings for enhancing retrieval effectiveness. In particular, we construct a generalized language model, where the mutual independence between a pair of words (say t and t') no longer holds. Instead, we make use of the vector embeddings of the words to derive the transformation probabilities between words. Specifically, the event of observing a term t in the query from a document d is modeled by two distinct events, that of generating a different term t', either from the document itself or from the collection, respectively, and then eventually transforming it to the observed query term t. The first event of generating an intermediate term from the document intends to capture how well does a term contextually fit within a document, whereas the second one of generating it from the collection aims to address the vocabulary mismatch problem by taking into account other related terms in the collection. Our experiments, conducted on the standard TREC collection, show that our proposed method yields significant improvements over LM and LDA-smoothed LM baselines.

[2] Multidisciplinary Team Dynamics in Service Design -- The Facilitating Role of Pattern Language Full Papers / Athavankar, Uday / Khambete, Pramod / Roy, Debjani / Chaudhary, Sujata / Kimbahune, Sanjay / Doke, Pankaj / Devkar, Sujit Proceedings of the IndiaHCI 2014 International Conference on Human Computer Interaction 2014-12-07 p.16-25
ACM Digital Library Link
Summary: Service design is an evolving discipline. Service value is co-created by service providers and their customer. The complex nature of services requires collaboration in a multidisciplinary team at the design stage itself to create service systems that lead to a delightful customer experience. While working in a multidisciplinary team for service design there is a need to effectively capture the knowledge of participants from different disciplines and integrate it in the design process. Team dynamics play an important role in this context as it is an unconscious, psychological force that influences the direction of a team's behavior and performance. Therefore, there needs to be a language that serves as lingua franca to improve the communication and a medium to ensure effective collaboration within a team. It this paper we share our study of the team dynamics in a multidisciplinary team while designing for services, and highlight the role of pattern language as an effective mediating entity.

[3] Exploring Cards for Patterns to Support Pattern Language Comprehension and Application in Service Design Posters / Athvankar, Uday / Khambete, Pramod / Doke, Pankaj / Kimbahune, Sanjay / Devkar, Sujit / Roy, Debjani / Chaudhary, Sujata Proceedings of the IndiaHCI 2014 International Conference on Human Computer Interaction 2014-12-07 p.112-115
ACM Digital Library Link
Summary: Service Design is a complex activity that requires collaboration among multiple stakeholders. Research indicates that pattern language can help multidisciplinary team overcome the service design complexity. This success hinges critically on comprehension and use of pattern language by the multidisciplinary team. Literature shows that the research focus has been use of pattern language, not the means required for comprehending pattern language.
    To address this need, we explored the use of pattern cards as a tool to support the comprehension of pattern language. In this paper, we share experiences of using pattern cards in studies conducted to understand the complex field of rural healthcare services. The participants of different domains used the cards for easy reference while designing service interventions. Analysis showed that pattern cards facilitated the team to comprehend patterns easily and helped in externalizing thoughts when used with experience journey map. It enabled discussions among the team keeping pace of the design process.

[4] Supporting treatment of people living with HIV / AIDS in resource limited settings with IVRs Personal health and wellbeing / Joshi, Anirudha / Rane, Mandar / Roy, Debjani / Emmadi, Nagraj / Srinivasan, Padma / Kumarasamy, N. / Pujari, Sanjay / Solomon, Davidson / Rodrigues, Rashmi / Saple, D. G. / Sen, Kamalika / Veldeman, Els / Rutten, Romain Proceedings of ACM CHI 2014 Conference on Human Factors in Computing Systems 2014-04-26 v.1 p.1595-1604
ACM Digital Library Link
Summary: We developed an interactive voice response (IVR) system called TAMA (Treatment Advice by Mobile Alerts) that provides treatment support to people living with HIV / AIDS (PLHA) in developing countries, who are on antiretroviral therapy (ART). We deployed TAMA with 54 PLHA in 5 HIV clinics in India for a period of 12 weeks. During the study, we gathered feedback about TAMA's design and usage. Additionally, we conducted detailed qualitative interviews and analysed usage logs. We found that TAMA was usable and viable in the real life settings of PLHA and it had many desirable effects on their treatment adherence. We developed insights that inform the design of TAMA and some of these can be generalised to design of other long-term, frequent-use IVR applications for users in developing countries in the healthcare domain and beyond.

[5] A portable audio/video recorder for longitudinal study of child development Poster session / Vosoughi, Soroush / Goodwin, Matthew S. / Washabaugh, Bill / Roy, Deb Proceedings of the 2012 International Conference on Multimodal Interfaces 2012-10-22 p.193-200
ACM Digital Library Link
Summary: Collection and analysis of ultra-dense, longitudinal observational data of child behavior in natural, ecologically valid, non-laboratory settings holds significant promise for advancing the understanding of child development and developmental disorders such as autism. To this end, we created the Speechome Recorder -- a portable version of the embedded audio/video recording technology originally developed for the Human Speechome Project -- to facilitate swift, cost-effective deployment in home environments. Recording child behavior daily in these settings will enable detailed study of developmental trajectories in children from infancy through early childhood, as well as typical and atypical dynamics of communication and social interaction as they evolve over time. Its portability makes possible potentially large-scale comparative study of developmental milestones in both neurotypical and developmentally delayed children. In brief, the Speechome Recorder was designed to reduce cost, complexity, invasiveness and privacy issues associated with naturalistic, longitudinal recordings of child development.

[6] Design Opportunities for Supporting Treatment of People Living with HIV / AIDS in India Interaction Design for Developing Regions / Joshi, Anirudha / Rane, Mandar / Roy, Debjani / Sali, Shweta / Bharshankar, Neha / Kumarasamy, N. / Pujari, Sanjay / Solomon, Davidson / Sharma, H. Diamond / Saple, D. G. / Rutten, Romain / Ganju, Aakash / Van Dam, Joris Proceedings of IFIP INTERACT'11: Human-Computer Interaction 2011-09-05 v.2 p.315-332
Keywords: HIV/AIDS; healthcare; adherence; user study; design for development
Link to Digital Content at Springer
Summary: We describe a qualitative user study that we conducted with 64 people living with HIV/AIDS (PLHA) in India recruited from private sector clinics. Our aim was to investigate information gaps, problems, and opportunities for design of relevant technology solutions to support HIV treatment. Our methodology included clinic visits, observations, discussion with doctors and counsellors, contextual interviews with PLHA, diary studies, technology tryouts, and home visits. Analysis identified user statements, observations, breakdowns, insights, and design ideas. We consolidated our findings across users with an affinity. We found that despite several efforts, PLHA have limited access to authentic information. Some know facts and procedures, but lack conceptual understanding of HIV. Challenges include low education, no access to technology, lack of socialisation, less time with doctors and counsellors, high power-distance between PLHA and doctors and counsellors, and information overload. Information solutions based on mobile phones can lead to better communication and improve treatment adherence and effectiveness if they are based on the following: repetition, visualisation, organisation, localisation, and personalisation of information, improved socialisation, and complementing current efforts in clinics.

[7] Grounding spatial language for video search Speech and language / Tellex, Stefanie / Kollar, Thomas / Shaw, George / Roy, Nicholas / Roy, Deb Proceedings of the 2010 International Conference on Multimodal Interfaces 2010-11-08 p.31
ACM Digital Library Link
Summary: The ability to find a video clip that matches a natural language description of an event would enable intuitive search of large databases of surveillance video. We present a mechanism for connecting a spatial language query to a video clip corresponding to the query. The system can retrieve video clips matching millions of potential queries that describe complex events in video such as "people walking from the hallway door, around the island, to the kitchen sink." By breaking down the query into a sequence of independent structured clauses and modeling the meaning of each component of the structure separately, we are able to improve on previous approaches to video retrieval by finding clips that match much longer and more complex queries using a rich set of spatial relations such as "down" and "past." We present a rigorous analysis of the system's performance, based on a large corpus of task-constrained language collected from fourteen subjects. Using this corpus, we show that the system effectively retrieves clips that match natural language descriptions: 58.3% were ranked in the top two of ten in a retrieval task. Furthermore, we show that spatial relations play an important role in the system's performance.

[8] Toward understanding natural language directions Paper session 5: natural language interaction / Kollar, Thomas / Tellex, Stefanie / Roy, Deb / Roy, Nicholas Proceedings of the 5th ACM/IEEE International Conference on Human Robot Interaction 2010-03-02 p.259-266
Keywords: direction understanding, route instructions, spatial language
ACM Digital Library Link
Summary: Speaking using unconstrained natural language is an intuitive and flexible way for humans to interact with robots. Understanding this kind of linguistic input is challenging because diverse words and phrases must be mapped into structures that the robot can understand, and elements in those structures must be grounded in an uncertain environment. We present a system that follows natural language directions by extracting a sequence of spatial description clauses from the linguistic input and then infers the most probable path through the environment given only information about the environmental geometry and detected visible objects. We use a probabilistic graphical model that factors into three key components. The first component grounds landmark phrases such as "the computers" in the perceptual frame of the robot by exploiting co-occurrence statistics from a database of tagged images such as Flickr. Second, a spatial reasoning component judges how well spatial relations such as "past the computers" describe a path. Finally, verb phrases such as "turn right" are modeled according to the amount of change in orientation in the path. Our system follows 60% of the directions in our corpus to within 15 meters of the true destination, significantly outperforming other approaches.

[9] Grounding spatial prepositions for video search Doctoral spotlight oral session / Tellex, Stefanie / Roy, Deb Proceedings of the 2009 International Conference on Multimodal Interfaces 2009-11-02 p.253-260
Keywords: spatial language, video retrieval
ACM Digital Library Link
Summary: Spatial language video retrieval is an important real-world problem that forms a test bed for evaluating semantic structures for natural language descriptions of motion on naturalistic data. Video search by natural language query requires that linguistic input be converted into structures that operate on video in order to find clips that match a query. This paper describes a framework for grounding the meaning of spatial prepositions in video. We present a library of features that can be used to automatically classify a video clip based on whether it matches a natural language query. To evaluate these features, we collected a corpus of natural language descriptions about the motion of people in video clips. We characterize the language used in the corpus, and use it to train and test models for the meanings of the spatial prepositions "to," "across," "through," "out," "along," "towards," and "around." The classifiers can be used to build a spatial language video retrieval system that finds clips matching queries such as "across the kitchen."

[10] Object schemas for responsive robotic language use Technical papers / Hsiao, Kai-yuh / Vosoughi, Soroush / Tellex, Stefanie / Kubat, Rony / Roy, Deb Proceedings of the 3rd ACM/IEEE International Conference on Human Robot Interaction 2008-03-12 p.233-240
Keywords: affordances, behavior-based, language grounding, object schema, robot
ACM Digital Library Link
Summary: The use of natural language should be added to a robot system without sacrificing responsiveness to the environment. In this paper, we present a robot that manipulates objects on a tabletop in response to verbal interaction. Reactivity is maintained by using concurrent interaction processes, such as visual trackers and collision detection processes. The interaction processes and their associated data are organized into object schemas, each representing a physical object in the environment, based on the target of each process. The object schemas then serve as discrete structures of coordination between reactivity, planning, and language use, permitting rapid integration of information from multiple sources.

[11] Spatial routines for a simulated speech-controlled vehicle Assistive robotics / Tellex, Stefanie / Roy, Deb Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction 2006-03-02 p.156-163
Keywords: language grounding, situated language processing, spatial language, spatial routines, visual routines, wheelchair
ACM Digital Library Link
Summary: We have defined a lexicon of words in terms of spatial routines, and used that lexicon to build a speech controlled vehicle in a simulator. A spatial routine is a script composed from a set of primitive operations on occupancy grids, analogous to Ullman's visual routines. The vehicle understands the meaning of context-dependent natural language commands such as "Go across the room." When the system receives a command, it combines definitions from the lexicon according to the parse structure of the command, creating a script that selects a goal for the vehicle. Spatial routines may provide the basis for interpreting spatial language in a broad range of physically situated language understanding systems.

[12] Probabilistic grounding of situated speech using plan recognition and reference resolution Semantics and dialog / Gorniak, Peter / Roy, Deb Proceedings of the 2005 International Conference on Multimodal Interfaces 2005-10-04 p.138-143
Keywords: grounding, language, plan recognition, situated, speech, understanding
ACM Digital Library Link
Summary: Situated, spontaneous speech may be ambiguous along acoustic, lexical, grammatical and semantic dimensions. To understand such a seemingly difficult signal, we propose to model the ambiguity inherent in acoustic signals and in lexical and grammatical choices using compact, probabilistic representations of multiple hypotheses. To resolve semantic ambiguities we propose a situation model that captures aspects of the physical context of an utterance as well as the speaker's intentions, in our case represented by recognized plans. In a single, coherent Framework for Understanding Situated Speech (FUSS) we show how these two influences, acting on an ambiguous representation of the speech signal, complement each other to disambiguate form and content of situated speech. This method produces promising results in a game playing environment and leaves room for other types of situation models.

[13] Elvis: situated speech and gesture understanding for a robotic chandelier Multimodal applications / Juster, Joshua / Roy, Deb Proceedings of the 2004 International Conference on Multimodal Interfaces 2004-10-13 p.90-96
Keywords: gesture, grounded, input methods, lighting, multimodal, natural interaction, situated, speech
ACM Digital Library Link
Summary: We describe a home lighting robot that uses directional spotlights to create complex lighting scenes. The robot senses its visual environment using a panoramic camera and attempts to maintain its target goal state by adjusting the positions and intensities of its lights. Users can communicate desired changes in the lighting environment through speech and gesture (e.g., "Make it brighter over there"). Information obtained from these two modalities are combined to form a goal, a desired change in the lighting of the scene. This goal is then incorporated into the system's target goal state. When the target goal state and the world are out of alignment, the system formulates a sensorimotor plan that acts on the world to return the system to homeostasis.

[14] A self-paced approach to hypermedia design for patient education Hypermedia documentation / Roy, Debopriyo ACM 22nd International Conference on Computer Documentation 2004-10-10 p.27-32
ACM Digital Library Link
Summary: Traditional theories on multimedia design have considered the importance of modality effect to a large extent. The stress on modality effect has often de-emphasized the importance of what information architecture can do to control modality effect if information presentation is self-paced instead of system paced. We have considered a patient education module as our case study. I propose a conversational interactive patient education module as a solution which responds to individual reader needs during hypermedia interaction. In this article, I take an initial step towards this approach, testing patient education modules with and without narration to support text and static graphics. Our results suggest that levels of reader comprehension and accuracy for modules with and without narration have similar performance. Readers have shown a preference towards using narration, online text and graphics based on individual task, if the system permits a self-paced interaction. Thus, we argue that modality effect may be influenced with a self-paced system.

[15] Augmenting user interfaces with adaptive speech commands Speech and gaze / Gorniak, Peter / Roy, Deb Proceedings of the 2003 International Conference on Multimodal Interfaces 2003-11-05 p.176-179
Keywords: machine learning, phoneme recognition, robust speech interfaces, user modelling
ACM Digital Library Link
Summary: We present a system that augments any unmodified Java application with an adaptive speech interface. The augmented system learns to associate spoken words and utterances with interface actions such as button clicks. Speech learning is constantly active and searches for correlations between what the user says and does. Training the interface is seamlessly integrated with using the interface. As the user performs normal actions, she may optionally verbally describe what she is doing. By using a phoneme recognizer, the interface is able to quickly learn new speech commands. Speech commands are chosen by the user and can be recognized robustly due to accurate phonetic modelling of the user's utterances and the small size of the vocabulary learned for a single application. After only a few examples, speech commands can replace mouse clicks. In effect, selected interface functions migrate from keyboard and mouse to speech. We demonstrate the usefulness of this approach by augmenting jfig, a drawing application, where speech commands save the user from the distraction of having to use a tool palette.

[16] A visually grounded natural language interface for reference to spatial scenes Posters / Gorniak, Peter / Roy, Deb Proceedings of the 2003 International Conference on Multimodal Interfaces 2003-11-05 p.219-226
Keywords: cognitive modelling, computational semantics, natural language understanding, vision based semantics
ACM Digital Library Link
Summary: Many user interfaces, from graphic design programs to navigation aids in cars, share a virtual space with the user. Such applications are often ideal candidates for speech interfaces that allow the user to refer to objects in the shared space. We present an analysis of how people describe objects in spatial scenes using natural language. Based on this study, we describe a system that uses synthetic vision to "see" such scenes from the person's point of view, and that understands complex natural language descriptions referring to objects in the scenes. This system is based on a rich notion of semantic compositionality embedded in a grounded language understanding framework. We describe its semantic elements, their compositional behaviour, and their grounding through the synthetic vision system. To conclude, we evaluate the performance of the system on unconstrained input.

[17] Towards Visually-Grounded Spoken Language Acquisition / Roy, Deb Proceedings of the 2002 International Conference on Multimodal Interfaces 2002-10-14 p.105
ACM Digital Library Link
Summary: A characteristic shared by most approaches to natural language understanding and generation is the use of symbolic representations of word and sentence meanings. Frames and semantic nets are examples of symbolic representations. Symbolic methods are inappropriate for applications which require natural language semantics to be linked to perception, as is the case in tasks such as scene description or human-robot interaction. This paper presents two implemented systems, one that learns to generate, and one that learns to understand visually-grounded spoken language. These implementations are part of our ongoing effort to develop a comprehensive model of perceptually-grounded semantics.

[18] Medical Device Requirements: A View from Canada 4: MULTIPLE-SESSION SYMPOSIA: Global Challenges in Science, Technology, Design, and Regulation [Single-Session Symposium] / Roy, Denis Proceedings of the Joint IEA 14th Triennial Congress and Human Factors and Ergonomics Society 44th Annual Meeting 2000-07-30 v.44 n.4 p.530-532
Link to HFES Digital Content
Summary: This paper provides an overview of the Canadian Medical Device Regulations and focuses on those requirements that may impact on human factors and ergonomic issues. Additionally, the paper provides examples of specific incidents with medical devices that occurred in Canada where ergonomic and human factor issues were directly responsible for the problem.

[19] Perceptual Intelligence: learning gestures and words for individualized, adaptive interfaces / Pentland, A. / Roy, D. / Wren, C. Proceedings of the Eighth International Conference on Human-Computer Interaction 1999-08-22 v.1 p.286-290
[20] Stimulating Research into Gestural Human Machine Interaction Round Table / Panayi, Marilyn / Roy, David M. / Richardson, James GW 1999: Gesture Workshop 1999-03-17 p.317-331
Link to Digital Content at Springer
Summary: This is the summary report of the roundtable session held at the end of Gesture Workshop '99. This first roundtable aimed to act as a forum of discussion for issues and concerns relating to the achievements, future development, and potential of the field of gestural and sign-language based human computer interaction.

[21] A Phoneme Probability Display for Individuals with Hearing Disabilities / Roy, Deb / Pentland, Alex Third Annual ACM SIGACCESS Conference on Assistive Technologies 1998-04-15 p.165-168
dkroy.www.media.mit.edu/people/dkroy/Assets98_HTML/speechdisplay.html
Broken Link to ACM Digital Library
www.acm.org/pubs/articles/proceedings/assets/274497/p165-roy/p165-roy.txt
Summary: We are building an aid for individuals with hearing impairments which converts continuous speech into an animated visual display. A speech analysis system continuously estimates phoneme probabilities from the input acoustic stream. Phoneme symbols are displayed graphically with brightness in proportion to estimated phoneme probabilities. We use an automated layout algorithm to design the display to group acoustically confusable phonemes together in the graphical display.

[22] NewsComm: A Hand-Held Interface for Interactive Access to Structured Audio PAPERS: News and Mail / Roy, Deb K. / Schmandt, Chris Proceedings of ACM CHI 96 Conference on Human Factors in Computing Systems 1996-04-14 v.1 p.173-180
Keywords: Audio interfaces, Hand-held computers, Structured audio
old.sigchi.org/chi96/proceedings/papers/Roy/paper.html
Summary: The NewsComm system delivers personalized news and other program material as audio to mobile users through a hand-held playback device. This paper focuses on the iterative design and user testing of the hand-held interface. The interface was first designed and tested in a software-only environment and then ported to a custom hardware platform. The hand-held device enables navigation through audio recordings based on structural information which is extracted from the audio using digital signal processing techniques. The interface design addresses the problems of designing a hand-held and primarily non-visual interface for accessing large amounts of structured audio recordings.

[23] Gestural Human-Machine Interaction for People with Severe Speech and Motor Impairment Due to Cerebral Palsy SHORT PAPERS: Enhancing Interaction / Roy, David M. / Panayi, Marilyn / Erenshteyn, Roman / Foulds, Richard / Fawcus, Robert Proceedings of ACM CHI'94 Conference on Human Factors in Computing Systems 1994-04-24 v.2 p.313-314
Keywords: Gesture recognition, Disability, Cerebral palsy, Performance art, Electromyogram, EMG, Artificial neural networks
Broken Link to ACM Digital Library
Summary: The objective of the research is to develop a new method of human-machine interaction that reflects and harnesses the abilities of people with severe speech and motor impairment due to cerebral palsy (SSMICP). Human-human interaction within the framework of drama and mime was used to elicit 120 gestures from twelve students with SSMICP. 27 dynamic arm gestures were monitored using biomechanical and bioelectric sensors. Neural networks are being used to analyze the data and to realize the gestural human-machine interface. Preliminary results show that two visually similar gestures can be differentiated by neural networks.