HCI Bibliography Home | HCI Conferences | HYPER Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
HYPER Tables of Contents: 03040506070809101112131415

Proceedings of the 2013 ACM Conference on Hypertext and Social Media

Fullname:Proceedings of the 24th ACM Conference on Hypertext and Social Media
Editors:Gerd Stumme; Andreas Hotho
Location:Paris, France
Dates:2013-May-01 to 2013-May-03
Publisher:ACM
Standard No:ISBN: 978-1-4503-1967-6; ACM DL: Table of Contents; hcibib: HYPER13
Papers:35
Pages:262
Links:Conference Website
A question of complexity: measuring the maturity of online enquiry communities BIBAFull-Text 1-10
  Grégoire Burel; Yulan He
Online enquiry communities such as Question Answering (Q&A) websites allow people to seek answers to all kind of questions. With the growing popularity of such platforms, it is important for community managers to constantly monitor the performance of their communities. Although different metrics have been proposed for tracking the evolution of such communities, maturity, the process in which communities become more topic proficient over time, has been largely ignored despite its potential to help in identifying robust communities. In this paper, we interpret community maturity as the proportion of complex questions in a community at a given time. We use the Server Fault (SF) community, a Question Answering (Q&A) community of system administrators, as our case study and perform analysis on question complexity, the level of expertise required to answer a question. We show that question complexity depends on both the length of involvement and the level of contributions of the users who post questions within their community. We extract features relating to askers, answerers, questions and answers, and analyse which features are strongly correlated with question complexity. Although our findings highlight the difficulty of automatically identifying question complexity, we found that complexity is more influenced by both the topical focus and the length of community involvement of askers. Following the identification of question complexity, we define a measure of maturity and analyse the evolution of different topical communities. Our results show that different topical communities show different maturity patterns. Some communities show a high maturity at the beginning while others exhibit slow maturity rate.
Where's @wally?: a classification approach to geolocating users based on their social ties BIBAFull-Text 11-20
  Dominic Rout; Kalina Bontcheva; Daniel Preotiuc-Pietro; Trevor Cohn
This paper presents an approach to geolocating users of online social networks, based solely on their 'friendship' connections. We observe that users interact more regularly with those closer to themselves and hypothesise that, in many cases, a person's social network is sufficient to reveal their location.
   The geolocation problem is formulated as a classification task, where the most likely city for a user without an explicit location is chosen amongst the known locations of their social ties. Our method uses an SVM classifier and a number of features that reflect different aspects and characteristics of Twitter user networks.
   The SVM classifier is trained and evaluated on a dataset of Twitter users with known locations. Our method outperforms a state-of-the-art method for geolocating users based on their social ties.
Microblog-genre noise and impact on semantic annotation accuracy BIBAFull-Text 21-30
  Leon Derczynski; Diana Maynard; Niraj Aswani; Kalina Bontcheva
Using semantic technologies for mining and intelligent information access to microblogs is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Semantic annotation of tweets is typically performed in a pipeline, comprising successive stages of language identification, tokenisation, part-of-speech tagging, named entity recognition and entity disambiguation (e.g. with respect to DBpedia). Consequently, errors are cumulative, and earlier-stage problems can severely reduce the performance of final stages. This paper presents a characterisation of genre-specific problems at each semantic annotation stage and the impact on subsequent stages. Critically, we evaluate impact on two high-level semantic annotation tasks: named entity detection and disambiguation. Our results demonstrate the importance of making approaches specific to the genre, and indicate a diminishing returns effect that reduces the effectiveness of complex text normalisation.
Composite interests' exploration thanks to on-the-fly linked data spreading activation BIBAFull-Text 31-40
  Nicolas Marie; Olivier Corby; Fabien Gandon; Myriam Ribière
Exploratory search systems are built specifically to help the user in his cognitive consuming search tasks like learning or topic investigation. Some of these systems are built on the top of linked data and use semantics to provide cognitively-optimized search experiences. Thanks to their richness and to their connected nature linked data datasets can serve as a ground for advanced exploratory search. We propose to address the case of mixed interests' exploration in the form of composite queries (several unitary interests combined) e.g. exploring results and make discoveries related to both The Beatles and Ken Loach.. The main contribution of this paper is the proposition of a novel method that processes linked-data for exploratory search purpose. It makes use of a semantic spreading activation algorithm coupled with a sampling technique. Its particularity is to not require any results preprocessing. Consequently this method offers a high level of flexibility for querying and allows, among others, the expression of composite interests' queries on remote linked data sources. This paper also details the analysis of the algorithm behavior over DBpedia and describes an implementation: the Discovery Hub application. It is an exploratory search engine that notably supports composite queries. Finally the results of a user evaluation are presented.
Harnessing linked knowledge sources for topic classification in social media BIBAFull-Text 41-50
  Amparo E. Cano; Andrea Varga; Matthew Rowe; Fabio Ciravegna; Yulan He
Topic classification (TC) of short text messages offers an effective and fast way to reveal events happening around the world ranging from those related to Disaster (e.g. Sandy hurricane) to those related to Violence (e.g. Egypt revolution). Previous approaches to TC have mostly focused on exploiting individual knowledge sources (KS) (e.g. DBpedia or Freebase) without considering the graph structures that surround concepts present in KSs when detecting the topics of Tweets. In this paper we introduce a novel approach for harnessing such graph structures from multiple linked KSs, by: (i) building a conceptual representation of the KSs, (ii) leveraging contextual information about concepts by exploiting semantic concept graphs, and (iii) providing a principled way for the combination of KSs. Experiments evaluating our TC classifier in the context of Violence detection (VD) and Emergency Responses (ER) show promising results that significantly outperform various baseline models including an approach using a single KS without linked data and an approach using only Tweets.
Structural and cognitive bottlenecks to information access in social networks BIBAFull-Text 51-59
  Jeon-Hyung Kang; Kristina Lerman
Information in networks is non-uniformly distributed, enabling individuals in certain network positions to get preferential access to information. Social scientists have developed influential theories about the role of network structure in information access. These theories were validated through numerous studies, which examined how individuals leverage their social networks for competitive advantage, such as a new job or higher compensation. It is not clear how these theories generalize to online networks, which differ from real-world social networks in important respects, including asymmetry of social links. We address this problem by analyzing how users of the social news aggregator Digg adopt stories recommended by friends, i.e., users they follow. We measure the impact different factors, such as network position and activity rate; have on access to novel information, which in Digg's case means set of distinct news stories. We show that a user can improve his information access by linking to active users, though this becomes less effective as the number of friends, or their activity, grows due to structural network constraints. These constraints arise because users in structurally diverse position within the follower graph have topically diverse interests from their friends. Moreover, though in most cases user's friends are exposed to almost all the information available in the network, after they make their recommendations, the user sees only a small fraction of the available information. Our study suggests that cognitive and structural bottlenecks limit access to novel information in online social networks.
Challenging information foraging theory: screen reader users are not always driven by information scent BIBAFull-Text 60-68
  Markel Vigo; Simon Harper
Little is known about the navigation tactics employed by screen reader users when they face problematic situations on the Web. Understanding how these tactics are operationalised and knowing the situations that bring about such tactics paves the way towards modeling navigation behaviour. Modeling the navigation of users is of utmost importance as it allows not only to predict interactive behaviour, but also to assess the appropriateness of the content in a link, the information architecture of a site and the design of a web page. Current navigation models do not consider the extreme adaptations, namely coping tactics, that screen reader users undergo on the Web. Consequently, their prediction power is lessened and coping tactics are mistakenly considered outlying behaviours. We draw from existing navigation models for sighted users to suggest the incorporation of emerging behaviours in navigation models for screen reader users. To do so, we identify the navigation coping tactics screen reader users exhibit on the Web, including deliberately clicking on low scented links, escaping from useless or inaccessible content and backtracking to a shelter. Our findings suggest that, especially in problematic situations, navigation is not driven by information scent or utility, but by the need of increasing autonomy and the need of escaping from the current web patch.
Activity fragmentation in the web: empowering users to support their own webflows BIBAFull-Text 69-78
  Oscar Díaz; Josune De Sosa; Salvador Trujillo
The Web is becoming a main conduit for our daily activities. When an activity expands across different websites, the user is left alone in the effort to aggregate the resources and services required in carrying out these cross-site activities. This results in a lost of focus, and constant switching among websites. The problem is that these webflows tend to be highly personal and hence, difficult to foreseen. Therefore, we advocate for users to be empowered to define these roadmaps upon the websphere. This work introduces CORSET, a Firefox plugin that lets users create their own webflows in the browser side. A corset is defined as a state-transition diagram, and results in "layer hyperlinks" being superimposed upon the participating websites. The expressiveness of CORSET is validated against four webflow patterns: the hub-and-spoke pattern, the guided-tour pattern, the parallel pattern and the interruption pattern. The benefits include (1) mitigation of activity fragmentation, (2) consolidation of webflow knowledge that is now amenable to sharing, (3) reduction in the number of clicks, and (4), alleviation of waiting times through page pre-load.
Storyscope: using theme and setting to guide story enrichment from external data sources BIBAFull-Text 79-88
  Annika Wolff; Paul Mulholland; Trevor Collins
Museum narratives, like other forms of narrative, are developed from an underlying conceptualization of events that can be referred to as the story. Storyscope is a web-based environment for constructing and exploring museum narratives and their underlying concepts. Storyscope aligns with a formal model of story and narrative specialized for a museum context called the curate ontology. This paper will explore the plot-reasoning component of Storyscope that provides intelligent support for the selection of events within the story and their interconnection as a coherent structure to be told within the narrative. Plot reasoning uses both internal knowledge and external information sources, such as Freebase and Factforge, to propose events that can be used to incrementally develop storylines and to employ a museum narrative. The approach taken uses the notions of setting and theme to search and rank events in terms of their relevance to the developing storyline. This paces the expansion of the story in each step, ensures that the story develops in a direction that is of interest to the author and helps to maintain narrative cohesion, an important goal of story-building. Plot development is also supported by methods for clustering events into related plot elements and by using information from Freebase to propose different types of influence relations between story events.
Models of human navigation in information networks based on decentralized search BIBAFull-Text 89-98
  Denis Helic; Markus Strohmaier; Michael Granitzer; Reinhold Scherer
Models of human navigation play an important role for understanding and facilitating user behavior in hypertext systems. In this paper, we conduct a series of principled experiments with decentralized search -- an established model of human navigation in social networks -- and study its applicability to information networks. We apply several variations of decentralized search to model human navigation in information networks and we evaluate the outcome in a series of experiments. In these experiments, we study the validity of decentralized search by comparing it with human navigational paths from an actual information network -- Wikipedia. We find that (i) navigation in social networks appears to differ from human navigation in information networks in interesting ways and (ii) in order to apply decentralized search to information networks, stochastic adaptations are required. Our work illuminates a way towards using decentralized search as a valid model for human navigation in information networks in future work. Our results are relevant for scientists who are interested in modeling human behavior in information networks and for engineers who are interested in using models and simulations of human behavior to improve on structural or user interface aspects of hypertextual systems.
How big is the crowd?: event and location based population modeling in social media BIBAFull-Text 99-108
  Yuan Liang; James Caverlee; Zhiyuan Cheng; Krishna Y. Kamath
In this paper, we address the challenge of modeling the size, duration, and temporal dynamics of short-lived crowds that manifest in social media. Successful population modeling for crowds is critical for many services including location recommendation, traffic prediction, and advertising. However, crowd modeling is challenging since 1) user-contributed data in social media is noisy and oftentimes incomplete, in the sense that users only reveal when they join a crowd through posts but not when they depart; and 2) the size of short-lived crowds typically changes rapidly, growing and shrinking in sharp bursts. Toward robust population modeling, we first propose a duration model to predict the time users spend in a particular crowd. We propose a time-evolving population model for estimating the number of people departing a crowd, which enables the prediction of the total population remaining in a crowd. Based on these population models, we further describe an approach that allows us to predict the number of posts generated from a crowd. We validate the crowd models through extensive experiments over 22 million geo-location based check-ins and 120,000 event-related tweets.
Canyons, deltas and plains: towards a unified sculptural model of location-based hypertext BIBAFull-Text 109-118
  David E. Millard; Charlie Hargood; Michael O. Jewell; Mark J. Weal
With the growing ubiquity of mobile devices, new ways of sensing context and the emergence of the mobile Web, digital storytelling is escaping the confines of the desktop and intertwinging in new and interesting ways with the physical world. Mobile, location aware, narrative systems are being applied to a range of areas including tour guides, educational tools and interactive fiction. Despite this there is little understanding of how these applications are related or how they link with existing hypertext models and theory.
   We argue that location aware narrative systems tend to follow three patterns (canyons, deltas and plains) and that it is possible to represent all of these patterns in a conceptual sculptural hypertext model. Our model builds on a general sculptural mechansim (of pre-conditions and behaviours) to include locality and narrative transitions as first class elements, opening the possibility of standardised viewers, formats, and hybrid stories. We show how existing structures can be mapped onto this conceptual sculptural model, and how narratives defined in the model can take advantage of open data sources and sensed contextual data. To demonstrate this we present the GeoYarn system, a prototype which implements the model to create interactive, location aware narratives, using all three patterns.
A sentiment-enhanced personalized location recommendation system BIBAFull-Text 119-128
  Dingqi Yang; Daqing Zhang; Zhiyong Yu; Zhu Wang
Although online recommendation systems such as recommendation of movies or music have been systematically studied in the past decade, location recommendation in Location Based Social Networks (LBSNs) is not well investigated yet. In LBSNs, users can check in and leave tips commenting on a venue. These two heterogeneous data sources both describe users' preference of venues. However, in current research work, only users' check-in behavior is considered in users' location preference model, users' tips on venues are seldom investigated yet. Moreover, while existing work mainly considers social influence in recommendation, we argue that considering venue similarity can further improve the recommendation performance. In this research, we ameliorate location recommendation by enhancing not only the user location preference model but also recommendation algorithm. First, we propose a hybrid user location preference model by combining the preference extracted from check-ins and text-based tips which are processed using sentiment analysis techniques. Second, we develop a location based social matrix factorization algorithm that takes both user social influence and venue similarity influence into account in location recommendation. Using two datasets extracted from the location based social networks Foursquare, experiment results demonstrate that the proposed hybrid preference model can better characterize user preference by maintaining the preference consistency, and the proposed algorithm outperforms the state-of-the-art methods.
Generating contextualized sentiment lexica based on latent topics and user ratings BIBAFull-Text 129-138
  Ralf Krestel; Stefan Siersdorfer
Sentiment lexica are useful for analyzing opinions in Web collections, for domain-dependent sentiment classification, and as sub-components of recommender systems. In this paper, we present a strategy for automatically generating topic-dependent lexica from large corpora of review articles by exploiting accompanying user ratings. Our approach combines text segmentation, discriminative feature analysis techniques, and latent topic extraction to infer the polarity of n-grams in a topical context. Our experiments on rating prediction demonstrate a substantial performance improvement in comparison with existing state-of-the-art sentiment lexica.
Whom should I follow?: identifying relevant users during crises BIBAFull-Text 139-147
  Shamanth Kumar; Fred Morstatter; Reza Zafarani; Huan Liu
Social media is gaining popularity as a medium of communication before, during, and after crises. In several recent disasters, it has become evident that social media sites like Twitter and Facebook are an important source of information, and in cases they have even assisted in relief efforts. We propose a novel approach to identify a subset of active users during a crisis who can be tracked for fast access to information. Using a Twitter dataset that consists of 12.9 million tweets from 5 countries that are part of the "Arab Spring" movement, we show how instant information access can be achieved by user identification along two dimensions: user's location and the user's affinity towards topics of discussion. Through evaluations, we demonstrate that users selected by our approach generate more information and the quality of the information is better than that of users identified using state-of-the-art techniques.
Graph based techniques for tag cloud generation BIBAFull-Text 148-157
  Martin Leginus; Peter Dolog; Ricardo Lage
Tag cloud is one of the navigation aids for exploring documents. Tag cloud also link documents through the user defined terms. We explore various graph based techniques to improve the tag cloud generation. Moreover, we introduce relevance measures based on underlying data such as ratings or citation counts for improved measurement of relevance of tag clouds. We show, that on the given data sets, our approach outperforms the state of the art baseline methods with respect to such relevance by 41% on Movielens dataset and by 11% on Bibsonomy data set.
Examining social media use among older adults BIBAFull-Text 158-163
  Caroline Bell; Cara Fausset; Sarah Farmer; Julie Nguyen; Linda Harley; W. Bradley Fain
Social media is a powerful tool that can connect family and friends across long distances as well as link people with similar interests. Social media has been widely adopted by younger adults, but older adults have been less likely to use such applications. A survey of 142 older adults (Mage=72 years, SD=11; range: 52-92) living in the metropolitan Atlanta area was conducted to understand the characteristics of older adults who do and do not use Facebook, a popular and wide-spread social media application. The present study examined the relationship between Facebook use and loneliness, social satisfaction, and confidence with technology. Demographic relationships were also examined, such as gender and age. Fifty-nine participants (42%) identified themselves as current Facebook users; 83 participants (58%) were not Facebook users. Non-Facebook users were significantly older (Mage= 75.3 years) than Facebook users (Mage= 66.5 years). Counter to expectations, there was not a significant difference in loneliness between Facebook users and non-users for this sample. However, Facebook users did score higher on assessments of social satisfaction and confidence with technology than did non-users. These preliminary results suggest that many older adults do use Facebook and they primarily use it to stay connected with family. As adults enter into older adulthood, maintaining social connectedness may become more difficult due to mobility limitations, chronic diseases, and other age-related issues, thus decreasing physical connectedness with friends, family, and community. For these reasons, social media may begin to play a more active role in keeping this population socially connected. Therefore, understanding the factors that influence social media use in older adults is becoming more critical.
Tweeting across hashtags: overlapping users and the importance of language, topics, and politics BIBAFull-Text 164-168
  Marco Toledo Bastos; Cornelius Puschmann; Rodrigo Travitzki
In this paper we investigate the activity of 1 million users tweeting under 455 different hashtags related to a wide range of topics (political activism, health, technology, sports, Twitter-idioms). We find that 70% of users in the sample tweet across multiple information streams, frequently engaging in what could be described as serial activism. We furthermore determined the dominant language in each hashtag to trace which users overlap between the thematic and linguistic communities delineated by different information streams. Although social media is frequently assumed to bring together people of different nationalities and cultures to discuss a wide range of controversial issues, our results indicate that the underlying social network that connects hashtags through overlapping users is heavily limited to linguistic and content-oriented communities. Information streams are clustered around linguistic communities, and hashtags within the same language group are clustered around well-defined topics, such as health, entertainment and politics. The only information streams that transcend language barriers are activism-related hashtags, which cluster information streams in different languages. Contrasting with the assumption that social media acts as the enabler of a globalized public debate, our results indicate a linear relationship between users who are very active in political hashtags and users who tweet across multiple political hashtags. The results suggest that activist campaigns based on social media are driven by a relatively small number of highly-active, politically engaged users.
Reading tweeting minds: real-time analysis of short text for computational social science BIBAFull-Text 169-173
  Zhe Wang; Daniele Quercia; Diarmuid Ó Séaghdha
Twitter status updates (tweets) have great potential for unobtrusive analysis of users' perceptions in real time, providing a way of investigating social patterns at scale. Here we present a tool that performs textual analysis of tweets mentioning a topic of interest and outputs words statistically associated with it in the form of word lists and word graphs. Such a tool could be of value for helping social scientists to navigate the overwhelming amounts of data that are produced on Twitter. To evaluate our tool, we select three concepts of interest to social scientists (i.e., privacy, serendipity, and Occupy Wall Street), build ground truths for each concept using the Grounded Theory approach, and perform a quantitative assessment based on two widely-used information retrieval metrics. To then offer qualitative assessments complementary to the quantitative ones, we run a user study involving 32 individuals. We find that simple information-theoretic association measures are more accurate than frequency-based measures. We also spell out under which conditions these metrics tend to work best.
Mainstream media behavior analysis on Twitter: a case study on UK general election BIBAFull-Text 174-178
  Zhongyu Wei; Yulan He; Wei Gao; Binyang Li; Lanjun Zhou; Kam-fai Wong
With the development of social media tools such as Facebook and Twitter, mainstream media organizations including newspapers and TV media have played an active role in engaging with their audience and strengthening their influence on the recently emerged platforms. In this paper, we analyze the behavior of mainstream media on Twitter and study how they exert their influence to shape public opinion during the UK's 2010 General Election. We first propose an empirical measure to quantify mainstream media bias based on sentiment analysis and show that it correlates better with the actual political bias in the UK media than the pure quantitative measures based on media coverage of various political parties. We then compare the information diffusion patterns from different categories of sources. We found that while mainstream media is good at seeding prominent information cascades, its role in shaping public opinion is being challenged by journalists since tweets from them are more likely to be retweeted and they spread faster and have longer lifespan compared to tweets from mainstream media. Moreover, the political bias of the journalists is a good indicator of the actual election results.
On commenting behavior of Facebook users BIBAFull-Text 179-183
  Mehwish Nasim; Muhammad U. Ilyas; Aimal Rextin; Nazish Nasim
Facebook treats friends as a single homogeneous group even though people on Facebook are possibly acquainted with diverse group of individuals and perceive their friends as representatives of different groups. It is a common observation that people tend to select friends with similar characteristics or individuals are likely to change their attributes to conform to their friends. In this measurement study we quantify the extension of this behavior on Facebook. We measure the probability with which a friend belonging to a particular group of friends will or will not comment on a post that has already received comments from other friends belonging/not belonging to his own circle of friends. To this end we collected an original data set of Facebook profiles of 50 volunteers. Our data analysis shows that Facebook users are influenced in their choice of posting comments on friends' wall posts, based on whether or not they are acquainted with the people that left earlier comments. Identification of such behavioral nuances can be helpful in improving the user interface design of online social networks.
Adaptive hypertext narrative as city planning BIBAFull-Text 184-188
  David Kolb
This essay explores an analogy that might offer new ideas for the construction of adaptive hypertext narrative systems. The analogy is not with the production of a literary work but with city planning, in particular Christopher Alexander's iterative model for gradual change. In this model, there is no overall plan for the city; instead there are multiple local interventions guided by local insufficiencies and a library of spatial patterns. Applied to narrative hypertext this model would stress multiple local additions to a growing landscape of nodes and links. The results should provide more kinds of novelty and surprise than with conventional authoring software, but would have to deal with problems of narrative unity and closure.
TouchStory: combining hyperfiction and multitouch BIBAFull-Text 189-195
  Claus Atzenbeck; Mark Bernstein; Marwa Ali Al-Shafey; Stacey Mason
As multitouch phones and tablets become more popular, multitouch technologies receive increasing attention. The underlying interaction paradigm of such devices is the space on which objects are manipulated by the user's fingertips. It is natural that hypertext narratives find their way from primarily mouse-driven interaction to spatial structures and visually rich presentations. In this article we propose three features for multitouch hypertext narrative applications: (i) Native multitouch support and direct manipulations of fictive objects; (ii) using the space as a structuring mechanism rather than a means for presentation; and (iii) supporting presentation of visually rich objects. Our prototype, TouchStory, is a novel tool specialized for authoring and reading hypertext narratives that integrates these features.
Engagement-based user attention distribution on web article pages BIBAFull-Text 196-201
  Oleg Rokhlenko; Nadav Golbandi; Ronny Lempel; Limor Leibovich
The main monetization vehicle of many Web media sites are display ads located on article pages. Those ads are typically displayed either as banners on top of the page, or on the page's side bar. Advertiser ROI depends on the quality of ad targeting, as well as on how noticeable those ads are to users reading the article. Focusing on the latter issue, previous work has studied which ad positions are, on aggregate, more noticed by users.
   This work takes the first step toward the personalized positioning of ads on article pages. We demonstrate a correlation between the level of attention that users devote to a story, and the position of the most noticeable graphic element on the side bar. In particular, we find that the graphic element most noticed by a user is roughly to the side of the point in the article where the user's attention waned. We argue that this finding lays the foundation for increasing display advertising effectiveness by tailoring ad positions on each article page impression to the user viewing it.
Discovering semantic associations from web search interactions BIBAFull-Text 202-207
  Michael Antunovic; Glyn Caon; Mark Truran; Helen Ashman
Semantic associations take many forms, sometimes being explicit as in visible links and at other times being implicit, not visible but nevertheless clear to the human reader. Some implicit semantic associations might be calculable as the result of a computation but in some cases it is difficult for a computation to capture the purpose of a semantic association, for example, the semantic similarities embodied by synonyms and similar word/phrase likenesses are not easily specified in a general rule. It is possible however to capture semantic associations made by human searchers. Searchers interact with search results by clicking on one or more resources in a set of results, and this interaction takes two forms: the first being an implicit indication of the relevance of the search term to the chosen resource, and the second being an implicit indication of the mutual relevance of any two or more resources selected from the same search. Both have been proposed as a similarity measure for clustering of resources. In this paper we implement, evaluate and compare three methods for semantic association discovery, mined from Web search logs. The first method is based purely on query analysis, the second is single click-based, and the third is coselection-based. The methods are compared for their effectiveness at detecting semantic similarities.
MeSoOnTV: a media and social-driven ontology-based TV knowledge management system BIBAFull-Text 208-213
  Alessio Antonini; Luca Vignaroli; Claudio Schifanella; Ruggero G. Pensa; Maria Luisa Sapino
Searching, browsing and analyzing web contents is today a challenging problem when compared to early Internet ages. This is due to the fact that web content is multimedial, social and dynamic. Moreover, concepts referred by videos, news, comments, posts, are implicitly linked by the fact that people on the Web talks about something, somewhere at some time and these connections may change as the perception of users on the Web changes over time. We define a model for the integration of the heterogeneous and dynamic data coming from different knowledge sources (broadcasters' archives, online newspapers, blogs, web encyclopedias, social media platforms, social networks, etc.). We use a knowledge graph to model all the heterogenous aspects of the information in an homogeneous way. Through a case study on social TV, we provide a non trivial cross-domain analysis scenario on real data gathered from YouTube and Twitter, and related to an Italian TV talk show on politics, broadcasted by RAI, the Italian public-service broadcasting organization.
How annotation styles influence content and preferences BIBAFull-Text 214-218
  Justin Cheng; Dan Cosley
Photo-tagging web sites provide several methods to annotate photographs. In this paper, we study how people use and respond to three different annotation styles: single-word tags, multi-word tags, and comments. We find significant differences in how annotation styles influence the objectivity, descriptiveness, and interestingness of annotations. Although single-word and multi-word tags are not normally differentiated, users prefer multi-word tags for their combination of descriptiveness and succinctness. We also discover that producers and consumers assess annotation styles differently in terms of ease of use, support for different user goals, and amount of effort required, demonstrating that allowing multiple modes of annotation is generally beneficial, as is considering both tag production and consumption.
A general collaborative filtering framework based on matrix bordered block diagonal forms BIBAFull-Text 219-224
  Yongfeng Zhang; Min Zhang; Yiqun Liu; Shaoping Ma
Recommender systems based on Collaborative Filtering (CF) techniques have achieved great success in e-commerce, social networks and various other applications on the Web. However, problems such as data sparsity and scalability are still important issues to be investigated in CF algorithms. In this paper, we present a novel CF framework that is based on Bordered Block Diagonal Form (BBDF) matrices attempting to meet the challenges of data sparsity and scalability. In this framework, general and special interests of users are distinguished, which helps to improve prediction accuracy in collaborative filtering tasks. Experimental results on four real-world datasets show that the proposed framework helps many traditional CF algorithms to make more accurate rating predictions. Moreover, by leveraging smaller and denser submatrices to make predictions, this framework contributes to the scalability of recommender systems.
Using personality to adjust diversity in recommender systems BIBAFull-Text 225-229
  Wen Wu; Li Chen; Liang He
Nowadays, although some approaches have been proposed to enhance the diversity in online recommendations, they neglect the user's spontaneous needs that might be possibly influenced by her/his personality. Previously, we did a user survey that showed some personality dimensions (such as conscientiousness which is one of personality factors according to the big-five factor model) have significant impact not only on users' diversity preference over items' individual attributes, but also on their overall diversity needs when all attributes are combined. Motivated by the findings, in the current work, we propose a strategy that explicitly embeds personality, as a moderating factor, to adjust the diversity degree within multiple recommendations. Moreover, we performed a user evaluation on the developed system. The experimental results demonstrate an effective solution to generate personality-based diversity in recommender systems.
"Tell me what I want to know!": the effect of relationship closeness on the relevance of profile attributes BIBAFull-Text 230-235
  João Guerreiro; Daniel Gonçalves
The growing amount of personal information on the web raises increasing concerns about what and with whom we share information online. Nevertheless, little effort has been made in determining the relevance of the information shared with us or in filtering it accordingly. We conducted a study to identify the most relevant characteristics when seeking information about people and to scrutinize their differences among relationship types. To achieve that, we asked users to describe people (friends, acquaintances and famous people). Afterwards, we asked them to rate the perceived relevance of a carefully pre-determined set of attributes for each type. Results showed that their relevance varied depending on the relationship. As an outcome, we present the most relevant attributes when seeking information about friends, acquaintances and famous people and the major differences among them. We conclude suggesting how our findings may influence the design of interactive systems where such data is paramount.
From RDF to RSS and atom: content syndication with linked data BIBAFull-Text 236-241
  Alex Stolz; Martin Hepp
For typical Web developers, it is complicated to integrate content from the Semantic Web to an existing Web site. On the contrary, most software packages for blogs, content management, and shop applications support the simple syndication of content from external sources via data feed formats, namely RSS and Atom. In this paper, we describe a novel technique for consuming useful data from the Semantic Web in the form of RSS or Atom feeds. Our approach combines (1) the simplicity and broad tooling support of existing feed formats, (2) the precision of queries against structured data built upon common Web vocabularies like schema.org, GoodRelations, FOAF, SIOC, or VCard, and (3) the ease of integrating content from a large number of Web sites and other data sources of RDF in general. We also (4) provide a pattern for embedding RDFa into the feed content in a "viral" way so that the original URIs of entities are included in all Web pages that republish the original content and that those pages will link back to the original content. This helps prevent the proliferation of identifiers for entities and provides a simple means for tracking the document URI at which particular content reappears.
Guided exploration and integration of urban data BIBAFull-Text 242-247
  Vanessa Lopez; Spyros Kotoulas; Marco Luca Sbodio; Raymond Lloyd
Governments and enterprises are interested in the return-on-investment for exposing their data. This brings forth the problem of making data consumable, with minimal effort. Beyond search techniques, there is a need for effective methods to identify heterogeneous datasets that are closely related, as part of data integration or exploration tasks. The large number of datasets demands a new generation of Smarter Systems for data content aggregation that allows users to incrementally liberate, access and integrate information, in a manner that scales in terms of gain for the effort spent. In the context of such a pay-as-you go system, we are presenting a novel method for exploring and discovering relevant datasets based on semantic relatedness. We are demonstrating a system for contextual knowledge mining on hundreds of real-world datasets from Dublin City. We evaluate our semantic approach, using query logs and domain expert judgments, to show that our approach effectively identifies related datasets and outperforms text-based recommendations.
Exploring temporal proximity and spatial distribution of terms in web-based search of event-related images BIBAFull-Text 248-252
  Massimiliano Ruocco; Heri Ramampiaro
Pictures in media sharing applications are increasingly accompanied with geotags. For this reason, we stress the importance of exploring the possibility of applying spatial, as well as the temporal dimensions in searching event-related pictures. Specifically, we propose extended query expansion models that exploit the information about the temporal neighbourhoods among pictures in a collection and leverage on the spatio-temporal distribution of the candidate expansion terms to re-weight and expand the initial query. To evaluate our approach, we conduct extensive experiments on a large dataset consisting of 88 million pictures from Flickr. The results from these experiments demonstrate the viability and effectiveness of our method with respect to retrieval performance, considering both a large dataset and query pictures with restricted size of terms.
On the topology of the web of data BIBAFull-Text 253-257
  Markus Luczak-Rösch; Robert Tolksdorf
The Web of Data consists of the open accessible structured data on the Web. This includes the evolving number of Linked Open Data data sets but also the structured data which is embedded in Web pages. In this paper we address questions related to a unified definition of distinct data sets and factors that influence different network representations of structured Web data. The contributions are (1) an algorithm to generate a data set linking structure of the embedded structured data sourcing from (a) the Billion Triples Challenge corpus (b) the Web Data Commons corpus, and (c) the sindice crawl, (2) a discussion on the issue of identifying distinct data sets in a generic fashion, and (3) a high level visual abstraction of the current Web of Data topology.
Community detection algorithm based on centrality and node distance in scale-free networks BIBAFull-Text 258-262
  Sorn Jarukasemratana; Tsuyoshi Murata; Xin Liu
In this paper, we present a method for detecting community structures based on centrality value and node distance. Many real world networks possess a scale-free property and this property makes community detection difficult especially on algorithms that are based on modularity optimization. However, in our algorithm, communities are formed from hub nodes. Thus communities with scale-free property can be identified correctly. The method does not contain any random element, nor requires any pre-determined value such as the number of communities. Our experiments have shown that our algorithm is better than those based on modularity optimization in both real world and computer generated datasets.