Measurement Quality of Online Collaboration in Presence of Negative Relationships | | BIBA | Full-Text | 3-13 | |
Mikolaj Morzy; Tomasz Bartkowski; Krzysztof Jedrzejewski | |||
Online collaboration services usually focus on positive relationships between constituting actors. Many environments in which social mechanisms are present harness positive feedback of social recognition, status visibility, or collective action. Simple mechanisms of commenting on status updates and up-voting of resources attributed to an actor may result in proverbial karma flow in the socially aware online collaboration environment. On the other hand, many services allow users to also express their dislike, irritation and contempt towards resources provided by users. For instance, down-voting mechanics is crucial in online news aggregation services, such as Digg or Reddit, to maintain a certain level of quality of presented contents. Despite the availability of data, not many works have been published on measuring the negative network effects in social networks. In this paper we analyze a large body of data harvested from a Polish online news aggregation site Wykop.pl and we examine the effects of a more considerate approach to negative network construction when measuring the overall parameters and characteristics of the social network derived from positive (up-voting) and negative (down-voting) behaviors of users. |
What Makes a Good Team of Wikipedia Editors? A Preliminary Statistical Analysis | | BIBAK | Full-Text | 14-28 | |
Leszek Bukowski; Michal Jankowski-Lorek; Szymon Jaroszewicz; Marcin Sydow | |||
The paper concerns studying the quality of teams of Wikipedia authors with
statistical approach. We report preparation of a dataset containing numerous
behavioural and structural attributes and its subsequent analysis and use to
predict team quality. We have performed exploratory analysis using partial
regression to remove the influence of attributes not related to the team
itself. The analysis confirmed that the key issue significantly influencing
article's quality are discussions between teem members. The second part of the
paper successfully uses machine learning models to predict good articles based
on features of the teams that created them. Keywords: team quality; Wikipedia; dataset; statistical data mining |
iPoster: A Collaborative Browsing Platform for Presentation Slides Based on Semantic Structure | | BIBA | Full-Text | 29-42 | |
Yuanyuan Wang; Kota Tomoyasu; Kazutoshi Sumiya | |||
Coursera and SlideShare are crucial platforms for improving education; students are able to obtain various educational presentation materials through the Web. Recently, Prezi introduced a zoomable canvas as a substitute to the traditional presentations that allows users to zoom in and out of the presentation media. Teachers then attempt to provide presentations in a nonlinear fashion for enhancing the user interaction through these presentations. However, creation of non-linear presentations would be time-consuming, besides posing design challenges. In this paper, in order to support collaborative browsing, we build a novel collaborative browsing platform that generates meaningfully structured presentations, called "iPoster;" this enables users to automatically navigate through the slide-based educational materials. The system places elements such as text and graphics of presentation slides in a structural layout by semantically analyzing the slide structure. The structural layout can reveal the hierarchy of elements by moving from the overview to a detail using automatic transitions, such as zooms and pans. Through this, the collaborative browsing platform can support multiple students to interactively browse an iPoster in cyberspace on their tablets. The navigation information maps each student's specific needs by considering the student's operations, and detects other students who have similar learning purposes to help them share their interests with each other. |
Using E-mail Communication Network for Importance Measurement in Collaboration Environments | | BIBA | Full-Text | 43-54 | |
Pawel Lubarski; Mikolaj Morzy | |||
Can we establish the importance of people by simply analyzing the set of sent and received emails, having no access to subject lines or contents of messages? The answer, apparently, is "yes we can". Intrinsic behavior of people reveals simple patterns in choosing which emails to answer next. Our theory is based on two assumptions. We assume that people do their email communication in bursts, answering several messages consecutively and that they can freely choose the order of answers. Secondly, we believe that people use priority queues to manage their internal task lists, including the list of emails to be answered. Looking at timing and ordering of responses we derive individual rankings of importance of actors, because we posit that people have a tendency to reply to important actors first. These individual subjective rankings are significant because they reflect the relative importance of other actors as perceived by each actor. The individual rankings are aggregated into a global ranking of importance of all actors. We perform an experimental evaluation of our model by analyzing the dataset consisting of over 600,000 emails sent during one year period to 200 employees of our university. Our final ranking closely reflects the "true" importance of employees computed based on surveys. We think that our model is general and can be applied whenever behavioral data is available which includes any choice made by actors from a set of available alternatives with the alternatives having varying degrees of importance to individual actors. |
Predicting Best Answerers for New Questions: An Approach Leveraging Topic Modeling and Collaborative Voting | | BIBAK | Full-Text | 55-68 | |
Yuan Tian; Pavneet Singh Kochhar; Ee-Peng Lim; Feida Zhu; David Lo | |||
Community Question Answering (CQA) sites are becoming increasingly important
source of information where users can share knowledge on various topics.
Although these platforms bring new opportunities for users to seek help or
provide solutions, they also pose many challenges with the ever growing size of
the community. The sheer number of questions posted everyday motivates the
problem of routing questions to the appropriate users who can answer them. In
this paper, we propose an approach to predict the best answerer for a new
question on CQA site. Our approach considers both user interest and user
expertise relevant to the topics of the given question. A user's interests on
various topics are learned by applying topic modeling to previous questions
answered by the user, while the user's expertise is learned by leveraging
collaborative voting mechanism of CQA sites. We have applied our model on a
dataset extracted from StackOverflow, one of the biggest CQA sites. The results
show that our approach outperforms the TF-IDF based approach. Keywords: CQA; expert recommendation; topic modeling; collaborative voting |
A Digital Humanities Approach to the History of Science | | BIBA | Full-Text | 71-85 | |
Pim Huijnen; Fons Laan; Maarten de Rijke; Toine Pieters | |||
Comparative historical research on the intensity, diversity and fluidity of public discourses has been severely hampered by the extraordinary task of manually gathering and processing large sets of opinionated data in news media in different countries. At most 50,000 documents have been systematically studied in a single comparative historical project in the subject area of heredity and eugenics. Digital techniques, like the text mining tools WAHSP and BILAND we have developed in two successive demonstrator projects, are able to perform advanced forms of multi-lingual text-mining in much larger data sets of newspapers. We describe the development and use of WAHSP and BILAND to support historical discourse analysis in large digitized news media corpora. Furthermore, we argue how text mining techniques overcome the problem of traditional historical research that only documents explicitly referring to eugenics issues and debates can be incorporated. Our tools are able to provide information on ideas and notions about heredity, genetics and eugenics that circulate in discourses that are not directly related to eugenics (e.g., sport, education and economics). |
Building the Social Graph of the History of European Integration | | BIBAK | Full-Text | 86-99 | |
Lars Wieneke; Marten Düring; Ghislain Silaume; Carine Lallemand; Vincenzo Croce; Marilena Lazzarro; Francesco Nucci; Chiara Pasini; Piero Fraternali; Marco Tagliasacchi; Mark Melenhorst; Jasminko Novak; Isabel Micheel; Erik Harloff; Javier Garcia Moron | |||
The breadth and scale of multimedia archives provides a tremendous potential
for historical research that hasn't been fully tapped up to know. In this paper
we want to discuss the approach taken by the History of Europe application, a
demonstrator for the integration of human and machine computation that combines
the power of face recognition technology with two distinctively different
crowd-sourcing approaches to compute co-occurrences of persons in historical
image sets. These co-occurrences are turned into a social graph that connects
persons with each other and positions them, through information about the date
and location of recording, in time and space. The resulting visualization of
the graph as well as analytical tools can help historians to find new impulses
for research and to un-earth previously unknown relationships. As such the
integration of human expertise and machine computation enables a new class of
applications for the exploration of multimedia archives with significant
potential for the digital humanities. Keywords: Face recognition; Entity linking; User centered design; Data visualization;
Digital Humanities; Human-machine Interaction; History; European Studies |
From Diagram to Network | | BIBAK | Full-Text | 100-109 | |
Yanan Sun | |||
This paper aims to remove a constraint of applying network approach to art
history. First, it points out, although old diagrams of art history did not use
the language of modern network theory, they have already shown ingenuous
network thinking to theorize the development of arts. Meanwhile, the indirect
visual devices and the embracive tradition of these diagrams, which includes
entities in various properties, prevent the application of computer-aided
network methods to decipher and re-analyze the contents of this heritage of art
historical research. To break this shackle, this paper suggests a multi-mode
network approach to "translate" the traditional network thinking of art
diagrams to the conceptualization of graph-theoretical network analysis. By
doing so, this paper demonstrates how art historical research could benefit
from modern sociological approach to network theory. To explain the usefulness
and advantage of this method, the diagrams of Covarrubias and Barr are taken as
examples to be converted into graph-theoretical networks. Keywords: multi-mode network; historic network research; art history; art-history
diagram |
Frame-Based Models of Communities and Their History | | BIBAK | Full-Text | 110-119 | |
Robert B. Allen | |||
Previous models of communities and their history have focused on the
entities in those communities such as their locations and people. We introduce
models which incorporate behaviors and processes. We propose that approaches
based on object-oriented modeling are particularly useful. Specifically, we
explore the feasibility of developing object-oriented models which employ
linguistic frames adapted from the FrameNet corpus. We apply these models to
relatively straightforward and self-contained historical scenarios. We
implement the models in Java and analyze some of advantages and challenges in
that approach. Historical newspapers are particularly rich sources of natural
language descriptions about communities but there are many sources of
non-linguistic information about communities which may also be incorporated. We
consider the possibilities of developing more coherent models of communities
based on modeling processes, partonomies, systems, and situations. Finally, we
consider enabling greater interactivity with the structured models and
alternative architectures for the models. Keywords: Behavior; Descriptive Modeling; Digital Humanities; Events; Functionality;
FrameNet; Indexing; Information Organization; Java; Object-Oriented Modeling;
Processes; Social Modeling |
Documenting Social Unrest: Detecting Strikes in Historical Daily Newspapers | | BIBA | Full-Text | 120-133 | |
Kalliopi Zervanou; Marten Düring; Iris Hendrickx; Antal van den Bosch | |||
The identification of relevant historical sources such as newspapers and letters and the extraction of information from them is an essential part of historical research. In this work, our aim is the detection of relevant primary sources with the goal to support researchers working on a specific historical event. We focus on the historical daily Dutch newspaper archive of the National Library of the Netherlands and strike events that happened in the Netherlands during the 1980s. Using a manually compiled database of strikes in the Netherlands, we first attempt to find reports on those strikes in historical daily newspapers by automatically associating database records to the daily press of the time covering the same strike. Then, we generalise our methodology to detect strike events in the press not currently covered by the strikes database, and support in this way the extension of secondary historical resources. Our methods are evaluated against the manually constructed database of strikes. |
Collective Memory in Poland: A Reflection in Street Names | | BIBAK | Full-Text | 134-142 | |
Radoslaw Nielek; Aleksander Wawer; Adam Wierzbicki | |||
Our article starts with an observation that street names fall into two
general types: generic and historically inspired. We analyse street names
distributions (of the second type) as a window to nation-level collective
memory in Poland. The process of selecting street names is determined socially,
as the selections reflect the symbols considered important to the nation-level
society, but has strong historical motivations and determinants. In the
article, we seek for these relationships in the available data sources. We use
Wikipedia articles to match street names with their textual descriptions and
assign them to the time points. We then apply selected text mining and
statistical techniques to reach quantitative conclusions. We also present a
case study: the geographical distribution of two particular street names in
Poland to demonstrate the binding between history and political orientation of
regions. Keywords: collective memory; Wikipedia; street names |