Proceedings of the 2013 International Symposium on Wikis and Open Collaboration

Fullname:Proceedings of the 9th International Symposium on Open Collaboration
Editors:Ademar Aguiar; Dirk Riehle
Location:Hong Kong, China
Dates:2013-Aug-05 to 2013-Aug-07
Standard No:ISBN: 978-1-4503-1852-5; ACM DL: Table of Contents; hcibib: ISW13
  1. Open collaboration research track
  2. Wikipedia research track
  3. FLOSS research track
  4. Open access research track
  5. Research posters
  6. Research-in-progress presentations
  7. Doctoral symposium
  8. Invited talks
  9. Experience reports
  10. Community demos
  11. Community tutorials

Open collaboration research track

Analyzing multi-dimensional networks within MediaWikis BIBAFull-Text 1
  Brian C. Keegan; Arber Ceni; Marc A. Smith
The MediaWiki platform supports popular socio-technical systems such as Wikipedia as well as thousands of other wikis. This software encodes and records a variety of relationships about the content, history, and editors of its articles such as hyperlinks between articles, discussions among editors, and editing histories. These relationships can be analyzed using standard techniques from social network analysis, however, extracting relational data from Wikipedia has traditionally required specialized knowledge of its API, information retrieval, network analysis, and data visualization that has inhibited scholarly analysis. We present a software library called the NodeXL MediaWiki Importer that extracts a variety of relationships from the MediaWiki API and integrates with the popular NodeXL network analysis and visualization software. This library allows users to query and extract a variety of multidimensional relationships from any MediaWiki installation with a publicly-accessible API. We present a case study examining the similarities and differences between different relationships for the Wikipedia articles about "Pope Francis" and "Social media." We conclude by discussing the implications this library has for both theoretical and methodological research as well as community management and outline future work to expand the capabilities of the library.
Design and implementation of wiki content transformations and refactorings BIBAFull-Text 2
  Hannes Dohrn; Dirk Riehle
The organic growth of wikis requires constant attention by contributors who are willing to patrol the wiki and improve its content structure. However, most wikis still only offer textual editing and even wikis which offer WYSIWYG editing do not assist the user in restructuring the wiki. Therefore, "gardening" a wiki is a tedious and error-prone task. One of the main obstacles to assisted restructuring of wikis is the underlying content model which prohibits automatic transformations of the content. Most wikis use either a purely textual representation of content or rely on the representational HTML format. To allow rigorous definitions of transformations we use and extend a Wiki Object Model. With the Wiki Object Model installed we present a catalog of transformations and refactorings that helps users to easily and consistently evolve the content and structure of a wiki. Furthermore we propose XSLT as language for transformation specification and provide working examples of selected transformations to demonstrate that the Wiki Object Model and the transformation framework are well designed. We believe that our contribution significantly simplifies wiki "gardening" by introducing the means of effortless restructuring of articles and groups of articles. It furthermore provides an easily extensible foundation for wiki content transformations.
Project talk: coordination work and group membership in WikiProjects BIBAFull-Text 3
  Jonathan T. Morgan; Michael Gilbert; David W. McDonald; Mark Zachry
WikiProjects have contributed to Wikipedia's success in important ways, yet the range of work that WikiProjects perform and the way they coordinate that work remains largely unexplored. In this study, we perform a content analysis of 788 work-related discussions from the talk pages of 138 WikiProjects in order to understand the role WikiProjects play in collaborative work on Wikipedia. We find that the editors use WikiProjects to coordinate a wide variety of work activities beyond content production and that non-members play an active role in that work. Our research suggests that WikiProject collaboration is less structured and more open than that of many virtual teams and that WikiProjects may function more like FLOSS projects than traditional groups.
Songrium: a music browsing assistance service based on visualization of massive open collaboration within music content creation community BIBAFull-Text 4
  Masahiro Hamasaki; Masataka Goto
This paper describes a music browsing assistance service, Songrium (http://songrium.jp), that helps a user enjoy songs while seeing visualization of open collaboration. Songrium focuses on open collaboration for music content creation on the most popular Japanese video-sharing service. Since this open collaboration generates more than half a million video clips with a rich variety of music content, we call it massive open collaboration. To develop a shared understanding of this collaboration we have analyzed, we developed Songrium that visualizes relations among both original songs and derivative works generated from the collaboration. Songrium also features a social annotation framework to verbalize and share various relations among songs, and a flexible ranking mechanism to find interesting songs. After we launched Songrium in August 2012, more than 7,000 users have used our service in which over 98,000 songs and 520,000 derivative works have automatically been registered. We hope Songrium will not only encourage creators to create more derivative works, but also attract consumers to participate in the collaboration as creators.
Managing complexity: strategies for group awareness and coordinated action in Wikipedia BIBAFull-Text 5
  Michael Gilbert; Jonathan T. Morgan; David W. McDonald; Mark Zachry
In online groups, increasing explicit coordination can increase group cohesion and member productivity. On Wikipedia, groups called WikiProjects employ a variety of explicit coordination mechanisms to motivate and structure member contribution, with the goal of creating and improving articles related to particular topics. However, while explicit coordination works well for coordinating article-level actions, coordinating group tasks and tracking progress towards group goals that involve tracking hundreds or thousands of articles over time requires different coordination strategies.
   To lower the coordination cost of monitoring and task-routing, WikiProjects centralize coordination activity on WikiProject pages -- "micro-sites" that provide a centralized repository of project tools, tasks and targets, and discussion for explicit group coordination. These tools can facilitate shared awareness of member and non-member editing activity on articles that the project cares about. However, whether these tools are as effective at motivating members as explicit coordination, and whether they elicit the same kind of contributions, has not been studied. In this study, we examine one such tool, Hot Articles, and compare its effect on the editing behavior of WikiProject members with a more common, explicit coordination mechanism: making edit requests on the project talk page.

Wikipedia research track

When the levee breaks: without bots, what happens to Wikipedia's quality control processes? BIBAFull-Text 6
  R. Stuart Geiger; Aaron Halfaker
In the first half of 2011, ClueBot NG -- one of the most prolific counter-vandalism bots in the English-language Wikipedia -- went down for four distinct periods, each period of downtime lasting from days to weeks. In this paper, we use these periods of breakdown as naturalistic experiments to study Wikipedia's heterogeneous quality control network, which we analyze as a multi-tiered system in which distinct classes of reviewers use various reviewing technologies to patrol for different kinds of damage at staggered time periods. Our analysis showed that the overall time-to-revert edits was almost doubled when this software agent was down. Yet while a significantly fewer proportion of edits made during the bot's downtime were reverted, we found that those edits were later eventually reverted. This suggests that other agents in Wikipedia took over this quality control work, but performed it at a far slower rate.
A history of newswork on Wikipedia BIBAFull-Text 7
  Brian C. Keegan
Wikipedia's coverage of current events blurs the boundaries of what it means to be an encyclopedia. Drawing on Gieyrn's concept of "boundary work", this paper explores how Wikipedia's response to the 9/11 attacks expanded the role of the encyclopedia to include newswork, excluded content like the 9/11 Memorial Wiki that became problematic following this expansion, and legitimized these changes through the adoption of news-related policies and routines like promoting "In the News" content on the homepage. However, a second case exploring WikiNews illustrates the pitfalls of misappropriating professional newswork norms as well as the challenges of sustaining online communities. These cases illuminate the social construction of new technologies as they confront the boundaries of traditional professional identities and also reveal how newswork is changing in response to new forms of organizing enabled by these technologies.
Tell me more: an actionable quality model for Wikipedia BIBAFull-Text 8
  Morten Warncke-Wang; Dan Cosley; John Riedl
In this paper we address the problem of developing actionable quality models for Wikipedia, models whose features directly suggest strategies for improving the quality of a given article. We first survey the literature in order to understand the notion of article quality in the context of Wikipedia and existing approaches to automatically assess article quality. We then develop classification models with varying combinations of more or less actionable features, and find that a model that only contains clearly actionable features delivers solid performance. Lastly we discuss the implications of these results in terms of how they can help improve the quality of articles across Wikipedia.
Getting to the source: where does Wikipedia get its information from? BIBAFull-Text 9
  Heather Ford; Shilad Sen; David R. Musicant; Nathaniel Miller
We ask what kinds of sources Wikipedians value most and compare Wikipedia's stated policy on sources to what we observe in practice. We find that primary data sources developed by alternative publishers are both popular and persistent, despite policies that present such sources as inferior to scholarly secondary sources. We also find that Wikipedians make almost equal use of information produced by associations such as nonprofits as from scholarly publishers, with a significant portion coming from government information sources. Our findings suggest the rise of new influential sources of information on the Web but also reinforce the traditional geographic patterns of scholarly publication. This has a significant effect on the goal of Wikipedians to represent "the sum of all human knowledge."
Revision graph extraction in Wikipedia based on supergram decomposition BIBAFull-Text 10
  Jianmin Wu; Mizuho Iwaihara
As one of the popular social media that many people turn to in recent years, collaborative encyclopedia Wikipedia provides information in a more "Neutral Point of View" way than others. Towards this core principle, plenty of efforts have been put into collaborative contribution and editing. The trajectories of how such collaboration appears by revisions are valuable for group dynamics and social media research, which suggest that we should extract the underlying derivation relationships among revisions from chronologically-sorted revision history in a precise way. In this paper, we propose a revision graph extraction method based on supergram decomposition in the document collection of near-duplicates. The plain text of revisions would be measured by its frequency distribution of supergram, which is the variable-length token sequence that keeps the same through revisions. We show that this method can effectively perform the task than existing methods.
The illiterate editor: metadata-driven revert detection in Wikipedia BIBAFull-Text 11
  Jeffrey Segall; Rachel Greenstadt
As the community depends more heavily on Wikipedia as a source of reliable information, the ability to quickly detect and remove detrimental information becomes increasingly important. The longer incorrect or malicious information lingers in a source perceived as reputable, the more likely that information will be accepted as correct and the greater the loss to source reputation. We present The Illiterate Editor (IllEdit), a content-agnostic, metadata-driven classification approach to Wikipedia revert detection. Our primary contribution is in building a metadata-based feature set for detecting edit quality, which is then fed into a Support Vector Machine for edit classification. By analyzing edit histories, the IllEdit system builds a profile of user behavior, estimates expertise and spheres of knowledge, and determines whether or not a given edit is likely to be eventually reverted. The success of the system in revert detection (0.844 F-measure) as well as its disjoint feature set as compared to existing, content-analyzing vandalism detection systems, shows promise in the synergistic usage of IllEdit for increasing the reliability of community information.
The role of conflict in determining consensus on quality in Wikipedia articles BIBAFull-Text 12
  Kim Osman
This paper presents research that investigated the role of conflict in the editorial process of the online encyclopedia, Wikipedia. The study used a grounded approach to analyzing 147 conversations about quality from the archived history of the Wikipedia article Australia. It found that conflict in Wikipedia is a generative friction, regulated by references to policy as part of a coordinated effort within the community to improve the quality of articles.
Temporal analysis of activity patterns of editors in collaborative mapping project of OpenStreetMap BIBAFull-Text 13
  Taha Yasseri; Giovanni Quattrone; Afra Mashhadi
In the recent years Wikis have become an attractive platform for social studies of the human behaviour. Containing millions records of edits across the globe, collaborative systems such as Wikipedia have allowed researchers to gain a better understanding of editors participation and their activity patterns. However, contributions made to Geo-wikis --wiki-based collaborative mapping projects -- differ from systems such as Wikipedia in a fundamental way due to spatial dimension of the content that limits the contributors to a set of those who posses local knowledge about a specific area and therefore cross-platform studies and comparisons are required to build a comprehensive image of online open collaboration phenomena. In this work, we study the temporal behavioural pattern of OpenStreetMap editors, a successful example of geo-wiki, for two European capital cities. We categorise different type of temporal patterns and report on the historical trend within a period of 7 years of the project age. We also draw a comparison with the previously observed editing activity patterns of Wikipedia.
The emergence of Wikipedia as a new media institution BIBAFull-Text 14
  Kim Osman
Wikipedia is an important institution and part of the new media landscape having evolved from the collaborative efforts of millions of distributed users. This poster will present ongoing research that examines how the issues that have been highlighted by conflict within the community have shaped the evolution of Wikipedia from an open wiki experiment to a global knowledge producer. Bringing together the concepts of interpretive flexibility and generative friction with existing theories on the evolution of institutions, the research aims to present possible futures for Wikipedia as part of not only the larger Wikimedia movement, but of an open and accessible web.

FLOSS research track

Security of public continuous integration services BIBAFull-Text 15
  Volker Gruhn; Christoph Hannebauer; Christian John
Continuous Integration (CI) and Free, Libre and Open Source Software (FLOSS) are both associated with agile software development. Contradictingly, FLOSS projects have difficulties to use CI and software forges still lack support for CI. Two factors hamper widespread use of CI in FLOSS development: Cost of the computational resources and security risks of public CI services. Through security analysis of public CI services, this paper identifies possible attack vectors. To eliminate one class of attack vectors, the paper describes a concept that encapsulates a part of the CI system via virtualization. The concept is implemented as a proof of concept.
Collaborative development of data curation profiles on a wiki platform: experience from free and open source software projects and communities BIBAFull-Text 16
  Sulayman K. Sowe; Koji Zettsu
Wiki technologies have proven to be versatile and successful in aiding collaborative authoring of web content. Multitude of users can collaboratively add, edit, and revise wiki pages on the fly, with ease. This functionality makes wikis ideal platforms to support research communities curate data. However, without appropriate customization and a model to support collaborative editing of pages, wikis will fall sort in providing the functionalities needed to support collaborative work. In this paper, we present the architecture and design of a wiki platform, as well as a model that allow scientific communities, especially disaster response scientists, collaborative edit and append data to their wiki pages. Our experience in the implementation of the platform on MediaWiki demonstrates how wiki technologies can be used to support data curation, and how the dynamics of the FLOSS development process, its user and developer communities are increasingly informing our understanding about supporting collaboration and coordination on wikis.
A case study of the collaborative approaches to sustain open source business models BIBAFull-Text 17
  Shane Coughlan; Tetsuo Noda; Terutaka Tansho
Open source licenses provide everyone with the legal right to use, study, share, and improve the technology they cover from the perspective of copyright law. However, there are occasions when open source software packages or projects primarily governed by copyright licenses come into potential conflict with patent issues, or suffer from other governance concerns regarding third-party Intellectual Property Rights (IPR). From an economic perspective it is interesting how instead of undermining adoption, such challenges have led to an increase of collaborative governance solutions in open source, perhaps inspired by how such collaboration in development and business matters has provided benefit to stakeholders. In this paper, we show this evolution of collaborative solutions in open source business by actual example, and in the process illustrate how this unique approach to dealing with diverse ownership across business sectors works in practice.
The empirical commit frequency distribution of open source projects BIBAFull-Text 18
  Carsten Kolassa; Dirk Riehle; Michel A. Salim
A fundamental unit of work in programming is the code contribution ("commit") that a developer makes to the code base of the project in work. An author's commit frequency describes how often that author commits. Knowing the distribution of all commit frequencies is a fundamental part of understanding software development processes. This paper presents a detailed quantitative analysis of commit frequencies in open-source software development. The analysis is based on a large sample of open source projects, and presents the overall distribution of commit frequencies.
   We analyze the data to show the differences between authors and projects by project size; we also includes a comparison of successful and non successful projects and we derive an activity indicator from these analyses. By measuring a fundamental dimension of programming we help improve software development tools and our understanding of software development. We also validate some fundamental assumptions about software development.
User-evolvable tools in the web BIBAFull-Text 19
  Jens Lincke; Robert Hirschfeld
Self-supporting development environments like Smalltalk and Emacs can be used to directly evolve themselves, making their tools very malleable and adaptable. In Web-based software development environments users can collaborate in creating software without having to install the environment locally. Bringing these two together and making Web-based environments self-supportive is challenging, since users have to take care of to breaking the system, since there might be others using it also. Environments aimed at end-users usually provide a scripting level above the base system. Instead of providing users with a fixed set of tools, we propose to make the tools user-evolvable by building them as scriptable objects in a shared user editable repository. In our system, the Lively Kernel, the core system is developed using modules and classes, and on top of it users create active content by directly manipulating and scripting objects. By leveraging the scripting level for the development of tools themselves, we allow users to adapt their tools in a self-supporting way, without the need to invasively change the system's core. In this paper we show how development tools in Lively are collaboratively evolved. Tools can be directly explored, adapted, and published in a shared manner while they are being used.

Open access research track

Seamless sharing in a seemingly divided world: a glimpse of the challenges faced by creative commons BIBAFull-Text 20
  Poorna Mysoor
A broad based adoption of the Creative Commons licenses is bound to face challenges. Some of the challenges arise from the way the copyright laws in different jurisdictions are designed. Other challenges arise from either the way the Creative Commons licenses are structured or due to the underlying policy considerations of information society as a whole. This paper discusses these challenges and suggests possible responses.
Metadata aggregation at GovData.de: an experience report BIBAFull-Text 21
  Florian Marienfeld; Ina Schieferdecker; Evanela Lapi; Nikolay Tcholtchev
A key challenge for open data portals is the aggregation of metadata from various data catalogs (on different administrative level or from different application fields) also known as metadata harvesting. This paper describes harvesting at the pilot of the German open government portal GovData.de, which is scheduled to become the data portal for all German public administration levels.
   At the launch of the pilot portal in February, eleven federal, state and local data catalogs were integrated, which produced about 2,000 open data sets. In the meantime, the number of data sets increased to over 3,100 mainly due to improved harvesting capabilities of the portal. This paper discusses GovData.de metadata schema and experiences with the different harvesting techniques that are in use at GovData.de: CKAN-Harvest, CKAN-API, CSW-Harvest and JSON-Dump.

Research posters

Interest classification of Twitter users using Wikipedia BIBAFull-Text 22
  Kwan Hui Lim; Amitava Datta
We present a framework for (automatically) classifying the relative interests of Twitter users using information from Wikipedia. Our proposed framework first uses Wikipedia to automatically classify a user's celebrity followings into various interest categories, followed by determining the relative interests of the user with a weighting compared to his/her other interests. Our preliminary evaluation on Twitter shows that this framework is able to correctly classify users' interests and that these users frequently converse about topics that reflect both their (detected) interest and a related real-life event.
A preliminary study on the effects of barnstars on Wikipedia editing BIBAFull-Text 23
  Kwan Hui Lim; Amitava Datta; Michael Wise
This paper presents a preliminary study into the awarding of barnstars among Wikipedia editors to better understand their motivations in contributing to Wikipedia articles. We crawled the talk pages of all active Wikipedia editors and retrieved 21,299 barnstars that were awarded among 14,074 editors. In particular, we found that editors do not award and receive barnstars in equal (or similar) quantities. Also, editors were more active in editing articles before awarding or receiving barnstars.
A graphical user interface for SILK data link discovery framework BIBAFull-Text 24
  Rina Singh; Jan Hidders; Feng Xia; Jialiang Kang
In the field of linked data, interlinking previously unlinked datasets that are available on the linked open cloud is still a big challenge. Silk is one of the tools that allow one to do interlinking between data items within different linked data sources. The main goal of our work is to simplify the process of specifying linking conditions using Silk. Specifying the correct conditions in the Silk-LSL is a complex task where a helpful interface can make a large difference. In this work, we propose Silk Magic as a useful tool that is capable of guiding the users through the process of specifying linking conditions during the creation of the Silk LSL program and hence simplifies the task of writing Silk-LSL programs. The tool for example allows to display the conditions as an interactive tree, and offers suggestions for class selection conditions and step expressions in path expressions.

Research-in-progress presentations

Impact of social features implemented in open collaboration platforms on volunteer self-organization: case study of open source software development BIBAFull-Text 25
  Junghong Choi; Jinwoo Kim; Bruce Ferwerda; Jae Yun Moon; Jungpil Hahn
The promise of collective intelligence emerging from voluntary participation, contribution and knowledge sharing brought about by ubiquitous information and communication technologies has recently attracted the attention of academics and practitioners alike. Of many related phenomena, open source software (OSS) development has been touted as one of the leading examples that speak to the potential of collective intelligence. Recently, the advent of novel open collaboration platforms for open source software development, such as Github, has prompted researchers to examine the impact of increased work transparency induced by the introduction of social features on voluntary self-organization and allocation of resources to projects. We present both qualitative and quantitative analyses from which we derive some initial propositions regarding the impact of transparency on voluntary self-organization processes and decision mechanisms.
How do Baidu Baike and Chinese Wikipedia filter contribution?: a case study of network gatekeeping BIBAFull-Text 26
  Han-Teng Liao
Though open collaboration websites such as Wikipedia have attracted attention for their more inclusive and participatory potentials, it becomes increasingly clear that certain information filtering/control or gatekeeping mechanisms are set to render them manageable. Applying the network gatekeeping theory, this paper presents a case study of Baidu Baike and Chinese Wikipedia, focusing on their editorial policies and practices, which have not been systematically examined. Through a detailed analysis of editorial priorities, power users, and geo-linguistic arrangement over how, by whom and for whom and which types of information are processed, the findings show different bases and salience components for distinct network gatekeeping processes. In Chinese Wikipedia, filtering copyright-dubious materials and accommodating Chinese geo-linguistic variants are more salient, whereas censoring politically-sensitive content and enforcing a national cultural political framework of People's Republic of China are more salient in Baidu Baike. The usefulness and limitations of applying network gatekeeping theory for open collaboration websites is discussed.
How does localization influence online visibility of user-generated encyclopedias?: a study on Chinese-language search engine result pages (SERPs) BIBAFull-Text 27
  Han-Teng Liao
Prior empirical and theoretical work has discussed the role of dominant search engine plays in the function of information gatekeeping on the Web, and there are reports on the high ranking of Wikipedia website among the search engine result pages (SERP). However, little research has been conducted on nonGoogle search engines and non-English versions of user-generated encyclopedias. This paper proposes a method to quantify the "display" gatekeeping differences of the SERP ranking and presents findings based on the Chinese SERP data. Based on 2,500 mainly-Chinese-language search queries, the data set includes the SERP outcome of four Chinese-speaking regions (mainland China, Singapore, Hong Kong and Taiwan) provided by three major search engines (Baidu, and Google and Yahoo), covering over 97% of the search engine market in each region. The findings, analysed and visualized using network analysis techniques, demonstrate the followings: major user-generated encyclopedias are among the most visible; localization factors matter (certain search engine variants produce the most divergent outcomes, especially mainland Chinese ones). The indicated strong effects of "network gatekeeping" by search engines also suggest similar dynamics inside user-generated encyclopedias.
Automated decision support for human tasks in a collaborative system: the case of deletion in Wikipedia BIBAFull-Text 28
  Bluma S. Gelley; Torsten Suel
Wikipedia's low barriers to participation have the unintended effect of attracting a large number of articles whose topics do not meet Wikipedia's inclusion standards. Many are quickly deleted, often causing their creators to stop contributing to the site. We collect and make available several datasets of deleted articles, heretofore inaccessible, and use them to create a model that can predict with high precision whether or not an article will be deleted. We report precision of 98.6% and recall of 97.5% in the best case and high precision with lower, but still useful, recall, in the most difficult case. We propose to deploy a system utilizing this model on Wikipedia as a set of decision-support tools to help article creators evaluate and improve their articles before posting, and new article patrollers make more informed decisions about which articles to delete and which to improve.
Are memory institutions ready for open data and crowdsourcing?: results of a pilot survey from Switzerland BIBAFull-Text 29
  Beat Estermann
Since the advent of the World Wide Web, the cultural heritage sector has undergone a series of changes. In a pilot survey among memory institutions (galleries, libraries, archives, museums) in Switzerland we have focused on two recent trends -- open data and crowdsourcing -- asking to what extent heritage institutions are ready to adopt open data policies and to embrace crowdsourcing strategies. The results suggest that so far, only very few institutions have adopted an open data policy. There are however signs that this may soon change: A majority of the surveyed institutions considers open data as important and believes that the opportunities prevail over the risks. Some obstacles however still need to be overcome, in particular the institutions' reservations with regard to "free" licensing and their fear of losing control. With regard to crowdsourcing the data suggest that the adoption process will be slower than for open data. Although approximately 10% of the responding institutions seem already to experiment with crowdsourcing, there is no general breakthrough in sight, as a majority of respondents remain skeptical with regard to the benefits.

Doctoral symposium

Drawing the big picture: analyzing FLOSS collaboration with temporal social network analysis BIBAFull-Text 30
  Amir Azarbakht
How can we understand FOSS collaboration better? Can social issues that emerge be identified and addressed before it is too late? Can the community heal itself, become more transparent and inclusive, and promote diversity? We propose a technique to address these issues by quantitative analysis of social dynamics in FOSS communities. We propose using social network analysis metrics to identify growth patterns and unhealthy dynamics; giving the community a heads-up when they can still take action to ensure the sustainability of the project.
Cyberactivism and nationalistic communicative actions of publics: framing and agenda-building over Wikipedia in international disputes BIBAFull-Text 31
  Tam Laishan
Different from other information processing theories, the Situational Theory of Problem Solving (STOPS) proposes that the underlying goal of communication is problem solving rather than decision making. Whether and how individuals become engaged in the processes of information acquisition, selection and transmission depends on whether they find an issue to be problematic (problem recognition), perceive to be involved in the issue (involvement recognition), feel constrained about resolving the issue (constraint recognition) and have the applicable knowledge to deal with the issue (referent criterion). Using the Wikipedia page of "Senkaku Islands dispute" as a case, the present study seeks to examine how individuals become motivated to engage in communicative actions to co-construct an agenda about an ongoing international dispute between the Chinese and Japanese governments. Based on data collected using textual and content analysis of the "article" page, the "talk" page, the "view history" page, the "references" section, the "sources" section and the "external links" section, the present study seeks to redefine both the independent and dependent variables in STOPS and discusses the significance of Wikipedia, as an international platform for the co-construction of agendas, for the expression of nationalistic sentiments towards international disputes.
Wikipedia as a new media institution: issues of diversity, regulation and sustainability in an open encyclopedia BIBAFull-Text 32
  Kim Osman
Wikipedia is an important institution and part of the new media landscape having evolved from the collaborative efforts of millions of distributed users. However it appears to be facing the danger that it has become so rule-bound and entrenched in its processes that it has become increasingly difficult to respond reflexively to change. Evidently it is also facing challenges in creating a culture that attracts and retains new editors to ensure its viability as an open and collaboratively produced encyclopaedia.
   This research aims to bring together existing scholarship on Wikipedia as a successful open community together with more critical accounts of diversity in open communities. It will examine how the issues that have been highlighted by conflict within the community have shaped the evolution of Wikipedia from an open wiki experiment to a global knowledge producer. It also aims to develop new theories to account for change in the new media landscape by bringing together the concepts of interpretive flexibility and generative friction with existing theories on the evolution of institutions.
   Finally, the research will present possible futures for Wikipedia as part of not only the larger Wikimedia movement, but of an open and accessible web.
Wiki development environments BIBAFull-Text 33
  Christoph Hannebauer
The doctoral thesis Wiki Development Environments analyzes contribution barriers to Free, Libre and Open Source Software (FLOSS) projects. Contribution barriers exist between particular subgroups within the community around a FLOSS project. Contribution barriers include social and technical factors. The hurdles that constitute the contribution barrier to become a co-developer receives special emphasis.
   The doctoral thesis also describes a pattern language for maintainers of FLOSS projects. The patterns in this pattern language describe practices that lower the contribution barriers in FLOSS projects that employ the patterns. The doctoral thesis includes a novel approach that minimizes contribution barriers. This approach comprises the combination of a wiki system and an Integrated Development Environment (IDE) into a Wiki Development Environment (WikiDE). A WikiDE is a web-based code editor that allows anonymous users to edit source code and contribute it to a FLOSS project. From the pattern language perspective, a WikiDE helps to realize some of the patterns described earlier and amplifies their effect.
   Editing source code of software differs from editing text in a natural language. WikiDE realizations must take these differences into account. This imposes challenges for WikiDEs realization that exceed the requirements of IDEs and wiki systems for natural language text.
The dynamics of gatekeeping in online collaborative systems BIBAFull-Text 34
  Bluma Gelley
In this thesis, he explores the dynamics of gatekeeping in online peer-production communities, such as Wikipedia, Stack Overflow, and various open-source software projects. How are standards set for content inclusion, and what mechanisms are used to find and remove content deemed inappropriate by the community? The subject of vandalism, or the malicious addition of "bad" content, has been widely studied. Here, he focuses on content added in good faith that has nevertheless been judged inappropriate for the specific community. He also explores the possibility of using various automated tools to assist humans in gatekeeping tasks.
The dual role of conflict in free and open source software development BIBAFull-Text 35
  Anna Filippova
The voluntary and computer mediated nature of FOSS work presents unique challenges and opportunities for effective collaboration. Conflict is one such challenge, magnified by the distributed nature of work and limited communication channels. Though conflict is recognized as an important social process in FOSS development teams, few studies have adequately addressed this issue. Drawing on theoretical frameworks in organizational behavior and social psychology, her dissertation investigates how conflict arises in voluntary distributed virtual teams such as FOSS, and its impact on group function. The work first explores the emergence and experience of conflict during the life cycle of a project. Different types and sources of conflict are identified, as well as their relationship with group outcomes. Various conflict types are expected to affect group function differently: some conflict sources may present a challenge, while others may prove necessary for successful group function. The dissertation expands theory and research on distributed work, in examining on-going processes of conflict in voluntary teams. This work also informs community design, as understanding conflict antecedents in voluntary virtual teams aids in reducing unproductive conflict and facilitates conflict that spurs innovation.

Invited talks

The era of open BIBAFull-Text 36
  Philip E. Bourne
We are definitely in the open era. What does that mean and what does the future hold? I will provide a practitioners perspective on these questions as someone involved in running widely used biological databases, a producer of open source software and as founding editor in chief of an open access journal from the Public Library of Science (PLOS). In a nutshell it means profound change in the way educate, collaborate, disseminate and comprehend. I look forward to an open dialog as to more details of what that really means drawing on some examples from my own experiences.
Let's raise kids up! BIBAFull-Text 37
  Pockey Lam
From refurnishing old computers donated by companies and installing them with GNU/Linux and Free Software to developing and deploying open education resources and packaging free educational software together in migrant workers schools in Beijing. Pockey intends to share her experience doing those things and hope to get more people to review, improve and deliver this content to children who need it.
Descending mount everest: steps towards applied Wikipedia research BIBAFull-Text 38
  Dario Taraborelli
Over the last years, Wikipedia has seen an explosion of academic interest, as indicated by a steadily increasing volume of scholarly publications. Due to its history, its size and the immediate availability of its data under open licenses, Wikipedia has served over time as a testbed for sociological and psychological theory; as the primary source of data for models of commons-based peer production and computer-supported collaboration; as a body of norms for research on the governance of online communities; or as a large multilingual corpus to mine, or against which to train text analysis algorithms. This explosion of academic interest reveals a gap between Wikipedia as a topic of scholarly research and Wikipedia as a living community in need of actionable solutions, facing real challenges and the first serious growth and sustainability problem in its entire lifecycle. The Wikimedia Foundation and the Wikimedia communities have yet to find a viable model to leverage academic expertise to solve these challenges, in the same way that Wikimedia projects have effectively engaged with a large community of contributors and software developers to produce its contents and support its open source infrastructure. In this talk I will review recent research trends spanning scholarly work and internal research conducted at the Wikimedia Foundation, and how these relate to some of the most urgent needs of the Wikimedia movement and the Wikimedia Foundation's work priorities. I'll discuss models that can support actionable research, as well as open opportunities for researchers and contributors to collaborate on developing joint solutions and identifying new growth opportunities for WIkipedia and its communities.
Implementing open licensing in government open data initiatives: a review of Australian government practice BIBAFull-Text 39
  Anne Fitzgerald; Neale Hooper; John S. Cook
As support grows for greater access to information and data held by governments, so does awareness of the need for appropriate policy, technical and legal frameworks to achieve the desired economic and societal outcomes. Since the late 2000s numerous international organizations, inter-governmental bodies and governments have issued open government data policies, which set out key principles underpinning access to, and the release and reuse of data. These policies reiterate the value of government data and establish the default position that it should be openly accessible to the public under transparent and non-discriminatory conditions, which are conducive to innovative reuse of the data. A key principle stated in open government data policies is that legal rights in government information must be exercised in a manner that is consistent with and supports the open accessibility and reusability of the data. In particular, where government information and data is protected by copyright, access should be provided under licensing terms which clearly permit its reuse and dissemination. This principle has been further developed in the policies issued by Australian Governments into a specific requirement that Government agencies are to apply the Creative Commons Attribution licence (CC BY) as the default licensing position when releasing government information and data. A wide-ranging survey of the practices of Australian Government agencies in managing their information and data, commissioned by the Office of the Australian Information Commissioner in 2012, provides valuable insights into progress towards the achievement of open government policy objectives and the adoption of open licensing practices. The survey results indicate that Australian Government agencies are embracing open access and a proactive disclosure culture and that open licensing under Creative Commons licences is increasingly prevalent. However, the finding that '[t]he default position of open access licensing is not clearly or robustly stated, nor properly reflected in the practice of Government agencies' points to the need to further develop the policy framework and the principles governing information access and reuse, and to provide practical guidance tools on open licensing if the broadest range of government information and data is to be made available for innovative reuse.

Experience reports

A triangulated investigation of using wiki for project-based learning in different undergraduate disciplines BIBAFull-Text 40
  Edwin H. Y. Chu; Michele Notari; Katherine Chen; Chi Keung Chan; Samuel K. W. Chu; Wendy W. Y. Wu
This study investigates the use of wiki to support project-based learning (PBL) in 3 undergraduate courses of different disciplines: English Language Studies, Information Management, and Mechanical Engineering. This study takes a methodological triangulation approach that employs the use of questionnaires, interviews, and wiki activity logs. The level of activities and the types of core actions captured on wiki varied among the three groups of students. Students generally rated positively on the use of wiki to support PBL, while significant differences were found on 9 items (especially in the "Motivation" and "Knowledge Management" dimensions of the questionnaire) among students in the three different disciplines. Interviews revealed that these differences may be attributable to the variations in the natures and scopes of the PBL, as well as in the different emphases that students placed on the work presented on the wiki. This study may provide directions on the use of wiki in PBL in undergraduate courses.
Empowering formative assessment using embedded web widgets in wikis BIBAFull-Text 41
  Michele Notari; Sonja Schär; Samuel Kai Wah Chu; Martin Schellenberg
In this article we describe how we developed and how we use a tool for teachers enhancing inter-group collaboration of learners using wikis in project-based learning settings with over 100 participants, where different groups of students develop similar projects and each project has an own wiki page. To achieve our goal we extended typical wiki functionality by using web widgets, mini applications embedded anywhere in the wiki environment using the iframe tag.
   Two different evaluation widgets (rating widget and 'working progress' widget) are placed on each of the project pages. The project groups use the 'working progress' widget to declare the amount of work done. The teacher and the rest of the learning community use the 'rating' widget to rate the ongoing project work. A so called 'meta widget' showing a summary of the results of the 'rating' and 'working progress' widget can be displayed on the start page of the learning community or if a project is divided in different milestones, on the page describing the goals and timeline for the milestone. Evaluation widgets and meta widget, which easily can be embedded by the teacher potentially all over the wiki pages, enhances visibility of quality and termination degree of a project and enhance so the self, the tutor and the peer review opportunities in such large scale project based learning settings. The created evaluation widgets and meta widgets have been embedded in the wiki of a three months curriculum. The evaluation of utility and usability of the widgets is ongoing. The educational value of rating and reflecting about the working progress of a given task is discussed.
Involve the users to increase their acceptance: an experience report BIBAFull-Text 42
  Angelika Mühlbauer; Kai Nissen
Wikipedia is a top-ten web site providing a community-driven free encyclopedia. Its success depends on the support of its volunteer contributors. And Wikipedia is a research object in several academic fields. Wikimedia Deutschland, the German Wikimedia Chapter, is a partner in the EU-funded international research project "RENDER -- Reflecting knowledge diversity". With this participation we aim to support Wikipedia authors in editing, and to understanding the status of articles. This experience report focuses on our interaction with in particular the German-speaking Wikipedia community -- less on the project and its results. We reached out to members of the Wikipedia community via several ways. In addition to the online channels, the live meetings are of particular importance to build up an interested and active community. During our project, we learned that it is very important to involve the users at an early stage. That helps to increase the acceptance and the willingness to support the project. If Wikipedians can see a benefit of research results and developments for their daily life in Wikipedia or the advancement of the whole project, they will be more willing to give innovations a try.

Community demos

Data twist: an experimental script family to twist open data into new shapes BIBAFull-Text 43
  Shane Coughlan; Kana Fukuma; Tetsuo Noda; Yasumichi Hanagata
Data Twist is a project to help people use Open Data to make directories. It is a project that helps anyone create open versions of Yelp (tm) or TripAdvisor (tm). Data Twist acts a foundation for open directories by importing OpenStreetMap XML data into Wordpress.Data Twist has a few dependencies. One is Wordpress. Another is Geo Mashup, a plug-in that allows you to store geo-references with each Wordpress post. This demonstration will show how Data Twist works and explain where the development effort will focus next.
Basic techniques in text mining using open-source tools BIBAFull-Text 44
  Jun Iio
There are many text mining tools provided commercially and non-commercially. However, the elementary text-based analysis can be done with basic Unix commands, shell-scripts, and small program of scripting languages, instead of using such extensive software. This paper introduces the basic techniques for text mining, using combination of a set of standard commands, small code, and generic tools provided as the open-source software. The target of the analysis are sixty-seven articles written by one author in a relay column since 1998. Several text-based analyses reveals a trend of interest moved within about fifteen years. In addition, at the end of this paper, the results of text-based analysis are compared with that of non-text-based analysis and the efficiency of non-parametric analysis is discussed.
R-tools: mediawiki extension for full-scale statistical computing BIBAFull-Text 45
  Juha Villman; Einari Happonen
Wikisystems are proven to be good for producing text and knowledge in collaborative manner but they are not designed to handle large amounts of numerical data. We needed a system that is capable for producing text and run calculations from datasets. For this purpose we created Opasnet which is a Mediawiki with integrated statistical computing extension and an external database for data. In our demonstration we will show how R (statistical software) can be integrated into Mediawiki as an extension (R-Tools) and how it can be used directly from wiki pages. This extension enables users to write R-code, run it and see the results of the calculation on the wiki page. R-tools can use data from external databases and this functionality is also demonstrated. First R-Tools demonstration was held at Wikisym 2012 in Linz. Now we will focus on its new features developed within this year.
Automated metrics for wiki-based school assignments BIBAFull-Text 46
  Oren Bochman
Based on the hypothesis that to function effectively, education in massive CSCW systems should follow the form of successful crowd sourcing platform. I have integrated the Moodle LMS to work more tightly with the Wikipedia software. Instructors are empowered in this LMS to assign Wikipedia editing tasks to individuals or groups. Students are automatically assessed using metrics for content development, social interaction and communication ability. To further engage students and teachers the learning environment is gamified using a format derived from a number of briefly surveyed essays currently used in Wikipedia. To support students in a newcomer social role a course on editing Wikipedia and its social conundrums is provided.

Community tutorials

Video co-creation in collaborative online communities BIBAFull-Text 47
  Andrew Lih
This tutorial session addresses issues of multimedia and video co-creation in a wiki environment, using Wikipedia and Wikimedia Commons as an example.
   Did you know that while English Wikipedia has over 4 million articles, only 0.1% of them have video? In 2013, the Wiki Makes Video project was started by Andrew Lih to encourage more video content creation. It was designed to further the work of earlier projects such as Lights Camera Wiki and Kaltura's open source collaborative video editor work that had been started before 2010, but has stalled in recent years.
   We demonstrate the challenges and best practices for video creation within a collaborative community, and how this project has used the idea of "video patterns" as an analog to "programming patterns" that originally spurred the development of wikis in the 1990s.
   Topics include:
  • Measuring the quantity and quality of video in Wikipedia's articles
  • How better video can be generated by contributors through teaching visual
       literacy using video patterns
  • Overview of open technical standards being employed in Wikimedia Commons, and
       how to use them
  • Researching and transcoding content from existing video repositories
       (Internet Archive, Library of Congress, et al)
  • What barriers to participation might be, and how to encourage more visual