Proceedings of the 2014 International Symposium on Open Collaboration

Fullname:Proceedings of the 10th International Symposium on Open Collaboration
Editors:Dirk Riehle; Jesus M. Gonzalez-Barahona; Gregorio Robles; Kathrin M. Möslein; Ina Schieferdecker; Ulrike Cress; Astrid Wichmann; Brent Hecht; Nicolas Jullien
Location:Berlin, Germany
Dates:2014-Aug-27 to 2014-Aug-29
Standard No:ISBN: 978-1-4503-3016-9; ACM DL: Table of Contents; hcibib: ISW14
The Impact of Automatic Crash Reports on Bug Triaging and Development in Mozilla BIBAFull-Text 1
  Iftekhar Ahmed; Nitin Mohan; Carlos Jensen
Free/Open Source Software projects often rely on users submitting bug reports. However, reports submitted by novice users may lack information critical to developers, and the process may be intimidating and difficult. To gather more and better data, projects deploy automatic crash reporting tools, which capture stack traces and memory dumps when a crash occurs. These systems potentially generate large volumes of data, which may overwhelm developers, and their presence may discourage users from submitting traditional bug reports. In this paper, we examine Mozilla's automatic crash reporting system and how it affects their bug triaging process. We find that fewer than 0.00009% of crash reports end up in a bug report, but as many as 2.33% of bug reports have data from crash reports added. Feedback from developers shows that despite some problems, these systems are valuable. We conclude with a discussion of the pros and cons of automatic crash reporting systems.
Socio-Technical Congruence in the Ruby Ecosystem BIBAFull-Text 2
  M. M. Mahbubul Syeed; Klaus Marius Hansen; Imed Hammouda; Konstantinos Manikas
Existing studies show that open source projects may enjoy high levels of socio-technical congruence despite their open and distributed character. Such observations are yet to be confirmed in the case of larger open source ecosystems in which developers contribute to different projects within the ecosystem. In this paper, we empirically study the relationships between the developer coordination activities and the project dependency structure in the Ruby ecosystem. Our motivation is to verify whether the ecosystem context maintains the high socio-technical congruence levels observed in many smaller scale FLOSS (Free/Libre Open Source Software) projects. Our study results show that the collaboration pattern among the developers in Ruby ecosystem is not necessarily shaped by the communication needs indicated by the dependencies among the ecosystem projects.
On influences between software standards and their implementations in open source projects: Experiences from RDFa and its implementation in Drupal BIBAFull-Text 3
  Björn Lundell; Jonas Gamalielsson; Alexander Grahn; Jonas Feist; Tomas Gustavsson; Henrik Strindberg
It is widely acknowledged that standards implemented in open source software can reduce the risk for lock-in, improve interoperability, and promote competition on the market. However, there is limited knowledge concerning the relationship between standards and their implementations in open source software. This paper reports from an investigation of influences between software standards and open source software implementations of software standards. The study focuses on the RDFa standard and its implementation in the Drupal project. Specifically, issues in the W3C issue trackers for RDFa and the Drupal issue tracker for RDFa have been analysed. Findings show that there is clear evidence of reciprocal action between RDFa and its implementation in Drupal. The study contributes novel insights concerning effective processes for development and long-term maintenance of software standards and their implementations in open source projects.
Utilization and Development Contribution of Open Source Software in Japanese IT Companies: An Exploratory Study of the Effect on Business Growth BIBAFull-Text 4
  Terutaka Tansho; Tetsuo Noda
The usage of Open Source Software (OSS) has been more general these days and OSS are utilized in a wide range of business fields not only IT industries. Behind the expansion, there exist OSS development communities, where voluntary engineers dedicate their time and effort for the improvement. Considering development engineers in the companies as input resources, it is important to investigate the output of business growth. In this study, we conducted questionnaire survey to Japanese IT companies in 2013, and then analyzed the present state and relation between OSS utilization and development contribution. Our study revealed that Japanese IT companies are rather free riders of OSS, the volume of development contributions are far less than that of utilization. With regard to the effect on the business growth, the results of correlation analysis implicate that OSS utilization is related to the sales growth in the present term and that development contribution is related to the future growth of the employee number in the company. In order to explore the direct effect on the business growth, we constructed the models of multiple-logistic and logistic analyses, however, no direct and explicit determinants are found from the results of the analyses. Our research endeavors to investigate the OSS effect on the business growth are still on the way, but it is meaningful to provide the present state in numbers and hopefully this will lay some foundation for further study in this field.
Older Adults and Free/Open Source Software: A Diary Study of First-Time Contributors BIBAFull-Text 5
  Jennifer L. Davidson; Umme Ayda Mannan; Rithika Naik; Ishneet Dua; Carlos Jensen
The global population is aging rapidly, and older adults are becoming increasingly technically savvy. This paper explores ways to engage these individuals to contribute to free/open source software (FOSS) projects. We conducted a pilot diary study to explore motivations, barriers, and the contribution processes of first-time contributors in a real time, qualitative manner. In addition, we measured their self-efficacy before and after their participation. We found that what drove participants were intrinsic motivations, altruism, and internal values, which differed from previous work with older adults and with the general FOSS population. We also found that self-efficacy did not change significantly, even when participants encountered significant barriers or setbacks. The top 3 barriers were lack of communication, installation issues, and documentation issues. We found that asking for and receiving help, and avoiding difficult development environments were more likely to lead to success. To verify these results, we encourage a future large-scale diary study that involves multiple demographics. Given our pilot study, we recommend that future outreach efforts involving older adults focus on how to effectively communicate and build community amongst older contributors.
Hackers on Forking BIBAFull-Text 6
  Linus Nyman
All open source licenses allow the copying of an existing body of code for use as the basis of a separate development project. This practice is commonly known as forking the code. This paper presents the results of a study in which 11 programmers were interviewed about their opinions on the right to fork and the impact of forking on open source software development.
   The results show that there is a general consensus among programmers' views regarding both the favourable and unfavourable aspects that stem from the right to fork. Interestingly, while all programmers noted potential downsides to the right to fork, it was seen by all as an integral component of open source software, and a right that must not be infringed regardless of circumstance or outcome.
Initial Results from the Study of the Open Source Sector in Belgium BIBAFull-Text 7
  Robert Viseur
The economy of FLOSS (Free and open source software) has been the subject of numerous studies and publications, particularly on the issue of business models. However, there are fewer studies on the local networks of FLOSS providers. This research focuses on the ecosystem of Belgian FLOSS providers and, more specifically, their geographical distribution, the activities, technologies and software they support, their business models, their economic performance and the relationships between companies. The research is based on a directory containing nearly 150 companies. This directory led to the creation of a specialized search engine that helped to improve annotation. The research also uses financial data provided by the Belgian Central Balance Sheet Office. The initial results of this study show a concentration in major economic areas. The businesses are more active in the services and are heavily involved activities such as infrastructure software and Web development, activities which were common in the early years of free software development. Services for the support of business software is also common. A first analysis of the graph of relationships between providers' websites highlights the role that is played by the multinational IT companies, by FLOSS editors, by commercial FLOSS associations and especially by the Walloon centers of competence that offer vast training catalogs that are dedicated to FLOSS. This research opens up many perspectives for improving the automation of the company directory updates, the analysis of the relationship between enterprises, and the automation of the financial analysis of companies.
Filling the Gaps of Development Logs and Bug Issue Data BIBAFull-Text 8
  Bilyaminu Auwal Romo; Andrea Capiluppi; Tracy Hall
It has been suggested that the data from bug repositories is not always in sync or complete compared to the logs detailing the actions of developers on source code.
   In this paper, we trace two sources of information relative to software bugs: the change logs of the actions of developers and the issues reported as bugs. The aim is to identify and quantify the discrepancies between the two sources in recording and storing the developer logs relative to bugs.
   Focussing on the databases produced by two mining software repository tools, CVSAnalY and Bicho, we use part of the SZZ algorithm to identify bugs and to compare how the "defects-fixing changes" are recorded in the two databases. We use a working example to show how to do so.
   The results indicate that there is a significant amount of information, not in sync when tracing bugs in the two databases. We, therefore, propose an automatic approach to re-align the two databases, so that the collected information is mirrored and in sync.

The Contribution of Different Online Communities in Open Innovation Projects BIBAFull-Text 9
  Michael A. Zeng
Online communities used as resource enlargement in open innovation processes are a promising concept. Yet, to date few comparative studies on characteristics of different online communities have been done. This paper identifies the cultures of innovation communities and brand communities in the environment of the Web 2.0 and shows how to use and further exploit their potential in different steps of open innovation projects. To analyze these online communities, an exploratory case study design with ten small- and medium-sized enterprises (SMEs) was chosen. All ten enterprises worked with the same innovation intermediary, which implemented an innovation community platform into a social network and possess a brand community in the respective social network.
   The key findings suggest that the potential of both communities should be brought together and used as a harmonized strategy for open innovation and social media. Based on these findings, a conceptual framework was developed which illustrates how to integrate such online communities into each stage of a new product development process as well as to interconnect them.
Understanding Virtual Objects for Knowledge Creation in Communities BIBAFull-Text 10
  Marc Marheineke; Hagen Habicht
In this paper we investigate the use of virtual objects for knowledge exchange in communities. Information systems provide a wide range of new (virtual) objects for community members which support non-canonical collaboration required for knowledge creation [1,2]. From a sociological perspective these objects are means to cross knowledge boundaries in communities [3]. In our study we extend this aspect by a technical perspective of how virtual objects effectively facilitate activities of knowledge creation. Media Synchronicity Theory [4] proposes how to best accomplish communication performance. It predicts that to achieve effective communication, the two primary communication strategies of conveyance of information and convergence on meaning need to be supported. Building upon this discussion, we examine the use of virtual objects in a dynamic process of knowledge creation. We will draw conclusions on how to appropriately use virtual objects for communication. Our empirical study is based on multiple cases [5] of knowledge communities. Qualitative data has been gathered from the participants of six focused group discussions conducted on a virtual whiteboard which comprises a media choice to interact in real time. The results detail information on the actual use (and not use) of virtual objects (media) for knowledge creation. Based on our findings we empirically confirm the core propositions of Media Synchronicity Theory. We conclude with managerial recommendations on how to employ virtual objects for increasing the effectiveness of dynamic processes of knowledge creation.
Designing an Integrated Open Innovation System: Towards Organizational Wholeness BIBAFull-Text 11
  Vasiliki Baka
Increasing use of collaborative technologies has transformed organizational dynamics in novel ways. In this paper, we adopt the principle of wholeness in designing an integrated open innovation system. We provide an overview of existing collaborative technologies and situate the proposed sociotechnical arrangement within the paradigm of open innovation. We explore how effectively technological platforms address emergent collaboration and innovation practices within and across organizations and to which extent existing technologies act as strategic catalysts of open innovation. We argue that in embracing wholeness and in treating technologies as inseparable constitutive parts of organizational architecture, we foster organizational and institutional collaboration and encourage innovative practices. The focus of the paper is on how the design of sociotechnical systems as wholes, that is systems that are concurrently acting as corporate websites, internal collaboration spaces, extranets and social media aggregators, actively promotes open innovation in practice. We close with a presentation of six cases that are illustrative of how such a system could be applicable within the open innovation paradigm, namely, citizen participation, crowdsourcing and open innovation contests, open source innovation, reviews and social media, social enterprises and open teaching.
Not Only for Ideation, But Also for Signaling: Incorporating User-Profile-Webpages into Virtual Ideas Communities BIBAFull-Text 12
  Ulrich Bretschneider; Philipp Ebel; Shkodran Zogaj; Jan Marco Leimeister
This research-in-progress-paper describes the case of SAPiens, which is a Virtual Ideas Community (VIC). Typically, SAPiens -- and VICs in general -- focuses solely on supporting the ideation interactions among members. There is evidence from a survey that SAPiens members are also interested in actively signaling competences, experiences and skills to third parties. However, SAPiens does not offer IT functionalities that would allow for such a signaling. Against this backdrop, we propose to enrich SAPiens through User Profile Webpages allowing SAPiens members to construct a public profile within the community and thereby to signal individual capabilities, skills and experiences. The aim of this action design research is to design such an IT artifact by building on the signaling theory. After this initial design, our research constitutes a circular process of constant refinement as well as piloting and evaluation of the IT artifact in the real world setting of the SAPiens VIC.
Cross-fertilization vs. Collaboration in Simulations of Open Innovation BIBAFull-Text 13
  Albrecht Fritzsche
Evolutionary models allow us to approach innovation by the means of computer simulation with genetic algorithms. Open innovation can be considered in these models in different ways. A popular model by David Goldberg connects re-combinations of elements during evolutionary processes with the exchange of information in cross-fertilization activities. Another possibility is to model the collaboration of contributors with specific skills and experiences through sophisticated change operators that work systematically on improvements with respect to certain aspects of the innovation context. A simulation of this procedure on an instance of the permutation flow shop scheduling problem shows that the usage of these operators can indeed increase the performance of the solution generation, if certain constraints are kept in consideration.
Open Societal Innovation BIBAFull-Text 14
  Jörn von Lucke
In this paper, the concept of open societal innovation is briefly described. Regarding government, administration and society, the first early pioneers have made their experiences in combing open innovation approaches with information technology. A compact analysis summarizes already experienced strengths, weaknesses, opportunities and threats of this approach in the public sector.

From Mashup Applications to Open Data Ecosystems BIBAFull-Text 15
  Timo Aaltonen; Tommi Mikkonen; Heikki Peltola; Arto Salminen
Web-based software is available all over the world instantly after the online release. Applications can be used and updated without need to install anything, with natural support for collaboration, which allows users to interact and share the same applications over the Web. In addition, numerous web services allowing users to upload, download, store and modify private and public resources have emerged. However, as the amount of web services and devices used to consume as well as generate data has exploded, it is difficult to access and manage relevant data. In this paper, we start from the principles of mashups, reflect their use to the concepts of software ecosystems, and finally extend the discussion to open data generated by users themselves. As a technical contribution, we also introduce our proof-of-concept implementation of a mashup system built on wellness data, and discuss the main lessons we have learned in the process.
The Social Shaping of Open Data through Administrative Processes BIBAFull-Text 16
  Sirko Hunnius; Bernhard Krieger
Many models have been provided in the last years that aim at describing an optimal open data publication process. However, they fail to explain the different outcomes of open data initiatives. Based on qualitative research this paper conceptualises the open data phenomenon as a set of techno-political arenas in which different interests of a variety of actors potentially and actually collide. The micro-political arena model constitutes an instrument to delineate the social and institutional context of open data that can be employed to explain the successes, as well as the failures of individual open data projects.
Open Data for Air Transport Research: Dream or Reality? BIBAFull-Text 17
  Marc Bourgois; Michael Sfyroeras
The role of open data in air transport research is analyzed by means of a sample of over 300 research articles. The most used (or available) data types, their sources and their access policies are identified, both for the US and the EU. The analyses show that 70% of research in air transport is heavily reliant on data, that 70% of the data sources are curated by governmental bodies and that the US publicizes a wider set of sources, leading to wider usage. Areas for improving accessibility of (mainly European) data sources are outlined and alternative avenues to obtain data are sketched. The fact that Europe is lagging considerably in making its sources readily available to the research community means Europe missing out on entrepreneurship, innovation and scientific discovery, the presumed benefits of open data.

Why Do Some Students Become More Engaged in Collaborative Wiki Writing? The Role of Sense of Relatedness BIBAFull-Text 18
  Wilson W. T. Law; Ronnel B. King; Michele Notari; Eddie W. L. Cheng; Samuel K. W. Chu
This study aims to investigate the role of sense of relatedness in students' engagement in using wikis in collaborative writing. Hong Kong secondary school students (N = 422) participated in the study and answered questionnaires about their sense of relatedness and their level of engagement when using wikis for open collaborative project work. Results from the regression analyses showed that students' sense of relatedness with their teacher and peers facilitated their engagement in the collaborative wiki writing environment. The results were also consistent with the educational psychology research findings in a traditional classroom setting. Most importantly, the result from this study showed the possible linkage between IT in education research and the educational psychology literature. Implications of psychological factors on students' learning in technologically-enriched learning environments are discussed.
Investigating Incentives for Students to Provide Peer Feedback in a Semi-Open Online Course: An Experimental Study BIBAFull-Text 19
  German Neubaum; Astrid Wichmann; Sabrina C. Eimler; Nicole C. Krämer
In open online learning courses such as MOOCs, peer feedback has been regarded as a powerful method to give elaborated feedback on weekly assignments. Yet motivating students to invest effort in peer feedback on top of existing work load is difficult. Students might give insufficient feedback or do not give feedback at all. Students' hesitation to provide feedback might be related to the lack of visibility of spent effort during feedback provision. Alternatively, students might provide less feedback due to lack of perceived benefits. In this study, we investigated the effect of two incentive types on peer feedback provision on weekly assignments. In total, 91 students enrolled in a semi-open online course were announced to receive either (1) a peer rating on their feedback or (2) open access to assignment solutions or (3) no incentive. Results indicate that the incentive type did not affect feedback provision in general, yet it had an impact on the content of the feedback. Students receiving (1) a rating-feedback incentive wrote longer and more specific feedback in comparison to students receiving (2) an information-access incentive or (3) no incentive. Results contribute to findings from peer assessment research that students are more likely to provide detailed feedback if students feel that feedback is attended to. Furthermore, results inform teachers and practitioners on how to encourage students to provide peer feedback in open learning environments.
Learning process analytics for a self-study class in a Semantic Mediawiki BIBAFull-Text 20
  Daniel K. Schneider; Barbara Class; Kalliopi Benetos; Julien Da Costa; Valérie Follonier
We describe a framework and an implementation of learning process analytics for both learners and teachers to enhance a self-study class on psychological and educational theory. The environment is implemented in a Semantic MediaWiki using Semantic Forms and Semantic Result Formats. The design is in early development, but it is deployed and operational.

Cream of the crop: Elite contributors in an online community BIBAFull-Text 21
  Katherine Panciera; Mikhil Masli; Loren Terveen
In open content communities like Wikipedia and StackOverflow and in open source software projects, a small proportion of users produce a majority of the content and take on much of the required community maintenance work. Understanding this class of users is crucial to creating and sustaining healthy communities. We carried out a mixed-method study of core contributors to the Cyclopath geographic wiki and bicycle routing web site. We present our findings and organize our discussion using concepts from activity theory. We found that the Cyclopath core contributors aren't the dedicated cyclists and that the characteristics of the community shape the site, the rules, and the tools for contributing. Additionally, we found that numerous aspects about the surrounding ecology of related systems and communities may help to shape how the site functions and views itself. We draw implications for future research and design from these findings.
Structured Wikis: Application Oriented Use Cases BIBAFull-Text 22
  Stefan Voigt; Frank Fuchs-Kittowski; Andreas Gohr
Structured wikis combine the flexibility advantage of traditional wikis with the possibility of presenting structures and relationships in a partly automated fashion. Such wikis can, for example, map process structures and thus support complex processes. Taking the ICKEwiki as an example, this paper examines the differences between traditional and structured wikis by presenting four different real-life sample cases.
What do Chinese-language microblog users do with Baidu Baike and Chinese Wikipedia? A case study of information engagement BIBAFull-Text 23
  Han-Teng Liao
This paper presents a case study of information engagement based on microblog posts gathered from Sina Weibo and Twitter that mentioned the two major Chinese-language user-generated encyclopaedias. The content analysis shows that microblog users not only engaged in public discussions by using and citing both encyclopaedias, but also shared their perceptions and experiences more generally with various online platforms and China's filtering/censorship regime to which user-generated content and activities are subjected. This exploratory study thus raises several research and practice questions on the links between public discussions and information engagement on user-generated platforms.

Information Evolution in Wikipedia BIBAFull-Text 24
  Andrea Ceroni; Mihai Georgescu; Ujwal Gadiraju; Kaweh Djafari Naini; Marco Fisichella
The Web of data is constantly evolving based on the dynamics of its content. Current Web search engine technologies consider static collections and do not factor in explicitly or implicitly available temporal information, that can be leveraged to gain insights into the dynamics of the data. In this paper, we hypothesize that by employing the temporal aspect as the primary means for capturing the evolution of entities, it is possible to provide entity-based accessibility to Web archives. We empirically show that the edit activity on Wikipedia can be exploited to provide evidence of the evolution of Wikipedia pages over time, both in terms of their content and in terms of their temporally defined relationships, classified in literature as events. Finally, we present results from our extensive analysis of a dataset consisting of 31,998 Wikipedia pages describing politicians, and observations from in-depth case studies. Our findings reflect the usefulness of leveraging temporal information in order to study the evolution of entities and breed promising grounds for further research.
Bots vs. Wikipedians, Anons vs. Logged-Ins (Redux): A Global Study of Edit Activity on Wikipedia and Wikidata BIBAFull-Text 25
  Thomas Steiner
Wikipedia is a global crowdsourced encyclopedia that at time of writing is available in 287 languages. Wikidata is a likewise global crowdsourced knowledge base that provides shared facts to be used by Wikipedias. In the context of this research, we have developed an application and an underlying Application Programming Interface (API) capable of monitoring realtime edit activity of all language versions of Wikipedia and Wikidata. This application allows us to easily analyze edits in order to answer questions such as "Bots vs. Wikipedians, who edits more?", "Which is the most anonymously edited Wikipedia?", or "Who are the bots and what do they edit?". To the best of our knowledge, this is the first time such an analysis was done for Wikidata and for really all Wikipedias -- large and small. According to our results, all Wikipedias and Wikidata together are edited by about 50% bots and by about 23% anonymous users. Wikidata alone accounts for about 48% of the totally observed edits. If we do not consider Wikidata, i.e., if we only look at all Wikipedias, about 15% of all edits are made by bots and 26% of all edits are made by anonymous users. Overall, we found a stabilizing number of 274 active bots during our observation period. Our application is available publicly online at the URL http://wikipedia-edits.herokuapp.com/, its code has been open-sourced under the Apache 2.0 license.
Accept, decline, postpone: How newcomer productivity is reduced in English Wikipedia by pre-publication review BIBAFull-Text 26
  Jodi Schneider; Bluma S. Gelley; Aaron Halfaker
Wikipedia needs to attract and retain newcomers while also increasing the quality of its content. Yet new Wikipedia users are disproportionately affected by the quality assurance mechanisms designed to thwart spammers and promoters. English Wikipedia's Articles for Creation provides a protected space for drafting new articles, which are reviewed against minimum quality guidelines before they are published. In this study we explore how this drafting process has affected the productivity of newcomers in Wikipedia. Using a mixed qualitative and quantitative approach, we show how the process's pre-publication review, which is intended to improve the success of newcomers, in fact decreases newcomer productivity in English Wikipedia and offer recommendations for system designers.
WikiBrain: Democratizing computation on Wikipedia BIBAFull-Text 27
  Shilad Sen; Toby Jia-Jun Li; WikiBrain Team; Brent Hecht
Wikipedia is known for serving humans' informational needs. Over the past decade, the encyclopedic knowledge encoded in Wikipedia has also powerfully served computer systems. Leading algorithms in artificial intelligence, natural language processing, data mining, geographic information science, and many other fields analyze the text and structure of articles to build computational models of the world.
   Many software packages extract knowledge from Wikipedia. However, existing tools either (1) provide Wikipedia data, but not well-known Wikipedia-based algorithms or (2) narrowly focus on one such algorithm.
   This paper presents the WikiBrain software framework, an extensible Java-based platform that democratizes access to a range of Wikipedia-based algorithms and technologies. WikiBrain provides simple access to the diverse Wikipedia data needed for semantic algorithms and technologies, ranging from page views to Wikidata. In a few lines of code, a developer can use WikiBrain to access Wikipedia data and state-of-the-art algorithms. WikiBrain also enables researchers to extend Wikipedia-based algorithms and evaluate their extensions. WikiBrain promotes a new vision of the Wikipedia software ecosystem: every researcher and developer should have access to state-of-the-art Wikipedia-based technologies.
Consider the Redirect: A Missing Dimension of Wikipedia Research BIBAFull-Text 28
  Benjamin Mako Hill; Aaron Shaw
Redirects are special pages in wikis that silently transport visitors to other pages. Although redirects make up a majority of all article pages in English Wikipedia, they have attracted very little attention and are rarely taken into account by researchers. This note describes redirects and illustrates why they play an important role in shaping activity in Wikipedia. We also present a novel longitudinal dataset of redirects for English Wikipedia and the software used to produce it. Using this dataset, we revisit several important published findings about Wikipedia to show that accounting for redirects can have important effects on research.
Chinese-language literature about Wikipedia: a meta-analysis of academic search engine result pages BIBAFull-Text 29
  Han-Teng Liao; Bin Zhang
This paper presents a webometric analysis of the academic search engine result pages (SERPs) of the Chinese-language term of "Wikipedia" across major Chinese-speaking regions of mainland China, Hong Kong and Taiwan. Because of the academic outcome, the findings can also be interpreted for further meta-analysis, or "research about research", of the Wikipedia research in Chinese-language literatures. The findings cover the results from four major search platforms: CNKI Scholar, Google Scholar China, Google Scholar Hong Kong and Google Scholar Taiwan. Cross tabulation of the results shows the major institutions (journals and academic departments) and scholarly archives for Chinese-language Wikipedia research. The findings suggest that there exists a divide between mainland Chinese academic sources/search results on one hand, and Hong Kong/Taiwanese ones on the other. Meta-analysis based on academic SERPs have implications for identifying the gaps and potentials in internationalization of Wikipedia research.

Supporting awareness of content-related controversies in a Wiki-based learning environment BIBAFull-Text 30
  Sven Heimbuch; Daniel Bodemer
User generated content in Wikis is mainly distributed on the article view and its corresponding talk page. Potentials of analysing and supporting discussants' knowledge construction processes on the level of talk pages have still been rarely researched. The presented experimental study addresses this issue by providing external representations of content-related controversies which were led by contradictory evidence between discussants to foster awareness on socio-cognitive conflicts which can be beneficial for learning. Its aim is to investigate how increased salience of controversies can guide participants' (N = 81) navigation and learning processes. Three conditions differing in their degree of awareness support were implemented in this study. Results indicate that the implementation of awareness representations helped students to focus on meaningful discussion threads. Findings suggest that Wiki talk page users can benefit from additional structuring aids.
Reliability of User-Generated Data: the Case of Biographical Data in Wikipedia BIBAFull-Text 31
  Robert Viseur
Wikipedia is a collaborative multilingual encyclopedia launched in 2001. We already conducted a first research on the extraction of biographical data about personalities from Belgium in order to build a large database with biographical data. However, the question of the reliability of the data arises. In particular, in the case of Wikipedia, the data are generated by users and could be subject to errors. In consequence, we wanted to answer to the following question: are the data introduced in Wikipedia articles reliable? Our research is organized in three sections. The first section provides a brief state of the art about the reliability of the user-generated data. A second section presents the methodology of our research. A third section will present the results. The error rates that were measured for the birthdate is low (0.75%), although it is higher than the 0.21% score that we observed for the baseline (reference sources). In a fourth section, the results are discussed.
Rhizome and Wikipedia: A humanities based approach towards a structural explanation of the namespace BIBAFull-Text 32
  Stephan Ligl
This paper describes similarities between two different rhizomes -- the philosophical according to Deleuze and Guattari with their six principles and the botanical, consisting of three principles -- and Wikipedia's main namespace and tries to compare them. As a conclusion it can be said that within Wikipedia's main namespace all the principles of both rhizomes can be observed.
Measuring the Quality of Edits to Wikipedia BIBAFull-Text 33
  Susan Biancani
Wikipedia is unique among reference works both in its scale and in the openness of its editing interface. The question of how it can achieve and maintain high-quality encyclopedic articles is an area of active research. In order to address this question, researchers need to build consensus around a sensible metric to assess the quality of contributions to articles. This measure must not only reflect an intuitive concept of "quality," but must also be scalable and run efficiently. Building on prior work in this area, this paper uses human raters through Amazon Mechanical Turk to validate an efficient, automated quality metric.
Contropedia -- the analysis and visualization of controversies in Wikipedia articles BIBAFull-Text 34
  Erik Borra; Esther Weltevrede; Paolo Ciuccarelli; Andreas Kaltenbrunner; David Laniado; Giovanni Magni; Michele Mauri; Richard Rogers; Tommaso Venturini
Collaborative content creation inevitably reaches situations where different points of view lead to conflict. In Wikipedia, one of the most prominent examples of collaboration online, conflict is mediated by both policy and software, and conflicts often reflect larger societal debates.
   Contropedia is a platform for the analysis and visualization of such controversies in Wikipedia. Controversy metrics are extracted from activity streams generated by edits to, and discussions about, individual articles and groups of related articles. An article's revision history and its corresponding discussion pages constitute two parallel streams of user interactions that, taken together, fully describe the process of the collaborative creation of an article. Our proposed platform, Contropedia, builds on state of the art techniques and extends current metrics for the analysis of both edit and discussion activity and visualizes these both as a layer on top of Wikipedia articles as well as a dashboard view presenting additional analytics. Furthermore, the combination of these two approaches allows for a deeper understanding of the substance, composition, actor alignment, trajectory and liveliness of controversies on Wikipedia.
   Our research aims to provide a better understanding of socio-technical phenomena that take place on the web and to equip citizens with tools to fully deploy the complexity of controversies. Contropedia is useful for the general public as well as user groups with specific interests such as scientists, students, data journalists, decision makers and media communicators.
   Contropedia can be found at http://contropedia.net.
Geographic and linguistic normalization: towards a better understanding of the geolinguistic dynamics of knowledge BIBAFull-Text 35
  Han-Teng Liao; Thomas Petzold
This paper proposes a method of geo-linguistic normalization to advance the existing comparative analysis of open collaborative communities, with multilingual Wikipedia projects as the example. Such normalization requires data regarding the potential users and/or resources of a geolinguistic unit.

Drupal as a Commons-Based Peer Production community: a sociological perspective BIBAFull-Text 36
  David Rozas
The aim of this research consists of extracting a set of insights related to the dynamics, group decision making procedures, motivations to contribute and mechanisms employed in the coordination of Commons-Based Peer Production communities, using as a case study the community responsible for the development of the Free/Libre Open Source Software Drupal. A sociological perspective is taken for this purpose, and a set of social research qualitative and quantitative methods employed for the study of online communities (virtual ethnography) are being used.
Impact of Collaboration on Structural Software Quality BIBAFull-Text 37
The structural quality of a codebase is a key determining factor in the software's total cost of ownership, yet it is notoriously difficult to measure or predict. In this doctoral research we leverage the power of open source repositories to understand the factors that influence structural quality (and by extension fault-proneness) in the context of the patterns of collaborative behaviour exhibited by contributors. The objective is to further our understanding of how such behaviour impacts structural quality with the end goal being to inform management decision making across the industry in the pursuit of better software engineering practices.
"The Institutionalization of Digital Openness": How NGOs, Hackers and Civil Servants Organize Municipal Open Data Ecosystems BIBAFull-Text 38
  Maximilian Heimstädt
Around the world national and municipal governments launch open data initiatives with declared goals like increased efficiency, transparency or economic growth. However, although little of these effects have been proven, more and more administrations open up their datasets to the public. The dissertation project describes this phenomenon as the ongoing institutionalization of digital openness in the field of public sector information. With empirical evidence from three case studies in large European cities the research project intends to theorize how NGOs, hackers and certain civil servants turn open data into an institution, which more and more public bodies feel the need to adapt to.
Understanding Coopetition in the Open-Source Arena: The Cases of WebKit and OpenStack BIBAFull-Text 39
  Jose Teixeira
In an era of software crisis, the move of firms towards distributed software development teams is being challenged by emerging collaboration issues. On this matter, the open-source phenomenon may shed some light, as successful cases on distributed collaboration in the open-source community have been recurrently reported. In our research we explore collaboration networks in the WebKit and OpenStack high-networked open-source projects, by mining their source-code version-control-systems data with Social Network Analysis (SNA). Our approach allows us to observe how key events in the industry affect open-source collaboration networks over time. With our findings, we highlight the explanatory power from network visualizations capturing the collaborative dynamics of high-networked software projects over time. Moreover, we argue that competing companies that sell similar products in the same market, can collaborate in the open-source community while publicly manifesting intense rivalry (e.g. Apple vs Samsung patent-wars). After integrating our findings with the current body of theoretical knowledge in management strategy, economics, strategic alliances and coopetition, we propose the novel notion of open-coopetition, where rival firms collaborate with competitors in the open-source community. We argue that classical coopetition management theories do not fully explain the competitive and collaborative issues that are simultaneously present and interconnected in the WebKit and OpenStack open-source communities. We propose the development of the novel open-coopetition theory for a better understanding on how rival-firms collaborate with competitors by open-source manners.
Volunteer Attraction and Retention in Open Source Communities BIBAFull-Text 40
  Ann Barcomb
The importance of volunteers in open source has led to the position of community manager becoming more common in foundations and projects. Yet the advice for volunteer management and retention is fragmented, incomplete, contradictory, and has not been empirically examined. Our aim is to fill this gap by creating a comprehensive guidebook of best practices drawing from open source practitioner guides and general literature on volunteering, and to subject a subset of practices to empirical study. A method for evaluating volunteer attrition in terms of value to the organization will also be developed.

Let's Build the Road Network of Civic Tech BIBAFull-Text 41
  Stef van Grieken
Your awesome petition app is like a sports car without a freeway to drive on. Over the past several years we've built amazing civic apps that are improving public service delivery, engaging more citizens in the political process, and making governments more accountable around the world. But we're rapidly approaching a point common to all new public technologies: the need for common infrastructure to enable massive scale. This talk will discuss three tenets of civic technology that will take us towards a common framework, and present research and examples of work doing this today. It's time for developers, governments, corporations, academics, funders and citizens to come together and lay the groundwork for what's next.
How You Run a Meeting Says a Lot About Your Values: Participatory Practices for Open Communities BIBAFull-Text 42
  Michelle Thorne
Live events are some of the best ways to see the power dynamics and philosophical bent of a community. Many communities, open and closed, glorify sitting in a darkened room and being inspired by a sage on the stage. And then there are events about participation: making and learning with fellow participants around shared passions and interests. The session argues for the use of participatory methods at events as a way to manifest open values. We'll unpack some techniques and case studies, as well as practice ourselves.
Wikidata: How We Brought Structured Data to Wikipedia BIBAFull-Text 43
  Daniel Kinzler; Lydia Pintscher
Over the last two years we have been developing Wikidata and build up a community around it. Wikidata is Wikimedia's central repository for structured data. This is the place where data, like the number of inhabitants of a country, is stored and made accessible to humans and computers alike. The data is used across all 287 language editions of Wikipedia and its sister projects as well as in projects outside of Wikimedia. In this talk we will take a look at how we developed Wikidata, what great tools are being built on top of it and what is in store for the future.
Inner Source: Coming to a Company Near You Soon! BIBAFull-Text 44
  Klaas-Jan Stol
The nature of software development has changed significantly over the last decade or so, driven by trends such as an increasing level of software outsourcing, distributed development and collaborative development models. One such model of collaborative and distributed development that has attracted significant attention in both industry and research communities is that of Open Source. Open Source development seems to defy traditional wisdom in software development -- with a seeming absence of a predefined process, open source communities have produced high-quality and successful products. Increasingly, large organizations are looking to reproduce such emerging and collaborative development projects by adopting the open source development paradigm within their organizations. This phenomenon is labelled "Inner Source". This talk will present the results of four years of research into Inner Source. Specifically, the talk will address questions such as why companies would want to adopt Inner Source and what factors are important when adopting Inner Source. The talk will draw from several industry cases of Inner Source.

Collaborative Learning of Translation: The Case of TransWiki in Macao BIBAFull-Text 45
  Hari Venkatesan; Robert P. Biuk-Aghai; Michele Notari
Pedagogy has undergone a paradigm shift since the focus changed from uni-directional transmission to collaborative construction of knowledge. The social constructivist approach calls for pedagogy to facilitate interaction between learners involved in collaborative problem solving of real life tasks. This paper describes a wiki-based implementation of this approach (TransWiki) in the learning of translation. The paper examines issues that arise both from the perspective of the learner/user and the pedagogue and discusses solutions supported by the customization of the wiki system. User surveys and a case study indicate that the platform for collaboration is generally well received, but there is marked ambivalence with regard to the advantages of asynchronous collaboration through TransWiki over real-time face-to-face discussions. From the perspective of the instructor, the platform is seen as enabling scaffolding and providing a wealth of data that could inform pedagogy.
An Open Source Software Directory for Aeronautics and Space BIBAFull-Text 46
  Andreas Schreiber; Roberto Galoppini; Michael Meinel; Tobias Schlauch
In aerospace engineering, as well as in many other disciplines, many software tools are developed. Often, it is hard to get an overview of already existing software. Sometimes this leads to multiple development of software, if nobody is able to determine whether a software for a specific tasks exist already or not. Therefore, in companies and organizations there is a need for a directory of exiting software. The German Aerospace Center has built such a directory based on the Open Source software Allura, which is the base software that drives the Open Source hosting platform SourceForge.net. Allura has been customized to the needs of the aerospace domain. The result is a software portal for the aerospace research community, that allow to register and categorize software. It is intended to be used both for Open Source and proprietary software. Employees of the German Aerospace Center as well as the public can search for existing software. This reduces the amount of software developed twice and allows to get in touch with colleagues who developed similar software.
Opening Lesson Plans to Support Teaching Innovation and Open Educational Resources Adoption BIBAFull-Text 47
  Manuel Caeiro Rodríguez
Edu-AREA is a web 2.X application that aims at supporting teaching during the whole life-cycle of lesson plans development, from design, facilitating the creation and the re-use of previous lesson plans, activities and resources provided by other users, to monitor and reflection, enabling teachers to register all types of evidences and comments. Edu-AREA also allows users (e.g., other teachers, students, parents) to comment and provide feedback to OLP. Accounting for these pieces of feedback will contribute to the detection of problems, the adoption of innovations and the implementation of effective improvements. In addition, the development of an appropriate recognition policy (e.g. badges for teachers) and the provision of "curation" facilities will support the identification of valuable educational resources, activities and experiences. In this contribution we show the main ideas and functionalities underlying this application.
Strategies for Promoting OER in Course Development and Course Delivery in ODL Environment BIBAFull-Text 48
  Chung Sheng Hung; Khor Ean Teng
This study discusses the phases involved for the development of OER-based course materials namely the OER course integration using Wikibooks; evaluation of Quality Assurance (QA) in OER learning content; promoting and exploring OER repositories; CC licensing discussions and establishment of collective feedback sessions at Wawasan Open University (WOU), Penang, Malaysia. The learning design for the computing courses with engagement of learning experiences and feedbacks from different stakeholders in Open Distance Learning (ODL) environment are taken into consideration as one of the major components in the OER-based course development and revision phases. The OER-based computing course comprises of course units, self-test, unit practice exercises, assessments, mini project and activities are delivered in ODL mode in three consecutive semesters span from 2013 till 2014. Evaluations and studies are being carried out at end of each semesters for the by the course team members on the primary aspects focusing on learners' participation rate of OER resources; LMS learners' activities and assessments evaluation. The OER development engagement involved multiple stakeholders (i.e. learners, instructors, course coordinators and External Course Assessors) from different levels aiming to promote the use and understanding of OER in ODL environment.

Wikirate: a claims-based system for collaboratively reviewing corporate behaviour BIBAFull-Text 49
  Vishal Kapadia; Ethan McCutchen; Lucia Lu; Philipp Kühl
Wikirate.org is a community effort to review and rate companies' ethical behavior.
   Wikirate.org is built using Wagn, whose atomic data approach allows Wikirate contributors to integrate rich qualitative and quantitative data in innovative, accessible ways.
   In the qualitative realm, data can be browsed by Company or by Topic, and the site's core Articles cover the intersection of the two (eg. BP+Climate Change). Because source data in the corporate transparency realm in notoriously biased, Wikirate enhances traditional wiki mechanisms for ensuring data quality patterns by organizing Articles around citation of granular Claims, supporting discussion of and voting on these Claims, and requiring that Claims cite Sources.
   This organization also serves to help break down the sizable task of reviewing corporate behavior into very manageable chunks.
   In the quantitative realm, Wikirate will soon be introducing a Ratings system that enables participants to create both simple and formulaic metrics and allows the community to vote on the most valuable of these. The system is designed such that all metrics are created and controlled by the community, and their votes determine both the metrics' prominence on the site and the companies' transparency score.
Strata: Typed Semi-Structured Data in DokuWiki BIBAFull-Text 50
  Brend Wanders; Steven te Brinke
A semantic wiki is a wiki that has a model of the knowledge contained in its pages. Currently, semantic wikis are not adopted by a large user base, because most implementations are research prototypes that implement their own wiki engine. To increase familiarity with semantic wikis and quick adoption of semantic technologies we present Strata, a plugin for the well known wiki DokuWiki. Strata allows the use of semi-structured data in any DokuWiki installation, normalizes values based on their types, and allows extensive data modeling and querying on complex data structures.

Wagn: co-creating advanced web systems with simple wiki building blocks BIBAFull-Text 51
  Ethan McCutchen
Wagn is a free and open-source software platform that enables non-programmers to create sophisticated web systems. Using simple wiki-inspired building blocks called "cards", a built-in query language, and an elegant rules system based on set theory, nontechnical "Wagneers" can collaboratively engineer powerful web systems. A recently introduced "mods" API makes Wagn a complete web-development platform.
   Wagn has been hailed as "advancing the state of the art" for wikis and was described by Ward Cunningham as "one of the freshest contributions to wiki since I coined the term".
   In the 90-minute tutorial slot, we will introduce Wagn and discuss current and potential Wagn applications, its core design principles, advantages of atomic data and set-based configuration, its simple queries, its architecture, called MoVE, improvements slotted for its rebranding release: Decko 1.0, and future directions
   The remainder of the half-day class will be participatory. Attendees will collaboratively create a website for tracking OpenSym presentations and other events. We will begin with a fresh Wagn installation (or "deck"), and proceed to build a web system together from scratch.