HCI Bibliography Home | HCI Conferences | ICPC Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
ICPC Tables of Contents: 1415

Proceedings of the 2015 International Conference on Program Comprehension

Fullname:ICPC'15: 23rd International Conference on Program Comprehension
Editors:Andrea De Lucia; Christian Bird; Rocco Oliveto
Location:Florence, Italy
Dates:2015-May-18 to 2015-May-19
Publisher:ACM
Standard No:ACM DL: Table of Contents; hcibib: ICPC15
Papers:39
Pages:306
Links:Conference Website | Conference Series Website
  1. Keynotes
  2. Mining software repositories -- technical research papers
  3. Learning and sharing program knowledge -- technical research papers
  4. Learning and sharing program knowledge -- early research achievement papers
  5. Users, user interfaces, and feature location -- technical research papers
  6. Users, user interfaces, and feature location -- early research achievement papers
  7. Large scale empirical studies -- technical research papers
  8. Large scale empirical studies -- early research achievement papers
  9. Reading and visualizing -- technical research papers
  10. Reading and visualizing -- early research achievement papers
  11. Industry and experience reports
  12. Tool demos

Keynotes

Test complement exclusion: guarantees from dynamic analysis BIBAFull-Text 1-2
  Andreas Zeller
Modern test generation techniques allow to generate as many executions as needed; combined with dynamic analysis, they allow for understanding program behavior in situations where static analysis is challenged or impossible. However, all these dynamic techniques would still suffer from the incompleteness of testing: If some behavior has not been observed so far, there is no guarantee that it may not occur in the future. In this talk, I introduce a method called Test Complement Exclusion that combines test generation and sandboxing to provide such a guarantee. Test Complement Exclusion will have significant impact in the security domain, as it effectively detects and protects against unexpected changes of program behavior; however, guarantees would also strengthen findings in dynamic software comprehension. First experiments on real-world Android programs demonstrate the feasibility of the approach.
Concise and consistent naming: ten years later BIBA 3
  Florian Deissenboeck; Markus Pizka
Approximately 70% of the source code of a software system consists of identifiers. Hence, the names chosen as identifiers are of paramount importance for the readability of computer programs and therewith their comprehensibility. However, virtually every programming language allows programmers to use almost arbitrary sequences of characters as identifiers which far too often results in more or less meaningless or even misleading naming. Coding style guides address this problem but are usually limited to general and hard to enforce rules like "identifiers should be self-describing". At IWPC 2005 we proposed a formal model, based on bijective mappings between concepts and names, provides a solid foundation for the definition of precise rules for concise and consistent naming. The enforcement of these rules was supported by a tool that incrementally builds and maintains a complete identifier dictionary while the system is being developed. The identifier dictionary explained the language used in the software system, aids in consistent naming, and improves productivity of programmers by proposing suitable names depending on the current context. In this talk we analyze the first ten year of the model we proposed at IWPC 2005 by analyzing its impact on the program comprehension community as well as its applicability in practice.

Mining software repositories -- technical research papers

Discovering loners and phantoms in commit and issue data BIBAFull-Text 4-14
  Gerald Schermann; Martin Brandtner; Sebastiano Panichella; Philipp Leitner; Harald Gall
The interlinking of commit and issue data has become a de-facto standard in software development. Modern issue tracking systems, such as JIRA, automatically interlink commits and issues by the extraction of identifiers (e.g., issue key) from commit messages. However, the conventions for the use of interlinking methodologies vary between software projects. For example, some projects enforce the use of identifiers for every commit while others have less restrictive conventions. In this work, we introduce a model called PaLiMod to enable the analysis of interlinking characteristics in commit and issue data. We surveyed 15 Apache projects to investigate differences and commonalities between linked and non-linked commits and issues. Based on the gathered information, we created a set of heuristics to interlink the residual of non-linked commits and issues. We present the characteristics of Loners and Phantoms in commit and issue data. The results of our evaluation indicate that the proposed PaLiMod model and heuristics enable an automatic interlinking and can indeed reduce the residual of non-linked commits and issues in software projects.
Detection of software evolution phases based on development activities BIBAFull-Text 15-24
  Omar Benomar; Hani Abdeen; Houari Sahraoui; Pierre Poulin; Mohamed Aymen Saied
Software evolution history is usually represented at fine granularity by commits in software repositories, and at coarse granularity by software releases. In order to gain insights on development activities and on software evolution, the information on releases is too general, whereas the information on commits is prohibitively large to be efficiently processed by a developer. This paper proposes an automatic technique for the identification of distinct phases of evolution. Such software evolution phases are characterized by similar development activities in terms of changes to entities. Therefore, our technique decomposes software evolution history to assist developers identify periods of different development activities. Our analysis technique is a search-based optimization of the best decomposition of commits from the software repository using heuristics such as classes changed in each commit, and the magnitude/importance of these changes. To validate our technique, we applied it on the evolution history of five case studies covering multiple releases over several years of development. An interesting outcome of the evaluation is that our automatic decomposition of software evolution history recovered the original decomposition in software releases.
I know what you did last summer: an investigation of how developers spend their time BIBAFull-Text 25-35
  Roberto Minelli; Andrea Mocci; Michele Lanza
Developing software is a complex mental activity, requiring extensive technical knowledge and abstraction capabilities. The tangible part of development is the use of tools to read, inspect, edit, and manipulate source code, usually through an IDE (integrated development environment). Common claims about software development include that program comprehension takes up half of the time of a developer, or that certain UI (user interface) paradigms of IDEs offer insufficient support to developers. Such claims are often based on anecdotal evidence, throwing up the question of whether they can be corroborated on more solid grounds.
   We present an in-depth analysis of how developers spend their time, based on a fine-grained IDE interaction dataset consisting of ca. 740 development sessions by 18 developers, amounting to 200 hours of development time and 5 million of IDE events. We propose an inference model of development activities to precisely measure the time spent in editing, navigating and searching for artifacts, interacting with the UI of the IDE, and performing corollary activities, such as inspection and debugging. We report several interesting findings which in part confirm and reinforce some common claims, but also disconfirm other beliefs about software development.
RCLinker: automated linking of issue reports and commits leveraging rich contextual information BIBAFull-Text 36-47
  Tien-Duy B. Le; Mario Linares-Vásquez; David Lo; Denys Poshyvanyk
Links between issue reports and their corresponding commits in version control systems are often missing. However, these links are important for measuring the quality of a software system, predicting defects, and many other tasks. Several approaches have been designed to solve this problem by automatically linking bug reports to source code commits via comparison of textual information in commit messages and bug reports. Yet, the effectiveness of these techniques is oftentimes suboptimal when commit messages are empty or contain minimum information; this particular problem makes the process of recovering traceability links between commits and bug reports particularly challenging. In this work, we aim at improving the effectiveness of existing bug linking techniques by utilizing rich contextual information. We rely on a recently proposed approach, namely ChangeScribe, which generates commit messages containing rich contextual information by using code summarization techniques. Our approach then extracts features from these automatically generated commit messages and bug reports, and inputs them into a classification technique that creates a discriminative model used to predict if a link exists between a commit message and a bug report. We compared our approach, coined as RCLinker (Rich Context Linker), to MLink, which is an existing state-of-the-art bug linking approach. Our experiment results on bug reports from six software projects show that RCLinker outperforms MLink in terms of F-measure by 138.66%.
Generating reproducible and replayable bug reports from Android application crashes BIBAFull-Text 48-59
  Martin White; Mario Linares-Vásquez; Peter Johnson; Carlos Bernal-Cárdenas; Denys Poshyvanyk
Manually reproducing bugs is time-consuming and tedious. Software maintainers routinely try to reproduce unconfirmed issues using incomplete or noninformative bug reports. Consequently, while reproducing an issue, the maintainer must augment the report with information -- such as a reliable sequence of descriptive steps to reproduce the bug -- to aid developers with diagnosing the issue. This process encumbers issue resolution from the time the bug is entered in the issue tracking system until it is reproduced. This paper presents crashdroid, an approach for automating the process of reproducing a bug by translating the call stack from a crash report into expressive steps to reproduce the bug and a kernel event trace that can be replayed on-demand. crashdroid manages traceability links between scenarios' natural language descriptions, method call traces, and kernel event traces. We evaluated crashdroid on several open-source Android applications infected with errors. Given call stacks from crash reports, crashdroid was able to generate expressive steps to reproduce the bugs and automatically replay the crashes. Moreover, users were able to confirm the crashes faster with crashdroid than manually reproducing the bugs or using a stress-testing tool.
Active semi-supervised defect categorization BIBAFull-Text 60-70
  Ferdian Thung; Xuan-Bach D. Le; David Lo
Defects are inseparable part of software development and evolution. To better comprehend problems affecting a software system, developers often store historical defects and these defects can be categorized into families. IBM proposes Orthogonal Defect Categorization (ODC) which include various classifications of defects based on a number of orthogonal dimensions (e.g., symptoms and semantics of defects, root causes of defects, etc.). To help developers categorize defects, several approaches that employ machine learning have been proposed in the literature. Unfortunately, these approaches often require developers to manually label a large number of defect examples. In practice, manually labelling a large number of examples is both time-consuming and labor-intensive. Thus, reducing the onerous burden of manual labelling while still being able to achieve good performance is crucial towards the adoption of such approaches. To deal with this challenge, in this work, we propose an active semi-supervised defect prediction approach. It is performed by actively selecting a small subset of diverse and informative defect examples to label (i.e., active learning), and by making use of both labeled and unlabeled defect examples in the prediction model learning process (i.e., semi-supervised learning). Using this principle, our approach is able to learn a good model while minimizing the manual labeling effort.
   To evaluate the effectiveness of our approach, we make use of a benchmark dataset that contains 500 defects from three software systems that have been manually labelled into several families based on ODC. We investigate our approach's ability in achieving good classification performance, measured in terms of weighted precision, recall, F-measure, and AUC, when only a small number of manually labelled defect examples are available. Our experiment results show that our active semi-supervised defect categorization approach is able to achieve a weighted precision, recall, F-measure, and AUC of 0.651, 0.669, 0.623, and 0.710, respectively, when only 50 defects are manually labelled. Furthermore, it outperforms an existing active multi-class classification algorithm, proposed in the machine learning community, by a substantial margin.

Learning and sharing program knowledge -- technical research papers

Could we infer unordered API usage patterns only using the library source code? BIBAFull-Text 71-81
  Mohamed Aymen Saied; Hani Abdeen; Omar Benomar; Houari Sahraoui
Learning to use existing or new software libraries is a difficult task for software developers, which would impede their productivity. Much existing work has provided different techniques to mine API usage patterns from client programs in order to help developers on understanding and using existing libraries. However, considering only client programs to identify API usage patterns is a strong constraint as the client programs source code is not always available or the clients themselves do not exist yet for newly released APIs. In this paper, we propose a technique for mining Non Client-based Usage Patterns (NCBUP-miner). We detect unordered API usage patterns as distinct groups of API methods that are structurally and semantically related and thus may contribute together to the implementation of a particular functionality for potential client programs. We evaluated our technique through four APIs. The obtained results are comparable to those of client-based approaches in terms of usage-patterns cohesion.
Searching the state space: a qualitative study of API protocol usability BIBAFull-Text 82-93
  Joshua Sushine; James D. Herbsleb; Jonathan Aldrich
Application Programming Interfaces (APIs) often define protocols -- restrictions on the order of client calls to API methods. API protocols are common and difficult to use, which has generated tremendous research effort in alternative specification, implementation, and verification techniques. However, little is understood about the barriers programmers face when using these APIs, and therefore the research effort may be misdirected.
   To understand these barriers better, we perform a two-part qualitative study. First, we study developer forums to identify problems that developers have with protocols. Second, we perform a think-aloud observational study, in which we systematically observe professional programmers struggle with these same problems to get more detail on the nature of their struggles and how they use available resources. In our observations, programmer time was spent primarily on four types of searches of the protocol state space. These observations suggest protocol-targeted tools, languages, and verification techniques will be most effective if they enable programmers to efficiently perform state search.
Synonym suggestion for tags on stack overflow BIBAFull-Text 94-103
  Stefanie Beyer; Martin Pinzger
The amount of diverse tags used to classify posts on Stack Overflow increased in the last years to more than 38,000 tags. Many of these tags have the same or similar meaning. Stack Overflow provides an approach to reduce the amount of tags by allowing privileged users to manually create synonyms. However, currently exist only 2,765 synonym-pairs on Stack Overflow that is quite low compared to the total number of tags.
   To comprehend how synonym-pairs are built, we manually analyzed the tags and how the synonyms could be created automatically. Based on our findings, we then present TSST, a tag synonym suggestion tool, that outputs a ranked list of possible synonyms for each input tag.
   We first evaluated TSST with the 2,765 approved synonym-pairs of Stack Overflow. For 88.4% of the tags TSST finds the correct synonyms, for 72.2% the correct synonym is within the top 10 suggestions. In addition, we applied TSST to 10 randomly selected Android related tags and evaluated the suggested synonyms with 20 Android app developers in an online survey. Overall, in 80% of their ratings, developers found an adequate synonym suggested by TSST.
Code, camera, action: how software developers document and share program knowledge using YouTube BIBAFull-Text 104-114
  Laura MacLeod; Margaret-Anne Storey; Andreas Bergen
Creating documentation is a challenging task in software engineering and most techniques involve the laborious and sometimes tedious job of writing text. This paper explores an alternative to traditional text-based documentation, the screencast, which captures a developer's screen while they narrate how a program or software tool works. We conducted a study to investigate how developers produce and share developer-focused screencasts using the YouTube social platform. First, we identified and analyzed a set of development screencasts to determine how developers have adapted to the medium to meet the demands of development-related documentation needs. We also explored the techniques and strategies used for sharing software knowledge. Second, we interviewed screencast producers to understand their motivations for creating screencasts, and to uncover the perceived benefits and challenges in producing code-focused videos. Our findings reveal that video is a useful medium for communicating program knowledge between developers, and that developers build their online personas and reputation by sharing videos through social channels.
Generating refactoring proposals to remove clones from automated system tests BIBAFull-Text 115-124
  Benedikt Hauptmann; Elmar Juergens; Volkmar Woinke
Automated system tests often have many clones, which make them complex to understand and costly to maintain. Unfortunately, removing clones is challenging as there are numerous possibilities of how to refactor them to reuse components such as subroutines. Additionally, clones often overlap partly which makes it particularly difficult to decide which parts to extract. If done wrongly, reuse potential is not leveraged optimally and structures between tests and reuse components will become unnecessarily complex. We present a method to support test engineers in extracting overlapping clones. Using grammar inference algorithms, we generate a refactoring proposal that demonstrates test engineers how overlapping clones can be extracted. Furthermore, we visualize the generated refactoring proposal to make it easily understandable for test engineers. An industrial case study demonstrates that our approach helps test engineers to gain information of the reuse potential of test suites and guides them to perform refactorings.

Learning and sharing program knowledge -- early research achievement papers

Framework instantiation using cookbooks constructed with static and dynamic analysis BIBAFull-Text 125-128
  Raquel F. Q. Lafetá; Marcelo A. Maia; David Röthlisberger
Software reuse is one of the major goals in software engineering. Frameworks promote the reuse of not only individual building blocks, but also of system design. However, framework instantiation requires a substantial understanding effort. High-quality documentation is essential to minimize this effort. However, in most cases, appropriate documentation does not exist or is not updated. Our hypothesis is that the framework code itself and existing instantiations can serve as a guide for new instantiations. The challenge is that users still have to read large portions of code, which hinders the understanding process, thus our goal is to provide relevant information for framework instantiation with static and dynamic analysis of the framework and pre-existing instantiations. The final documentation is presented in a cookbook style, where recipes are composed of programming tasks and information about hotspots related to a feature instantiation. We conducted two preliminary experiments, the first to evaluate the recall of the approach and the second to study the practical usefulness of the recipe information for developers. Results reveal that our approach discloses accurate and relevant information about classes and methods used for framework instantiation.

Users, user interfaces, and feature location -- technical research papers

Two user perspectives in program comprehension: end users and developer users BIBAFull-Text 129-139
  Tobias Roehm
Recent empirical studies identified an interest of software developers in high-level usage information, i.e. why and how end users employ a software application. Furthermore, recent empirical work found that developers of interactive applications put themselves in the role of users by interacting with the user interface during program comprehension.
   This paper presents an exploratory case study investigating these two user perspectives in detail. The study focuses on information needs regarding software usage and developers in the role of users during program comprehension. 21 developers from six software companies were observed during program comprehension tasks and interviewed. The resulting observation protocols and interview minutes were analyzed using coding.
   We found that developers are interested in information about use cases and user behavior, user goals and user needs, failure reproduction steps, and application domain concepts. But such information is rarely available to them during program comprehension. This mismatch indicates a potential to improve program comprehension practices by capturing such information and providing it to developers. Furthermore, we found that developers interact with the user interface of an interactive application to reproduce failures, to find relevant source code, to test changes, to trigger the debugger, and to familiarize with an unknown part of the application. Also, developers conceptually map elements of the user interface to source code, data structures, and algorithms. We call this behavior "UI-based comprehension" and argue that it is part of a broader comprehension strategy together with comprehension activities like reading source code or debugging.
Exploring the use of concern element role information in feature location evaluation BIBAFull-Text 140-150
  Emily Hill; David Shepherd; Lori Pollock
Before making changes, programmers need to locate and understand source code that corresponds to specific functionality, i.e., perform concern or feature location. Numerous concern and feature location techniques have been proposed, but to the best of our knowledge, no existing techniques or evaluations report information on what role a code element plays in the larger concern. In this paper, we report on two case studies that investigate two hypotheses on how evaluation studies of concern location techniques can be strengthened by utilizing concern role information: (1) by increasing agreement among human annotators for gold set establishment and (2) by providing richer information about the elements ranked as relevant by concern location techniques, which could help further improve the tools.
   We conducted a case study of 6 Java developers annotating 3 concerns with role information. When the developers understood the task description, pairwise agreement increased by 20%, 25%, and 135% for the 3 concerns over a prior concern location study without role information. Our findings also suggest that there may be core element roles that need to be annotated by humans, but that the remaining roles may be automatically derived, which could facilitate more reliable concern location benchmarks in the future. We also conducted an exploratory study of the element roles represented in results returned by a state of the art feature location tool. The results of these two studies suggest that integrating concern element role information into evaluations can help to strengthen both the gold set establishment and the analysis of results returned by various tools.
Rethinking user interfaces for feature location BIBAFull-Text 151-162
  Fabian Beck; Bogdan Dit; Jaleo Velasco-Madden; Daniel Weiskopf; Denys Poshyvanyk
Locating features in large software systems is a fundamental maintenance task for developers when fixing bugs and extending software. We introduce In Situ Impact Insight (I3), a novel user interface to support feature location. In addition to a list of search results, I3 provides support for developers during browsing and inspecting the retrieved code entities. In situ visualizations augment results and source code with additional information relevant for further exploration. Developers are able to retrieve details on the textual similarity of a source code entity to the search query and to other entities, as well as the information on co-changed entities from a project's history. Execution traces recorded during program runs can be used as filters to further refine the search results. We implemented I3 as an Eclipse plug-in and tested it in a user study involving 18 students and professional developers that were asked to perform three feature location tasks chosen from the issue tracking system of jEdit. The results of our study suggest that I3's user interface is intuitive and unobtrusively supports developers with the required information when and where they need it.
Detecting clones in Android applications through analyzing user interfaces BIBAFull-Text 163-173
  Charlie Soh; Hee Beng Kuan Tan; Yauhen Leanidavich Arnatovich; Lipo Wang
The blooming mobile smart phone device industry has attracted a large number of application developers. However, due to the availability of reverse engineering tools for Android applications, it also caught the attention of plagiarists and malware writers. In recent years, application cloning has become a serious threat to the Android market. In previous work, mobile application clone detection mainly focuses on code-based analysis. Such an approach lacks resilient to advanced obfuscation techniques. Their efficiency is also questionable, as billions of opcodes need to be processed for cross-market clone detection. In this paper, we propose a novel technique of detecting Android application clones based on the analysis of user interface (UI) information collected at runtime. By leveraging on the multiple entry points feature of Android applications, the UI information can be collected easily without the need to generate relevant inputs and execute the entire application. Another advantage of our technique is obfuscation resilient since semantics preserving obfuscation technique do not affect runtime behaviors. We evaluated our approach on a set of real-world dataset and it has a low false positive rate and false negative rate. Furthermore, the results also show that our approach is effective in detecting different types of repackaging attacks.

Users, user interfaces, and feature location -- early research achievement papers

Manually locating features in industrial source code: the search actions of software nomads BIBAFull-Text 174-177
  Howell Jordan; Jacek Rosik; Sebastian Herold; Goetz Botterweck; Jim Buckley
Expert software engineers working on large systems often need to perform feature location when moving to work in unfamiliar areas. We hypothesise that leveraging the system-specific knowledge of these software nomads may help to improve semi-automated feature location techniques. In order to assess and understand how software nomads perform manual feature location searches, two expert professional software engineers were observed in-vivo following a think-aloud protocol while performing manual feature location on a large-scale heterogeneous system. The nomads' search actions were found to be around twice as effective as those reported in previous studies. This cannot be explained by sophisticated use of tools or complex queries. We conclude that system rules and conventions are frequently used by experts when constructing feature location search terms.
From obfuscation to comprehension BIBAFull-Text 178-181
  Eran Avidan; Dror G. Feitelson
Code obfuscation techniques are widely used in industry to increase protection of source code and intellectual property. The idea is that even if attackers gain hold of source code, it will be hard for them to understand what it does and how. Thus obfuscation techniques are specifically targeted at human comprehension of code. We suggest that the ideas and experience embedded in obfuscations can be used to learn about comprehension. In particular, we survey known obfuscation techniques and use them in an attempt to derive metrics for code (in)comprehensibility. This leads to emphasis on issues such as identifier naming, which are typically left on the sidelines in discussions of code comprehension, and motivates increased efforts to measure their effect.
The plague doctor: a promising cure for the window plague BIBAFull-Text 182-185
  Roberto Minelli; Andrea Mocci; Michele Lanza
Modern Integrated Development Environments (IDEs) are often affected by the "window plague", an overly crowded workspace with many open windows and tabs. The main cause is the lack of navigation support in IDEs, also due to the many -- and not always obvious -- complex relationships that exist between program entities.
   Researchers have shown that it is possible to mitigate the window plague by exploiting the data obtained by monitoring how developers interact with the user interface of the IDE. However, despite initial results the approach was never fully integrated in an IDE.
   In our previous work, we implemented DFlow, an automatic interaction profiler that monitors all the fine-grained interactions of the developer with the IDE. Here we present a first prototype of the Plague Doctor, a tool that seamlessly detects the windows that are less likely to be used in the future and automatically closes them. We discuss our long term vision on how to fully exploit the interaction data recorded by DFlow to provide a more effective cure for the window plague.

Large scale empirical studies -- technical research papers

Polymorphism in the spotlight: studying its prevalence in Java and Smalltalk BIBAFull-Text 186-195
  Nevena Milojkovic; Andrea Caracciolo; Mircea Filip Lungu; Oscar Nierstrasz; David Röthlisberger; Romain Robbes
Subtype polymorphism is a cornerstone of object-oriented programming. By hiding variability in behavior behind a uniform interface, polymorphism decouples clients from providers and thus enables genericity, modularity and extensibility. At the same time, however, it scatters the implementation of the behavior over multiple classes thus potentially hampering program comprehension.
   The extent to which polymorphism is used in real programs and the impact of polymorphism on program comprehension are not very well understood. We report on a preliminary study of the prevalence of polymorphism in several hundred open source software systems written in Smalltalk, one of the oldest object-oriented programming languages, and in Java, one of the most widespread ones.
   Although a large portion of the call sites in these systems are polymorphic, a majority have a small number of potential candidates. Smalltalk uses polymorphism to a much greater extent than Java. We discuss how these findings can be used as input for more detailed studies in program comprehension and for better developer support in the IDE.
A survey of the forms of Java reference names BIBAFull-Text 196-206
  Simon Butler; Michel Wermelinger; Yijun Yu
The readability of identifiers is a major factor of program comprehension and an aim of naming convention guidelines. Due to their semantic content, identifiers are also used in feature and bug location, among other software maintenance tasks. Looking at how names are used in practice may lead to insights on potential problems for comprehension and for programming support tools that process identifiers.
   Class and method names are already well represented in the literature. This paper presents an investigation of Java field, formal argument and local variable names, which we collectively call reference names. These names cannot be ignored because they constitute over half the unique names and almost 70% of the name declarations in the corpus investigated.
   We analysed the forms of 3.5 million reference name declarations in 60 well known Java projects, examining the phrasal structure of names composed of known words and acronyms. The structures found in practice were evaluated against those given in the literature. The use of unknown abbreviations and words, which may pose a problem for program comprehension, was also identified. Based on our observations of the rich diversity of reference names, we suggest issues to be taken into account for future academic research and for improving tools that rely on names as sources of information.
Make it simple: an empirical analysis of GNU make feature use in open source projects BIBAFull-Text 207-217
  Douglas H. Martin; James R. Cordy; Bram Adams; Giulio Antoniol
Make is one of the oldest build technologies and is still widely used today, whether by manually writing Makefiles, or by generating them using tools like Autotools and CMake. Despite its conceptual simplicity, modern Make implementations such as GNU Make have become very complex languages, featuring functions, macros, lazy variable assignments and (in GNU Make 4.0) the Guile embedded scripting language. Since we are interested in understanding how widespread such complex language features are, this paper studies the use of Make features in almost 20,000 Makefiles, comprised of over 8.4 million lines, from more than 350 different open source projects. We look at the popularity of features and the difference between hand-written Makefiles and those generated using various tools. We find that generated Makefiles use only a core set of features and that more advanced features (such as function calls) are used very little, and almost exclusively in hand-written Makefiles.
License usage and changes: a large-scale study of Java projects on GitHub BIBAFull-Text 218-228
  Christopher Vendome; Mario Linares-Vásquez; Gabriele Bavota; Massimiliano Di Penta; Daniel German; Denys Poshyvanyk
Software licenses determine, from a legal point of view, under which conditions software can be integrated, used, and above all, redistributed. Licenses evolve over time to meet the needs of development communities and to cope with emerging legal issues and new development paradigms. Such evolution of licenses is likely to be accompanied by changes in the way how software uses such licenses, resulting in some licenses being adopted while others are abandoned. This paper reports a large empirical study aimed at quantitatively and qualitatively investigating when and why developer change software licenses. Specifically, we first identify licenses' changes in 1,731,828 commits, representing the entire history of 16,221 Java projects hosted on GitHub. Then, to understand the rationale of license changes, we perform a qualitative analysis -- following a grounded theory approach -- of commit notes and issue tracker discussions concerning licensing topics and, whenever possible, try to build traceability links between discussions and changes. Our results point out a lack of traceability of when and why licensing changes are made. This can be a major concern, because a change in the license of a system can negatively impact those that reuse it.
Unsupervised software categorization using bytecode BIBAFull-Text 229-239
  Javier Escobar-Avila; Mario Linares-Vásquez; Sonia Haiduc
Automatic software categorization is the task of assigning software systems or libraries to categories based on their functionality. Correctly assigning these categories is essential to ensure that relevant software can be easily retrieved by developers from large repositories. State of the art approaches either rely on the availability of the source code, or use supervised machine learning approaches, which require a set of already labeled software as training data. These restrictions make current approaches fail when such information is not available. We propose a novel approach, which overcomes these limitations by using semantic information recovered from bytecode and an unsupervised algorithm to assign categories to software systems. We evaluated our approach in a study on the Apache Foundation Repository of Java libraries and the results indicate that our approach is able to correctly identify a correct category for 86% of the libraries.

Large scale empirical studies -- early research achievement papers

The last line effect BIBAFull-Text 240-243
  Moritz Beller; Andy Zaidman; Andrey Karpov
Micro-clones are tiny duplicated pieces of code; they typically comprise only a few statements or lines. In this paper, we expose the "last line effect," the phenomenon that the last line or statement in a micro-clone is much more likely to contain an error than the previous lines or statements. We do this by analyzing 208 open source projects and reporting on 202 faulty micro-clones.

Reading and visualizing -- technical research papers

How programmers read regular code: a controlled experiment using eye tracking BIBAFull-Text 244-254
  Ahmad Jbara; Dror G. Feitelson
Regular code, which includes repetitions of the same basic pattern, has been shown to have an effect on code comprehension: a regular function can be just as easy to comprehend as an irregular one with the same functionality, despite being longer and including more control constructs. It has been speculated that this effect is due to leveraging the understanding of the first instances to ease the understanding of repeated instances of the pattern. To verify and quantify this effect, we use eye tracking to measure the time and effort spent reading and understanding regular code. The results are that time and effort invested in the initial code segments are indeed much larger than those spent on the later ones, and the decay in effort can be modeled by an exponential or cubic model. This shows that syntactic code complexity metrics (such as LOC and MCC) need to be made context-sensitive, e.g. by giving reduced weight to repeated segments according to their place in the sequence.
Eye movements in code reading: relaxing the linear order BIBAFull-Text 255-265
  Teresa Busjahn; Roman Bednarik; Andrew Begel; Martha Crosby; James H. Paterson; Carsten Schulte; Bonita Sharif; Sascha Tamm
Code reading is an important skill in programming. Inspired by the linearity that people exhibit while natural language text reading, we designed local and global gaze-based measures to characterize linearity (left-to-right and top-to-bottom) in reading source code. Unlike natural language text, source code is executable and requires a specific reading approach. To validate these measures, we compared the eye movements of novice and expert programmers who were asked to read and comprehend short snippets of natural language text and Java programs.
   Our results show that novices read source code less linearly than natural language text. Moreover, experts read code less linearly than novices. These findings indicate that there are specific differences between reading natural language and source code, and suggest that non-linear reading skills increase with expertise. We discuss the implications for practitioners and educators.
Comparing trace visualizations for program comprehension through controlled experiments BIBAFull-Text 266-276
  Florian Fittkau; Santje Finke; Wilhelm Hasselbring; Jan Waller
For efficient and effective program comprehension, it is essential to provide software engineers with appropriate visualizations of the program's execution traces. Empirical studies, such as controlled experiments, are required to assess the effectiveness and efficiency of proposed visualization techniques.
   We present controlled experiments to compare the trace visualization tools Extravis and ExplorViz in typical program comprehension tasks. We replicate the first controlled experiment with a second one targeting a differently sized software system. In addition to a thorough analysis of the strategies chosen by the participants, we report on common challenges comparing trace visualization techniques. Besides our own replication of the first experiment, we provide a package containing all our experimental data to facilitate the verifiability, reproducibility and further extensibility of our presented results.
   Although subjects spent similar time on program comprehension tasks with both tools for a small-sized system, analyzing a larger software system resulted in a significant efficiency advantage of 28 percent less time spent by using ExplorViz. Concerning the effectiveness (correct solutions for program comprehension tasks), we observed a significant improvement of correctness for both object system sizes of 39 and 61 percent with ExplorViz.

Reading and visualizing -- early research achievement papers

Towards visual reflexion models BIBAFull-Text 277-280
  Marcello Romanelli; Andrea Mocci; Michele Lanza
Source code and models of a software system, like architectural views, tend to evolve separately and drift apart over time. Previous research has shown that it is possible to effectively relate them through a reflexion model, defined as a "summarization of a software system from the viewpoint of a particular high-level model". While effective, the process of constructing and analyzing reflexion models was supported by text-based tools with limited visual representation. With the original approach, it was relatively hard to understand which parts of the system were represented, and which parts of the system contributed to specific relations in the reflexion model.
   We present our vision on augmenting the construction and analysis of reflexion models with visual support, effectively providing the basis for visual reflexion models. We describe our approach, implemented as a web-based application, and two promising case studies involving two open-source projects.
Understanding web applications using component based visual patterns BIBAFull-Text 281-284
  Dan C. Cosma; Petru F. Mihancea
This paper introduces our approach for high-level system understanding that uses software visualization to analyze the presentation layer of Web applications. The technique is driven by static analysis, relies on state-of-the art concepts, and is technology-aware, so that it focuses on those precise particularities of the application's presentation layer that define its Web presence. By combining an approach initially developed for software testing with visualization, the essential structural dependencies between and within the Web components are extracted and reviewed. Initial evaluation shows that the technique is able to provide a comprehensive view that is very useful in spotting new and interesting visual patterns that give significant insight for software comprehension.

Industry and experience reports

Fault localization during system testing BIBAFull-Text 285-286
  Pavan Kumar Chittimalli; Vipul Shah
Functional testing of business applications in the enterprise is carried out by independent test teams. Test scripts are generated manually or automatically from requirements, treating the IT systems as a black box. For every release, when test scripts fail to execute, the test teams need to ascertain the cause of failure, which could be due to mismatch between the requirements and the test models and test scripts, or faults in the test scripts or faults in the source code. The process is cumbersome and time consuming. While several techniques have been developed to localize source code faults, these target testing carried out by the developer.
   To help test teams localize faults, we propose the novel idea of applying source code based fault localization technique to process models that represent the system functionality. Experimental results show that the techniques when applied to models, were able to localize both test script and source code faults.
Recovering workflows from functional tests BIBAFull-Text 287-288
  Chetan Khadke; Sunjit Rana; Vipul Shah
When enterprises outsource maintenance of IT systems to service providers, thorough knowledge acquisition is critical to the success of the engagement. Program comprehension contributes significantly to acquiring knowledge of the IT systems. It is a common practice to execute test scripts to identify critical scenarios in the system and then trace these as flows in the programs.
   Instead of executing test scripts, we propose the novel idea of mining workflows from test scripts to construct formal process models. The global view provided by the mined model can not only help transition teams gain high level understanding of the system but also help identify critical flows. We also suggest categorization of test cases using supervised learning to improve comprehension.
Reordering results of keyword-based code search for supporting simultaneous code changes BIBAFull-Text 289-290
  Yusuke Sabi; Hiroaki Murakami; Yoshiki Higo; Shinji Kusumoto
Many research studies have been conducted to help simultaneous code changes on multiple code fragments. Code clones and logical couplings are often utilized in such research studies. However, most of them have been evaluated on only open source projects or students' software. In this paper, we report our academic-industrial collaboration with a software company. The collaboration is intended to suggest multiple code fragments to be changed simultaneously when a developer specifies a keyword such as variable names on source code. In the collaboration, we propose to use code clones and logical couplings information to reorder the code fragments. We confirmed that code clones and logical couplings worked well on helping simultaneous code changes on three projects that have being developed in the company.

Tool demos

VerXCombo: an interactive data visualization of popular library version combinations BIBAFull-Text 291-294
  Yuki Yano; Raula Gaikovina Kula; Takashi Ishio; Katsuro Inoue
In large software systems, it is common practice to adopt third-party libraries. Decisions by system maintainers to either update or introduce new third-party libraries can range from trivial to complex. For instance, incompatibility between internal library dependencies may complicate adoption. Therefore, system maintainers especially need adequate assurance of any candidate library release. Using the 'wisdom of the crowd', VerXCombo aims to assist system maintainers by mining popular library dependency patterns of similar systems. Through data interactions, VerXCombo leverages parallel sets to break-down large and complex dataset into distinguishable patterns of 1.) popular and 2.) latest library dependency release combinations. Populating our tool with a maven library dependency dataset from over 4,000 Java Open Source projects, we demonstrate through a case scenario navigation and best fit combinations of the VerXCombo tool. A video highlighting the main features of the tool can be found at: http://goo.gl/wWPylL
ITMViz: interactive topic modeling for source code analysis BIBAFull-Text 295-298
  Amir M. Saeidi; Jurriaan Hage; Ravi Khadka; Slinger Jansen
Topic modeling has seen a surge in use for software comprehension. Although the models inferred from the source code are a great source of knowledge, they fail to fully capture the conceptual relationships between the topics. Here we investigate the use of interactive topic modeling for source code analysis by feeding-in information from the end-users, including developers and architects, to refine the inferred topic models. We have implemented a web-based toolkit called ITMViz to provide support to interpret the topic models, and use the results to cluster modules together. A medium-sized Java project is used to evaluate our approach in understanding the software system.
ExceptionTracer: a solution recommender for exceptions in an integrated development environment BIBAFull-Text 299-302
  Vahid Amintabar; Abbas Heydarnoori; Mohammad Ghafari
Exceptions are an indispensable part of the software development process. However, developers usually rely on imprecise results from a web search to resolve exceptions. More specifically, they should personally take into account the context of an exception; then, choose and adapt a relevant solution to solve the problem. In this paper, we present ExceptionTracer, an Eclipse plugin that helps developers to resolve exceptions with respect to the stack trace in Java programs. In particular, ExceptionTracer automatically provides candidate solutions to an exception by mining software systems in the SourceForge, as well as listing relevant discussions about the problem from the StackOverflow.
Limpio: LIghtweight MPI instrumentatiOn BIBAFull-Text 303-306
  Milan Pavlovic; Milan Radulovic; Alex Ramirez; Petar Radojkovic
Characterization of high-performance computing applications often has to be done without access to the source code. Computer architects, therefore, have a narrowed choice of instrumentation tools. Moreover, potentially large amount of collected data can prohibit creating a full timestamped event trace and analyzing it post-mortem. This paper describes Limpio -- a LIghtweight MPI instrumentatiOn framework, that allows dynamic instrumentation of user-selected MPI calls, and customization of data gathering, analysis and visualization.