All posts by Maximilian Capraro

Final Thesis: A Quality Model for Inner Source

Abstract: Inner Source (IS) ist die Verwendung von Open Source (OS) Entwicklungspraktiken innerhalb einer Organisation. Einige Organisationen führen IS Projekte oder sogar IS Programme durch. Bisher gibt es zwar veröffentlichte OS Qualitätsmodelle, allerdings ist kein Qualitätsmodell speziell für IS bekannt. Dieses Papier präsentiert ein Qualitätsmodell für IS Programme und Projekte. Wir führen fünf Interviews mit IS Experten durch und analysieren diese mittels thematischer Analyse. Anhand der daraus entstandenen Einblicke entwickeln wir ein hierarchisches Qualitätsmodell für IS Programme und IS Projekte, die wir anschließend zu recherchierten OS Qualitätsmodellen abgrenzen.

Keywords: Inner source, inner source quality, inner source metrics

PDFs: Master ThesisThesis Description

Reference: Bernd Grillenberger. A Quality Model for Inner Source. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2017.

Final Thesis: Managing Organization Data for Patch-Flow Measurement

Abstract: Open source practices and the establishment of an open source like culture within organizations is also called Inner Source (IS). The existing software Collaboration Metric Suite (CMSuite) provides metrics about collaboration between software projects. These metrics can validate the application of IS in organizations. However, the underlying model of the CMSuite currently only supports simple hierarchical organizational structures. Organizations with a more complex structure can not be correctly mapped. In this thesis, a model was designed and integrated into the CMSuite, that fulfills the requirements of a complex organizational structure. For this purpose, two case studies, which show the weaknesses of the current model, were studied. Finally, it was shown that dealing with complex organization structures is not a problem for the CMSuite anymore.

Keywords: Software engineering, mining software repositories, inner source

PDFs: Master Thesis, Thesis Description

Reference: Andreas Bauer. Managing Organization Data for Patch-Flow Measurement. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2017.

Final Thesis: Extending an Inner Source Patch-Flow Crawler for Gitlab and Github Enterprise

Abstract: Due to the advantages, source code management is widely used nearly everywhere in software development. But often the access to organizational repositories is restricted to individual projects or groups. In contrast to that, Riehle, Capraro, Kips und Horn (2016) describe the application of techniques established in open source development, like organization internal accumulation and publication of knowledge, as an important element of inner source. In this context features set of SCM, like following up the author of a repositories commit, is a crucial part for measuring patches between organizational units. The Professorship for Open Source Software developed a crawler, with the purpose of gathering and saving patch-flow data, by automatic processing of a repository’s metadata. Extending the Patch-Flow crawler with an interface for GitLab and GitHub Enterprise allows to use the implemented functionality as standalone or in combination with already existing features. This way, the possible applications and accuracy are enhanced.

Keywords: Software engineering, mining software repositories, inner source, patch-flow

PDFs: Bachelor ThesisThesis Description

Reference: Benjamin Mach. Extending a Inner Source Patch-Flow Crawler for Gitlab and Github Enterprise. Bachelor Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2017.

Final Thesis: Measuring the Patch Review Process in Open and Inner Source

Abstract: Inner source development is the application of open source practices for a company’s internal software development. One of the practices is called review process. This process separates the code contribution from its integration. In inner source, the review process is not researched. Therefore, a suitable software for measuring this process is required for research purposes. The measuring instruments for inner source development used today are not capable of examining the review process. This thesis develops an extension of an existing application for analyzing review processes in inner source. To evaluate the functionality of this application, it is applied to selected projects. The collected data is used to demonstrate that they are suitable for answering typical questions for review processes. For further research the extension allows measurement of the review processes in inner source projects.

Keywords: Software engineering, mining software repositories, inner source, patch-flow

PDFs: Master Thesis, Work Description

Reference: Johannes Pfann. Measuring the Patch Review Process in Open and Inner Source. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2017.

Final Thesis: Bewertung von Fehlerfreiheit und Vollständigkeit gemessener Patch-Flow Daten

Abstract: The patch-flow analysis offers companies the possibility to analyze the own collaboration of their organisational units with company-internal reference sources. Due to the diversity of required data sources, data can sometimes only be collected by hand. The monitoring of completeness and accuracy has not been established so far. This thesis is used in the investigation to determine which characteristics of data quality are of interest and how manually collected data influences on completeness and accuracy. A goal-question-metric model is been developed in order to evaluate patch-flow data with regard to completeness and accuracy. On the basis of precise measured values, the model will be evaluated and discussed.

Keywords: Inner source, patch-flow, data quality

PDFs:  Work description

Reference: Jörn Rechenburg. Bewertung von Fehlerfreiheit und Vollständigkeit gemessener Patch-Flow Daten. Bachelor Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2016.

Final Thesis: A Tool for Visualizing Patch-Flow

Abstract: Inner source is the use of open source software development practices and the establishment of an open source-like culture within organization that helps to improve code reuse and share knowledge where product line engineering fails (Capraro et al. 2016). We measure the inner source collaboration by measuring the code contributions (Patch-Flow) between project boundaries or organizational units. The management of software development organizations, which want to adopt the inner source strategy in Enterprise, needs a tool for visualizing Patch-Flow, helping it to analyze the collaborative process and make the software development process within companies more effective. Nowadays, the market cannot offer any product to supply the demand. This thesis develops a software design and implementation of tool to represent the various Patch-Flow-based metrics for quantifying code-level collaboration and stake of participants in it. The presented tool is the first product allowing quantitative visualization of Patch-Flow. It enables managers to evaluate and make decisions about the code-level collaboration and supports them in the inner source context.

Keywords: Inner source, open source, patch-flow, collaboration, software metrics

PDFs:  Bachelor Thesis, Work Description

Reference: Oleksandr Iefimenko. A Tool for Visualizing Patch-Flow. Bachelor Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2016.

Full House for FLOSS – Our Course on Free/Libre, and Open Source Software

For a long time, we have been planning to hold our course on Free/Libre, and Open Source Software (FLOSS).  Today, we kicked off FLOSS with the first lecture in a series of eleven. We were thrilled by the number of interested students. The lecture hall was close to bursting full:

If you want to learn more, please checkout our FLOSS course or join us on StudOn.

Final Thesis: Collaboration Networks in Open Source Projects’ Issue Trackers

Abstract: Open source software nowadays is increasingly used outside its own ecosystem. It is used by governmental departments and companies, for commercial projects and as a part of critical infrastructure, like encryption libraries. This usually makes it necessary to increase the collaboration with open source communities but it is also required to asses new risks, which arise from using a software, maintained mainly by volunteers. In order to achieve this we need a better understanding of open source communities and about how they organize and structure themselves, whether they depend on key personality and whether they form subcommunities? How do such communities change their behavior over time, during growth and which is the most efficient form of structuring to handle reported issues? We collect issue tracker data and use it to create sequences of social networks for 6007 projects present on Github.com and for 120 projects from the Apache Software Foundation. Based on metrics to quantify the strength of subcommunities and centralization we study the general structure of open source communities and how they might correlate, but also their behavior over time. We compare the communities based on an efficiency metric to gain information about preferable structures. Our results show that most open source communities avoid to organize themselves in subcommunities while they are highly dependent on a few key personalities. The results of both metrics do indeed correlate, which means that if a community has strongly distinct groups it is unlikely to be highly centralized. But neither  the few in subcommunities organized projects nor any other organizational type show a significantly higher efficiency. Although projects grow over time, they show only little internal structural change. The stability of open source communities and the strong avoidance of subcommunities are unexpected results, since it highly contradicts recent related studies and therefore requires more research to better predict the development of projects. For the assessment of projects the dependency on strong key members for the stability may prove helpful when it comes to indicating and analyzing major changes.

Keywords: Open Source, Issue Tracker, Bug Tracker, Social Network Analysis

PDFs: Master Thesis, Work Description

Reference: Björn Meier. Collaboration Networks in Open Source Projects’ Issue Trackers. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2015.

Final Thesis: Measuring Patch-Flow at Google

Abstract: In the industrial domain, software development is a highly collaborative work involving different contributing teams. But there is not yet a way to quantify the collaboration between organizational units within a software developing company. However, information about this collaboration is latent in software repositories and has not been defined yet.

We mined Google’s software repository and identified all commits which are assigned to projects of organizational units the patch author does not belong to. We call this phenomena of collaboration beyond organizational borders patch-flow. This work introduces a graph-based metric to quantify this patch-flow. We developed a tool that is able to crawl in Google’s repository and collected patches of 2,500 Google developers in the years 2007, 2009, 2011, and 2013. Due to the missing historical information about organizational unit membership of developers, we provided a clustering approach to assign all developers to orgunits. Because the Google internal data has not been released by now, we crawled and analyzed the Chromium project.

Using the Chromium data we were able to apply the patch-flow metric and quantify collaboration over organizational unit boundaries, although the used data source is only suitable to a limited extent. The clustering approach has to be validated.

Keywords: collaboration, mining software repositories, google, orgunit, patch, flow

PDFs: Work Description

Reference: Michael Dorner. Measuring Patch-Flow at Google. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2015.

Final Thesis: Measuring Patch-Flow on Github

Abstract: For Open Source Software ( OS ) projects, collaboration is a key to success, as less collaboration between projects leads to projects with less progress. Patches from other OS projects provide the projects with a higher code quality or functionality. In literature, several papers examine the extent of collaboration on OS projects. Yet, most of these studies do not cover the collaboration between different projects. To understand the collaboration between OS projects, Source Code Management ( SCM ) repositories are an essential source. Between repositories exists a connection by patches, which can be obtained by data mining the projects repositories. The measurement of the connection by patches is very difficult, because the information about where the patches go and where they come from is not stored within a repository. Collaboration between OS projects can be expressed as so called Patch Flow. As an example for the OS world, I use GitHub.com as data source. I present to which extent Patch Flow exists between repositories and what circumstances influence Patch Flow. Further, I introduce a model which represents the Patch Flow in detail. Based on this model, I developed a crawler to collect data from the GitHub.com repositories. The analysis of the gathered data shows, that Patch Flow between OS projects exists. Numbers suggest, that collaboration among projects is common in OS projects.

Keywords: Measuring collaboration, mining software repositories, software analytics, open source

PDFs: Master Thesis, Work Description

Reference: Manuel Frederic Zerpies. Measuring Patch-Flow on Github. Master Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg: 2015.