|
|
By Dirk Riehle, on May 23rd, 2011
Abstract: Die Wikipedia hat das Ziel, eine global verfügbare, freie Informationsquelle in Form eines Online-Lexikons zu erstellen. Freiwillige aus der ganzen Welt erstellen und kategorisieren kollaborativ neue Artikel, prüfen, aktualisieren und verbessern bestehende Artikel. Diese Änderungen erfordern auch das Überarbeiten anderer Inhalte, um die Konsistenz der Wikipedia zu erhalten. Aufgrund des Umfangs der Wikipedia ist es für Autoren eine besondere Herausforderung, diese Textstellen ausfindig zu machen. Das in dieser Arbeit entwickelte System kann die Wikipedia automatisch nach ähnlichen Artikeln und Abschnitten durchsuchen und darauf aufbauend Kategorievorschläge zusammenstellen. Auf einer Volltextsuche basierend, skaliert es für den gesamten textuellen Inhalt der Wikipedia und liefert innerhalb kürzester Zeit Ergebnisse. Damit stellt es eine Verbesserung gegenüber anderen Ansätzen dar, die entweder auf die Linkstruktur der Wikipedia beschränkt sind oder nur für einen Teilbereich der Wikipedia getestet wurden. Die Möglichkeit, dabei auf Abschnittsebene zu arbeiten, ist neu. Mithilfe der Wikipedia-Kategorien wird das System qualitativ evaluiert.
PDFs: Final thesis (in German)
Reference: Guido Leisker. Abschnittsbasierte Textklassifikation in der Wikipedia. Magisterarbeit, Friedrich-Alexander University of Erlangen-Nürnberg: 2011.
By Dirk Riehle, on May 2nd, 2011
Abstract: When using mailing lists as a collaboration tool, (open source) software developers are following various usage patterns. In order to improve the efficiency of open source collaboration, this thesis tries to identify these existing patterns by analyzing the mailing lists of popular open source projects, then proposes an annotation schema to codify these patterns. A mailing list archiver application is also implemented, which applies the codifications to handle email messages, provides tool supporting for the improvement.
Keywords: Open Source Software Development, Collaboration, Mailing List, Conversation Action, Usage Pattern, Email Message, JavaMail API, Google Web Toolkit (GWT), Hibernate, PostgreSQL
PDFs: Final thesis (in English), original work description
Reference: Ke Chang. Open Source Collaboration Codified. Diplomarbeit, Friedrich-Alexander University of Erlangen-Nürnberg: 2011.
By Dirk Riehle, on May 1st, 2011
We are happy to announce the general availability of the first public release of the Sweble Wikitext parser, available from http://sweble.org.
The Sweble Wikitext parser
- can parse all complex Wikitext, incl. tables and templates
- produces a real abstract syntax tree (AST); a DOM will follow soon
- is open source made available under the Apache Software License 2.0
- is written in Java utilizing only permissively licensed libraries
You can find all relevant information and code at http://sweble.org – this also includes demos, in particular the CrystalBall demo, which lets you query a Wikipedia snapshot using XQuery. (The underlying storage mechanism is not particularly well-performing, so you may have to wait a little if load is high.)
Continue reading Announcing the Open Source Sweble Wikitext Parser v1.0
By Dirk Riehle, on January 3rd, 2011
Table of Contents
- Year-end Summary
- Mini Symposium
- More Information
1. Year-end Summary
The Open Source Research (OSR) Group was founded in Sept 2009, so it has been 16 months since inception. We hope to be writing a year-end summary every year, available to anyone interested. FAU is the university, CS is the computer science department, “we” is the group, and “I” is Dirk Riehle.
Continue reading 2010 Year-End Letter to Stakeholders
By Dirk Riehle, on December 30th, 2010
Some upcoming research conferences with submission deadlines in 2011.
Software Engineering
Collaborative Work
By Dirk Riehle, on August 2nd, 2010
Summary: Continuous deployment is the name of an engineering practice where a commit to a project’s code repository is put into production without any intermediate human intervention. It is the next step after continuous integration and it is all the rage in agile methods circles and for web applications. This (Studien/Diplom/Bachelor/Master) thesis reviews the current practice of continuous deployment and applies it to the Open Source Research Group’s software projects.
Read more on /fun or contact Prof. Riehle
By Dirk Riehle, on June 22nd, 2010
The Bavaria California Technology Center has awared our group some funds to move forward with our “Reengineering Wikitext” project. Below please find a short research project summary (in German).
Continue reading Funding for Research Project “Reengineering Wikitext”
By Dirk Riehle, on June 8th, 2010
The German Ph.D. System Explained in One Page
In Germany, today (2010), primary and secondary education take 12 school years and when finished successfully, allow you to enter college. College provides three main degrees you can achieve, a Bachelor, a Master, and a Ph.D. At the University of Erlangen, the Bachelor degree typically takes three years and the Master degree takes an additional two years. You can achieve them only one after another, and they replace what used to be the German “Diplom.”
The Ph.D. title is also called the doctor title and upon successful completion, you’ll be allowed to carry the prefix “Dr.” in front of your name (rather than a trailing “Ph.D.”) If you see a “Dr.” in front of a name it does not imply a medical degree but could be a degree of any science. My department, the technical faculty of the University of Erlangen provides a traditional “Dr. Ing.” which is a doctor of engineering.
Continue reading The German Ph.D. System Explained in One Post
By Dirk Riehle, on March 26th, 2010
Please join us for the Erlangen Computer Science Day (Tag der Informatik). More information (in German) to be found on the Tag der Informatik website. Prof. Riehle will give a main lecture, his “Antrittsvorlesung” that day.
Continue reading Erlangen Computer Science Day / Tag der Informatik
By Dirk Riehle, on March 5th, 2010
Update 2010-03-31: The company in question is WeWebU; read more about its open source strategy in their recent press release. Also, the company would like to hire the person described below directly, as an employee. A potential dissertation then has become a separate matter. If you are interested, please direct your resume directly to the CEO of WeWebU, Stefan Waldhauser, at stefan.waldhauser@wewebu.de.
Continue reading Dissertation on “Going Open Source” Case Study
|
|