Author Archives: hannes

HD-Diff released as part of Sweble 2.0

HD-Diff is a tree-based algorithm to compute the differences between two documents. The algorithm was presented in a paper at the DocEng 2014 conference. Unlike other tree-based differencing algorithms HD-Diff can look into text nodes, splits them when necessary and … Continue reading

Posted in Uncategorized | Comments Off on HD-Diff released as part of Sweble 2.0

Fine-grained Change Detection in Structured Text Documents (DocEng 2014)

Abstract: Detecting and understanding changes between document revisions is an important task. The acquired knowledge can be used to classify the nature of a new document revision or to support a human editor in the review process. While purely textual change … Continue reading

Posted in Uncategorized | 1 Comment

Sweble 2.0 released!

Two years after our first public release of the Google-Sponsored Sweble 2.0 Alpha, we are happy to announce the release of Sweble 2.0! The most important innovation in the alpha release was the introduction of the engine component which allowed … Continue reading

Posted in Wikitext Parser | Tagged | Comments Off on Sweble 2.0 released!

Design and Implementation of Wiki Content Transformations and Refactorings

Abstract: The organic growth of wikis requires constant attention by contributors who are willing to patrol the wiki and improve its content structure. However, most wikis still only offer textual editing and even wikis which offer WYSIWYG editing do not … Continue reading

Posted in Uncategorized | Comments Off on Design and Implementation of Wiki Content Transformations and Refactorings

Sweble on GitHub and Ohloh

The Sweble Project can now be found on GitHub and Ohloh. The GitHub repositories mirror the primary repositories hosted on our servers. Commits pushed to our repositories will be pushed to GitHub after a short delay. Please visit us on … Continue reading

Posted in Uncategorized | Comments Off on Sweble on GitHub and Ohloh

Google-Sponsored Sweble 2.0 Alpha Released

We released an early 2.0 (alpha) version of the Sweble Wikitext parser and related libraries on our git repository and as maven artifacts. The Sweble Wikitext parser aims to provide a Mediawiki-compliant Wiktext parser implementation in Java. This includes full Mediawiki template expansion but does not … Continue reading

Posted in Uncategorized | Comments Off on Google-Sponsored Sweble 2.0 Alpha Released

Sweble 1.1.0 released

Sweble 1.1.0 fixes some bugs and introduces a couple of new features/modules. For a full list of changes please refer to the changes reports of the individual modules. The release can be found on maven central. Jars with dependencies will soon be available … Continue reading

Posted in Uncategorized | Comments Off on Sweble 1.1.0 released

Sweble is available on Maven Central

We are finally deploying releases of Sweble and related software to Maven Central. This has many advantages for users of our software, among others: You don’t have to refer to our Maven repositories any more in your own poms (if … Continue reading

Posted in Uncategorized | 2 Comments

Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia

We will be presenting our paper on the design and implementation of the Sweble Wikitext Parser at the WikiSym 2011 conference! The conference will take place in Mountain View, CA in October. For those of you who want to take a … Continue reading

Posted in Uncategorized | Comments Off on Design and Implementation of the Sweble Wikitext Parser: Unlocking the Structured Data of Wikipedia

WOM: An object model for MediaWiki’s Wikitext

Wikipedia is a rich encyclopedia that is not only of great use to its contributors and readers but also to researchers and providers of third party software around Wikipedia. However, Wikipedia’s content is only available as Wikitext, the markup language … Continue reading

Posted in Uncategorized | Comments Off on WOM: An object model for MediaWiki’s Wikitext