Suzan Verberne op eHumanities workshop

Geplaatst op 11-06-2013 door Maarten Marx | Political Mashup, resultaten

Suzan Verberne presenteert haar werk over het leren classificeren van verkiezingsprogrammas op de Soeterbeeck eHumanities workshop op 13 en 14 Juni, 2013.



Isaac Lipschits (1930–2008) was a Dutch historian and political scientist. One of his works is an annotated collection of election manifestos (party programmes) for the Dutch elections between 1977-1998 (Lipschits, 1977). For each election year he compiled a book with the manifestos published by all parties that partic- ipated in that year’s elections. Lipschits manually labelled the manifestos with themes: he segmented the manifestos into coherent text fragments, numbered them, and added an index of themes in the back of the book referring to these text numbers.
In the Political Mashup project (Marx, 2009), Dutch political data from 1814 onwards is being digitized and indexed. The data are not only digitized and integrated but also disclosed to the public. The aims of the work presented in the current paper are: (1) to digitize the 1977–1998 Lipschits collections and (2) to build an automatic classifier for more recent, unclassified election manifestos. The starting points for our work are the Lipschits books, scanned as PDF files.
We took the following approach: We first converted the scanned PDFs to XML data in which each text fragment has been annotated with the Lipschits themes. We then used these data to build a classifier that is suited for classifying election manifestos from 2002 onwards using the data from the 1980s and 1990s. We evaluated the results by having a domain expert manually judge a sample of the classified data.


