Job opening at KRDB centre, Bolzano, Italy

Geplaatst op 28-04-2014 door Maarten Marx | Uncategorized | tags: | comment image Geen reacties »

The faculty of computer science in Bozen-Bolzano (Italy) has two openings at the level of research fellow / assistant professor (RTD-a), associated to the KRDB research centre.
| lees verder…

PoliticalMashup and Parliamentary Data

Geplaatst op 16-04-2014 door Maarten Marx | DiLiPaD, parliament | | comment image Geen reacties »

The PoliticalMashup project at the University of Amsterdam started collecting parliamentary proceedings in 2008. We started with Dutch proceedings and have since moved to collecting proceedings from other European states as well. Dilipad-logo-REVERSED-300dpi
Our aim is to transform all proceedings into a rich common XML format and store these into a single XML database system. With this database system we facilitated comparative diachronic research for historians, political scientists, linguists and communication scientists.
Currently we collect data from The Netherlands, the UK, Flanders, Germany, Denmark, Sweden and Norway.
The main problem in keeping the collecting up to date and going back as far as possible is changing data formats. The content and layout of the proceedings are in general very stable over time, but the technical formats differ much, especially in the “digital era” (starting roughly around 1995). For older material, OCR errors and badly placed scans form major challenges. Besides these challenges with the texts, much work is needed to recognize, disambiguate and link political entities (speakers, parties, constituencies, ministerial functions, etc) to existing databases.

Aims

After consultation with a panel of users consisting of scientists, journalists and archivists we decided to focus on the following aims:

  • create a complete copy of the proceedings of the meetings in parliament;
  • add metadata which record for each word spoken in parliament when it was said, who said it, in what role, on behalf of which party, and in which context. If possible, also indicate the type of speech act (e.g., speech from central lectern, interruption of a speech, shout from the benches, etc);
  • give each entity a unique identifier which is resolvable by a Handle system comparable to the DOI system; do this for real entities (persons, parties) and textual objects (proceedings, topics, speeches, paragraphs, votes, etc);
  • use these identifiers to link data to existing databases and link the parliamentary data to the Linked Open Data Cloud.

Available tools

Proceedings of the UK and the Netherlands are actively collected “until yesterday”. The collections start in 1935 (UK) and 1814 (NL) respectively. They can be downloaded and accessed through a Search Interface.

Cooperation

DiLiPaD’s Dutch principal investigators Jaap Kamps and Maarten Marx collaborate with the Information Office of the Dutch House of Commons, the Dutch Royal Library, the Dutch National Archive, the Dutch Documentation Centre for Political Parties, and scientists from the Humanities, Social and Computer Sciences.

Linking Hansards to related newsarticles

Geplaatst op 15-04-2014 door Maarten Marx | DiLiPaD, ExPoSe, ODE, parliament | tags: | comment image Geen reacties »

We describe a simple technique with which to link news articles to debates in Parliament.
The technique uses the news search engine EMM Newsexplorer.
As search strings we use

  • the date of the debate
  • the speakers
  • the first ten words from a unigram parsimonious language model created from the debate

Results on oral questions are promising. In this post we explain how we find the relevant news articles, evaluate the results. Code is provided.
| lees verder…

War in Parliament: What a Digital Approach Can Add to the Study of Parliamentary History

Geplaatst op 15-04-2014 door Maarten Marx | resultaten | tags: , | comment image Geen reacties »

Het artikel War in Parliament: What a Digital Approach Can Add to the Study of Parliamentary History van Hinke Piersma, Ismee Tames (beide NIOD), Lars Buitinck, Johan van Doornik en Maarten Marx (alle Informatics Institute, UvA) is verschenen in Digital Humanities Quarterly.
| lees verder…

Leren classificeren van verkiezingsprogrammas

Geplaatst op 15-04-2014 door Maarten Marx | DiLiPaD, Political Mashup, resultaten | tags: | comment image Geen reacties »

Het artikel Automatic thematic classification of election manifestos van Suzan Verbernea, Eva D’hondt, Antal van den Bosch en Maarten Marx is verschenen in Information Processing & Management (Volume 50, Issue 4, July 2014, Pages 554–567).
| lees verder…

DiLiPaD on Twitter

Geplaatst op 04-04-2014 door Maarten Marx | DiLiPaD | | comment image Geen reacties »

Dilipad-logo-REVERSED-300dpiThe DiLiPaD project is on Twitter, https://twitter.com/parl_data.
DiLiPaD blogs at http://dilipad.history.ac.uk/.

UK Hansards in PoliticalMashup format

Geplaatst op 03-04-2014 door Maarten Marx | data, DiLiPaD, parliament, Political Mashup | | comment image Geen reacties »
Dilipad-logo-REVERSED-300dpi Debates of the House of Lords and House of Commons from 1935 until “yesterday” are available in the XML format developed within the PoliticalMashup project. The debates are available as one dump of XML files and through a rudimentary search interface.
All debates are available in XML, RDF and HTML formats, via a simple parameter:

| lees verder…