Political Phrase Viewer

Purpose

Show dynamics in salience of phrases spoken in Parliament.

  • Change in time
  • Change across the political spectrum

Applications

Comparative diachronic research (history, social science, linguistics).

  • Agenda setting
  • Framing
  • Language change

Dimensionality reduction

Four dimensions:

  • time
  • political spectrum
  • salience
  • phrases

Data sets

Complete Proceedings of Dutch Parliament from 1814-2012.

Turned into uniform XML from different sources and formats (scans, HTML, XML).

Structural elements

Data element Count
Proceedings 50.071
Topics 101.127
Speeches 2.457.664
Paragraphs 11.851.653

Size of the vocabulary

Phrases Occurring more than once All
1-gram 992.291 2.773.826
2-gram 12.852.501 38.811.679
3-gram 38.648.440 170.314.738
4-gram 48.621.948 358.360.166
5-gram 36.838.184 498.848.849
6-gram 22.737.318 573.197.917
7-gram 13.655.460 606.867.133
totaal 174.346.142 2.249.174.308

Meta data

For every phrase, we know

  • when it was spoken
  • by whom, in what function, on behalf of what party
  • to whom
  • in which context

Search engine technique

  • Inverted index from phrases of at most 7 words to triples (date,speaker,phrase-frequency)
  • Stored in Lucene
  • In current visualiation, data is aggregated to year-party level.
  • Every n-gram is stored in the index (also hapaxes)

PDF scan van deze pagina

Laatst aangepast op 17-04-2013 door Maarten Marx Geen reacties »

Notice: Theme without sidebar.php is deprecated since version 3.0 with no alternative available. Please include a sidebar.php template in your theme. in /var/www/html/PoliticalMashup/wp-includes/functions.php on line 3679