Protected: Attackogram Stage
Enter your password to view comments
Enter your password to view commentsPaper published at the ACM Hypertext 2011 conference.
Abstract
This paper addresses the following research aim: provide a useful but succinct summary of long narrative events involving the interaction of several speakers. The summary should enable users to navigate to specific parts of the event using hyperlinks.
Our solution is based on a representation of the main actors of the event and their interactions as a social network. The solution is applicable to events in which these interactions are more or less formally structured and detectable. This includes theatre and radio plays, recordings of a scientific workshop, proceedings of parliament and meetings notes in general.
Reference
Bart de Goede, Maarten Marx, Arjan Nusselder, and Justin van Wees. 2011. Succinct summaries of narrative events using social networks. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia (HT ‘11). ACM, New York, NY, USA, 299-304. DOI=10.1145/1995966.1996005 http://doi.acm.org/10.1145/1995966.1996005
De interruptiegraaf die PoliticalMashup jaarlijks van de Algemene Beschouwingen maakt is door verschillende media opgepikt. Trouw en NRC plaatsten ze zaterdag 25 September. Marco Visser van Trouw schreef er een verhelderende analyse bij. Het politieke weblog Sargasso.nl had de mooiste interruptiegraaf gemaakt door PoliticalMashup en vormgegeven en geprogrammeerd door de makers achter politiekinzicht.com. Dit is een interactieve visualisatie met een aantal extra’s bovenop de interruptiegraaf zelf:
Deze visualisatie is gebaseerd op het werk van Kaptein, Marx en Kamps [SIGIR 2009] en gemaakt door Jurrian Tromp, Reinier van der Plank en Thomas Moeskops.
Hoe de gegevens verwerkt worden wordt in de blogpost over de interruptiegraaf uitgelegd.
De gegevens zijn verzameld met behulp van software ontwikkeld in het PoliticalMashup project van de Universiteit van Amsterdam.
Hieraan hebben de volgende mensen bijgedragen: Lars Buitinck, Johan van Doornik, Bart de Goede, Steven Grijzenhout, Maarten Marx, Rob Mokken, Arjan Nusselder, Anne Schut en Justin van Wees.
The thesis of Gilles den Hollander was turned into a paper which was presented at CSSE 2011.
In this study parsimonious language models were used to construct word clouds of the proceedings of the European Parliament. Multiple design choices had to be made and are discussed. Important features are stemming during tokenization, including bigrams into the word cloud and multilingualism. Also, the original parsimonious language models were extended with an additional term dampening unigrams that already occurred in the word cloud. This algorithm was tested in a small user study, using proceedings of the University of Amsterdam Science faculty’s student council. Members of this council had to give their preference for multiple word clouds constructed using either parsimonious language models or simple Term Frequencies (TF) with stop words. 68% over 29% (p <;60; 0.05, two-tailed paired t-test) preferred the word clouds constructed using parsimonious language models. Beside the system design, further technical findings, the social significance of applying word clouds to political data and possibilities for future work are discussed.
A paper on the quality of the XML files found on the web will be published in the proceedings of the 2011 ACM Conference on Information and Knowledge Management (CIKM).
Abstract
We collect evidence to answer the following question: Is the quality of the XML documents found on the web sufficient to apply XML technology like XQuery, XPath and XSLT? XML collections from the web have been previously studied statistically, but no detailed information about the quality of the XML documents on the web is available to date. We address this shortcoming in this study. We gathered 180K XML documents from the web. Their quality is surprisingly good; 85.4% is well-formed and 99.5% of all specified encodings is correct. Validity needs serious attention. Only 25% of all files contain a reference to a DTD or XSD, of which just one third is actually valid. Errors are studied in detail. Automatic error repair seems promising. Our study is well documented and easily repeatable. This paves the way for a periodic quality assessment of the XML web.
The full paper and all data are publicly available at the url http://data.politicalmashup.nl/xmlweb.
A paper on Evaluation Methods for Rankings of Facetvalues for Faceted Search was accepted at the Conference on Multilingual and Multimodal Information Access Evaluation 2011.Below is the abstract:
We introduce two metrics aimed at evaluating systems that select facetvalues for a faceted search interface. Facetvalues are the values of meta-data fields in semi-structured data and are commonly used to refine queries. It is often the case that there are more facetvalues than can be displayed to a user and thus a selection has to be made. Our metrics evaluate these selections based on binary relevant assessments for the documents in a collection. Both our metrics are based on Normalized Discounted Cumulated Gain, an often used Information etrieval metric.
A pdf version of the paper can be found here. There is also a longer version with experiments available.
@inproceedings{schuth_evaluation_2011 ,
title = {Evaluation Methods for Rankings of Facetvalues for Faceted Search},
booktitle = {Proceedings of the Conference on Multilingual and Multimodal Information Access Evaluation 2011},
year = {2011},
publisher = {Springer},
author = {Schuth, A. and Marx, M.J.}
}
PoliticalMashup is een samenwerking begonnen met de Haagse afdeling van NRC Handelsblad.
Het eerste artikel, over de werkzaamheden van de oude en nieuwe Kamerleden in het eerste jaar van het Kabinet Rutte, verscheen op zaterdag 2 Juli. Alle feiten zijn terug te vinden op een speciale website: http://nrc.nl/denhaag/.
Geen reacties »Drie eerstejaars UvA Informatiekunde studenten hebben met hun site politiekinzicht.com de 3de prijs in de visualisatie track van de Open Data Challenge gewonnen. Zij maakten een applicatie die zeer snel inzichtelijk maakt waar elke politicus in de Tweede Kamer over spreekt.
De prijs werd uitgereikt door Neelie Smit Kroes. In de jury zaten onder meer Sir Tim Berners Lee (W3C), Tom Lee (Sunlight Foundation) and Rufus Pollock (Open Knowledge Foundation).
De applicatie is gebaseerd op de Handelingen der Staten Generaal, die als Open Data beschikbaar zijn gemaakt in het PoliticalMashup project van het Informatica Instituut van de UvA. Verdere Informatie.
Over de prijs.
The Open Data Challenge was Europe’s biggest open data competition to date. There were 20,000 euros in prizes to win, and a total of 430 entries from 24 EU Member States. It was open for 60 days - from early April to early June 2011.
The winners were selected by an all star cast of open data gurus, and announced by Vice President of the European Commission Neelie Kroes at the Digital Agenda Assembly in Brussels.
Uit het jury rapport
We need better ways to understand our politicians, ways that go beyond catching a single quote to illuminate all of their commitments, interests and actions. That is why I really like this app.
My favorite data visualisation was the dutch entry called “Politiek Inzicht”, which shows what members of parliament talk about, by visualising tag clouds for all individual speeches, reports and so on given by members of parliament. This is not only done in way which is very fun - it also provides valuable insight into the real political focus of each politician, allows for comparison between individuals within parties or across parties. When I explored this app, I immediately thought - “Thats what we need in Germany too!”.
Steven Grijzenhout made a collection of XML files crawled from the web available for research purposes.
The collection is available at http://data.politicalmashup.nl/sgrijzen/xmlweb/. A description of the data and an analysis of it is in the paper The Quality of the XML Web .
Vanuit PoliticalMashup zijn er dit jaar twee praatjes op het Politicologenetmaal, over de vermeende linksheid van de Nederlandse TV en over stemadvieshulpen op het web.
Hier zijn de bijbehorende slides:
Het praatje van Bart de Goede heeft de best presentation award binnen zijn sessie gewonnen. Een mooie prestatie voor een Bachelor student.
Voting Advice via Direct Access to the Relevant Data (Maarten Marx)
Slant on Dutch TV. Is TV language use really dominated by left? (Bart de Goede and Maarten Marx)