US presidential debates analyzed with VOSViewer

These last few weeks the US was treated with three presidential and one vice-presidential debates. These are the most watched and most tweeted about events of the year. Still, the year is not over yet. A way to get partial insight into the debates, the maps show words being used in the debate: the closer the words in the Euclidean space, the more often they are used in the same line. The software can be downloaded from this website vosviewer.com. The manual is short but contains relevant references on algoritms and how to use it. The transcripts were downloaded from Debates.org. Before importing the texts in VOSViewer, they were stripped from anchor elements such as who is speaking, whether there is applause or laughter or crosstalk.

Below are the maps showing the results for each of the debates. The first map representing the first presidential debate shows four clusters. The first cluster (green) indicates the debate focused on the tax system regarding the small business enterprises. Because “governor” is part of this cluster, it appears that Obama or the interviewer directly addressed Romney on these issues. The second cluster (purple) shows the debate also strongly focused on Obamacare, insurance and the elderly. The red cluster shows healthcare issues, education and the Dodd Frank reform. The fourth cluster (yellow) is difficult to interpret.

First debate on national politics

The map of the second presidential debate (the “townhall” format) shows that the first (green) cluster focuses on businesses, small deduction the economy and woman. the second one (red) focuses on the younger citizens of the US, judging from the words: school, kid, candy, college, chance. the third cluster (blue) is difficult to interpret from these words: day, time, question, lot, mr. president, governor. The fourth and final cluster (brownish) focuses on the US-China relation.

Debate on national issues. The “Townhall” meeting

The map of the third presidential debate shows that the debate is clustered around four topics. The first cluster revolves around the relation between the government and American businesses (red). The second cluster is about the Middle East and the resent unrests (Syria and Libya) (green). The third clusteris about the Amnerican economy and the role of China as the culprit taking away American jobs (blue). The final cluser (yellow) deals with another part of Asia: Iraq, Pakistan and Afgahanistan and the American troops that stay over there to prevent war.

Third debate on foreign politics

http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/digg_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/delicious_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/technorati_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/google_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/myspace_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/facebook_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/twitter_32.png

The transformation from data journalism to computational journalism

Some time now journalism used to be a traditional profession of people investigating issues, talking to sources, writing it down and publishing it. But then the Internet came and journalists when onto the Net, emailing and surfing the Web as a new way to contact sources gather information and disseminate the news.
Ultimately, this evolved into a new type of journalist, one that collects data freely available on the Net and aggregates this in such a way it reveals new insights. This is called data journalism. Now there appears even another type of journalist, one that computes: computational journalism. I would suggest that computational journalism is an extension of data(-driven) journalism. Of course data as such are meaningles and only through some filtering – aggregation, comparisons etc – sense can be made of the large amounts of data, possibly made easier through the use of visualizations.
Data and computational journalism especially used in investigative journalism has been around for quite some time already. However, it received a great push through the use of APIs and the increased accessibility of databases through the Internet in general. The data repository of the Guardian is a good example of the latter. Still, analyzing data and visualizing the findings to convey the message of the journalist can be quite tricky. A source on creative data visualisation or visualisations gone wrong can be found at Flowing Data.
The video below is a lecture on computational journalism’s agenda Journalism and Media Studies Centre of Hong Kong University
.

Media Research Seminar: Computational Journalism: Mapping the Research Agenda from JMSC HKU on Vimeo.

http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/digg_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/delicious_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/technorati_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/google_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/myspace_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/facebook_32.png http://blog.mauricevergeer.nl/wordpress/wp-content/plugins/sociofluid/images/twitter_32.png