Rumble on Youtube: Psy vs Justin Bieber

On Saturday November 24th 2012 history was written: The most viewed Youtube video up til then – Canadian Justin Bieber’s baby with 805.914.820 views – was surpassed by South Korean Psy’s Gangnam Style with 833.499.683 views.
It’s interesting to look at the viewer statistics of both videos and the way Youtube presents them. First of all, looking at the shape of Bieber’s viewer stats across time. At the early stages after the video was released we see a steep incline, which then levels off to a horizontal line. Looking at Psy’s graph we see that the amount of viewers is still on the increase. There are no signs yet that a maximum has been reached.
At first, after the new record was set, I expected there would be a competition between Bieber fans and the Psy fans to compete for the new target: views!!! Then again, there is no visual sign of Bieber’s curve to rise again. This means that Bieber´s video increasingly lags behind Psy’s video, probably to the extent that he will never be able to overtake Psy. If Psy’s video will reach more than a billion views, it’ll be the record holder for a long time. Or will it? Bieber’s record only lasted less than three years, while Psy broke that record in record time. It took Psy only 134 days (which equals 6.220.147 views per day). Compare this to a measly average of 804.306 daily views for 1002 days for Bieber’s Baby. So, it’s waiting for the next video to break Psy’s record. It’ll come within the next five years. Mark my words!!! 🙂

Interestingly, Youtube doesn’t seem to be that interested in Justin Bieber’s video. First of all, Youtube is very slow in updating Bieber’s viewer stats. Whereas Psy’s stats are updated daily, Bieber’s stats on average they lag behind for about a week. Also Bieber’s vertical axis needs to be updated because the video surpassed the 800.000.000 mark clearly.

Psy’s Youtube stats

Justin Bieber’s Youtube stats

A further notable difference is the steep climb at the beginning of the number of viewer for the Bieber video. Compare this to the slow start of Psy’s video. A possible explanation is the date these video´s were posted: Bieber’s video was posted in Februari, one of the coldest months of the northern hemisphere, and Psy’s video was posted mid July, the hottest period for the northern hemisphere. That got me thinking that in the coldest months people stay indoors and have Youtube readily available, whereas in the summer people often are outdoors, or are on vacation, limiting their Youtube access. Cautionary note: this analysis is based on visual inspection of the graphs. It’d be better to use the actual longitudinal data. On July 28, 2012 Robbie Williams linked to Psy’s video on his website, probably aiding the quick dissemination through the Web, particularly the English speaking parts of the Web.

Justin Bieber’s Youtube interaction stats

Psy’s Youtube interaction stats

Below are the interaction stats directly compared between Bieber and Psy. It shows that, again, Psy has the most views, but Bieber has the most reactions. Psy has the most “Thumps Up”, whereas Bieber ahs the most “Thumbs Down”.

Viewer stats compared

As for the ratios between different stats we see that the audience of Biber’s video is more responsive than Psy’s audience. The rates for “Thumbs Up” “Thumbs Down” are quite similar to the earlier indicators because the number of views for Psy and Bieber are at a similar level. Still the quite small fraction of people reacting to these videos which only reaches a 1.1 percent shows that social media are not always that interactive. This percentage is probably somewhat inflated, and probably somewhat higher if multiple views by the same person would be taken account for. At the same time a single person can post multiple reponses to the video. This shows that using of-the-rack stats comes with limitations.
Rates between stats compared

Finishing this blog post on the 12th of December and checking the latest numbers on the Psy video, I wouldn’t be surprised when it reaches a billion views before then end of the year Only some 67 million views to go!

OK, to make this blog post complete here are both videos:

Presentation at RC33 conference on social media

Last week I attended the RC33 conference of the ISA in Sydney, Australia. I participated in two events: a session on social media and a panel session on computational social science, both organized and chaired by Robert Ackland. In the session on social media I talked about my experiences working with Twitter data and the problems, solutions and opportunities involved in using these data. Below are the slides.

The panel session on computational social science we discussed about what it is, what we can do with it as well as computational social science in the era of Big Data. As for the latter part, I do think we gain using the Big Data, although we must acknowledge their limitations. However, I also think that Small Data still has the preference for now. Whereas the use of Big Data particularly involves the analyses of large systems, but still results in fairly descriptive analysis, small data allows for the analysis of specific cases. The benefit of the use of specific cases is that particularly social media data, that are limited when downloaded from SNS’s API’s, can be augmented / enriched by added additional data. If you read our work on social media and web campaigning in general, these analyses always use additional data about parties and their candidates. This way we can move beyond the descriptive analyses of social media.
Of course, computational social science is much more than using large amounts of data. Simulating behavior according to specific rules is also part of it. Still, computer simulation has been around for some decades already and – from my perspective – they still are not widely used. At least not being published about in academic communication journals. An exception in the field I am interested in (political campaigning) is Gulati et al. on modelling voting behavior.

The transformation from data journalism to computational journalism

Some time now journalism used to be a traditional profession of people investigating issues, talking to sources, writing it down and publishing it. But then the Internet came and journalists when onto the Net, emailing and surfing the Web as a new way to contact sources gather information and disseminate the news.
Ultimately, this evolved into a new type of journalist, one that collects data freely available on the Net and aggregates this in such a way it reveals new insights. This is called data journalism. Now there appears even another type of journalist, one that computes: computational journalism. I would suggest that computational journalism is an extension of data(-driven) journalism. Of course data as such are meaningles and only through some filtering – aggregation, comparisons etc – sense can be made of the large amounts of data, possibly made easier through the use of visualizations.
Data and computational journalism especially used in investigative journalism has been around for quite some time already. However, it received a great push through the use of APIs and the increased accessibility of databases through the Internet in general. The data repository of the Guardian is a good example of the latter. Still, analyzing data and visualizing the findings to convey the message of the journalist can be quite tricky. A source on creative data visualisation or visualisations gone wrong can be found at Flowing Data.
The video below is a lecture on computational journalism’s agenda Journalism and Media Studies Centre of Hong Kong University

Media Research Seminar: Computational Journalism: Mapping the Research Agenda from JMSC HKU on Vimeo.