mapping superbowl tweets in the nytimes

this post has been retitled from “it’s a twitter-happy go-go springsteen nation.”

Via @courtneybird who retweeted @nickbilton and an email message from my father (who refuses to Twitter but sends me all sorts of Twitter-related news articles), the New York Times has semantically and geographically represented on the continental United States an interactive tag cloud of popular terms Twittered during the Super Bowl:

I am fascinated by how information changes as we move it from one medium to the next. Here, the graphic artists have remediated words found in linear tweets composed across the nation (excluding Alaska and Hawaii) by locating them across several data points: term count, time, geography, size of font, color of font, and along 6 categories. This allows for an outstanding range of comparative data points. Adding to the graphic is the fact that time, scrolling over the terms, and the 6 categories can be controlled by users. As a result, if we move the time line to 8:13pm eastern time, we get the following semantic celebration of the halftime show:

Of course, as with all data we must ask what is missing from this representation, namely the tweets that referred to Springsteen as “Bruce” or “Bruuuuuuuuuuuuuuuuuuuuuuce!” or with any number of uuus in-between. For example, looking at my tweets about Springsteen during the Super Bowl reveals than none of my tweets (my location in orange) about him found their way into the semantic representation:

Similarly, we must also ask what is missing visually when all categories are combined. Here via the “All Tweets” option at 9:50pm, just after the Cardinals took a 23-20 lead:

This mapping suggests that the primary terms being tweeted at 9:50pm were Cardinals and Steelers. There are several problems with how this data has been represented. First, the designers have decided to use the same font color in the All Tweets option even though in the “Steelers vs. Cardinals” option the teams are represented by their team colors: Steelers in Black, Cardinals in red:

The change in font color brings forward the fact that the term Cardinals was being tweeted at a much higher rate than the term Steelers. The black on black font in the All Tweets option is not nearly as effective because the uniform font color reduces the comparative nature of the graphic.

Continuing with this line of thought, the All Tweets graphic for 9:50pm suggests that there were only small pockets in Kansas, Kentucky, western Texas, and Idaho where fans were tweeting the word “go.” However, selecting the “People Saying ‘Go'” option at 9:50pm reveals a nation shouting the word “go”:

According to my viewing of the flash animation, 9:50pm is when the word “go” reached its tweet apex and yet the words are hidden in the All Tweets option. This remediation begs the question: who were the fans saying go for? The answer lies in the comparison of data, and because that comparison has not been provided in the original I have merged the two above graphics using Fireworks:

The result is a nation of fans tweeting “go” and “cardinals.” Everyone loves an underdog.

This entry was posted in academia, classification, IT, viz rhet and tagged , , , , , , , . Bookmark the permalink.

8 Responses to mapping superbowl tweets in the nytimes

  1. Bill Wolff says:

    new blog post: it’s a twitter-happy go-go springsteen nation http://is.gd/ieBg

  2. Just read @billwolff’s interesting blog post on nytimes superbowl twitter viz & what is hidden in data http://is.gd/ieBg

  3. John Jones says:

    RT @briancroxall: Just read @billwolff’s interesting blog post on nytimes superbowl twitter viz & what is hidden in data http://is.gd/ieBg

  4. Billie says:

    This is FASCINATING!!! Thanks for posting these images.

  5. John Jones says:

    It is _really_ difficult to export comprehensive data from Twitter. The examples you point out seem like they could have been fixed by the engineers at NYTimes, but even if they were, there is no guarantee that those engineers would actually have all the tweets that were sent during the Super Bowl at their disposal.

    I guess my point is that it is likely some of the problems you identify with this data originated on Twitter’s end.

    Cheers,

    John

  6. Chuck says:

    Not to upset your thesis too much, but isn’t this also around the time that the truly wretched GoDaddy.com ad came on? I’m wondering how many of those tweets were referring to the ad and how many were cheering on one team or another? And I could be completely off the mark in this speculation.

    I love this analysis, though, and the words that oddly crept into the picture. There were a few references to “faith” at the beginning of the game, which I’m assuming refer to Faith Hill’s performance of the national anthem.

  7. Bill says:

    Billie, John, and Chuck, thanks for your comments.

    I completely understand what you mean, John, when you talk about the difficulty of exporting Twitter data. My point was not to critique the NY Times engineers, but to illustrate how we must question what is made invisible in visual representations of data. The engineers made choices for which font colors to use and what words to be made visible at certain points in the visualization. As I look more closely at the map, I see that the size of the words are not represented on the same scale. That is, at 9:50pm the word “Cardinals” has 73 references in the area near New York. At the same time the word “go” received 49 references near Pittsburgh. Yet, “go” is displayed in a font size significantly larger than “Cardinals.” Visual data meant for comparison must be represented on the same scale in order for it to accurate–or, as Tufte would say, for it to function effectively as “beautiful evidence.” The question of scale is one most certainly in the hands of the engineers.

    As Bowker and Star and Lakoff and Johnson show us, decisions to make certain items visible means that certain things were necessarily made invisible. The invisible is what I am interested in thinking more about.

    Chuck, no, not a hole in my theory at all, but a revelation as to what might be going on with the word Go. As you suspect, at 9:50pm the term “GoDaddy” shows up strong when viewing the map when the “Talking about Ads” category is selected. This raises a whole host of questions, including: how many of the “go” references were referring to GoDaddy but were written as “Go Daddy”—that is, with a space between Go and Daddy. As with the Bruce/Springsteen example, the map is most likely only representing a portion of references to GoDaddy. Of course, we would need to know more about how many times the word “Daddy” appeared following “Go_” to be fully sure.

    But it also raises a host of questions about the “people saying ‘go'” category, which I believe is clearly meant to refer to the cheer often heard at sporting events: “Go Steelers!” “Go Cardinals!” “Let’s GO Rangers!” and so on. How many of those “go” references actually refer to GoDaddy and are not part of a cheer? This question makes the whole “people saying ‘go'” category suspect because the data might actually represent something other than the implied meaning of the category.

    Oh, and John, if you have any idea how they might have been able to extract the data, I’d be quite interested in learning about that.

  8. Bill Wolff says:

    .@lwaltzer @jimgroom And an analysis of said Superbowl tweets mapping: http://j.mp/aK4Wq2 :-) #ds106

Leave a Reply

Your email address will not be published. Required fields are marked *

Please type the characters of this captcha image in the input box

Please type the characters of this captcha image in the input box