NYTimes Exposes 2.8 Million Articles in New API

Via ReadWriteWeb, the New York Times has announced the release of 2.8 million articles with 28 searchable fields in an new API (Application Programming Interface). Marshall Kirkpatrick at ReadWriteWeb writes:

What do you do when your industry is shifting under your feet? Taking the lead with radical steps is one strategy. The New York Times did just that this afternoon when it announced that it has released a new Application Programming Interface (API) offering every article the paper has written since 1981, 2.8 million articles. The API includes 28 searchable fields and updated content every hour.

This is a big deal. A strong press organ with open data is to the rest of the web what basic newspaper delivery was to otherwise remote communities in another period of history. It’s a transformation moment towards interconnectedness and away from isolation. A quality API could throw the doors wide open to a future where “newspapers” are important again.

What does that mean? It means that sites around the web will be able to add dynamic links to New York Times articles, or excerpts from those articles, to pages on their own sites. The ability to enrich other content with high quality Times supplementary content is a powerful prospect.

This is a pivotal moment in the history of news organizations and the dissemination, consumption, and analysis of information. If, and when it succeeds, I suspect other news organizations (as well as record companies) will begin to release their own APIs and provide information for free. Free, as Chris Anderson argues, is future of business.

This is also another argument for why I need to learn more about APIs and how to create programs that interact with them.

Cross-posted at IAOC Blog.

Posted in technews | Tagged , , , , , , , | Comments Off on NYTimes Exposes 2.8 Million Articles in New API

mapping superbowl tweets in the nytimes

this post has been retitled from “it’s a twitter-happy go-go springsteen nation.”

Via @courtneybird who retweeted @nickbilton and an email message from my father (who refuses to Twitter but sends me all sorts of Twitter-related news articles), the New York Times has semantically and geographically represented on the continental United States an interactive tag cloud of popular terms Twittered during the Super Bowl:

I am fascinated by how information changes as we move it from one medium to the next. Here, the graphic artists have remediated words found in linear tweets composed across the nation (excluding Alaska and Hawaii) by locating them across several data points: term count, time, geography, size of font, color of font, and along 6 categories. This allows for an outstanding range of comparative data points. Adding to the graphic is the fact that time, scrolling over the terms, and the 6 categories can be controlled by users. As a result, if we move the time line to 8:13pm eastern time, we get the following semantic celebration of the halftime show:

Of course, as with all data we must ask what is missing from this representation, namely the tweets that referred to Springsteen as “Bruce” or “Bruuuuuuuuuuuuuuuuuuuuuuce!” or with any number of uuus in-between. For example, looking at my tweets about Springsteen during the Super Bowl reveals than none of my tweets (my location in orange) about him found their way into the semantic representation:

Similarly, we must also ask what is missing visually when all categories are combined. Here via the “All Tweets” option at 9:50pm, just after the Cardinals took a 23-20 lead:

This mapping suggests that the primary terms being tweeted at 9:50pm were Cardinals and Steelers. There are several problems with how this data has been represented. First, the designers have decided to use the same font color in the All Tweets option even though in the “Steelers vs. Cardinals” option the teams are represented by their team colors: Steelers in Black, Cardinals in red:

The change in font color brings forward the fact that the term Cardinals was being tweeted at a much higher rate than the term Steelers. The black on black font in the All Tweets option is not nearly as effective because the uniform font color reduces the comparative nature of the graphic.

Continuing with this line of thought, the All Tweets graphic for 9:50pm suggests that there were only small pockets in Kansas, Kentucky, western Texas, and Idaho where fans were tweeting the word “go.” However, selecting the “People Saying ‘Go'” option at 9:50pm reveals a nation shouting the word “go”:

According to my viewing of the flash animation, 9:50pm is when the word “go” reached its tweet apex and yet the words are hidden in the All Tweets option. This remediation begs the question: who were the fans saying go for? The answer lies in the comparison of data, and because that comparison has not been provided in the original I have merged the two above graphics using Fireworks:

The result is a nation of fans tweeting “go” and “cardinals.” Everyone loves an underdog.

Posted in academia, classification, IT, viz rhet | Tagged , , , , , , , | 8 Comments

the freedom riders: then and now

The February 2009 issue of Smithsonian has a wonderful article, “The Freedom Writers,” which discusses a photography book created in part as a tribute the 80 heroic men and women who in 1961 boarded buses and headed south to protest illegal segregation at interstate highway facilities. The book by writer and aspiring photographer Eric Etheridge, Breach of Peace: Portraits of the 1961 Mississippi Freedom Riders, juxtaposes mug-shots taken when the riders were arrested with current portraits taken by the photographer. The Smithsonian article showcases several of the portraits, and also excepts the stories of some of the men and women featured in the book:

As riders poured into the South, National Guardsmen were assigned to some buses to prevent violence. When activists arrived at the Jackson bus depot, police arrested blacks who refused to heed orders to stay out of white restrooms or vacate the white waiting room. And whites were arrested if they used “colored” facilities. Officials charged the riders with breach of peace, rather than breaking segregation laws. Freedom Riders responded with a strategy they called “jail, no bail”—a deliberate effort to clog the penal facilities. Most of the 300 riders in Jackson would endure six weeks in sweltering jail or prison cells rife with mice, insects, soiled mattresses and open toilets.

“The dehumanizing process started as soon as we got there,” said Hank Thomas, a Marriott hotel franchise owner in Atlanta, who was then a sophomore at Howard University in Washington, D.C. “We were told to strip naked and then walked down this long corridor…. I’ll never forget [Congress of Racial Equality (CORE) director] Jim Farmer, a very dignified man …walk­ing down this long corridor naked…that is dehumanizing. And that was the whole point.”

Jean Thompson, then a 19-year-old CORE worker, said she was one of the riders slapped by a penal official for failing to call him “sir.” An FBI investigation into the incident concluded that “no one was beaten,” she told Etheridge. “That said a lot to me about what actually happens in this country. It was eye-opening.” When prisoners were transferred from one facility to another, unexplained stops on remote dirt roads or the sight of curious onlookers peering into the transport trucks heightened fears. “We imagined every horror including an ambush by the KKK,” rider Carol Silver told Etheridge. To keep up their spirits, the prisoners sang freedom songs.

More pictures from the book, as well as news coverage, videos, and a sample of the book can be found at breachofpeace.com.

Posted in photography | Tagged , , , , | Comments Off on the freedom riders: then and now