On the May 1, 2011 evening it was announced that Osama bin Laden had been killed, we started running repeated fetches against the Twitter API for the terms “osama” and “bin laden”. On May 3, we posted more than 1.2 million tweets in XML format. Since then, the live feed collection on DiscoverText keeps rolling along.
The Twitter API serves a maximum of 1500 items per fetch. The DiscoverText live feed scheduler can fetch as often as every five minutes. During the peak of the Tweet storm, running a single repeated fetch could not get 100% of the Tweets. The work around that produced these two large collections was to set up several repeating fetches in DiscoverText that all fed the same archive. The results are frankly more Tweets than anyone might ever need to understand this slice of the the micro-blogging public sphere during a critical juncture in world history.
1:46 PM UPDATE: Approaching 1 million tweets per archive
Pingback: The Dustbin of Twistory : Lawyers, Guns & Money
Pingback: Twitter Asks DiscoverText to Stop Sharing Tweet Data | Gil Shapira
Pingback: Capturing Dissent on Social Media | Texifter, LLC. Blog
Pingback: Bin Laden, Oil & Foreign Policy | Texifter, LLC. Blog
Pingback: Bin Laden Tweets Take Center Stage at AoIR | Texifter, LLC. Blog