Text Analysis during the 2011 State of the Union Address

As part of the underlying research Texifter is doing on sentiment and topic analysis, we collected data from various Twitter and Facebook feeds +/- 48 hours during the 2011 State of the Union address on Tuesday, January 25th, 2011. Texifter set up live feed captures from the official Whitehouse and the Huffington Post pages on Facebook as well as capturing tweets from Twitter pertaining to “#SOTU” and “Obama”.

From the collections (fetching new data once every 5 minutes) , we gathered close to 700,000 individual comments and tweets. From these individual datasets (available at http://discovertext.com/sotu.aspx), we extracted the text and ran each comment through uClassify’s mood and news topic classifiers to extract out the sentiment of the comments over time as well as to pull out a trend of the topics over time. Below we share a few examples of what we’ve found (click on the images for the full view)

Sentiment and topics over time for the Whitehouse official Facebook Page’s comments:

Ranked order of topics found for the Whitehouse official Facebook Page:

Sentiment and topics over time for tweets with the #SOTU hashtag:

Ranked order of topics found for tweets with the #SOTU hashtag:

Sentiment and topics over time including the term “Palin” for tweets with the #SOTU hashtag:

Ranked order of topics found including the term “Palin” for tweets with the #SOTU hashtag:

This is the type of functionality we are integrating into DiscoverText within the next couple of months. When live and interactive, you will be able to drill down into the time-segmented slices and topics to further analyze and classify your own documents and comments. Look for our new and improved classification and sentiment analysis interfaces coming in 2011!

About Mark J. Hoy

Texifter CTO Mark J. Hoy brings over 18 years of professional IT experience as well as over 28 years of programming hailing from a wide swath of areas including academic research, R&D for the defense industry and the Director of Programming for a graphic design and multimedia firm. Mark graduated from Carnegie Mellon University in 2007 with a Masters in Information Technology, Software Design and Management as well as in 2000 with a B.S. in Information and Decision Systems. When not poking around with the internals of DiscoverText, Mark keeps himself busy as an amateur guitarist for over 20 years and a tinkerer of all things.
This entry was posted in general, research and tagged , , , , , , , , , . Bookmark the permalink.