Monitor Middle East Protests

Dear faithful users and intrigued future users of DiscoverText,

My name’s Josh and I’m one of 3 user support specialists at Texifter LLC. For my first Texifter blog entry, I’m going to demonstrate how I’ve been using DiscoverText to capture minute-to-minute protest tweets in the Middle East, ever since the beginning of the Arab Spring. Then I’m going to show off some of the awesome functions that DiscoverText lets you perform with that data.

To get started go to your dashboard and click start a new project. Then name your project, and you’re ready to go. (see below)

Now you’ve got a completely empty project that you want to fill with lots of data. In DiscoverText you can import your own data from your computer, or you can pull it offline from Facebook, Twitter, YouTube, and other places. For this blog entry, I’ll be sticking with Twitter, only because that’s where so much exciting stuff is happening.

So to import a twitter feed, click  “Import data” under Project Options (see below).

Next, you’ll see a whole bunch ways to bring data into DiscoverText. Click the Twitter icon.

Next you’ll need to name the archive where your tweets are going be stored. Below, you can see that that I named this archive after what I’m initially interested in getting some information about: The City of Hama.

Now, type in your twitter search term, click the Twitter sign in button, and then click next.

The last step before you import tweets is something called the “Live Feed Scheduler. This feature allows you to continuously or periodically pull tweets into your account, even when you are not online. If you’d like to just get as many tweets as you can (there is a maximum of 1500 for each import), as fast as you can, just leave almost everything as is, but click the drop-down menu where it says 1 hour and change it to 5 minutes.  Never fear, you can always run multiple feeds, if need be. (And you might want to if each import is producing over 1500 tweets per 5 minutes… using this technique, some users have searched MILLIONS of tweets at one time!!! Are you up for the task!?!)

At last, your tweets are coming soon. Grab a quick cup of coffee and your data will be ready in a couple minutes. You can also follow the progress of the data import beneath the notifications link if you’re feeling impatient. (see below)

At last, the cool part has arrived and you should now see the name of your archive in the navigation tree on the left side.

Usually the first thing I’ll do before I start playing with a twitter archive is I’ll have a quick glance at the the comments. So I click the name of the new archive in the navigation tree and then click the listing options button (it looks like this: ) at the center-top.  Select 100 items per page and click save.

Now you can browse the tweets easily..

Now let’s say you want to start organizing the tweets according to content. In the example above, we can see every mention of the City of Hama, but now you would like to see every mention of – say – the army, the police, and the secret service within Hama. DiscoverText makes it super easy to do this. At the top of your document list, type your first search term and press enter.

If you’d like to keep your new search results, select the checkboxes of those tweets that you want to sort and then click add to bucket: “Selected.” (Buckets are your saved searches) Create a bucket name and perform the remainder of your searches. (see below)

Here are the results on a search for “Secret Service”….

and the search results for the Army…

Just like that, you can analyze what tweets are saying about the government’s behavior in one particular city. (For example, It took me just a few minutes to figure out that police officers in Hama have (supposedly) been walking around in civilian clothing!)

Now, let’s expand the search! Instead of just pulling in tweets about Hama, let’s also pull in tweets about Aleppo, Homs, Damascus, and Deir el-Zur. All you have to do is right click the name of the project, click import data, and repeat the process above.

Now, I let DiscoverText import several rounds of tweets, and as you can see from the picture below, I’m now looking at 5 different archives and over 19,000 tweets! (To learn how to remove duplicate tweets, click here)

To search all of those archives at once, click the name of the project (in large letters) at the top of the navigation tree.

Now if we search for, “Police,” we can monitor police behavior in all five cities at once.

We can see 72 mentions of police…

and 153 mentions of the Army…

and 44 mentions of the secret service.

Clearly, there is a lot of chatter on twitter about the Syrian army in those 5 cities. Now let’s say you want to organize, categorize, and/or code what is being said about the Army.

The first thing you’ll want to do is create a new bucket, just like you did before.

Next, right click that bucket in the navigation tree and click “create dataset.”

On the next page, click create dataset (or click here to learn more about different kinds of datasets you can design). Next, pick the categories and coding scheme that you’d like to use. As you can see below, I used three different codes, but you can use as few or as many as you want.  When you’re all set, click finished.

The next thing t0 do is decide who you want to code the dataset you just created. You can assign it to yourself or any of your peers in DiscoverText. (For more on Peers, click here) When you’re finished assigning coders, click “set chosen coders.”

To start coding right away, click “Code Dataset.” (see above).

This is what coding might look like for you:

When you’re done coding, click the stop icon.

To get a full coding report, all you have to do is click the “Analytics / Export” button on the left, and click “reports”.

Click “Dataset Summary Report,” customize your report, and a minute later you will be looking at a full report of everything that has been coded, with great visuals that will look something like this:


That’s about all for now. This has been just a glimpse at some of the ways I’ve been playing with DiscoverText. If you like what you’ve seen, sign-up now and “like” us on Facebook and LinkedIn. And, of course, if you have any questions, feel free to e-mail me anytime at I’m always happy to help!


About Josh Sowalsky

Josh Sowalsky is the Director of User Support at Texifter, where he has worked since September 2010. He holds two degrees in Political Science and Middle Eastern Studies from UMASS Amherst, where he minored in History, Arabic, and International Relations. While at UMASS Josh designed and taught an advanced course that examined the intersection of technological development and national identity formation. Serving also as a research assistant in the UMASS Political Science department, he researched and published articles on electoral politics and political dissent in Jordan. Josh has conducted and presented multilingual field research on civil society development, democratization, and national identity formation throughout the Middle East - namely in Israel, Lebanon, and Syria. His honors thesis was entitled, "The Role of Women's Rights NGOs in Syrian Democratization." When not managing projects in QDAP or harvesting Arabic protest tweets in DiscoverText, Josh can be found strumming a ukulele, exploring Netflix, or swinging aimlessly at tennis balls.
This entry was posted in general and tagged , , , , , , , , , . Bookmark the permalink.
  • MADESCLAIRE Isabelle

    Hello Josh thanks for complete information. May I ask : Is it possible to find the tweets of my profile (my own tweets and my RT) during those dates : from 20 to 25 may 2011 .
    Can textifter help to get hold of these archives on my twitter account?
    Twitter has a limit on screen, which I didn’t know of, so for this period I did do the savings on time.
    I’m saving in txt to keep memory but . My project is to write on basis of twitter’s data (+ prss and others of course).
    Thank you very much for your attention.

  • Mark J. Hoy

    Unfortunately, via the Twitter API – DiscoverText can only go back as far as Twitter allows us – which is either 1,500 tweets or approximately 1 to 2 weeks worth (whichever one is greater). DiscoverText will not help with getting these older tweets, however, it will be able to keep track of your wall and gather tweets going forward (if you set up a scheduled fetch within the system).

    This is a limit imposed by Twitter that we cannot get around. I’m not even certain (and have not found any references) to if Twitter even keeps archives themselves of older data.

  • Pingback: Making Sense of 500K ESPN Tweets | Texifter, LLC. Blog