Facebook Graph Mysteries

During my recent visit to the Digital Methods Initiative (DMI) summer school, hosted by my good friend Richard Rogers, I had the pleasure of spending two days teaching and working with 35 exceptionally bright students who were new to the tools and techniques that are part of DiscoverText.

They were an excellent group, highly motivated and digitally fluent. As part of the class, students put forward project ideas and formed small teams to hack out a solution to some research problem. Many of these ideas involved scraping content off Facebook via the Graph API. I watched eagerly as teams of students furiously tested out many of the “shiny new toy” functionalities they found in DiscoverText. Very quickly, they helped to articulate some of the key mysteries of the permissions managed via the social Graph.

Some data collection trends were immediately raised. For example:

  • Why is there is a numerical discrepancy between what appears on the actual public Facebook pages and groups and what is delivered via the Graph?
  • By what combination of criteria do different users get slightly (or vastly) different results for the same query?
  • Why is there often a substantial gap between the number of items the API delivers and the number of items a user of DiscoverText actually gets in the downloaded archive?

As the experiments at the DMI continue, and users of DiscoverText all over the world start asking some of the same questions, we hope to better document here on the blog the precise way in which your credentials, and the settings of diverse Facebook users, impact the data collection made possible using the DiscoverText-Facebook API.

In the meantime, I am home, but the DMI students are still pounding away on the Graph and DiscoverText raising excellent questions and generating new feature ideas we will surely use.

About Stuart Shulman

Stuart Shulman is a political science professor, software inventor, entrepreneur, and garlic growing enthusiast who coaches U13 boys club soccer and in the Olympic Development Program with a national D-license. He is Founder & CEO of Texifter, LLC, Director of QDAP-UMass, and Editor Emeritus of the Journal of Information Technology & Politics. Stu is the proud owner of a Bernese/Shepherd named "Colbert" who is much better known as 'Bert. You can follow his exploits @stuartwshulman.
This entry was posted in research and tagged , , , , , , . Bookmark the permalink.