Texifter Announces Strategic Partnership with SurveyMonkey

Texifter Announces Strategic Partnership with SurveyMonkey
to Improve Survey Data Analytics

Combining the power and versatility of Texifter’s DiscoverText analytics with the reach of the world’s largest survey website.

AMHERST, MA., May 27 2014—Texifter, a developer of social data and text analysis tools, today announced a new strategic partnership with SurveyMonkey, the world’s largest survey website, to provide advanced text analytics capabilities to SurveyMonkey users through its cloud-based platform, DiscoverText.

SurveyMonkey is known for intuitive interfaces and communications features that allow researchers to collect millions of survey responses every day. When surveys produce very large numbers of responses to open-ended questions, it can be a challenge to analyze all of the verbatim data. This is especially true for those relying on spreadsheet software as their primary text analytics tool.

DiscoverText provides an accessible “point and click” solution for these and other analytic challenges. Starting today, all DiscoverText users will be able to log in to SurveyMonkey to easily import existing survey data. Researchers can use a 30-day free trial to apply the full range of Discover Text’s powerful software tools to both the open ended answers and the structured survey metadata. Texifter’s “five pillars of text analytics” approach combines search, filtering, clustering, human-coding, and machine-learning.

Once registered on DiscoverText, newcomers have access to a wide spectrum of online data feeds. Facebook, Tumblr, YouTube, WordPress, Disqus, and Twitter data can be gathered, managed, and analyzed in DiscoverText alongside SurveyMonkey responses, email, and other forms of text data.

“The Texifter team is excited to be introducing SurveyMonkey users to the powerful and flexible text analytics tools in DiscoverText,” said founder and CEO Stuart Shulman. “We are confident that once people try out features like clustering and custom machine-learning, they’ll begin to see new possibilities for generating insights from bigger and more diverse collections of unstructured free text.”

This strategic partnership signals the latest phase in the evolution of DiscoverText. Originally built for federal agencies sorting large-scale public comment collections, the four-year old collaborative research platform now serves a wide variety of public and private sector clients, as well as the academic research community.

Texifter is a spin-out company based on information science research by Dr. Stuart W. Shulman, who directed the development of numerous human language tools for reviewing large numbers of public comments.

Texifter Contact
Stuart Shulman

Posted in DiscoverText, product | Tagged , , , , , , , , | Comments Off

School Bullying Research Using DiscoverText

Our Vanderbilt University team uses DiscoverText (DT) to support qualitative text analysis of 8,531 high school students’ responses about their in-school experiences of bullying. DiscoverText has offered us powerful ways to perform key steps throughout our coding process. Fundamentally, DT supports parsing our large data set into archives, buckets, and datasets. Thus, we are able to focus on key portions of our large data set to hone our initial hierarchical coding structure while retaining the ability to return to an untouched dataset for final coding. We use the diverse annotation tools in DiscoverText to mark singular problematic items for discussion at meetings. Our team was able to develop a complex coding structure with 58 codes (at one point we had 128), and begin coding in a month and a half. Undoubtedly, DiscoverText’s robust organizational and annotation tools, within an easy-to-use user interface, supported expediency.

Following the development of our coding structure we employed DiscoverText’s analytic tools to better understand and improve our team’s inter-coder reliability. DT’s real-time coding analytics supports decision-making in meetings. Through the use of these tools, we raised our coding reliability from a .2 Kappa value to a .82 Kappa value after five training rounds. Given that four coders are using 58 hierarchical codes to code over 8,000 free-response items, the numbers represent a phenomenal increase in reliability. Presently, we are half way through coding the 8,531 items using overlapping coding patterns to ensure reliability. Out team members share their experiences below:

“I am currently working with a research team that must code students’ responses about their bullying experiences. I had never coded before and was introduced to DiscoverText only a few months ago. Fortunately, I have found DiscoverText to be very user-friendly and easy to navigate. Despite my lack of formal coding experience, I have found the program to run smoothly and have already learned a great deal in such a short period of time. My favorite feature thus far would have to be the code-by-code comparisons. This allows us to discuss any discrepancies among the research team and to increase our reliability. I have enjoyed exploring the features of this program and look forward to discovering what more it can do.” – Abbie, undergraduate, Human and Organizational Development, honors track.

“My team is using DiscoverText to code thousands of brief responses to a survey question about bullying. As someone who is new to qualitative research and coding programs, I have found DiscoverText easy to use. The coding process was very easy for me to learn, and I quickly became efficient at coding responses. Our initial looks at code comparisons have been fairly straightforward for me to figure out as well. As we move forward with more analysis, I anticipate other functions and features of DiscoverText will be similarly straightforward, and I will see more of the power of the program.” - Brian, master’s student, Human Development Counseling.

“I’m working with DiscoverText as part of an academic research team analyzing high school students’ qualitative responses to questions about bullying. As we have been coding responses, we have found the coding process fairly smooth, although not without a few features that we would have done differently. Still, the process of coding is similar to that of other qualitative coding software (I’ve used NVivo). We haven’t yet gotten into any sophisticated filtering or analysis, but I’m expecting that it will be really useful. The biggest impression I’m left with after my three months of using DiscoverText is that it’s a powerful tool, and we’ve only scratched the surface of what it can do.” - Ben, doctoral student, Community Research and Action.

Overall, DiscoverText enabled our team’s timely progress through a complex research process. Following coding, we intend to make use of DT’s meta data “tagging” capabilities such that we can meaningfully export coded response summaries to their “tagged” respective schools. Finally, we intend to continue to explore the useful capabilities of DT in our research. We find DiscoverText easy-to-use and helpful – our questions have been kindly answered by the Textifier support team or solved through processing the helpful support material on DT’s support site!

Thanks a lot DiscoverText!
Joseph H. Gardella

Posted in Coding, research | Tagged , , , , , , , , , | 1 Comment

Five Pillars of Text Analytics

Document relevance is a key challenge for social media research. The specific problem of “word sense disambiguation” is widespread. If I am interested in “banks” where money is stored, I want to exclude mentions of river banks. If I am “Delta” airlines, I do not want to see social data about Delta faucets, Delta force, or those pesky river deltas. If I run a sports team like the Pittsburgh Penguins, the massive numbers of Facebook posts and Tweets about flightless but adorable birds are equally problematic. There are very few social media analytics projects that can easily avoid the challenge of sorting relevant and irrelevant documents.

At Texifter, we have refined a powerful set of tools and techniques for doing word sense disambiguation. This 5-minute video uses the example of Governor Chris Christie to illustrate how the five pillars of text analytics can help anyone to identify and remove irrelevant documents from an ambiguous social data collection. The principles are very similar to spam filtering in email; we use the same mathematics. Using DiscoverText, we argue an individual or small collaborative team can create a custom machine classifier for the task in just a few hours. Someday, we hope to get this down to a few minutes.

Posted in DiscoverText, general, product, research, Social Media | Tagged , , , , , , , | 2 Comments

Big Data TechCon


Posted in general | Comments Off

DiscoverText: A Vital Research Tool for Social Media

Longtime DiscoverText User Jacob Groshek

I’ve been using DiscoverText for several years, primarily in an academic research capacity but also working with journalists to help them reach broader audiences through social media.  From an academic standpoint, DiscoverText was the backbone of collecting Facebook and Twitter data for a study on the 2012 Presidential election that was published in Social Scientific Computer Review.  When working with the New England Center for Investigative Reporting, we use DiscoverText to collect social data and mine that to find users interested in topics being covered by the center and to share stories with them.  Raw data can be exported for use in third party software, as in the case of this work on co-mentions about flooding.

Altogether, DT is a vital tool to not only collect and gather data but also to code and analyze data.  It is simply the best place to begin with social data, and offers utilities many other entities do not, including the ability to clean data and minimize redundancies such as those created by bots.  DiscoverText and Texifter personnel have my highest endorsement. It is a model enterprise for users at all levels who are looking to engage in a rich and thorough analysis of social media data.

Posted in DiscoverText, Facebook, product, research, Social Media, Twitter | Tagged , , , , , , | 1 Comment

DiscoverText as a Teaching and Research Tool

Conducting research on the impact of large projects and events is difficult as each undertaking is unique. Traditional quantitative techniques face limitations of internal validity while qualitative research faces challenges of external validity. However, projects and events generate a massive amount of social media traffic that can be used to understand stakeholder interactions before, during and after delivery.  In addition to research, they also provide an avenue to enhance teaching and learning activities as students can collect social media data to apply new research techniques such as text mining. At Bournemouth university, we’ve launched a project called Festim that aims to develop research and teaching using data from social media networks.

For research, the initial objective  is to  enable the evaluation of social impacts, an area that is difficult to assess using conventional qualitative and quantitative approaches.  In the teaching domain, we wish to develop Reusable Learning Objects that can guide future graduate researchers seeking to apply social media data. We also wish to widen the range of research options available to undergraduate students  to include social media analysis.

We were fortunate to get a trial enterprise subscription to DiscoverText, which we used to support all of these activities. For research, DiscoverText enables us to understand the online narratives around events on Facebook, Google+, and Twitter. So far, we have been able to create a taxonomy that compares festivals by online stakeholder engagement. Our team is also exploring the nature of discussions that generate engagement across multiple platforms. We’ve used DiscoverText to uncover the nature of the temporary communities of interest that are created on Social Media  from the discussions around festivals.

Undergraduate researchers have also deployed DiscoverText. One student has used the platform to compare the impact of music events while another has explored how social media is used to recruit volunteers.  For teaching, our students have been using DiscoverText to understand the content of discussions on Facebook pages of case study companies as a way of illuminating current issues.

Posted in DiscoverText, Facebook, general, research, Twitter | Tagged , , , , , | 1 Comment

Tools for Text – Lecture at Northeastern University Monday March 10, 2014

Tools for Text

Dr. Stuart W. Shulman
Founder & CEO of Texifter
Research Associate Professor of Political Science
University of Massachusetts Amherst

12pm – 1:15pm, Monday, March 10
Center for Complex Network Research
5th floor Dana Building, Northeastern University (take elevator on left)

Tools for reviewing, coding, and retrieving text found in qualitative data analysis packages carry with them no particular attributes for ensuring the reliability or accuracy of the recorded observations. Based on 13 years of multidisciplinary experience, this presentation guides researchers through key aspects of measuring coder validity and reliability as part of building custom machine classifiers. The presentation demonstrates how text mining and related analytic tools focus attention on unexpected or difficult to code concepts, which in many cases will constitute the most interesting terrain for deeper investigation.

Posted in general | Comments Off

Texifter News: Migration to Azure and the Big Boulder Initiative

A brief follow up on Texifter. We successfully migrated “DiscoverText” (http://discovertext.com) to Microsoft’s Azure. It was very smooth, though we are going through a period of diminished search and filtering capabilities while the data re-indexes. Otherwise, the other capabilities appear stable.

We also launched a new beta product on Azure to allow users to get free estimates (and buy the data) self-serve from the full history of Twitter. The live prototype is “Sifter” (http://sifter.texifter.com).

Finally, I have been elected a board member and Treasurer for the Big Boulder Initiative (http://bigboulderconf.com/about/). In that capacity, I will be playing a role helping to organize the social data industry association that will launch in June at Big Boulder.

2014 is looking good for Texifter. On January 31, 2014, the company re-acquired of all assets and intellectual property related to DiscoverText, including the Sifter stack of language technologies for de-duplication, clustering, coding, and machine-learning, as well as the “CoderRank” patent.  Going forward, we believe these tools can make a significant impact on the history of information.

Posted in general, Texifter | Tagged , , , , , , , , , | 1 Comment

Collecting Facebook & Twitter Data

This is an updated 4-minute tutorial on how to collect public Facebook data via the Open Graph API using DiscoverText.

This is an even shorter 75-second tutorial on how to collect Twitter data via the public API.

Posted in API, Facebook, Social Media, Twitter, Twitter | Tagged , , , , , , , , , , , | Comments Off

New Product Testing – TwitterSifter.com

Update 2.12.2014
The beta has been renamed Sifter and moved to http://sifter.texifter.com.


Posted in general | Comments Off

Digital Methods Initiative Winter 2014 Slides

It was a great joy to return to the University of Amsterdam and give this talk to my old friend Richard Rogers and his 100+ attentive workshop attendees.

Posted in DiscoverText, general, product, Social Media, Texifter, Twitter | Tagged , , , , , , , , , , , , , | 1 Comment

Free Gnip-enabled Historical Twitter Estimates

Use search and powerful @Gnip Power Track operators to find the exact slice of Twitter history that you need.

Search every tweet in history

Search every tweet in history via the Gnip-enabled Power Track for Twitter

Posted in general | Comments Off

Win Historical Twitter Datasets

Just about six hours left to win valuable historical twitter datasets and powerful text analytics software. This is by far our best Facebook raffle yet. To enter:

  1. Login to Facebook
  2. Visit this URL: http://bit.ly/1421tWP
  3. Tweet about the raffle, follow DiscoverText on Twitter, or like on Facebook.
  4. Do all three to increase your chances.
  5. Refer friends to do better still.

The winner will get three 10-day historical Twitter  datasets, with Power Track search operators enable by our friends @gnip as well as gratis use of the DiscoverText software platform. Runners up will also get valuable software prizes for a full year.

Posted in DiscoverText, general, product, Social Media, Twitter | Tagged , , , , , , , , , , , , , , , | 1 Comment