Search, Filter, Cluster, Code & Classify Text
Texifter Blog
Skip to content
  • Home
  • DiscoverText
  • Founder & CEO
  • Texifter
  • PCAT
  • CAT
  • QDAP
← Scholarly Mentions of DiscoverText

Gnip Geo Enhancements for Twitter Data

 Posted on February 2, 2015 by Stuart Shulman

                            
In a continuing effort to create the best possible methods to sample Twitter data, we testing out a number of Gnip geographical enhancements. For a limited time, all of the “profile” PowerTrack rules are live. You can add them to your free estimate queries in Sifter or use them in your day-forward PowerTrack rules on DiscoverText to greatly increase the amount and granularity of the geographical specificity in the results.

The Geo and Profile Enhancement Rules
has:profile_geo
Matches tweets that have any Profile Geo metadata, regardless of the actual value. Here are two sample estimates, one for a day with no keyword and another for a month with the keyword Nike:

Rule Text: has:profile_geo
Start Date: 01/01/2015
End Date: 01/01/2015
Estimated Activities: 92,450,000

Rule Text: Nike has:profile_geo
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 1,300,000

has:profile_geo_locality
Matches all activities that have a profileLocations.address.locality value present in the payload. Here is a one month sample estimate:

Rule Text: profile_country_code:us has:profile_geo_locality
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 413,851,000

has:profile_geo_subregion
Matches all activities that have a profileLocations.address.subRegion value present in the payload. Here is a sample estimate for one month of every Tweet with a geo subregion.

Rule Text: has:profile_geo_subregion
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 400,634,000 

has:profile_geo_region
Matches all activities that have a profileLocations.address.region value present in the payload. Here is a 10% sample estimate for 1 day:

Rule Text: sample:10 has:profile_geo_region
Start Date: 01/01/2015
End Date: 01/01/2015
Estimated Activities: 6,864,000

profile_bounding_box:[west_long south_lat east_long north_lat]
Uses  latitude and longitude to create a geographical bounding box. Here is an example of a one month estimate for the bounding box for Boulder, CO. 

Rule Text: profile_bounding_box:[-105.301758 39.964069 -105.178505 40.09455]
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 412,000

profile_country_code:
Exact match on the “countryCode” field from the “address” object in the Profile Geo enrichment. Uses a normalized set of two-letter country codes, based on ISO-3166-1-alpha-2 specification. This operator is provided in lieu of an operator for “country” field from the “address” object to be concise. Here is an example of one week of Twitter with the country code Brazil:

Rule Text: profile_country_code:BR
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 42,694,000

profile_region:
Matches on the “region” field from the “address” object in the Profile Geo enrichment. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.

profile_region_contains:
Matches on the “region” field from the “address” object in the Profile Geo enrichment. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of one week for region contains Seattle or New England.

Rule Text: profile_region_contains:seattle OR profile_region_contains:new england
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 7,000

profile_locality:
Matches on the “locality” field from the “address” object in the Profile Geo enrichment. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of 1 month for avon:

Rule Text: profile_locality:avon
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 74,000

profile_locality_contains:
Matches on the “locality” field from the “address” object in the Profile Geo enrichment. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of one week for york:

Rule Text: profile_locality_contains:york
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 3,660,000

profile_subregion:
Matches on the “subRegion” field from the “address” object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of one week for San Francisco County:

Rule Text: profile_subregion:”San Francisco County”
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 1,170,000

profile_subregion_contains:
Matches on the “subRegion” field from the “address” object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.

Post to Twitter Tweet This Post

About Stuart Shulman

Stuart Shulman is a political science professor, software inventor, entrepreneur, and garlic growing enthusiast who coaches U13 boys club soccer and in the Olympic Development Program with a national D-license. He is Founder & CEO of Texifter, LLC, Director of QDAP-UMass, and Editor Emeritus of the Journal of Information Technology & Politics. Stu is the proud owner of a Bernese/Shepherd named "Colbert" who is much better known as 'Bert. You can follow his exploits @stuartwshulman.
View all posts by Stuart Shulman →
This entry was posted in GNIP, Social Media, Twitter, Twitter and tagged Data Mining, DiscoverText, Geo, geo locate, geographic information, GNIP, historical Twitter, R&D, Research, Social Media, social media monitoring, Texifter, Twitter Mining. Bookmark the permalink.
← Scholarly Mentions of DiscoverText
  • Pingback: Gnip Geo Enhancements for Twitter Data Live in ...

  • MB

    Very helpful for my analysis today! Thanks

  • Contact Us

    Texifter, LLC
    1-413-992-8513
    info@texifter.com

  • Texifter Links

    • @discovertext
    • @stuartwshulman
    • @texifter
    • Coding Analysis Toolkit
    • DiscoverText
    • Measuring Tweets
    • Needles in Haystacks
    • Sifter
    • Texifter
  • Recent Posts

    • Gnip Geo Enhancements for Twitter Data
    • Scholarly Mentions of DiscoverText
    • Historical Twitter Prize Winners
    • Historical Twitter Prize Drawing
    • Twitter’s Complete Index is Live
    • Texifter Releasing AC/DC-Inspired Shirts at Big Data TechCon
    • Texifter Social Data and Tools: August Prize Winners
    • Updates: Crowd Sourcing the FCC Open Internet Data
  • Archives

    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • October 2013
    • April 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • July 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • February 2011
  • Blogroll

    • Mashable
    • Qualitative Data Analysis Program
    • Steve Boese's HR Technology
    • TechCrunch
    • TechPresident
    • uClassify
    • Wired
  • Meta

    • Log in
    • Entries RSS
    • Comments RSS
    • WordPress.org
  • @texifter Twitter Updates

    Error: Twitter did not respond. Please wait a few minutes and refresh this page.

Texifter, LLC. Blog
Proudly powered by WordPress.