In a continuing effort to create the best possible methods to sample Twitter data, we testing out a number of Gnip geographical enhancements. For a limited time, all of the “profile” PowerTrack rules are live. You can add them to your free estimate queries in Sifter or use them in your day-forward PowerTrack rules on DiscoverText to greatly increase the amount and granularity of the geographical specificity in the results.
The Geo and Profile Enhancement Rules
has:profile_geo
Matches tweets that have any Profile Geo metadata, regardless of the actual value. Here are two sample estimates, one for a day with no keyword and another for a month with the keyword Nike:
Rule Text: has:profile_geo
Start Date: 01/01/2015
End Date: 01/01/2015
Estimated Activities: 92,450,000Rule Text: Nike has:profile_geo
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 1,300,000
has:profile_geo_locality
Matches all activities that have a profileLocations.address.locality value present in the payload. Here is a one month sample estimate:
Rule Text: profile_country_code:us has:profile_geo_locality
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 413,851,000
has:profile_geo_subregion
Matches all activities that have a profileLocations.address.subRegion value present in the payload. Here is a sample estimate for one month of every Tweet with a geo subregion.
Rule Text: has:profile_geo_subregion
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 400,634,000
has:profile_geo_region
Matches all activities that have a profileLocations.address.region value present in the payload. Here is a 10% sample estimate for 1 day:
Rule Text: sample:10 has:profile_geo_region
Start Date: 01/01/2015
End Date: 01/01/2015
Estimated Activities: 6,864,000
profile_bounding_box:[west_long south_lat east_long north_lat]
Uses latitude and longitude to create a geographical bounding box. Here is an example of a one month estimate for the bounding box for Boulder, CO.
Rule Text: profile_bounding_box:[-105.
301758 39.964069 -105.178505 40.09455]
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 412,000
profile_country_code:
Exact match on the “countryCode” field from the “address” object in the Profile Geo enrichment. Uses a normalized set of two-letter country codes, based on ISO-3166-1-alpha-2 specification. This operator is provided in lieu of an operator for “country” field from the “address” object to be concise. Here is an example of one week of Twitter with the country code Brazil:
Rule Text: profile_country_code:BR
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 42,694,000
profile_region:
Matches on the “region” field from the “address” object in the Profile Geo enrichment. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation.
profile_region_contains:
Matches on the “region” field from the “address” object in the Profile Geo enrichment. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of one week for region contains Seattle or New England.
Rule Text: profile_region_contains:
seattle OR profile_region_contains:new england
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 7,000
profile_locality:
Matches on the “locality” field from the “address” object in the Profile Geo enrichment. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of 1 month for avon:
Rule Text: profile_locality:avon
Start Date: 01/01/2015
End Date: 01/31/2015
Estimated Activities: 74,000
profile_locality_contains:
Matches on the “locality” field from the “address” object in the Profile Geo enrichment. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of one week for york:
Rule Text: profile_locality_contains:york
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 3,660,000
profile_subregion:
Matches on the “subRegion” field from the “address” object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region. This is an exact full string match. It is not necessary to escape characters with a backslash. For example, if matching something with a slash, use “one/two”, not “one\/two”. Use double quotes to match substrings that contain whitespace or punctuation. Here is an example of one week for San Francisco County:
Rule Text: profile_subregion:”San Francisco County”
Start Date: 01/01/2015
End Date: 01/07/2015
Estimated Activities: 1,170,000
profile_subregion_contains:
Matches on the “subRegion” field from the “address” object in the Profile Geo enrichment. In addition to targeting specific counties, these operators can be helpful to filter on a metro area without defining filters for every city and town within the region. This is a substring match for activities that have the given substring in the body, regardless of tokenization. Use double quotes to match substrings that contain whitespace or punctuation.
Pingback: Gnip Geo Enhancements for Twitter Data Live in ...