Just another site

Thought this was cool: Soda vs. Pop with Twitter

leave a comment »

One of the great things about Twitter is that it’s a global conversation anyone can join anytime. And it gets even better when you can mine this chatter to study the way humans live and interact.

For example, how do people in New York City differ from those in Silicon Valley?

Or, how does language vary as you travel across different regions? Recall the classic soda vs. pop. vs. coke question: some people use the word “soda” to describe their soft drinks, others use “pop”, and still others use “coke”. Who says what where?

Let’s take a look.

United States

To make this map, I sampled geo-tagged tweets containing the words “soda”, “pop”, or “coke”, performed some state-of-the-art NLP technology to ensure the tweets were soft drink related (e.g., the tweets had to contain “drink soda” or “drink a pop”), and filtered out coke tweets that were specifically about the Coke brand (e.g., Coke Zero).

It’s a little cluttered, though, so let’s clean it up by aggregating nearby tweets.

United States Binned

Here, I bucketed all tweets within a 0.333 latitude/longitude radius, calculated the term distribution within each bucket, and colored each bucket with the word furthest from its overall mean. I also sized each point according to the (log-transformed) number of tweets in the bucket.

We can see that:

  • The South is pretty Coke-heavy.
  • Soda belongs to the Northeast and far West.
  • Pop gets the mid-West, except for some interesting spots of blue around Wisconsin and the Illinois-Missouri border.

For comparison, here’s another map based on a survey at

Pop vs. Soda Map

We can see similar patterns, though interestingly, our map has less Coke in the Southeast and less pop in the Northwest.

Finally, here’s a world map of the terms, bucketed again. Notice that “pop” seems to be prevalent only in parts of the United States and Canada.


I’ve been getting a lot of questions lately about interesting things you can do with the Twitter API, so this was just one small project I’ve worked on to illustrate. This paper contains another awesome application of Twitter data to geographic language variation, and just for fun, here are two other cute mini-projects I did a while ago:

What do people eat during the Super Bowl? (wings and beer, apparently)

Superbowl Snacks

What do people want for Christmas, compared to what they actually get?


from Edwin Chen’s Blog:


Written by cwyalpha

七月 7, 2012 在 4:53 上午

发表在 Uncategorized


Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / 更改 )

Twitter picture

You are commenting using your Twitter account. Log Out / 更改 )

Facebook photo

You are commenting using your Facebook account. Log Out / 更改 )

Google+ photo

You are commenting using your Google+ account. Log Out / 更改 )

Connecting to %s

%d 博主赞过: