An overview of the AAS221 tweet stream
Here are a few features of the tweets from the 221st meeting of the American Astronmical Society.
- The most retweeted tweets
- A look at what's popular
- How chatty were people?
- How popular is retweeting?
- How long did it take for retweets to occur?
- This is not a popularity contest
The data collection has now stopped.
Time range (PST): to
There have been tweets and re-tweets made by people, of which only sent retweets ().
The most retweeted tweets
The retweet numbers are close to the values that Twitter has for the tweets, which means that I can't have messed up too much (I wouldn't expect perfect agreement since Twitter does not guarantee that any search will return all matching tweets).
A look at what's popular
Twitter account | Number of re-tweets |
---|
Twitter account | Number of mentions |
---|
It is not really that surprising that the most re-tweeted accounts also appear in the most-mentioned table.
Twitter account | Number of replies received |
---|
Twitter account | Number of replies made |
---|
HashTag | Number of mentions |
---|
Note that #aas221 and #hackaas have an advantage here since the search I used was for 'aas221', 'aas 221', or 'hackaas'.
URL | Number of mentions |
---|
The counts in this table jumped significantly - at least for the top URL - on February 12th since I updated my code to calculate the "true" URL - i.e. after following through all the link-shorteners you find on Twitter. Since there are some common query strings seen in URLs on Twitter which can lead to "missing" counts, I removed the following query terms for most URLS: utm_source, utm_medium, utm_campaign, goback and cid. For YouTube links I dropped the feature term, which is why the "Fund Me, Maybe" video just sneaks in at number 6. The choice was made after reviewing the URLs.
Publisher | Number of tweets |
---|
Note that this table is created from the tweets () that included this information.
Number of programs | Number of tweets |
---|
I have not (yet?) looked into the "multi-program" cases to see if it is people with multiple devices - perhaps a smart phone during the conference and a browser/desktop application back in the hotel room - or something else.
How chatty were people?
Below I show the distribution of the number of times an account tweeted (this excludes retweets made by the account). The graph has been cut off to focus on the low-numbers since there are some users with upwards of 100 tweets. The number of tweets used in this analysis is and the graph accounts for of this total.
This plot shows the number of retweets by a user; as can be seen most people made 1 retweet, although there is a significant fraction () who made none. There are users that have been excluded by the cut off at ; the maximum number of retweets by a user is .
How popular is retweeting?
Below I show the distribution of the number of times a tweet was re-tweeted. The graph has been cut off to focus on the low-numbers; as shown above there are tweets that have more than 70 retweets, but it is only of the population. The number of tweets used in this analysis is .
How long did it take for retweets to occur?
Here we look at the distribution of "retweet times" - i.e. the time between the original tweet and when it was re-tweeted. There are two graphs; the first is limited to one day, and so excludes of the distribution. The second graph shows all the tweets, but uses a much-larger bin size. The number of retweets used in this analysis is .
Would people be interested in seeing the same distribution but for replies to tweets (i.e. conversations)?
This is not a popularity contest
I am interested in seeing whether we can split up the population, so I wonder if the distribution of number of followers vs followed for each person may tell us something. Given the Twitter etiquette of reciprocity, I expected there to be a trend along the y=x line, here shown as the diagonal line. I have excluded the accounts who either have no follows or followers. Since the number of follows and followers for an account varies with time, I have used the maximum value for each account reported by the Twitter search.
The soft limit of 2000 followers can be seen as the upper bound on most of the users; I have added a blue rectangle showing the area where the number of follows is more than 2000 but the number of followers is less than 2000 to highlight this area. The limit is explained in the Twitter documentation on Are there additional limits if you are following 2000+ accounts?.
Credits
This visualization was created using the d3.js JavaScript library.