UPDATE: Added a new graph below
After my previous look into /r/AskScience, I wanted to do a similar analysis for another interesting reddit community: /r/Gentlemanboners. It’s a place where people submit (mostly) SFW pictures of female celebrities, usually candids from red carpet events or glamour shots from magazines. It’s a fairly popular subreddit that usually gets a post to /r/all each day.
I thought it would be interesting to analyze the submissions to /r/Gentlemanboners for a couple reasons. First, it’s very easy to tell who’s in each picture because there’s a strict rule of requiring that the full name is included in the title. Second, certain girls seem to be much more popular than others, e.g. Emma Watson. To see if this was true, I downloaded the JSON files for all submissions between April 5, 2011 and June 22, 2016 using this script for a total of 89,842 files. Below are some some simple analyses, but I might get motivated to look more closely at this dataset in the future.
The first thing I looked at was ranking the most popular girls. You can see below the number of submissions for the 15 girls with the most submissions. I ignored incorrect/alternate spellings of names, and a single submission could be counted twice for two different girls if they were both in the picture. Also, this post was my first attempt to try D3.js, so you can see the exact numbers by placing the mouse cursor over each bar.
Emma Watson is by far the most popular girl on /r/Gentlemanboners, with 40% more submissions than Taylor Swift, Anna Kendrick, or Scarlett Johanssen, who are essentially tied for 2nd place. It’s interesting to see that this 40% difference can’t be seen between any of the lower ranks, meaning that Emma Watson’s popularity surpasses the power law trend that describes the rest of the data fairly well.
Popularity vs. Rating
The number of submissions is probably not the best measure for which girl is the most well-liked, so I also wanted to look at the scores for each submission. Below I have a scatter plot of the number of submissions vs. the average score for those submissions. The plot has the 500 girls with the most submissions. You can mouse over each dot to see the name associated with it.
There’s an overall positive correlation between the average score and the number of submissions, with Emma Watson’s dot located to the far right. However, you can also see that many girls have higher average scores despite fewer submissions.
I think the most interesting thing here are the names with very high average scores but few posts. At the top is Milana Vayntrub, with only 62 submissions but an average score of 985, much higher than the average score for anyone else in this group.
So who is Reddit’s favorite?
I can’t really say who deserves to be called Reddit’s favorite, since it depends on whether you consider the score or number of submissions as more important. Besides Emma Watson and Milana Vayntrub, there’s also girls to consider like Alison Brie. She was ranked 13th in the bar graph above, but her average score is higher than the top 89 most popular girls. The next most popular girl with a higher average score is Katherine McNamara (who has over 500 fewer submissions). So who is reddit’s favorite? I’ll leave that for you to decide.
Update: Total Karma
By request, I made an additional plot of the total karma, i.e. the sum of the scores for each post. Below are the top 15 girls ranked by total karma. Emma Watson is still at the top, probably thanks to her large lead in overall submissions, but the difference between 1st and 2nd is definitely smaller. Taylor Swift was probably the hardest hit, dropping down to 11th. The stars of Black Swan (Natalie Portman and Mila Kunis) were knocked out of the top 15, replaced by Natalie Dormer and Blake Lively.