This article is a continuous of the previous post, Reddit text mining and visualization with R Shiny.
In this article, we will explore the most popular music styles in 2016 based on Reddit, Music board.
Take a look at the data
We first load the data. Select Music board, set data range to 2016-01-01 to 2016-12-31 and we get the following outputs.
In 2016, there are 23,774 posts in total. Stat. and boxplot shows most posts has 1 point or 0 comments (median). While the most popular quarter (3rd Qu.) has 3 points or 2 comments. The histogram shows a consistency of post number over the year.
We draw Author-post barchart to see how those posts are made.
The vertical axis is author ID's (too many, not listed), and horizontal axis is post numbers. As circled in red, a few people made a lot of posts. As circled in orange, most only made one post through the year.
Identify frequent music types
Now let's move on to identifying popular music types. The Term frequency barchart shows words with high if-idf.
Here we list the top 20. We see Rock is on the top. Looking at the plot, we can write down: Rock, Pop, Hip-Hop, Metal, and Rap.
So far, we're dealing with the whole year's data. Now let's extract posts with high points (points >= 100) and plot again.
In this subset, there're 657 posts. Rock is still the top, but Metal becomes the second and Punk is the third. Rap is out of the top 5.
We focus on posts with most discussions (comment number >= 10).
In this subset, we have 1546 posts. The top 5 are the same as in high-point group, but Punk is lower than Pop.
Brief summary
1. In all three groups: total, high-point, and high-discussion, most posts are about Rock.
2. Punk is not the top 5 overall, but in high-point, and high-discussion groups, it's the top 5.
3. Taking union of the three groups' top 20 keywords, we can build a pocket list for further analysis:
Rock, Rap, Pop, Hip-Hop, Metal, Punk, Folk, Soul, Electronic.
Look into each music styles
Now we search posts related to each music styles in our pocket list.
Looking at the boxplot and Statistics, we write down the following table.
|
points
|
comments
|
posts
|
|
1st Qu.
|
median
|
3rd Qu.
|
1st Qu.
|
median
|
3rd Qu.
|
|
All
|
0
|
1
|
3
|
0
|
0
|
2
|
23774
|
Rock
|
1
|
2
|
5
|
0
|
0
|
1
|
5948
|
Pop
|
1
|
1
|
3
|
0
|
0
|
1
|
2182
|
Hip-Hop
|
1
|
1
|
2
|
0
|
0
|
1
|
1542
|
Metal
|
1
|
2
|
6
|
0
|
0
|
2
|
1066
|
Punk
|
1
|
2
|
6
|
0
|
0
|
2
|
853
|
Rap
|
1
|
1
|
2
|
0
|
0
|
1
|
1138
|
Folk
|
1
|
2
|
3
|
0
|
0
|
1
|
965
|
Soul
|
1
|
1.5
|
3
|
0
|
0
|
1
|
484
|
Electronic
|
1
|
1
|
2
|
0
|
0
|
1
|
957
|
For comment numbers, all music styles have median 0. Note that the range between 1st and 3rd Qu. can show variation. The smaller the more consistent the median is. Thus we should also consider it while reading medians. Metal and Punk have the 3rd Qu. of points equals to 2 and commnts equals to 6 which are the highest among all styles. While the 1st Qu. of points comments are the same as others. This means the distributions are extreme. Folk has a high median in points (2) and the Qu. range is samll (1-3). This indicates a centrality in distribution.
Brief summary
1. Rock has most posts and a high median (2) in points.
2. Metal and Punk have bipolar distributions on points and comment numbers.
3. Folk has a centered distribution on points.
More on music styles
In previous analyses, we focus on individual styles. But looking closer the data and you'll see, the music styles are not exclusive! There're "Electronic Rock", or "Rock/ Country". Treating them separately drops information of their co-existence. In the following analysis, we'll focus on pairs of music styles.
Here, we plot bigrams occur over 30 times in the year.
Looking at the bigram cloud, we can see centers and radios. Surrounding Rock, there're "alternative", "country", "punk", "electronic"... There's also Rap and Hip-Hop forming a circle. Metal is also a center with "death", heavy", "power" as members. There are bigrams outside, like "lo fi" and "jazz fusion".
Brief summary
1. Music styles are families with a center style and member styles.
2. Rock is the biggest family center.
Conclusion
1. Rock is the most liked (high points), most discussed music style.
2. Rock, Hip-Hop Metal, and Pop are the most popular families.
Comments
Post a Comment