Are you a fan of the works of composer Philip Glass? Are you a fan of data representation and open source statistical software packages? Then good news, we’ve found something that combines the two!
Plot your output
Interested to know how an artist changes over time? Do they start of predominantly making symphonic ballads, move on to industrial cowbell before discovering a love of shoegaze calypso?
Well, one fan of Philip Glass was interested, so they looked into his work, classifying it by genre and noting the year. This was then all plotted using the software package R – now at version 3.3.2 (Sincere Pumpkin Patch) – and displayed using a cracking little ‘exploding box blot’ package that reveals each composition individually.
Doing your own project?
If you’re currently in the middle of a research project and aren’t familiar with R, it really is worth a look, although the more graphically-friendly RStudio might be a good place to start. Let’s do a little bit of our own research RIGHT NOW and see what we can reveal. Shall we look at how many weeks the UK chart-topping singles spent in the top slot from the sixties onwards by decade?
Do singles spend less time at number one these days?
Yes, let’s do that because it’s really easy to get the data. Taking the number of weeks at number one data from the UK chart number ones pages from Wikipedia, we build a two-column spreadsheet with the decade (60’s to 00’s) in the first column, with the list of numbers in the second column. This is the dataframe we save as a .csv file an import into RStudio using the “Import Dataset” tool in the top right. Our file is called ‘OnesDecade’, with the decade column headed “Decade”, and the weeks at number one column headed “Weeks”
Now, this might be bad form, but we’re starting with a fresh sheet and we’re not going to be doing much else, so we’ll save some typing later by attaching the data:
Now, let’s have a quick look and see what we’ve got with:
boxplot (Weeks ~ Decade)
It works, but our boxes aren’t in chronological order. We can fix that by creating a new version of “Decade” which we’ll call “Dec” with the order of the decades forced:
Dec <- factor(Decade,levels = c(“60s”, “70s”, “80s” , “90s” , “2000s”))
Now, when we plot the boxplot using “Dec” instead of “Decade”, everything runs nicely by decade from left to right. It still doesn’t look great though, does it? So let’s get it labelled appropriately:
boxplot (Weeks ~ Dec, main=”Number of Weeks at UK Number One by Decade”, xlab=”Decade”, ylab=”Number of Weeks”)
and prettied up with a bit of colour:
boxplot (Weeks~Dec,col=(c(“gold”,”darkgreen”,”cornflowerblue”,”chocolate2″,”bisque”)), main=”Number of Weeks at UK Number One by Decade”, xlab=”Decade”, ylab=”Number of Weeks”,par(bg = ‘aliceblue’))
And so we have a nice-to-look at first overview of the data, but what does it mean…? I think that might be for another day…