Who’s inside that Mechanical Turk?

Andy Baio, otherwise known as Waxy (I don’t know why) is an independent journalist and programmer who lives in Oregon, and in addition to maintaining one of the most interesting link blogs on the planet he periodically takes on research projects — including an exhaustive investigation of all 300 or so samples used in the new Girl Talk album. In order to compile that data, he used Amazon’s “crowd-sourcing” engine known as the Mechanical Turk, and became fascinated by the idea that hundreds of people were spending their time doing small research jobs for him anonymously through the service. So he posted a request that Turkers take a photo of themselves holding a piece of paper, with the reason why they like to Turk. The results? Photos of 30 people, 10 women and 20 men, mostly young and white. Some Turk for the money, some for the “lulz” (or laughs), some just because they are bored. Thanks, Waxy.

Waxy digs into Girl Talk data

If you are the kind of data geek who loves to just accumulate numbers about things and then slice and dice them to see what appears, then Andy “Waxy” Baio is your kind of guy. An independent journalist and programmer whose blog at Waxy.org is a treasure trove of such things, Andy spent some time recently and analyzed the recent album from DJ mashup artist Girl Talk (which I wrote about here). Using data from Wikipedia — as well as some he got by using Amazon’s “crowd-sourcing” engine, Mechanical Turk — he came up with a spreadsheet listing all the samples that Gregg “Girl Talk” Gillis used on the album (264 in all) and how many samples each song contained.

Then he created a visual timeline of where the samples appear in each song, and a bar graph that shows the age of each song used as a sample (median age: 13 years old), as well as the same data laid out in a different way, to show that Gillis uses a lot of recent hits, and also a lot of 80s tunes, but not that many in between. What does any of this mean? Who knows. But it’s a tour de force of data porn. As always, Waxy gives a full breakdown of his methodology, and all the data can be download as a CSV file if you want to run your own analysis.