Just another site

Thought this was cool: Interesting dataset: million songs dataset

leave a comment »

As you probably all know we are always looking for additional free, high quality datasets to try some of our techniques on. I got the million songs dataset link from Clive Cox, Chief Scientist at Rammble Labs, our man in London.

Here is some information from their website:

The core of the dataset is the feature analysis and metadata for one million songs, provided by The Echo Nest. The dataset does not include any audio, only the derived features. Note, however, that sample audio can be fetched from services like 7digital, using code we provide.
The Million Song Dataset is also a cluster of complementary datasets contributed by the community:

Here is information on getting the dataset. Kaggle managed a contest for rating music items drawn from this dataset. For evaluating performance they used MAP@500 metric described here. Anyway I am soon going to try out our GraphChi CF toolbox on this dataset. Keep posted for some results!
from Large Scale Machine Learning and Other Animals:


Written by cwyalpha

一月 16, 2013 在 10:23 上午

发表在 Uncategorized


Fill in your details below or click an icon to log in: 徽标

You are commenting using your account. Log Out /  更改 )

Google+ photo

You are commenting using your Google+ account. Log Out /  更改 )

Twitter picture

You are commenting using your Twitter account. Log Out /  更改 )

Facebook photo

You are commenting using your Facebook account. Log Out /  更改 )


Connecting to %s

%d 博主赞过: