Music Datasets for MIR
Publicly available music datasets are like gold mines for anyone working in MIR. And such datasets become diamonds if their audio data comes annotated with some sort of metadata (the more the better).
One of the most recent newcomers in this field is the great Magnatagatune dataset, but many others exist (many coming from the MIREX initiatives over the years) – check this site for a quite extensive compilation of the datasets available and used in MIR.
Anyway, I was wandering some of the music uploading sites available nowadays, mainly the ones with a CC license blanket, and I found at least four that may be usable as sources of audio data+metadata for MIR evaluations.
Free Music Archive (FMA) seems like a nice source of music for MIR research, where song uploads are selected by a limited number of “curators” (should we expect a “higher quality” selection of the songs?!). Songs only seem to have a genre/subgenre tag (as far as I could understand) and I have not found any API to retrieve the songs and their genre tag in a programatic way…
CCMixter is a another source of music data, associated with the Creative Commons Initiative/Licensing, but it seems to lack any type of information about genres/subgenres of other musical metadata. However, some of the tracks provide (or point to) the individual source tracks (i.e. isolated vocals, isolated drum track, etc.) which could be really nice for source separation evaluations.
Palco Principal is a portuguese website for bands and musicians to upload their musical creations. Upon upload, each song can be tagged with genre (from an ID3 tags list), as well as with a list of instruments used, if there is a male/female singer, and a list of “main stream” influences (aka well-known, commercially released bands). Some of the songs can be downloaded for free after registration (which is also free), but is not clear what is the license used for these downloads… Not sure if there is currently an API for access to the songs/metadata, though.
Finally, just learned about the Libre.fm initiative, but I’m just waiting for an invite, so no comments till then.