Anyway, I would like to suggest that any other crazy people like me and Manhattan Glutton who want to build Song Fight! related apps consider using the data that is already available from my "Jukebox" site before they go off and write more scraping code. The jukebox is really two apps: one a database of the fight related data with a simple RESTful Web service on top that makes the data available in easy to consume formats, and the other a jukebox built on top of the database. I'd love to grow the database aspect, and maybe even fully split that out from the jukebox.
Getting the data out of my site is very easy. Some examples:
All the fight data for fights started after July 1st last year, sorted with most recent fights first, in JSON:
http://sfjukebox.org/fights.json?minSta ... ding=false
All the artist data (minus the extended profile info, which I don't have yet and may never import since it's mostly stale) for artists who first entered after July 1st last year, sorted by artist name, in JSON:
http://sfjukebox.org/artists.json?minFi ... nding=true
I've limited these examples to data within the last year, but it's possible to return the whole dataset by removing the restriction. It's similarly easy to get the individual artist and fight info out in easy to consume formats.
I would love to work with anybody who wants to build a Song Fight! app to make the data available in whatever format works for them, and to work with people on trying to add data that is missing.
Maybe some day we could have something that was so good that we could then flip things around and make it the real system of record and drive songfight.org off of it and stop all the scraping and importing.