Archive data fun
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
It's already available as a CSV:
http://sfjukebox.org/songs.csv?fightTit ... ply+Filter
A shared Google spreadsheet would save people the trouble of downloading the data and importing it into a spreadsheet program and then exporting charts and graphs and uploading them somewhere.
http://sfjukebox.org/songs.csv?fightTit ... ply+Filter
A shared Google spreadsheet would save people the trouble of downloading the data and importing it into a spreadsheet program and then exporting charts and graphs and uploading them somewhere.
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Well, here's something. Seems like a 9000+ row spreadsheet maybe is a bit more than Google Docs is meant to handle as it feels a bit sluggish.
https://docs.google.com/spreadsheet/ccc ... n_US#gid=0
https://docs.google.com/spreadsheet/ccc ... n_US#gid=0
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Oh, King Arthur, I remembered what the limitation is with the Google Chart API. I can only provide a value along the vertical axis, and a label to associate with that value on the horizontal axis. It then plots those values and labels in the order it gets them. So if I wanted it to space the fights out correctly over time, I'd have to add 0 values for every day in between the days where I'm plotting the fights. I'm probably not explaining it well. Basically it's not like you give it two coordinates and it plants a dot, at least not for the bar/column/line charts. It looks like scatter plots do work more that way.
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
Lunkhead wrote:9000+ row spreadsheet maybe is a bit more than Google Docs is meant to handle
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
So this is kind of a weird request...
but I think it'd be really cool to have a graph of the rolling average of %vote for an artist.
I can't explain the details since I'd probably do it wrong, but something that shows an artist's improvement over time rather than specific fights as plots. Such that the influence of previous fight results decay over time along the plot. I'm not a statistician, but I think there must be something like that.
but I think it'd be really cool to have a graph of the rolling average of %vote for an artist.
I can't explain the details since I'd probably do it wrong, but something that shows an artist's improvement over time rather than specific fights as plots. Such that the influence of previous fight results decay over time along the plot. I'm not a statistician, but I think there must be something like that.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Unless I am completely misunderstanding you (which is possible) that sounds kind of like an audio upsampling algorithm, which I guess is sort of akin to curve fitting. You make up data points between the actual data points based on weighted averages of the points on either side, or something like that. (I wrote some upsampling code only once so I'm not really an expert on that and I've never written curve fitting code.) Anyway, I'm not sure what you envision being the end result of that but I suspect it would just be a curve that connected the % votes data points. I could pretty easily switch the straight lines on the graphs to curves. I could also try to figure out a way to get the time scale along the horizontal axis to work properly (by that I mean, have each tick correspond to a fixed time unit, like a day, or a week).
Of course, if you really want "something that shows an artist's improvement over time" then maybe the Song Fight! data isn't the place to look?
Of course, if you really want "something that shows an artist's improvement over time" then maybe the Song Fight! data isn't the place to look?
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
It doesn't have to be interpolated, but the points would be connected with bezier curves? A graph of the average % would be fine as well.
This is kind of what's going on in my head. I'm sure it makes no sense.
This is kind of what's going on in my head. I'm sure it makes no sense.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Ah, I think I am beginning to get what you mean, thanks for the visual aid. That looks pretty straightforward. I guess the variable there is the rate of decay of influence of past values?
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
Yes, and whether it decays per time or per fights. I'm not sure which is better. Or whether the sum or average is better.
I'm not a statistician, but it seems like something like that might be cool to show... something.
I'm not a statistician, but it seems like something like that might be cool to show... something.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
I'm just going to keep doing things by fight for now, rather than by time, so as to avoid having to deal with multiple fights on the same date, and to avoid making the horizontal scale work correctly. 
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
So I think... the sum would measure the influence in the community (and should decay per every song fight title), and the average would measure consistency in a way that is more forgivable than just the average at each point (and should decay for each fight entered by that artist).
I probably drew the pink line incorrectly. It should be jagged.
I probably drew the pink line incorrectly. It should be jagged.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
So would you expect to see the "song fight power" line start high then curve down steeply and have a "long tail" of low values? I think I have it coded up and that's what I'm seeing. I'm not sure if that's what I'm supposed to see though. Should I be incorporating the percentile rank for every previous fight when calculating the "power" for a fight, or just a fixed number of previous fights?
Also I guess I'm not really sure what that value is meant to indicate either. The artist's "influence in the community" over time? What does that mean and how is that supposed to correlate to this graph?
Also I guess I'm not really sure what that value is meant to indicate either. The artist's "influence in the community" over time? What does that mean and how is that supposed to correlate to this graph?
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
I went and updated my server with the new chart, but I'm not sure it's right, or that I'll keep it. Here are some examples:
http://sfjukebox.org/artists/chart/Berk ... e?type=sfm
http://sfjukebox.org/artists/chart/Caravan+Ray?type=sfm
http://sfjukebox.org/artists/chart/Ross+Durand?type=sfm
http://sfjukebox.org/artists/chart/Host ... s?type=sfm
http://sfjukebox.org/artists/chart/Berk ... e?type=sfm
http://sfjukebox.org/artists/chart/Caravan+Ray?type=sfm
http://sfjukebox.org/artists/chart/Ross+Durand?type=sfm
http://sfjukebox.org/artists/chart/Host ... s?type=sfm
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Yeah, me neither.
But it was fun to make! I don't know why I am enjoying this stuff so much, especially now that I've gotten things to the point where I can copy and paste some code and kludge out new charts pretty easily.
I think so far the percentile rank distribution/histogram seems to maybe be the most informative chart. You can see at a glance how often someone ranks highly vs in the lower percentiles. That seems like it could possibly correlate to the quality of an artist's entries.
I think so far the percentile rank distribution/histogram seems to maybe be the most informative chart. You can see at a glance how often someone ranks highly vs in the lower percentiles. That seems like it could possibly correlate to the quality of an artist's entries.
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
Why are the charts going down? They should be going up!
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
They are going down because the "sum of song fight mojo" is being divided by a bigger number (the # of fights up to that fight) every time. Is that not correct? Here's my code:
//iterate through songs/fights, oldest to newest
// get artist's percentile rank for current song/fight
Double percentileRank = Math.max(fightIdPercentileRankMap.get(song.getFightId()) * 100, 0d);
// add percentile rank to list of percentile ranks
rankValues.add(percentileRank);
// start creating sum of percentile ranks
rankSumValues.add(percentileRank);
for (int j = (mojoIndex - 1); j >= 0; j--) {
Double rankI = rankSumValues.get(mojoIndex);
// add previous song/fight percentile rank, scaled down more the farther back you guy
rankI += rankValues.get(j) / (RATE_OF_MOJO_DECAY * (mojoIndex - j));
rankSumValues.set(mojoIndex, rankI);
}
// calculate average
mojoValues.add(rankSumValues.get(mojoIndex) / rankSumValues.size());
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
Maybe make it sum instead of average?
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Then the lines go up. But again, I'm still not clear on exactly what it's supposed to mean?
EDIT: Site updated to use sum rather than average, and I changed the rate of decay of the influence of previous fights a bit.
EDIT: Site updated to use sum rather than average, and I changed the rate of decay of the influence of previous fights a bit.
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
I like that better! Thanks for indulging me. I feel like it's probably a better indicator of whether the next song from an artist will be "good" than a simple average. Maybe. Like I said, I'm not a statistician. 
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Billy's Little Trip
- Odie
- Posts: 12090
- Joined: Mon Nov 13, 2006 2:56 pm
- Instruments: Guitar, Bass, Vocals, Drums, Skin Flute
- Recording Method: analog to digital via Presonus FireBox, Cubase and a porn machine
- Submitting as: Billy's Little Trip, Billy and the Psychotics
- Location: Cali fucking ornia
Re: Archive data fun
Who here likes kitties?
....me too.
random pirate.

....me too.
random pirate.

- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Thanks for interjecting BLT. No, that does not have anything to do with masturbation. Or ... does it... ?
Anyway, I decided to nerd out in a slightly different way and dig into Yahoo Pipes a bit. Here's a pipe that, given an artist's key in the archive and full name spits out an RSS feed for that artist's songs in the archive:
http://pipes.yahoo.com/pipes/pipe.run?_ ... cial+Scene
Pipes are cool! No, that is not a drug reference, BLT. Or ... is it...?!!
Anyway, I decided to nerd out in a slightly different way and dig into Yahoo Pipes a bit. Here's a pipe that, given an artist's key in the archive and full name spits out an RSS feed for that artist's songs in the archive:
http://pipes.yahoo.com/pipes/pipe.run?_ ... cial+Scene
Pipes are cool! No, that is not a drug reference, BLT. Or ... is it...?!!