Archive data fun
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Does anybody remember exactly which fight was the first fight that allowed voting for multiple songs? I don't remember unfortunately.
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
I don't, but I have a hunch it's the one where the % # of average votes per song went up significantly... graph that too!
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
A quickly thrown together chart of average # of votes per song over time is now available here:
http://sfjukebox.org/songs/charts/avgVotesDate
There actually was a quite noticeable uptick in the average number of votes per song at one point in time (July 2nd, 2008), which is when the "All We Could See At The Window" fight ended. With a little poking around on the boards I was able to determine that was in fact when Spud released the new multi-voting functionality. Very interesting.
Also interesting is that it looks to me like (and this is just a subjective thing from looking at the chart, not anything based on real analysis) the chart has three sections: 2003-2006, sustained avg. # of votes per song; 2006-2008, steadily declining avg. # of votes per song; 2008-present sustained avg. # of votes per song, at a level higher than the 2003-2006 period.
This old discussion was interesting, too. I'd forgotten about it, so it's kind of neat to look at it and also look at the data a few years on.
http://songfight.net/forums/viewtopic.php?f=12&t=5519
http://sfjukebox.org/songs/charts/avgVotesDate
There actually was a quite noticeable uptick in the average number of votes per song at one point in time (July 2nd, 2008), which is when the "All We Could See At The Window" fight ended. With a little poking around on the boards I was able to determine that was in fact when Spud released the new multi-voting functionality. Very interesting.
Also interesting is that it looks to me like (and this is just a subjective thing from looking at the chart, not anything based on real analysis) the chart has three sections: 2003-2006, sustained avg. # of votes per song; 2006-2008, steadily declining avg. # of votes per song; 2008-present sustained avg. # of votes per song, at a level higher than the 2003-2006 period.
This old discussion was interesting, too. I'd forgotten about it, so it's kind of neat to look at it and also look at the data a few years on.
http://songfight.net/forums/viewtopic.php?f=12&t=5519
- Manhattan Glutton
- Niemöller
- Posts: 1530
- Joined: Tue Feb 15, 2005 12:10 pm
- Instruments: Angst
- Recording Method: REAPER
- Location: Madison, WI
- Contact:
Re: Archive data fun
My hiatus years. Coincidence? I think not.Lunkhead wrote:2006-2008, steadily declining avg.
If I had a dollar for every one of my songs j$ has called a 90s pastiche, I'd have $1 for every song I've written.
Nur Ein Archives | The New Ugly Podcast
Nur Ein Archives | The New Ugly Podcast
- JonPorobil
- Ibárruri
- Posts: 5682
- Joined: Sat Sep 25, 2004 11:45 am
- Instruments: Piano, Guitar, Harmonica, Mandolin, Accordion, Bass, lots of VSTs
- Recording Method: Cubase 10.5
- Submitting as: Jon Eric, Jon Porobil, others
- Pronouns: He/Him
- Location: Pittsburgh, PA
- Contact:
Re: Archive data fun
The first multi-vote fight was "Walking the Border."
"Warren Zevon would be proud." -Reve Mosquito
Stages, an album of about dealing with loss, anxiety, and grieving a difficult year, now available on Bandcamp and all streaming platforms! https://jonporobil.bandcamp.com/album/stages
Stages, an album of about dealing with loss, anxiety, and grieving a difficult year, now available on Bandcamp and all streaming platforms! https://jonporobil.bandcamp.com/album/stages
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Ah yes, you're correct, according to the review thread for that fight. Also on the chart June 20th is actually the fist data point of the elevated 2008-present section, so higher average votes per song still correlates with multi-voting. For some reason my mouse was only hovering over the points right after or before that until I really tried to get that one to show up just now, knowing it had to be there somewhere.
Any other graph ideas?
Any other graph ideas?
- Billy's Little Trip
- Odie
- Posts: 12090
- Joined: Mon Nov 13, 2006 2:56 pm
- Instruments: Guitar, Bass, Vocals, Drums, Skin Flute
- Recording Method: analog to digital via Presonus FireBox, Cubase and a porn machine
- Submitting as: Billy's Little Trip, Billy and the Psychotics
- Location: Cali fucking ornia
Re: Archive data fun
Can I say retarded stuff yet and be a dickhead? I've waited a long time, Lunk. 2 pages? You need some BLT. This thread is like watching paint dry. 
edit:
Oh wait, I just posted in here. My work is done. No need to answer the above, Lunkhard.
edit2: sorry, Linkhead?
edit3: Forgive me, lamp...........heard?
Wait! I know this!......HumpBed!......right?
Fine, I can't say it right, but you know who you are!
edit:
Oh wait, I just posted in here. My work is done. No need to answer the above, Lunkhard.
edit2: sorry, Linkhead?
edit3: Forgive me, lamp...........heard?
Wait! I know this!......HumpBed!......right?
Fine, I can't say it right, but you know who you are!
- fluffy
- Eisenhower
- Posts: 11267
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Archive data fun
I just saw the percentile charts. Pretty neat, but it seems like a histogram (showing the distribution of the number of fights in each percentile rank) would be a bit more useful, especially for the huge-number-of-entry artists. Also, a line chart isn't the right presentation for the historical percentile data, since it implies an interpolation between data points that is meaningless. Really it should be an x,y scatterplot. Maybe the size of the dot could be used to indicate the total number of votes or entry count or something, too.
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Cool, I will try all that stuff when I have time, maybe tomorrow. Thanks fluffy!
- king_arthur
- Niemöller
- Posts: 1763
- Joined: Sun Sep 26, 2004 6:56 am
- Instruments: guitar, vocals, bass, BIAB, keyboards (synth anything)
- Recording Method: Tascam DP-24SD
- Submitting as: King Arthur
- Pronouns: he/him
- Location: Phoenix, AZ
- Contact:
Re: Archive data fun
While you're poking at stuff... although the X-axis is labeled with dates, it is not scaled by date - if somebody didn't enter a fight for a whole year and then entered six fights in a row, those fights are all evenly spaced. If you look at my chart, you can see where there are a couple points on the X-axis that are VERY close, from when we had that ten-title enter-all-you-want fight.
Charles (KA)
Charles (KA)
"...one does not write in dactylic hexameter purely by accident..." - poetic designs
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
I didn't realize you entered every one of those "SONG FIGHT!" fights. Way to mess up my chart! 
Seriously, though, I'm using Google's Chart API and as far as I can tell it doesn't really provide fine grained enough control to fix that issue, KA.
http://code.google.com/apis/chart/
What I would like to do is tell the API that the vertical axis is time, and specify the left and right boundary dates, and have it place the data points accordingly. It really seemed like that is not possible. I will probably switch over to this API at some point, which should be more configurable:
http://code.google.com/p/flot/
Seriously, though, I'm using Google's Chart API and as far as I can tell it doesn't really provide fine grained enough control to fix that issue, KA.
http://code.google.com/apis/chart/
What I would like to do is tell the API that the vertical axis is time, and specify the left and right boundary dates, and have it place the data points accordingly. It really seemed like that is not possible. I will probably switch over to this API at some point, which should be more configurable:
http://code.google.com/p/flot/
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
I'm not sure the histogram is all that useful. I've got it coded but not on my server yet. Looking at it, the "number of fights in each percentile rank" is almost always 1 for every song for every artist. It doesn't look like grouping it into deciles would make it more informative, either. I'll try the scatter plot, maybe that will work better.fluffy wrote:I just saw the percentile charts. Pretty neat, but it seems like a histogram (showing the distribution of the number of fights in each percentile rank) would be a bit more useful, especially for the huge-number-of-entry artists. Also, a line chart isn't the right presentation for the historical percentile data, since it implies an interpolation between data points that is meaningless. Really it should be an x,y scatterplot. Maybe the size of the dot could be used to indicate the total number of votes or entry count or something, too.
- fluffy
- Eisenhower
- Posts: 11267
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Archive data fun
Well, I meant grouped into buckets, yeah. Maybe adjust the bucket size based on the number of fights or something (log_2 of fight count, maybe). Obviously a raw percentile histogram isn't very meaningful for such a small number of data points.
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
As I said, I'm pretty ignorant about statistics and data presentation. I don't mind learning stuff by trial and error, but I also don't mind if you want to spell things out for me. 
Anyway, grouping into deciles seemed to produce more interesting results. I've put that new chart on the site as the new default chart for artists. Here are some examples:
http://sfjukebox.org/artists/chart/Paco+del+Stinko
http://sfjukebox.org/artists/chart/Melvin
http://sfjukebox.org/artists/chart/MC%20Frontalot
I'll probably try the scatter plot thing tomorrow. Whee!
Anyway, grouping into deciles seemed to produce more interesting results. I've put that new chart on the site as the new default chart for artists. Here are some examples:
http://sfjukebox.org/artists/chart/Paco+del+Stinko
http://sfjukebox.org/artists/chart/Melvin
http://sfjukebox.org/artists/chart/MC%20Frontalot
I'll probably try the scatter plot thing tomorrow. Whee!
- fluffy
- Eisenhower
- Posts: 11267
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Archive data fun
unsurprising but :(
(and yes that is much more useful)
what I mean by using log_2 is that for example you'd need 4 fights to have 2 buckets, 8 fights to have 3 buckets, 16 fights to have 4 buckets, etc.
(and yes that is much more useful)
what I mean by using log_2 is that for example you'd need 4 fights to have 2 buckets, 8 fights to have 3 buckets, 16 fights to have 4 buckets, etc.
- fluffy
- Eisenhower
- Posts: 11267
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Archive data fun
Oh! Another neat thing would be if you could stack multiple artists together, and have each one's contribution color-coded, like this only less stupid:
- king_arthur
- Niemöller
- Posts: 1763
- Joined: Sun Sep 26, 2004 6:56 am
- Instruments: guitar, vocals, bass, BIAB, keyboards (synth anything)
- Recording Method: Tascam DP-24SD
- Submitting as: King Arthur
- Pronouns: he/him
- Location: Phoenix, AZ
- Contact:
Re: Archive data fun
I think there were three or four of us who entered all ten fights. Actually, I entered all ten, but my "Terror In Tiny Town" never seemed to make it to the fightmasters, who were probably going completely nuts trying to get all the songs posted. So I only show up in nine.Lunkhead wrote:I didn't realize you entered every one of those "SONG FIGHT!" fights. Way to mess up my chart!
The chart as it is is still very cool, thanks for all the work you're doing!
Charles (KA)
"...one does not write in dactylic hexameter purely by accident..." - poetic designs
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
log_2... I think I can switch it to that pretty easily, may give it a shot tonight, or more likely tomorrow.
Stacked column charts with multiple artists would be neat. Unfortunately the way I'm doing things right now is all very kludgy and hacky and not very flexible. Maybe what I could also do is just load the raw data into a public Google spreadsheet...? I could even write code to update it when I update the jukebox. I wonder if people could then go to town making their own charts and graphs with the data. The only totally public spreadsheet I've seen was not editable, so maybe that wouldn't be very useful.
Stacked column charts with multiple artists would be neat. Unfortunately the way I'm doing things right now is all very kludgy and hacky and not very flexible. Maybe what I could also do is just load the raw data into a public Google spreadsheet...? I could even write code to update it when I update the jukebox. I wonder if people could then go to town making their own charts and graphs with the data. The only totally public spreadsheet I've seen was not editable, so maybe that wouldn't be very useful.
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Actually, I don't think I get it still. If somebody had one song, their song would go in the 0-100 percentile bucket...? If they had four songs, they'd be split into the 0-X and Y-100 buckets...? Plus then no one would ever have over 7 buckets. I'm not sure that sounds useful, so I'm probably not understanding it correctly. It seems like having hte same buckets for everybody gives a common reference point for comparing artists. Having different buckets for different artists would seem to remove that common reference point.fluffy wrote:what I mean by using log_2 is that for example you'd need 4 fights to have 2 buckets, 8 fights to have 3 buckets, 16 fights to have 4 buckets, etc.
- fluffy
- Eisenhower
- Posts: 11267
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Archive data fun
If they only have one song then it's not that useful to know what their percentile distribution is to begin with. Anyway it's something tunable. Maybe it should be something based on confidence intervals, or something. I am not a statistician.
- Lunkhead
- Rosselli
- Posts: 8567
- Joined: Sat Sep 25, 2004 12:14 pm
- Instruments: many
- Recording Method: cubase/mac/tascam4x4
- Submitting as: Berkeley Social Scene
- Pronouns: he/him
- Location: Central Oregon
- Contact:
Re: Archive data fun
Anybody have any thoughts about the Google spreadsheet idea?
- fluffy
- Eisenhower
- Posts: 11267
- Joined: Sat Sep 25, 2004 10:56 am
- Instruments: sometimes
- Recording Method: Logic Pro X
- Submitting as: Sockpuppet
- Pronouns: she/they
- Location: Seattle-ish
- Contact:
Re: Archive data fun
Well, the point to having the stacked data was to aggregate multiple artist names for a single participant (and to see how much better each name did than the others). There isn't much point to having an overall stacked histogram for EVERY artist because it will by definition just be a flat line without much to differentiate individual artists within the columns.
If you just want to export all the raw data, why not provide it as a CSV?
If you just want to export all the raw data, why not provide it as a CSV?