The Archive Project

Post lyrics to your song fight entries. If you have lyrics in search of music, post those here in the Lyric Marte thread.
User avatar
Spud
Hot for Teacher
Posts: 4770
Joined: Fri Sep 24, 2004 10:25 am
Instruments: Bass, Keyboards, eHorn
Submitting as: Octothorpe
Location: Seattle
Contact:

The Archive Project

Post by Spud »

With the move away from dumbrella, many of you have forseen the need to create a more stable archive of both song lyrics and reviews.

Some of you have been involved in independent projects working toward that goal, and I thought it would be good if we could combine our efforts and come up with something official.

If you have been involved with such a project, please post a synopsis here, with links to your interface if any, and we can begin this discussion.

I know that most of the work has been done in harvesting the lyric archive, but has anyone made any progress in harvesting the review threads?

SPUD
User avatar
fluffy
Eruption
Posts: 11028
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Post by fluffy »

Are the majority of review threads even available for harvesting anymore? Last I checked, all of the archived threads on UBB were never properly re-indexed, and it looks like rstevens didn't migrate the archived threads over to phpBB. :/
User avatar
fluffy
Eruption
Posts: 11028
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Post by fluffy »

crapcrapcrapcrapcrap

If we want to harvest the archived threads we have to do it NOW:
Holy Cow! The new Dumbrella boards are open! This messageboard shuts down this weekend. Your account is waiting for you in the new world!
User avatar
Plat
Push Comes to Shove
Posts: 441
Joined: Sat Sep 25, 2004 5:54 pm
Instruments: teeth and other bones
Recording Method: cubase, native instruments, waves, izotope, ears
Submitting as: The Cow Exchange, Eat It 'n' Mattress
Location: Green Bay, WI
Contact:

Re: The Archive Project

Post by Plat »

Spud wrote:I know that most of the work has been done in harvesting the lyric archive, but has anyone made any progress in harvesting the review threads?
Here's what I've done. It's not finished, and I have a history of leaving things half-complete, so you're welcome to steal anything you see.

1) Wrote a perl script to parse songfight's "archive.txt" file (containing vote counts, song names, mp3 URLs, art URLs, etc), and build database tables from it (a song table and an entries table).

2) Wrote a Web interface to this database, so you can explore the songs. The newer songs are colored, newest being red. You can also sort results by vote count, or by name. Ignore the "Dillfrog" header/footer branding, that's just to make it easier for me to code. It's not linked from anywhere else on the Dillfrog site. http://www.dillfrog.com/tools/songfight_explorer/

3) Wrote a perl script that downloads messages from the *OLD* messageboard (not the new one), for each known song review URL and song lyric URL. Note that the current archive.txt file has many dead links to old (non-archived) reviews. The reviews seem to exist, it's just a matter of finding the corresponding archive URL and updating this file to point to it. I can provide a list of these dead links if any group of people wants to resolve them manually. I tried, but got bored real fast. :)

4) Wrote a perl script that parses the messageboard HTML that it previously downloaded in item 3. From this, we get clean forum data, where for each post we know the message poster's username, ID#, and of course their message.

5) Configured the parser in #4 to compare the review posts to the list of known artists/songs, and automagically decide which band/song it belongs to. This is at least 50-66% efficient, but could be better and should probably be replaced with a manual process (imagine a Web page where you see the user's messageboard post, and you have to choose whether it's a lyric at all, and if so, pick the song and band name from a dropdown list). This wouldn't take much longer than 30-60 minutes to code.

So the end result is that you can see some lyrics in the URL from #2.


I haven't added review information to the database yet. I was planning on writing some code to automagically parse the reviews and split them up by band, though since I haven't tried that yet, I'm not sure how successful that'll be. It might be another semi-automatic process.

But I don't really want to do much before hearing what everyone else is doing, to avoid rework. If you want to use any of the parsers or tables or whatever, I'm happy to share. It's not all greatly commented right now, though, since I'm doing it mainly as a hack.

The other thing I'd like to do is write a client-side or mixed client-server application that cleans up the MP3s and ID3 tags (theoretically this could be done server-side, but that's risky) to reflect the band's name and song title, so they show up prettier on my Mp3 player.

It also would be cute to have a scripted review system (e.g. a checkbox to label the song as a "keeper", a textarea box to write your review, and the option to rate songs from 1-10 on production, performance and the song itself (or similar)). But anyway, I think I'm digressing (that's more of a system to enter new reviews than to archive/categorize existing ones).

Anyway, food for thought. What have others done? Does anyone want to try resolving the remaining review thread URLs?
User avatar
fluffy
Eruption
Posts: 11028
Joined: Sat Sep 25, 2004 10:56 am
Instruments: sometimes
Recording Method: Logic Pro X
Submitting as: Sockpuppet
Pronouns: she/they
Location: Seattle-ish
Contact:

Post by fluffy »

Once upon a time, I had that kind of free time on my hands... *sigh*
User avatar
Spud
Hot for Teacher
Posts: 4770
Joined: Fri Sep 24, 2004 10:25 am
Instruments: Bass, Keyboards, eHorn
Submitting as: Octothorpe
Location: Seattle
Contact:

Post by Spud »

I have saved approximately one half of the review threads into text files to be parsed into a database. However, the links are still broken to the following review threads:

120 mph
a day in the life
a promise is a promise
a spring of teal
a very unlikely occurrence
across the dusty plains
adam and steve
adaptation
alarm in the graduate school
alright alright
angry all over
back from brazil
back to the airplanes
been to china
between the rain
blueberry hassle
boarding call
brother in law
brown boxes
cellar door
crinkle binkle
dad im changing
danger bus
deuce
dinga da donga
direct to helmet
dirty chucks
dizzy spells
do it for captain
dog without warning
dont dance
dont forget to come to my house on wednesday
down to the atom
driving
elvis in space
eternity without you
experimental fashion
fear is free
feathers
fight the sea
fire bomb
friendly fire
funny enough for you
gettin all sweaty
goodbye monster
here comes my dragon
hey ruth
hold this for me
husky youth
i love you
i want to get better
i was only joking
in full effect
keep in touch
last date
lavender splendor
less of you
let it be
lizard wizard
lucky man
martians are going to eat us
merry christmas
midnight rendezvous
minister and man
mission accomplished
more than soup
morescience
moscow idaho
new planet
nicely toasted
on my block
paper cuts
paperback writer
pass fail
please the pig
poison pill
polaris
puppets dream
rabies
race to seven
red robot
red zero
repair my heart
romantic cheapskate
run free
save a pony
say the word
shes on my mind
shipwreck
silent pipe
sincerity machine
snooter
so kind stacey
spalding gray is missing
straw man
take a pill
talk about your feelings
texas
thanks for coming
thats not what i need
the chair we share
the spirit world
there are so many possibilities
third string
top drawer
troublemaker
twelve monkeys
under the horse
under the wagon
upcoming downtime
violet wants it her way
wtf
zombie son

If anyone wants to grab a chunk of these and dig up their URLs, it would be greatly appreciated. Thanks to 15-16 puzzle for doing a bunch of them already. Most of them will be found in the archived messages at the bottom of the OLD dumbrella board: http://www.dumbrella.com//ubb/ultimateb ... forum;f=36

Post any thread URLs you find here, and I will continue the downloading process.
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Post by j$ »

Spud wrote:I have saved approximately one half of the review threads into text files to be parsed into a database. However, the links are still broken to the following review threads:

minister and man
.
I'll help with this as I can, Spud - to start with, minister and man review thread got obliterated. There is no record of all the good reviews I got ;p

j$
User avatar
blindmime
A New Player
Posts: 23
Joined: Sat Sep 25, 2004 1:57 pm
Recording Method: Logic Pro
Submitting as: blind mime ensemble
Pronouns: he/him
Location: salt lake city
Contact:

Violet Wants It Her Way

Post by blindmime »

Violet Wants It Her Way

http://www.dumbrella.com//ubb/ultimateb ... 841#000006

Is this what you want? You don't need to know how many pages the thread is 'cos it's there in the first URL, right?

EDIT: go johnny go
Last edited by blindmime on Tue Sep 28, 2004 2:07 am, edited 1 time in total.
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Post by j$ »

Originally posted by 15-16 Puzzle and A Lurker (haven't checked them all)

RUN FREE
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000059

120 MPH
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000076

NEW PLANET
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000086

DAD IM CHANGING
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000093

ALRIGHT ALRIGHT
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000199

ADAPTATION
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000270

ON MY BLOCK
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000376

LAST DATE
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000439

BACK TO THE AIRPLANES
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000500

I WAS ONLY JOKING
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000542

STRAW MAN
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000596

VIOLET WANTS IT HER WAY
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000841

DRIVING
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000822

DIRTY CHUCKS
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000868

FUNNY ENOUGH FOR YOU
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000482

ALARM IN THE GRADUATE SCHOOL
http://www.dumbrella.com/cgi-bin/ultima ... 6;t=000309

HUSKY YOUTH
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000326

I LOVE YOU
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000014

BEEN TO CHINA
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000104

DOWN TO THE ATOM
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000148

TOP DRAWER
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000415

POISON PILL
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000429

POLARIS
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000551

ANGRY ALL OVER
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000798

DON'T DANCE
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000045


Don't exist in review threads (I'm pretty sure, anyway):
Eternity without you
Repair My Heart
Midnight Rendezvous



There was some debate about Shipwreck -
Lurker wrote

I chose Shipwreck's review thread, at least the one I have listed above, simply because that was the old thread on the Shipwreck page. But then I looked at it, and I realized there really wasn't anything on the thread except people bragging, talking about pedals, and Drew yelling at people. It doesn't seem to have a topic. Not to say that it isn't like the current review threads, but I think it's closer to a prefight-ish thing, and the review thread is over here:
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000048

The problem is that thread number two has exactly one review, 6 replies, and drew yelling at people (and vice versa). The important fact is that it has a review. So...I'm not sure which to put down, so I'll put down both and I guess Spud can pick.

SHIPWRECK:

ORIGINAL FLAVOR (possible prefight)
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000042

NEW AND IMPROVED (not really a thread)
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000048
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Post by j$ »

Last edited by j$ on Tue Sep 28, 2004 2:56 am, edited 1 time in total.
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Post by j$ »

Last edited by j$ on Tue Sep 28, 2004 6:05 am, edited 1 time in total.
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Re: The Archive Project

Post by j$ »

That's very cool btw. Nice work.
Eric Y.
Ice Cream Man
Posts: 1797
Joined: Sun Sep 26, 2004 12:36 pm

Post by Eric Y. »

j$ wrote:I can't find DIRECT TO HELMET
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000981
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Post by j$ »

tviyh wrote:
j$ wrote:I can't find DIRECT TO HELMET
http://www.dumbrella.com//ubb/ultimateb ... 6;t=000981
Ah yes. Thanks TVIYH! Damn JB and his pesky cool thread titles :)

So that should be all available from the review threads? Well, those that exist anyway. And the Gift of Music, two coverfight and two skongskirmish reviews. I didn't get the third SK cos I couldn't remember what the song was called / it wasn't obvious. And I left the Gearfights out because there never seemed a lot of reviews in those. And we have to draw the line somewhere I guess.

J$
Eric Y.
Ice Cream Man
Posts: 1797
Joined: Sun Sep 26, 2004 12:36 pm

Post by Eric Y. »

j$ wrote:I didn't get the third SK cos I couldn't remember what the song was called
http://www.dumbrella.com//ubb/ultimateb ... 6;t=001224
User avatar
Spud
Hot for Teacher
Posts: 4770
Joined: Fri Sep 24, 2004 10:25 am
Instruments: Bass, Keyboards, eHorn
Submitting as: Octothorpe
Location: Seattle
Contact:

Post by Spud »

Excellent work, Johnny.

I have downloaded all of these threads for the project, and will be updating the archive when I get a chance, in the hopes that dumbrella deems to leave the archives on line.

Two remaining issues related to the main fights:

The link to the Deuce thread is incorrect
There is no link for Do It For Captain

If someone could dig these up, I would be grateful.
j$
Beat It
Posts: 5348
Joined: Sat Sep 25, 2004 11:33 am
Instruments: Bass, keyboards, singin', guitar
Submitting as: Johnny Cashpoint
Location: London, Engerllaaannnddd
Contact:

Post by j$ »

Post Reply