eyecreate Posted July 25, 2008 Share Posted July 25, 2008 I am trying to create a program that uses remixes downloaded from ocremix.org, but I needed a way to index all the remixes, so I created a python script that indexed the whole site(as of now). I now present an XML file with the name of every song, its "song number" on the website, the game the source was from, and the name of the mp3(can be diff from name). I hope that someone else may be albe to fins use with it. Once the software is done, I will release it here. http://www.box.net/shared/x55a1bvokw Eyecreate Link to comment Share on other sites More sharing options...
Dafydd Posted July 25, 2008 Share Posted July 25, 2008 You know, if you added the date each song was posted to the info you already have, one could make some really cool graphs showing how the number of remixes, games, and remixers has grown over time since day one. I think that would be really cool. Link to comment Share on other sites More sharing options...
eyecreate Posted July 25, 2008 Author Share Posted July 25, 2008 Great idea! I have updated the xml file link to include that info. Link to comment Share on other sites More sharing options...
atmuh Posted July 25, 2008 Share Posted July 25, 2008 someone comes along and does something exactly like this once a year just about Link to comment Share on other sites More sharing options...
djpretzel Posted July 25, 2008 Share Posted July 25, 2008 While I appreciate the effort & enthusiasm involved here, automated tools that gather all this data from the site by hitting page after page after page are actually BAD for the site - they cause a performance hit that's really unnecessary. The most efficient way of doing this is by pulling the data directly from the database, then gzipping the result and offering that for download. I'll look into something like that, and we'll be doing statistics and reports of our own pretty soon, but in the meantime please don't spider the site yourself for this information using scripts, etc. If I see too much of that, I'd have to put anti-leech measures in place to prevent it, which could adversely affect human users, which I don't want to do. Again, I appreciate the effort & enthusiasm, but the repercussions aren't worth it; I'll try to expedite coming up with our own, official solution. Link to comment Share on other sites More sharing options...
Recommended Posts