Danbooru

Danbooru Archives

Posted under General

I don't think that's gonna happen. We have a strenuous relationship with copyright as it is--selling other people's work is crossing a significant line that I for one certainly won't be a part of.

A freely downloadable archive would be more feasible, but I don't think it's possible because of bandwidth issues. A torrent might work...

Superseeding doesn't take a lot of bandwidth; a torrent could work. I'd recommend doing it by segments consisting of fixed-length ranges of post IDs, to keep the torrents all about the same size.

But what are you going to do with a huge image dump? Without the tags and translation notes etc., the danbooru collection is a veritable ocean. Setting up a portable tag storage and retrieval system would require some programming effort.

EDIT: Yeah, and what Suiseiseki said. That torrent is, IIRC, from before danbooru's nervous breakdown a couple years ago.

400,000 images are useless without some sort of search function. I have no idea what people do with those Danbooru torrents. They probably download all 20 gigabytes and then look at 50 images and then forget about it.

If you can give me a better reason than "I want 400,000 images on my hard drive" then I will provide a dump. Examples include data mining, building a client app, building a mirror, etc.

I want a dump, I was actually going to request it at one point anyway; the primary reason I haven't done so is that I don't have enough free space at the moment. I could prepare it, however, if I knew what exactly is needed for a current snapshot.

By dump I mean image data and as much of the DB as possible (ideally everything but the private settings of the users). I want it for both an offline client, and experimenting with various features useful for the site. Not to mention having a backup of all the data in case danbooru keels over and dies again. I really wouldn't want to lose my favs.

Updated

I don't especially need the images, but I'd very much like a database dump. It'd make it easier to write patches for the site if I could test against real data. I'd also like to be able to do some statistics on the site, to find out things like average scores and fav counts.

At some point I was thinking of asking for a dump like this, but I never got around to doing what I was planning on doing. I was hoping to be able to locally mirror danbooru to do fancier queries than danbooru allows and/or would time out on.

No aliases or implications in the tags table, although it wouldn't take long to grab those as well. Stored in mysql, field names were taken from the XML API output. Moving it into a postgres DB loaded with Danbooru's schema could be done with a bit of scripting, I just don't have the time or motivation to do so.

Added aliases and implications tables. Archive contains perl scripts for retrieving each table, an .sql file containing statements necessary to create the MySQL tables, and a readme containing a short description of each script.

http://danboorubackup.googlecode.com/files/danboorubackup.rar

I have a full dump of the images as well, which currently consumes about 130GB of disk space. If several people want a copy of this and don't want to leech off the site, I can throw it all in a rar file and torrent the whole thing.

yosome said:
I have a full dump of the images as well, which currently consumes about 130GB of disk space. If several people want a copy of this and don't want to leech off the site, I can throw it all in a rar file and torrent the whole thing.

This might be a nice idea. Does anyone know if it's possible to update a torrent as it's running? It would be useful to be able to keep it up-to-date as it went.

This might be a useful and low-impact way of providing content for people who might want to experiment with mirror danboorus.

Edit: I'd also suggest against RARing it. It wouldn't do much given most of the content is already compressed, and it would make it harder to download specific content. If archiving is neccessary it should be done by time period, perhaps each day's uploads or something like that.

Updated

1 2 3 4