Danbooru

Danbooru2019 mirror/dataset

Posted under General

Hi everyone. I'm releasing the third edition of my Danbooru mirror/dataset, Danbooru2019:

https://www.gwern.net/Danbooru2019

This updates my previous Danbooru2018 (topic #15864) through 31 December 2019. It can be downloaded via torrent or rsync. All formats are as before.

The dataset now contains ~2.7tb of 3.69m images with 108m tags.

Notable uses of Danbooru2018: https://waifulabs.com https://www.thiswaifudoesnotexist.net/ https://selfie2anime.com/ https://www.reddit.com/r/MachineLearning/comments/edzqp8/p_deep_danbooru_girl_image_tag_estimation_source/

I noticed while updating the summary statistics that the 's'/'q' tag grew more than 'e' and it shrank by a fraction of a percent. Was there a specific uploader or rule change responsible for that trend, possibly by uploading a lot of scenic images?

There haven't been any rule changes that would affect the types of content being uploaded. It's not unusual though to have certain uploaders being responsible for posting a lot of content under a certain tag they have a strong interest in.

1