Danbooru

How do you know what posts are well-tagged? +downloading many images

Posted under General

Hi, I've got a couple questions.

1. Can anyone think of a way to identify posts which are "well-tagged"? I don't need 100% accuracy, but it would be nice if I could identify a sizable subset of posts that are more or less on the money as far as tagging is concerned.

2. I need to download a few hundred thousand images, with their tags and ratings. Is there any tool that could make that easier? Also, how much should I throttle my requests so that I don't put undue strain on the site?

1. You could try searching by general tag count. You can use the metasearch "gentags:", like this: gentags:>30. 30 is a fairly good number to search for well-tagged posts.

2. You can easily do that with the api. I don't think there's an automatic tool for that unfortunately that also gets tags/ratings. There's Hydrus Network but that requires you to also keep using that program so I wouldn't recommend it unless you plan on keeping the images in that program.

Common courtesy is generally 1 request per second, but Danbooru allows more, especially since you're Platinum level - see here: help:api. Writing requests are 4 per second for Platinum+ users, so I'd say the same applies to reads (like downloads, which is what you're asking about).

Edit: an example of api usage would be fetching https://danbooru.donmai.us/posts.json?tags=gentags:%3E30 - as you can see it offers a json with tag list, rating, and direct link to the full image, along with other info about each post. It's not hard at all to fetch as many posts as you want if you have even low level programming knowledge.

Fheadguy said:

2. I need to download a few hundred thousand images, with their tags and ratings. Is there any tool that could make that easier? Also, how much should I throttle my requests so that I don't put undue strain on the site?

Stick to one connection at a time. nonamethanks already mentioned your API limit of four requests per second. For downloading images, just pull everything over one connection with keep-alive and the server will throttle you accordingly. When I tried it, I was limited to one full-size image per second. My guess would be that thumbnails aren’t limited. If you have a fast connection, have the time to spare and want to be nice, you could voluntarily set a speed limit, especially if you download many large images.

1