Batch download a tag set

qqeor

I only want to batch download *one* collection of images for myself (I'm not gonna repost em or anything or abuse danbooru's BW). Are there any good tools for doing this?

I tried this:
http://code.google.com/p/danbooru-robot/

But it seems to be broken with the current version of Danbooru...

(If asking this is against some rule feel free to delete this)

Updated by a moderator about 12 years ago

Reply

15312134215123141

over 16 years ago

There is a nice command line one involving perl that you can automate.

Reply

RaisingK

over 16 years ago

Here's a roundabout way:

Get the Greasemonkey firefox extension, install the Endless Pages script.
Run the query, and put something on the Page Down key until you've loaded all of the results.
Dump this in the address bar and hit enter: javascript: var imageLink = document.getElementsByTagName("a"); for (i = 0; i < imageLink.length; i++) if (/\/post\/show\//.test(imageLink.href)) imageLink.href = imageLink.firstChild.src.replace(/preview\//,''); alert('done');
Use the DownThemAll extension to download all of the images linked on the page (thumbs now all point directly to the image thanks to the above).

Updated by RaisingK over 16 years ago

Reply

DschingisKhan

over 16 years ago

qqeor said:
But it seems to be broken with the current version of Danbooru...

Change the address on line 82 and it works just fine, actually.

This script could use some improvement, but it's a decent starting point at least.

Reply

15312134215123141

over 16 years ago

If the mods say its okay, I can put a mediafire of the program I've seen work extremely well.

Reply

albert

over 16 years ago

It's okay.

Reply

15312134215123141

over 16 years ago

I'm not exactly a programmer so if I give a direction that doesn't really make sense forgive me. I did not write this and I don't really know who did, but whoever they are they're pretty smart.

First, you must get perl. I downloaded Strawberry Perl, it works pretty well.

Tools you will need to unzip into one folder. http://www.mediafire.com/?nwui2mqj2mk

I make these into .bat extensions so that I can download many tags automatically, but you can use cmd.exe as well.

1. Navigate into the folder containing the tools using command prompt. (cd C:\booru is my location)
2. The syntax you will use is "perl booru.pl [booru location you will be using, in this case danbooru] [tag, can be more than one]"

Take a look at the example enclosed. the line "perl booru.pl danbooru hakurei_reimu -f={md5}" will download all images with the hakurei_reimu tag and name them with their md5 hashes. There are more modifiers you can add to make your life easier.

-f= (filename template, I just leave these to {md5})
-l=NUMBER (limits downloads to a certain number)
-s=NUMBER (downloads from a post # forward)
-r=NUMBER (Downloads based on rating. 1=safe 2=questionable 3=explicit)
-p= (sets path to save the files in instead of the current directory. a real lifesaver when you start to put this stuff in batch files. Remember to use quotes if you have any spaces in your path)
-u=username -w=password = These should be put in to avoid the error of "couldn't get at line" xxx or whatever.

example:
perl booru.pl danbooru ibuki_fuuko -r=3 -f=${md5} -p="C:\Pictures\Fuuko"

Will download all ibkui_fuuko tags that are rated explicit from danbooru and name them with their md5 hashes and place them in the C:\Pictures\Fuuko directory.

There is some documentation included with this, although its not very user friendly. I don't know everything about this program, but if you want to ask me questions you can PM me.

Updated by 15312134215123141 over 15 years ago

Reply

RaisingK

over 16 years ago

My way seems simpler after all. ^^ Though maybe that perl code has more flexibility with all those parameters.

Reply

15312134215123141

over 16 years ago

I like it because you can automate it to download an entire batch of tags.

I have like 70 folders they all need to go into, and i don't want to run a program 70 times.

I've used both, and while my explanation seems complicated it is really very simple once you put a few minutes into it. Using downthemall doesnt put them in directories and its hard to update.

Updated by 15312134215123141 over 16 years ago

Reply

DschingisKhan

over 16 years ago

Ahh, see in this case neither is greater or less: I can do the additional scripting on the fly as-needed with a one-liner in Bash. I feel sort of bad for Windows not having a proper shell.

Reply

qqeor

over 16 years ago

Thanks for the help, guys!

Reply

wanchan

over 16 years ago

DschingisKhan said:
I feel sort of bad for Windows not having a proper shell.

Powershell?

Reply

0xCCBA696

over 16 years ago

Or cygwin, if you're so inclined.

Reply

homeless homo

about 16 years ago

RaisingK said:
Here's a roundabout way:
Get the Greasemonkey firefox extension, install the Endless Pages script.
Run the query, and put something on the Page Down key until you've loaded all of the results.
Dump this in the address bar and hit enter: javascript: var imageLink = document.getElementsByTagName("a"); for (i = 0; i < imageLink.length; i++) if (/\/post\/show\//.test(imageLink.href)) imageLink.href = imageLink.firstChild.src.replace(/preview\//,''); alert('done');
Use the DownThemAll extension to download all of the images linked on the page (thumbs now all point directly to the image thanks to the above).

I just wanted to say that this method only works with files that have a .jpg extension. If an image is in .png format or anything else, it will show up as 404 not found because in the direct image link, the real extension is replaced with .jpg

Or maybe I'm just doing something wrong.

Reply

RaisingK

about 16 years ago

Ah, you're right, thumbnails are always jpg. Oh well, everyone settled on the alternative anyway.

Reply

AndarielHalo

about 16 years ago

"javascript: var imageLink = document.getElementsByTagName("a"); for (i = 0; i < imageLink.length; i++) if (/\/post\/show\//.test(imageLink.href)) imageLink.href = imageLink.firstChild.src.replace(/preview\//,''); alert('done');"This is not working on my firefox.

Reply

RaisingK

about 16 years ago

AndarielHalo said:
"javascript: var imageLink = document.getElementsByTagName("a"); for (i = 0; i < imageLink.length; i++) if (/\/post\/show\//.test(imageLink.href)) imageLink.href = imageLink.firstChild.src.replace(/preview\//,''); alert('done');"This is not working on my firefox.

FFS, stop bumping this old thread already and use the other tool. Homo already pointed out the problem with my quickfix. I could throw in a few more lines to generate different extension variations (png, gif, swf) for every thumbnail to cover all the bases, but it isn't worth the effort.

If it still matters to you, the site HTML has changed since then for whatever reason. Replacing "firstChild" with "childNodes[1]" fixes your problem.

Reply

15312134215123141

about 16 years ago

After a user prompted me about this, I realized this program can access more than one type of danbooru style site. Examples include gelbooru, sankakucomplex, 3d booru, etc.

Check in the list.pl file for a list of stuff that is supported. You should see something like this.

gelbooru=>sub{
use Booru::Gel;
Booru::Gel->new();
},

The bolded text is what you type in when you choose the type of booru to use. So it would look something like this.

perl booru.pl gelbooru ibuki_fuuko -r=3 -f=${md5} -p="C:\Pictures\Fuuko"

If the site you want is not there, you can probably edit the file/url to make it appear as long as you know the code or whatever that its written for.

I wish there was a post without bumping option.

Reply

AndarielHalo

almost 16 years ago

I don't mean to bump it, but this is the best and most reliable function for me to use. I tried the Strawberry Perl, but it wouldn't function; the bat would close as soon as I opened it, without doing anything.

I found that by "dumping" the javascript with "childNodes[1]" then doing it again with "firstChild" works with all pictures from any time.

I'd greatly appreciate the code for PNG's and the like, but even then, I can still manage---I will get a 404 for wrong links, and can just replace the tag with the proper JPEG/PNG/GIF

Reply

15312134215123141

almost 16 years ago

Sometimes it doesn't work (something not being able to use a string as a hash ref). I suspect that this is due to amateur programming, and if someone who knows perl can take a look at it I'd appreciate it.

Try restarting windows as well while using it.

EDIT: I keep restarting it and then suddenly it works.

A very odd program.

Updated by 15312134215123141 almost 16 years ago

Reply