Danbooru

Cloudflare issues with JDownloader

Posted under Bugs & Features

Hello.

I don't know if it's the right place to ask about it, since I'm not used to the forums, so feel free to mvoe if it's not in the right place.
Since a few days, my JDownloader plugin is failing its API requests. I've asked the JDownloader team what's wrong and they said that everything's fine on their side and that I'm having loudflare issues.

Except that I can browse the website just fine. When I check my API logs, I do see that everything's fine on December 27th, with a nice and proper HTTP/1.1 200 OK. But today I see a HTTP/1.1 403 Forbidden along with something that looks somehow like a document that I'd get from a cloudflare check in my browser.

Anything can be done? Using a VPN doesn't seem to cut it.

The 403 Forbidden error means your user agent or IP range was put through a Cloudflare challenge for excessive scraping. We had some severe site performance degradation a few days ago due to scrapers going wild, so some measures were put in place to limit the damage that the worst ones were causing.
Unless you made tens of thousands of requests in a very small amount of time, the most likely case is that you're using an user agent that was also used by bad actors at the time of the adjustment to cloudflare rules.

If JDownloader allows you to change the "User-Agent" header it uses to talk with Danbooru, that should be the first thing to try. Make sure to put a sane rate limit however, to avoid getting blocked again.

Since I've tried from my own IP, from another country and yet another one... I'll go with the User-Agent used by JDownloader.
And my wild guess is that while I may use JDownloader myself to get some batches of 1-3k pictures at once here and there for my personal use, it wouldn't be far fetched to say that some of our dear website sucking AI people use JDownloader too for their deeds, or at least a tool that uses the same core.

Just great... As if I wasn't having enough trouble.

It'd be a good idea to log the repeated massive requests done with the API means, if it's not already done, to identify the culprits.

At any given time at least 15% of the site's traffic is bots, search engine crawlers, and scrapers. Whenever the site is too slow I go through and ban everyone who is scraping enough to show up on top of my leaderboards. If your bot or scraper doesn't use an API key or doesn't have a custom User-Agent identifying who you are, then you may be banned without warning. Set a custom User-Agent or use an API key so I know who you are and can yell at you if you're causing problems. Otherwise I have no choice but to ban you. Either your IP, your entire network, or whatever software you're using, which may be used by multiple people.

evazion said:

At any given time at least 15% of the site's traffic is bots, search engine crawlers, and scrapers. Whenever the site is too slow I go through and ban everyone who is scraping enough to show up on top of my leaderboards. If your bot or scraper doesn't use an API key or doesn't have a custom User-Agent identifying who you are, then you may be banned without warning. Set a custom User-Agent or use an API key so I know who you are and can yell at you if you're causing problems. Otherwise I have no choice but to ban you. Either your IP, your entire network, or whatever software you're using, which may be used by multiple people.

I'm trying to get something done for the User-Agent used by JDownloader right now.
Still sad that I got caught in all that mess (despite using my API key), because from what I can see, the client *is* respectful toward the servers regarding queries, and I wasn't using it when you had to face those overload issues. For instance, once it has the list of pictures that matches a request, it gets their links by batches of 1200 once every 2-3 seconds. If it's still too much, please let me know.

1