Danbooru

On the topic of replacing non-samples

Posted under General

Chiera said:

To be honest, that might have been written here, but there is something that is called Custom.
I think that should apply here, even if Evazion wrote something in that topic that is against it and multiple people, too.
The action of replacing Twitter with Pixiv wasn't done by Randeel alone but by a row of Approver and sometimes even by the Admins.
So I would definitely take into account that there was activity going on and that was that Twitter is replaced by Twitter.
I still agree, we shouldn't do that, but that's a different layer in that issue as far as I know.

Just because a number of users do it doesn't make it acceptable to. In fact, they could've been inspired by his actions. I remember in the first place that albert did it once on a post, and that basically served as the justification and impetus to start doing this sort of thing.

That is, until people started making mistakes that I literally had to point out to them for. That's the part that's not okay.

Just as a headsup, I've only ever done this once (which I undid just now by uploading post #2898077). If I have ever replaced one of your posts on request that you'd like to see undone, just let me know, albeit as I recall I have undone many replacements just to have their original uploader get credit for the 1up, such is the case with users like nonamethanks and DanbooruBlacklist.

Randeel said:

Anyone other than Mikaeri is welcome to answer that.

Great work fixing the problem you caused in the first place, but really though. Your open antipathy is unneeded. It's better if I let someone else explain to you anyway.

Randeel said:

So I have begun to undo a lot of my non-sample replacements and uploading them separately instead.
I'd still like to know why I can't replace visually identical images like post #2832935 or post #2844429 for example.
Why do we need duplicates of that?

Might as well say my two cents on this. Exact duplicates (as per Pixiv/Seiga images, same MD5 or something, I don't know) can't be uploaded anyway, so it's not like we can have two of the literal, exact same thing.

And, for example, having both the Twitter and Pixiv version of an image doesn't really hurt anyone; plus it gives a reason for parent/child posts to exist. And if we're talking about b-but it's a waste of space, I think we're already oh-so wasting enough space with things like half of the photo tag or the volumes of shit that should rather be in either Gelbooru or 3DBooru, for example.

If you want to understand that, fine. If you don't, that's fine too. It's just my opinion, do whatever you want.

Anyone other than Mikaeri is welcome to answer that.

I'm on no one's side and I don't care about the drama but wow.
That was a very nice thing to say. I totally don't understand why he's decided to quit.

Kilias said:

Might as well say my two cents on this. Exact duplicates (as per Pixiv/Seiga images, same MD5 or something, I don't know) can't be uploaded anyway, so it's not like we can have two of the literal, exact same thing.

And, for example, having both the Twitter and Pixiv version of an image doesn't really hurt anyone; plus it gives a reason for parent/child posts to exist. And if we're talking about b-but it's a waste of space, I think we're already oh-so wasting enough space with things like half of the photo tag or the volumes of shit that should rather be in either Gelbooru or 3DBooru, for example.

I don't buy this "space" argument anyway.
We are an all expanding website that gets at least 1.000 new posts each day. If you have 1.001 or 999 because of one duplicate really doesn't hurt all that much. It only starts hurting f we start mass uploading western, furry, photos of cosplay or guro art which is why they are restricted to a very low amount each day.

As a matter of fact, those upload pairs that @Type-kun mentioned only have a difference in metadata. I stay with my stance that metadata is mostly unimportant because you have to search for differences with a certain tool and frankly, this isn't all that interesting, not for Danbooru and also not for the user. I don't really see the reasoning behind this whole metadata thing. So yeah, the metadata might vary but if the image is still the same from a visual point of view (same resolution, same coloring, identical cropping) I see no harm there.

I wouldn't replace an image that is in one visual regard different. It doesn't matter if better or worse: The Twitter version might have lower resolution that the Pixiv version, but they should exist together because the we have two seperate images.
If the two images are visually identical then there is also no loss when one version gets deleted. This only matters when we have revisions but they aren't subjects for replaement, anyway.

Well, I'm not all against Randeel's way of doing things, but one should first and foremost look if there are differences in the visual aspects of an image because when you replace something you also replace the favorites and maybe that user doesn't like a higher resolution.

Chiera said:

I don't buy this "space" argument anyway.

That was just an example since more than one person back then have already said versions from different sources shouldn't be uploaded because they're wastes of space; let's not get heated up, please.

Randeel said:

So I have begun to undo a lot of my non-sample replacements and uploading them separately instead.
I'd still like to know why I can't replace visually identical images like post #2832935 or post #2844429 for example.
Why do we need duplicates of that?

In that specific situation, I personally wouldn't have issues with those replacements, but in terms of how we operate as a site, I think the practice shouldn't be allowed with using the image replacement tool. As stated by others it's very easy for users to misidentify a superior post and can readily result in replacing images with either inferior versions or with revisions. It's very easy as it is for approvers to misidentify a revision as a duplicate, and allowing for a user to change a post without good oversight or quality checks isn't a good idea. It is by far better to follow a system that errs on the side of caution and retains these images than it is to follow a system where we risk users replacing uploads that shouldn't have been replaced and that possibly may be overlooked until the original becomes irretrievable from other sources.

The current system needs to change, and while your examples display a situation where it probably should be allowed, as a rule for the site it would be far simpler and less confusing to simply not allow it at all. It's possible when the system changes that we could enable such replacements, but it seems like a lot of extra work for no real gain than simply retaining both copies of the image.

Yeah... it'd be really great if there was a site or group of sites that were backing up everything that Danbooru has, just so that we don't potentially lose an image forever... oh wait, there totally are a couple of sites. Gelbooru + SankakuComplex

Granted, they're not backing up the images that get replaced through the image replacement function though... yet.

BrokenEagle98 said:

Yeah... it'd be really great if there was a site or group of sites that were backing up everything that Danbooru has, just so that we don't potentially lose an image forever... oh wait, there totally are a couple of sites. Gelbooru + SankakuComplex

Granted, they're not backing up the images that get replaced through the image replacement function though... yet.

Yeah, sans the fact that those repost bots can't grab anything that's replaced using the post replacement feature. As a result they have a number of samples or unwanted images that currently are fixed on our gallery but not theirs.

That's one major crux of this whole thing. Take post #2681490 for example.

https://gelbooru.com/index.php?page=post&s=view&id=3630909
https://chan.sankakucomplex.com/post/show/5966914

Note how the full-size is only on our gallery and not theirs. This applies to ALL of the replacements, that never had their posts properly 1upped.

It is unintended, sure, and we don't have to be concerned about boards other than ours, but it would help if they knew. If other users on other boards knew "Hey, we have the best version of everything and want to share it with you but this feature you guys don't know about prevents that from happening!"

Mikaeri said:

Yeah, sans the fact that those repost bots can't grab anything that's replaced using the post replacement feature. As a result they have a number of samples or unwanted images that currently are fixed on our gallery but not theirs.

That's one major crux of this whole thing. Take post #2681490 for example.

https://gelbooru.com/index.php?page=post&s=view&id=3630909
https://chan.sankakucomplex.com/post/show/5966914

Note how the full-size is only on our gallery and not theirs. This applies to ALL of the replacements, that never had their posts properly 1upped.

I ask again: What does this have to do with us?
This looks more like their problem.

Chiera said:

I ask again: What does this have to do with us?
This looks more like their problem.

It's our problem too if you consider the fact that some people want to see shit that's expunged from our server but not theirs.

Take Ricegnat's recently expunged patreon rewards. They are still live on gelbooru and sankaku. Now imagine if Ricegnat's images were instead replaced on totally fine images from his pixiv/blog/whatever. Then when they get expunged (and this shouldn't happen to begin with), then what happens?

Unintended effects aren't inherently wrong; it should be our concern that we're screwing it up for them too. We're a site for users by users, and to my knowledge users don't like not having content they can't find, even when it's not on our board.

EDIT: Just to reinforce that, I see our booru as only one part of a moving puzzle that seeks to archive all the good work in the gallery. If something happens here, everyone should know about it. I also mentioned in the OP that it fucks with the IQDB and sauceNAO lookup servers.

Frankly, as an uploader, I feel it's much quicker and easier just doing things the usual old way with Twitter->Pixiv uploads. I accidentally uploaded post #2847819 without checking that there was a Pixiv version. I can just tell at a quick glance that the Pixiv version is not as compressed and go ahead and upload it. If I had thought of replacing it (and due to my feelings about the thread I mentioned earlier, I've felt that has always been a no-go, so it didn't even occur to me at all), I would have to spend extra time fretting over whether there might be some revision/difference that is getting lost. There probably isn't in this case: but this goes to show it makes things a little more complicated for careful replacers while also potentially causing lost content in the case of careless replacers. It seems like a lose-lose situation.

EB said:

Frankly, as an uploader, I feel it's much quicker and easier just doing things the usual old way with Twitter->Pixiv uploads. I accidentally uploaded post #2847819 without checking that there was a Pixiv version. I can just tell at a quick glance that the Pixiv version is not as compressed and go ahead and upload it. If I had thought of replacing it (and due to my feelings about the thread I mentioned earlier, I've felt that has always been a no-go, so it didn't even occur to me at all), I would have to spend extra time fretting over whether there might be some revision/difference that is getting lost. There probably isn't in this case: but this goes to show it makes things a little more complicated for careful replacers while also potentially causing lost content in the case of careless replacers. It seems like a lose-lose situation.

Pretty much... I don't think duplicate images bother the users as much as we may think it does. Some users want less, given, but the fact is that there's zero harm in having duplicates, while there is in trying to prevent them.

We should always delete and remove duplicates that we definitely don't want. That's a given. That's our job as approvers; to prevent the gallery from being flooded with lower quality finds. However, it isn't our job to retroactively "delete" duplicates just because we want less. A user might find a new post on a given day because it was posted to the index when they were active, when they would've never found it had it was just replaced on an older "inferior" post in the gallery. Again I will mention that this is a similar case to rantuyetmai's -- we simply don't delete duplicates as policy (unless they're samples/corrupted/unsourced/indubitably worse by considerable margin/whatever).

While I appreciate users that can spot the differences throughly and deliberately, we can't trust all of the approver staff to do the same consistently especially given the examples I've provided. A two-person check (or other preventative measure) still isn't in place yet, so we should treat very cautiously.

bump

Seems to me the current consensus is to not replace images from different sources unless one can indubitably claim that one is a sample of another. Say, an artist uploads an image sample to pixiv, which is not as rare an exception as one might think.

If this is "resolved" (to the extent we can say resolved), then I imagine it would be better to just edit the replacement notice and then warn everyone for now just not to do that sort of stuff. The need for explicit replacement is already rare as is nowadays, and the users that do it quickly learn from their mistakes. We can always take further action if they make it a regular habit to upload samples (since again, it's not necessarily our job to regularly fix the mistakes we allow them to make).

Of course, I still think it's fine for other cases (such as when a user uploads the wrong image, or seeks to reorder their uploads quickly after uploading), yet this action is indeed marred by the fact that it messes up stuff for repost bots and IQDB/SauceNAO if not done quickly enough.

By the way...what is with replacing scans?
We don't get them from 1st party sources anyway, so what if we upload a scan that is heavily artifacted but later a better scan of the same image appears.
I'd say they don't fall under the same policy than posts we directly get from the artists but since there is a thing of repost bots we might treat them the same..?

Chiera said:

By the way...what is with replacing scans?
We don't get them from 1st party sources anyway, so what if we upload a scan that is heavily artifacted but later a better scan of the same image appears.
I'd say they don't fall under the same policy than posts we directly get from the artists but since there is a thing of repost bots we might treat them the same..?

Scans are a strange thing, I admit. CodeKyuubi is actually one of the users that fixes scans regularly here, and sometimes I offer to replace his scan fixes out of wanting to preserve the original post info while removing an indisputably unwanted dupe (say, a scan artifact/dust fleck/crease/etc). I also fix errors on my own scan uploads sometimes (post #2861791).

But where exactly it is appropriate to, I still have to sort of 'clean' up and elaborate on. I don't necessarily feel very strongly about repost bots grabbing scans (since scans are up to variation on what could be considered the most ideal for preservation, and I have plenty I optimize and modify that I don't really consider upworthy). Nova Genesis, for example, regularly uploads scans that I'd consider less than preferable to keep, if only I suppose to at least have those scans somewhere searchable

It doesn't help that the scanning community for anime-style artbooks and tankoubons is getting smaller with recent developments in sadpanda and the like. While I do understand where there is perhaps a demand for it, it's a bit much to continue to make part of the gallery's focus on something like that. My interest in booru has always been in digital art, with artbook scans and finds I submit on a less than regular occasion.

EDIT: I'm also sorta tired but I did a thing to help:replacement notice for now. Also to account for one of kisaragi nana's supa crazy revision raves (in which one uploader wasn't fast enough to catch a pre-revision that he intended to upload, I fixed that for him).

EDIT 2: A thing

Updated

Well, here's a case study on scan replacement, sufficiently difficult that it should raise good discussion:
post #2629371
https://files.yande.re/image/d09c3f985dc471cf8c46a2e8dc960bd0/yande.re%20386614%20animal_ears%20halloween%20horns%20mishima_kurone%20nekomimi%20no_bra%20pantyhose%20rumia_tingel%20ryiel_rayford%20sistina_fibel%20stockings%20tail%20thighhighs%20wings.png

The one currently here is ridiculously artifacted. I'm sure it's been recompressed multiple times - if you can't tell by looking at it, consider that it's at JPEG level 81 and look at it again. Taking the replacement I propose and upscaling it, level 81 from that is twice the size, and level 33 is about the same size and has way less artifacting.

yande.re has done the same replacement, but they don't have this feature so they just deleted the old one. The old image isn't linked to on their site, even if you view the deleted post (at least without logging in), but is still available on their servers.

This is questionable as a replacement because technically the one we have here is higher resolution. However, that resolution holds artifacts rather than actual information. Given that it's been recompressed, I wouldn't be surprised if it's been upscaled as well, so that resolution may be fake anyway, but we can't know for sure. In any case, it looks worse than an upscale of the better source. You could make the argument that the replacement has a 10 times larger filesize. However, as stated in my first paragraph, you can make a much better JPEG at the same size. Seriously, the image here is pretty useless.

I've already uploaded a different scan (post #2906666). That one is visually distinct so it definitely needs to be separate. But for this one, it feels like a waste to have 3 copies when two of them are the same scan and one of those is utter crap - if someone did upload the replacement separately, I'd flag the old one.

So what should we do here?

☆♪ said:

Well, here's a case study on scan replacement, sufficiently difficult that it should raise good discussion:
post #2629371
https://files.yande.re/image/d09c3f985dc471cf8c46a2e8dc960bd0/yande.re%20386614%20animal_ears%20halloween%20horns%20mishima_kurone%20nekomimi%20no_bra%20pantyhose%20rumia_tingel%20ryiel_rayford%20sistina_fibel%20stockings%20tail%20thighhighs%20wings.png

The one currently here is ridiculously artifacted. I'm sure it's been recompressed multiple times - if you can't tell by looking at it, consider that it's at JPEG level 81 and look at it again. Taking the replacement I propose and upscaling it, level 81 from that is twice the size, and level 33 is about the same size and has way less artifacting.

yande.re has done the same replacement, but they don't have this feature so they just deleted the old one. The old image isn't linked to on their site, even if you view the deleted post (at least without logging in), but is still available on their servers.

This is questionable as a replacement because technically the one we have here is higher resolution. However, that resolution holds artifacts rather than actual information. Given that it's been recompressed, I wouldn't be surprised if it's been upscaled as well, so that resolution may be fake anyway, but we can't know for sure. In any case, it looks worse than an upscale of the better source. You could make the argument that the replacement has a 10 times larger filesize. However, as stated in my first paragraph, you can make a much better JPEG at the same size. Seriously, the image here is pretty useless.

I've already uploaded a different scan (post #2906666). That one is visually distinct so it definitely needs to be separate. But for this one, it feels like a waste to have 3 copies when two of them are the same scan and one of those is utter crap - if someone did upload the replacement separately, I'd flag the old one.

So what should we do here?

Just wanted to mention first offs that I like questions like these, since we should always explore different use cases (and this is a good extrapolation of the original posit in forum #138356).

If you ask me, personally I think scan quality should be a factor in a post's approval/deletion, so post #2629371 should be flagged (and/or perhaps moved to immediate deletion as per quality standards). However, it would serve as a poor subject for replacement given the original uploader had absolutely zero say in the new and better image.

Here's one thing to note: There will be some debate over this, but to some extent all scans are 3rd-party edits unless provided as a scan by the very artist or publisher itself. This is simply the nature of scans, since you're taking a print, an analog medium, and turning it into a digital medium.

My course of action would be the same as yours (as I've mentioned before). Someone should upload the replacement separately (as bad scans/upscales aren't really considered "samples", moreso third-party edits), and we should keep the better image even if it has a lower resolution (because qualitatively it is in fact, better).

EDIT: Sorry if it took a few days to respond. My opinion was already formulated when I read your post, I just didn't care to post it yet (as I was simply busy with other things, coding projects and other self-focused stuff like that at the moment).

1 2