Danbooru

Copying English plaintext to translated commentary fields

Posted under General

@WRS and I are having a disagreement on how to deal with commentary in which one field needs translation but the other doesn’t. I’ve been copying the English plaintext in these cases because if only one Translated field has content, the other Original field will be greyed out, making it harder to read. Correct me if I’m wrong, but I’m under the impression that the greyed out text is meant to allow the user to focus on the translated parts. If there’s nothing that needs to be translated, then greying the text out is worse than useless. Unfortunately, the site has no way to tell that no translation is needed, so it will grey the text out regardless. Duplicating the plaintext is the only way to prevent this without forcing users to use CSS scripts and the like.

The only downside I can think of is that it will cost a few dozen bytes per post, which is several orders of magnitude lower than the average upload on this site.

However, WRS thinks we should not copy the plaintext because since it was in English to begin with, it isn’t translated text. I don’t think we need to be that literal about this considering the intended purpose of the greyed-out text. Does anyone else have any thoughts about this?

tl;dr: Figuring out who's right here.

For reference and to restate/clarify, there are three things relevant to how I translate commentary on my own posts. While Blank User was doing a commentary run, several of my posts were also edited - I undid some commentary/tag changes applied consistent with how I operate translations.

1. When presented with a noun and its honorific, I would translate only the name and drop the honorific. I later learned this was wrong and started adding it on (or in places where I removed it, added it back).

2. Removed language commentary tags if the only untranslated text was "untitled", a noun or a preposition. While I still translate those at all times, it doesn't seem right to be to consider these a langauge commentary. I don't normally add these, and when Blank User was going over some of my posts, I removed them. post #8455553 isn't mixed-language commentary to me when the actual commentary is English and only the name is untranslated. For the same reason, post #8869776 (rating Q) isn't mixed when the title is "Untitled" (i.e. the absence of a title, which Danbooru's commentary fetcher renders as the word Untitled). I always translate this but never add a language commentary tag for it. Relevant: topic #28907

3. Adding already-English text to the translated section if the full text is not English. The small case in question which primarily drove this topic's creation was post #8862306 (rating Q) where the character's name was duplicated in the title section (Evelyn イヴリン). I added only the English name to the translated section because it would look silly to repeat the same text in the translated section (Evelyn Evelyn), which Blank User removed so there was no translated sections and I reverted that. Because I reverted it, Blank opted to instead duplicate the already-English text into the translated section, which I also removed. It felt like Blank was trying to force an all-or-nothing so there would be no dim-grey text, which meant either leaving the duplicated name alone or adding already-English text as "translated" commentary.

I realise I was definitely wrong on point 1 because it was already policy to retain honorifics. However, 2 and 3 make no sense to me, especially 3. I don't consider English text translated which is why I didn't feel it appropriate to duplicate that into the translated section. Blank User later sent me a DMail asking me about its removal, saying that we "need" to do it because of "accessibility".

We don't have a policy on this, so it felt to me like Blank user was trying to assert their own way of doing commentary as the way it needs to be done, with what felt like considerably flimsy justification. Using language commentary for untranslated nouns or for an actually blank field which we render with a word feels like taking language commentary tags to the extreme. Neither of us can really be proven right about the third point, it really comes down to common sense, but it's not really clear that would be. It seems like either direction could be fine - I have no real reason to remove it, and similarly asserting that we need to duplicate text to prevent greyed-out sections sounded to me like trying to treat a personal preference/accessibility pain point as policy.

It's not a big a deal as the topic or DMail exchange may make it seem. We're just mainly looking for a second opinion to settle our differences here (except for point 1, that was fully my wrong, and this wasn't highlighted in our exchange).

Updated

Blank_User said:

@WRS and I are having a disagreement on how to deal with commentary in which one field needs translation but the other doesn’t. I’ve been copying the English plaintext in these cases because if only one Translated field has content, the other Original field will be greyed out, making it harder to read. Correct me if I’m wrong, but I’m under the impression that the greyed out text is meant to allow the user to focus on the translated parts. If there’s nothing that needs to be translated, then greying the text out is worse than useless. Unfortunately, the site has no way to tell that no translation is needed, so it will grey the text out regardless. Duplicating the plaintext is the only way to prevent this without forcing users to use CSS scripts and the like.

This is correct and should be done for the reasons you've described (readability, mainly)

The only downside I can think of is that it will cost a few dozen bytes per post, which is several orders of magnitude lower than the average upload on this site.

Not really our concern unless told otherwise (redtails dumping 20tb into image assets may have more of an effect).

WRS said:

2. Removed language commentary tags if the only untranslated text was "untitled", a noun or a preposition. While I still translate those at all times, it doesn't seem right to be to consider these a langauge commentary. I don't normally add these, and when Blank User was going over some of my posts, I removed them. post #8455553 isn't mixed-language commentary to me when the actual commentary is English and only the name is untranslated. For the same reason, post #8869776 (rating Q) isn't mixed when the title is "Untitled" (i.e. the absence of a title, which Danbooru's commentary fetcher renders as the word Untitled). I always translate this but never add a language commentary tag for it. Relevant: topic #28907

Those are definitely mixed-language: Japanese name written in Japanese script, English info regarding paid rewards, or a Korean bit about the post being "Untitled". We're translating, not deciding what the author's intent was. There's a few cases where the "untitled" part gets auto-added through some mechanism, but even then it's not a decision to make, and in the end that text is still in a non-English language, so it should be tagged.

3. Adding already-English text to the translated section if the full text is not English. The small case in question which primarily drove this topic's creation was post #8862306 (rating Q) where the character's name was duplicated in the title section (Evelyn イヴリン). I added only the English name to the translated section because it would look silly to repeat the same text in the translated section (Evelyn Evelyn), which Blank User removed so there was no translated sections and I reverted that. Because I reverted it, Blank opted to instead duplicate the already-English text into the translated section, which I also removed. It felt like Blank was trying to force an all-or-nothing so there would be no dim-grey text, which meant either leaving the duplicated name alone or adding already-English text as "translated" commentary.

When text is duplicated like that it gets a bit more difficult. Though, we do have bilingual commentary now to indicate that a commentary says the same thing twice. In this example, adding that should probably do enough to signal readers who can't read the Japanese that it says the same thing. This is a bit cleaner than removing part of the text, since that involves removing part of the text (wow).

This is correct and should be done for the reasons you've described (readability, mainly)

I can concede this point based on this just being a logical conclusion. The point of translating, be it the content of a post via notes or the commentary, is to make it readable, and you can count accessibility as a factor of this. It's not necessarily that I have an issue with it, just feel it pointless since it's not real translated text.

Not really our concern unless told otherwise (redtails dumping 20tb into image assets may have more of an effect).

I didn't quote this in my initial reply, but agreed. This is as much a non-concern as someone suggesting that posting duplicates wastes storage space when the only thing that would increase is just a few bytes of metadata - the image itself and its sample versions for thumbnails and the like are already in media assets (duplicates are entirely irrelevant, but just a slightly relevant scenario that small things like a few text additions are not worth much consideration).

We're translating, not deciding what the author's intent was.

This is where I disagree. Author intent is not being considered here. Mostly in the case of Pixiv, if the author doesn't leave a title, it literally doesn't show. The only reason we have text that shows is because that's what the API fetches, and it's weird to consider this a form of commentary. If Danbooru code opted to ignore placeholder text like this, the conversation would take a different turn - and that turn is how I base my application of the tags. Same goes with only applying it when it's non-noun/preposition text (sentences as titles don't count, and I use language commentary tags for those cases).

Latter thing segways into the above. When we have two duplicated languages, I'm game (e.g. post #8603529, post #8652228).

Edit: Wrote this before I saw the other replies.

Regarding the language tags, I think it only makes sense to avoid using english_commentary on proper nouns that are usually the same throughout most languages. But there’s no way we can expect users to avoid using other language tags on them or anything else. Most users wouldn’t be able to distinguish a Japanese name from a Japanese phrase, for example. I think if the text is exclusive to its own language, then it should have the language tag.

I also don’t remember seeing any preposition-only commentaries, but they’d have the same problem.

My removal of the translated title of post #8862306 was not an attempt to fix the greyed-out text issue. It was more of a two birds one stone kind of deal. I did that because I thought the Japanese name should not have been removed as it was a loss of information, but had second thoughts afterward, so decided to leave it alone for now. But that meant there would be greyed-out text, so I copied the plaintext from the Original section.

I chose my words poorly in my initial DMail and unintentionally made it sound like I was claiming it as site policy. I apologize for that. I was not trying to impose my will on the site.

I don't disagree that they're in another language - that's not so much my issue as finding it ugly to load up on those metatags for things that are just purely nouns and/or prepositions, and feeling like that's just not what our commentary metatags were designed for, rather for things that are... actual commentaries.

WRS said:

This is where I disagree. Author intent is not being considered here. Mostly in the case of Pixiv, if the author doesn't leave a title, it literally doesn't show. The only reason we have text that shows is because that's what the API fetches, and it's weird to consider this a form of commentary. If Danbooru code opted to ignore placeholder text like this, the conversation would take a different turn - and that turn is how I base my application of the tags. Same goes with only applying it when it's non-noun/preposition text (sentences as titles don't count, and I use language commentary tags for those cases).

Latter thing segways into the above. When we have two duplicated languages, I'm game (e.g. post #8603529, post #8652228).

No but it still is in that case. Works being formally "Untitled" isn't uncommon, and it makes sense to just play it safe. This also leads into...

gzb said:

@WRS I have to disagree with your 2nd point there. The "キアラちゃん" in post #8455553 is still sufficiently another language. So is the "Untitled" in post #8869776. You have to consider that for someone with little or no knowledge of Japanese or Korean, those still look like alien script to them and if they have to ask what those words mean, it's still definitely mixed-language commentary.

... this being a very important part. You're not translating and categorizing for people who can already read it, but specifically for people who can't. It's pointless if they have to put the text into google translate to figure out why it wasn't tagged as something specific. I can't read Korean, but I can recognize Korean, so if I see "무제" I'll wonder why it's not tagged with Korean commentary. This applies broadly.

I mean, I guess so. I do see that angle. It just, ugh, feels horribly unwieldy that we would load up on commentary metatags for something so small. That's why I opt to leave them out when there's nothing more than just a name of something or untitled. And with 23K commentary changes operating under this exact methodology, it's not really easy to up and change based on another person's interpretations and disagreements when we lack a policy or official comment on it. And it wouldn't be right to me to end off at "I won't change my ways, other people should pick up my slack because I'm stubborn about it" - it looks to be that I should change the way I do it, but I'm not sure I'm ready or convinced to commit to that change.

The tags aren't being "loaded up for something so small", this is literally what those tags are for. At the point where you're not regarding "キアラちゃん" as Japanese commentary, the commentary tags are useless since they are leaving out a lot of Japanese commentary.

It's not so much that I don't regard non-blank commentary boxes as non-commentary, but I don't add language tags when it's purely a noun or preposition, because it doesn't feel like the author leaving a comment, just putting their name in their native language. I add it in every other case. Commentary always applies when the translated section has to be used and there's nothing left to translate, but languages? It doesn't really seem like their intended use to me.

At least it's not hard to fix them, just tedious, because that means scrawling back over all 23K edits and adding, which at least is less destructive than if somehow it were to be wrong and having to remove all of those instances. But I would really like to figure out the hard intention behind those tags and if I'm doing it right or wrong, officially.

Based on several discussions and clarifications of other such tags, now it's sort of apparent that I can't just take a small side of opinions at face value versus others. We've experienced this a few times already now where there's an unspoken and unwritten "intent" behind the reason why a tag exists or how it should apply, and there's been cases where I've taken a side that ended up being wrong, such as when it came to the application of ratings (e.g. topic #28443, forum #333449). It's more or less sort of the same happening here - because there's no written method on how to handle these, one side is saying "this must be the way it's done", and there's another (which is mine) that's saying "I don't really think so". At worse, the only thing lost here is that I'm - not maliciously - leaving out a tag that I don't think was the intended way to use it.

Again, I'm not fully disagreeing with your stance - only when it comes to using these for untitled, nouns and prepositions, is where I'm incredibly iffy on if they really apply. On a loose-but-related subject, I question intention based on the use of (and editing of) hashtag-only commentary and symbol-only commentary, where we once had explicit clauses on how to handle it when meta was involved. So it may be simple to you, but not to me. It doesn't feel as straightforward, and more that a certain group's way of applying them is asserting this is the way. There's always some exception, or extreme, that these tags are taken to.

Updated

WRS said:

tl;dr: Figuring out who's right here.

I prefer Blank User's approach with handling of the already existing English commentary, it is also what I do to my posts, simply because it looks better not having grayed out texts. For me grayed out text in commentary section is for texts that are still in need of translation in a partially translated commentary.

Yeah I guess that's fair enough. I figured that it's not real translated text so it doesn't make much sense to apply it back over, and I normally don't. I did say I found that part, with or without policy or official comment, considerably shaky, and it's not like there's anything lost or bad from doing it to begin with. Removing it when someone else did it to my posts though is definitely a first time, and basically why this thread exists, the language tags being a very secondary thing. I have no stake of defense here besides that it doesn't make sense to me, but I also don't have a problem adopting that way as well. Consider the matter of duplicated text resolved, and I'll reinstate the text I removed on the post in question.

Ajisaiii said:

I prefer Blank User's approach with handling of the already existing English commentary, it is also what I do to my posts, simply because it looks better not having grayed out texts. For me grayed out text in commentary section is for texts that are still in need of translation in a partially translated commentary.

Same, I will copy over the text on any post I see it not done, even on those with only URLs where you can't even spot the difference because it triggers my OCD. Though I'm also the person that always add missing commentary tags on posts where I see it.

To be honest I don't feel that strongly about this, despite my stance and/or disagreements. Point 1 is on me, and I can see the other angle (and brain scratchy feeling) of 2 which I mentioned that I'll give in there and reinstated the text (though I only did that on one, I forgot I linked more than just one example). I don't wholly agree with 3, but if that's just what the majority of comment gardeners do, then I'll make an effort to change over. Not a big deal to spend less than half a second to type out a few more things and choke down my disgust at having like 4 commentary-related tags just because a word or single character was not English.

1