normalizing facebook links ig

Posted under General

Probably something that could be submitted to github somehow if I cared enough to sign into that account, if someone else wants to snipe this go ahead.

Anyways facebook sources on here are often gross links like https://www.facebook.com/DateALiveFC/photos/rpp.142604822590116/776432789207313/?type=3&theater which has a whole lotta junk data that isn't super relevant. You can however hop over to the post and go through some boring back and forth pagination wizardry to get the URL https://www.facebook.com/photo/?fbid=776432789207313&set=rpp.142604822590116 which is still kinda gross, but can then get normalized by Danbooru to just be https://www.facebook.com/photo?fbid=776432789207313. As you may have noticed, all the data to get those second and third link versions are all there in that first link.

So, hypothetically, when given a URL formatted
https://www.facebook.com/PAGENAME/photos/bunchanumbers/photoid/?unecessarytrackingdata=anddisplayformatstuff
you can pretty easily get to
https://www.facebook.com/photo?fbid=photoid&set=bunchanumbers
or just https://www.facebook.com/photo?fbid=photoid
without even having to potentially get jumpscared by AI shrimp jesus posts.

i dont know what the bunchanumbers there is meant to indicate tbqh. it doesn't seem to match the internal numerical ID of the facebook page or anything.

I've just been doing this manually by going through the posts using my old facebook account from high school (gross) for like 20 posts or something, but considering there's a way to fanangle it out of the URLs as-is I'm gonna pause and let Daniel Booru decide if anything on the backend wants to happen in this regard. Or of some other random user feels like messing around with a mass-updating bot, idk.

I guess my counterargument to that is that instagram links are normalized to not include their usernames even when present in the URL (post #10094263) and unlike facebook links danbooru doesn’t even attempt to find the associated artist when the artist is included right there in the URL lmao.

Damian0358 said:

Regardless of anything, if we update Facebook/Instagram/etc. normalization, someone has to fucking update the examples given for them in bad link, because the one for Facebook we have right now runs into valid normalized sources.

That should be fixed now, I think. Maybe. It's not getting any false positives now in the query there (source:*fbcdn.net/* -source:*/hphotos* -source:*/scontent*), but there are still some false negatives (source:*fbcdn.net/* ~source:*/hphotos* ~source:*/scontent* bad_link)

I think the normalization of direct image URLs might not be working properly for more recent uploads... post #10085149 is public at https://www.facebook.com/photo.php?fbid=10223614231102757, and the directly uploaded link in the source is still valid, but the URL it converts to (https://www.facebook.com/photo?fbid=10223614231142758) is not correct.

Older uploads like post #548614 are accurate, though.

So I guess this means I have to spend more time on the AI jesus shrimp website to try and resolve some of these. Yippee

Updated by Ylimegirl

Following up here because it seems like it makes sense to me, but as I mentioned in forum #426215, a bunch of my Facebook uploads got switched (sometimes the same post multiple times) between bad id, bad link and a different source that was nearly identical to the one I uploaded it with. None of the posts (especially official art ones) are dead though. It still doesn't make sense to me how Facebook uploads are supposed to be sourced. Leaving the normalised link is apparently wrong, and so is changing it to a /photo?fbid=photoid link. What exactly am I supposed to do?

The proposed bad facebook link tag also has a wiki that says "use the uploader" but that results in a 400 code so that's wrong too.

WRS said in forum #426959:

Following up here because it seems like it makes sense to me, but as I mentioned in forum #426215, a bunch of my Facebook uploads got switched (sometimes the same post multiple times) between bad id, bad link and a different source that was nearly identical to the one I uploaded it with. None of the posts (especially official art ones) are dead though. It still doesn't make sense to me how Facebook uploads are supposed to be sourced. Leaving the normalised link is apparently wrong, and so is changing it to a /photo?fbid=photoid link. What exactly am I supposed to do?

Mildly confused by your confusion, but I'll go ahead and explain my logic. Let's use post #10081157 as an example.

I have no doubt that at the time of upload, the direct image link you provided at < https://scontent-yyz1-1.xx.fbcdn.net/v/t39.30808-6/480874746_1351545329310274_3378236370973032964_n.jpg?_nc_cat=100&ccb=1-7&_nc_sid=127cfc&_nc_ohc=CFD5sRlPFRYQ7kNvwFb90CC&_nc_oc=AdliW4B0ontAN3PSr84SnhH3bSAKzYGbOEG2Od-k0leh2NRXYv83WoabTT4PBvLa3lQ&_nc_zt=23&_nc_ht=scontent-yyz1-1.xx&_nc_gid=W0n4DqfmZnNzOd3cuJD7aw&oh=00_AfeXRj6WuHaqomlAZK83AfqXFr1NQNpIGc5z-bdHrAGeKw&oe=68EADE1F> was correct and active. However, as you can see by going to that link now, that is no longer the case. The source it normalizes to, <https://www.facebook.com/photo.php?fbid= 1351545329310274>, doesn't exist at all.

I have since changed the source to https://www.facebook.com/photo.php?fbid=1303612194103588&set=pb.100033644588999.-2207520000&type=3, which as you can see is active and okie dokie. I can't check the direct image URL as I'm on mobile right now, but I'm sure that too will expire just as the first one did.

Basically—the direct links to *fbcdn* assets are a ticking clock before the associated hash expires, and the URLs they normalize to rarely match the actual source (unless it's an older Facebook post like post #6467019, where the direct image hash is actually expired but the normalized source is still active).

I can't claim to understand the inner workings of Facebook's code, but I do know the actual IDs are usually pretty close numerically to their direct URLs, which makes me think it's something like the Twitter snowflake ID that has the date/time embedded into it somewhere with potential delays based on how long it takes between uploading the images and posting them, or just a case of everyone waiting in line and it just generally being close together.

Again if you want to find the not-dead sources for your Facebook uploads please share, but as it stands everything under user:WRS source:*fbcdn* doesn't currently point anywhere useful to me or other users who aren't signed into Facebook.

The proposed bad facebook link tag also has a wiki that says "use the uploader" but that results in a 400 code so that's wrong too.

you can barely tell i copypasted the wiki from the bad twitter link wiki

My confusion is that I don't really get how you're sourcing or determining sources for Facebook because every link I provided apparently seems to be arbitrarily correct and point to an alive post or not, both in cases of leaving the direct image link as the source for it to be normalised and in cases of replacing it with the page the post comes from.

I am thinking more along the lines of post #9371154 as an example. You changed this to the exact same initial link I had prior to changing it to the direct image link except with an additional URL parameter in it. What you haven't explained that well in the posts prior is how the source is supposed to be set where it won't supposedly normalise to a "dead" post.

But I guess this is telling:

Basically—the direct links to *fbcdn* assets are a ticking clock before the associated hash expires, and the URLs they normalize to rarely match the actual source

So it looks like I have to be changing the direct image link to the post page every time I upload from Facebook and ignore the normalisation, like Twitter (not that it has normalisation because it can't, as in, using the post page as the link).

ETA: Well, either way, would've been nice to know earlier that Facebook normalisation is all wrong. I'm checking now and all the remaining ones are actually deleted off the profile I got them from so no recovering the original links.

Updated by WRS

1