Title: Improving the auto-linker (SMF patch) Post by: PowerGlove on September 07, 2023, 03:26:56 PM There was a recent Meta thread (https://bitcointalk.org/index.php?topic=5465210) about the auto-linker sometimes failing to properly recognize URLs, and my name came up, so I decided to poke around and see if I could make sense of this bug.
As a recap, the auto-linker can sometimes be confused by leading spaces (particularly after a post has been edited, or quoted). For example, if you post the following (a sequence of URLs with an increasing amount of leading space, meant to showcase the problem): Code: www.thefarside.com Then it'll (initially) render correctly, like this: https://talkimg.com/images/2023/09/07/mIF0I.png But after an edit (even one that doesn't change anything), it'll render incorrectly, like this (i.e. links with 2/4/6 leading spaces no longer recognized): https://talkimg.com/images/2023/09/07/mIIJ5.png If the original post is quoted, then it'll render like this (i.e. links with 3/5/7 leading spaces no longer recognized): https://talkimg.com/images/2023/09/07/mIspz.png (And if the quoted post were edited, it would revert to links with 2/4/6 leading spaces no longer being recognized.) Pretty weird, huh? Now, I know there are a few places in SMF where whitespace conversions happen (that's part of the reason I did the [nbsp] patch (https://bitcointalk.org/index.php?topic=5440501), so that non-breaking spaces could be used in a way that wouldn't be undone by those conversions). So, I don't find this bug that perplexing (though, I was surprised that the bug persisted even after bypassing preparsecode() and un_preparsecode(); I had figured that something in one of those two functions was behind spacing not "round-tripping" correctly on SMF). Anyway, regardless of the ultimate source(s) of spacing getting silently messed with when you edit (or quote) a post, this particular bug is caused by the URL regexes in the auto-linker not properly taking this state of affairs into account (which is odd, because the e-mail regexes do). Specifically, the positive lookbehind assertions aren't aware of non-breaking spaces (and the second regex, the one for schemeless URLs, needs an additional tweak in order to prevent this bug from sometimes presenting during post preview). Here's the diff for @theymos: Code: --- baseline/Sources/Subs.php 2011-09-17 21:59:55.000000000 +0000 (Because this patch amounts to adjusting a pair of regexes in the BBCode parser, it will both fix this bug moving forward, and retroactively fix old posts that have unclickable links in them due to this issue, like this one (https://bitcointalk.org/index.php?topic=2934774.msg30174356#msg30174356).) Title: Re: Improving the auto-linker (SMF patch) Post by: theymos on September 08, 2023, 07:48:01 PM Done, thanks! What a monstrous regex...
I'm 95% sure that this change is correct, but if anyone notices this breaking any posts, let me know. Title: Re: Improving the auto-linker (SMF patch) Post by: Pmalek on September 09, 2023, 08:57:43 AM I am the one who opened that thread in Meta you are talking about OP. For testing purposes, I am going to link to it here with different spaces before the link to see if the bug has been fixed. I will also edit my post once without making any changes and then a second time with just a minor change to see if that affects anything.
Edit 2: Testing if the bug is gone https://bitcointalk.org/index.php?topic=5465210.0 Testing if the bug is gone https://bitcointalk.org/index.php?topic=5465210.0 Testing if the bug is gone https://bitcointalk.org/index.php?topic=5465210.0 Testing if the bug is gone https://bitcointalk.org/index.php?topic=5465210.0 Edit 3 and 4: It works Title: Re: Improving the auto-linker (SMF patch) Post by: cafter on September 09, 2023, 09:56:04 AM I come cross this bug from this thread (https://bitcointalk.org/index.php?topic=5465210) after reading some replies i replied to this thread in bitcoin discussion board (https://bitcointalk.org/index.php?topic=5457166.msg62818451#msg62818451) and the url is non clickable.
is this issue got not resolved yet? https://i.ibb.co/ygFxhwy/byui.png Title: Re: Improving the auto-linker (SMF patch) Post by: joker_josue on September 09, 2023, 01:33:47 PM Another great job @PowerGlove. Thanks!
I come cross this bug from this thread (https://bitcointalk.org/index.php?topic=5465210) after reading some replies i replied to this thread in bitcoin discussion board (https://bitcointalk.org/index.php?topic=5457166.msg62818451#msg62818451) and the url is non clickable. is this issue got not resolved yet? https://i.ibb.co/ygFxhwy/byui.png This is not a code issue or bug. The forum only recognizes links that start with http:// or www. That is, if I write bitcointalk.org it does not create a link. But, if you write https://bitcointalk.org or www.bitcointalk.org it creates the link. Title: Re: Improving the auto-linker (SMF patch) Post by: cafter on September 09, 2023, 01:41:38 PM <snip> Now i added "www." in beginning of the link and it became a nice clickable link. it was so confusing to understand what the exact problem was and what powerglove solved because i am not a coder or don't know much about technical things. thanks for clearing up the solution :) :) https://i.ibb.co/kB5MVYk/bvhm.png |