XMB Forum Software

Smilies Defect?

miqrogroove - 8-28-2025 at 10:15 AM

There is something unexpected here:

ini_set('error_log', 'error_log');

Why is ...

Code:
')


... being processed as a smilie with the new database message encoding?

And why does an extra semicolon appear after the smilie?

flushedpancake - 8-28-2025 at 06:51 PM

') is being turned into ') it seems?

the smiley filter doesn't check things like spaces and it runs after everything else is processed.

hence, wink face with an extra semicolon after because your message already had a semicolon in it

---

what I don't get is why it explicitly seems to trigger with a single escape and a closing bracket. ending in a semicolon is not exclusive to &pos;.

miqrogroove - 8-28-2025 at 10:00 PM

Oh, you're right. The message was:

Code:
');


Probably I mismatched the smilie input or forgot to update it somewhere along the line. It's in the bug tracker now for further study.

miqrogroove - 8-28-2025 at 10:22 PM

Quote: Originally posted by flushedpancake  
what I don't get is why it explicitly seems to trigger with a single escape and a closing bracket. ending in a semicolon is not exclusive to &pos;.


This part is correct. One of the default smilie codes is set up like emoticon substitution...

Code:
;)


We just need to make sure encoded and non-encoded strings aren't getting mixed up internally.

miqrogroove - 8-28-2025 at 10:47 PM

Need to test on 1.9.12 what happens if a non-latin-1 character is followed by a ). Then we will know if this is a new bug or an old bug made worse by apostrophe encoding.

miqrogroove - 8-29-2025 at 03:54 PM

Confirmed on v1.9.12, the message

Code:
>) becomes &gt followed by a smilie image


In 1.10 we get the same >)

So this is an old bug.

The issue with not-latin-1 chars and other user-provided entity references is that they aren't currently processed or even reversible with the variety of charsets in use. I will have to parse around these, similar to how we treat HTML elements, to fix this bug.

miqrogroove - 8-29-2025 at 05:36 PM

Fixed this old bug with some regex wizardry. Everything else was fine.

flushedpancake - 8-30-2025 at 06:39 PM

Speaking of utf8 shenanigans and mysql : https://www.coderedcorp.com/blog/guide-to-mysql-charsets-col... -- I've always used utf8mb4_unicode_520_ci, but perhaps there's significant reasoning to allow different charset/colation selections for some technical reason or another. (The lack of support for Celtic/Brittonic languages in MySQL is a sad state of affairs.)

Also the site itself should have lang/encoding html metatags.

miqrogroove - 9-1-2025 at 12:59 PM

I agree the lang attributes are mostly absent.

https://bugs.xmbforum2.com/view.php?id=847

flushedpancake - 9-2-2025 at 08:27 PM

We should also probably set this in some places like the registration page.

<meta name="robots" content="noindex">

In case robots.txt gets deleted/overwritten by accident.

Some pages (when you search "powered by XMB" in double quotes in google) that should not be indexed are indexed like the registration pages, these are really easy targets for bots to do ****ed up things.