Auditing your own word censors.

Found an issue with the phpBB system here at NESdev? Use this forum to report problems.

Moderator: Moderators

Pokun
Posts: 1485
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Auditing your own word censors.

Post by Pokun » Fri Jan 10, 2020 3:45 am

[Split from this topic, from which a spam reply was deleted]

What the...?? What kind of filter would replace things by "my mom"!?

Anyway you can write down some checksum or something on your ROM and, if any other files are included, compressed archive to make sure nothing is altered at any point. To make sure the cartridge was made correctly I guess you have to dump a copy and check with the ROM checksum you made.

As for cleaning up unused code, I guess that's fine if the game is tested thoroughly after that. After final testing, no changes should probably be made to the ROM or you have to do over the testing again.

nocash
Posts: 1211
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: Auditing your own code.

Post by nocash » Fri Jan 10, 2020 6:10 pm

The filter does replace "via-gra" by "my mom", maybe other words, too. That might explain some confusing spam posts that had occured in past some years, like people offering to "buy (my mom) online"... it might also shed new light on posts from people telling that they were "building a desktop PC for (my mom)"?
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

User avatar
rainwarrior
Posts: 7822
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Auditing your own code.

Post by rainwarrior » Fri Jan 10, 2020 8:34 pm

cruddy 3V bootleg carts █RX SPAM█?

wow, that kind of silent replacement is pretty fucking disturbing. I do not like this at all.

It'd be one thing if it replaced blocked words with **** or whatever, but making substitutions like that erodes my confidence in all communication.

User avatar
tokumaru
Posts: 11744
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Auditing your own code.

Post by tokumaru » Fri Jan 10, 2020 9:23 pm

I always chuckle when I see spammers talking about their moms here! I love it! I'm pretty sure this filter has been in place for years, but I still don't know which specific words get replaced (in addition to the one that was just mentioned).

User avatar
Memblers
Site Admin
Posts: 3855
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Re: Auditing your own code.

Post by Memblers » Fri Jan 10, 2020 11:10 pm

That word filter is my fault, it's been in there for ages. It wrecks the spam URLs and was a funny/dumb way of bullying the spammers, in the style of "why do you keep hitting yourself?". The only ones with their mom was v_iagra, l_evitra, c_ialis (I remember now we did have a problem with it affecting "specialist" at one time, so it now it has wildcard characters removed). There's 14 in total, mostly related to pharmaceuticals and warcraft currencies.

Guess I could change the v_word at least, obviously it was never supposed to affect actual user posts, sorry about that.

tepples
Posts: 22014
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Auditing your own code.

Post by tepples » Fri Jan 10, 2020 11:18 pm

Without giving too much away, I'll explain all current word censors in general terms.

- Test patterns used by spambots
- Prescription drugs
- Services to gain an advantage in an MMORPG
- A couple specific reddit posts copied and pasted here, replaced with a sandwich ad
- The URL of one of blargg's old sites, replaced with a newer URL

nocash
Posts: 1211
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: Auditing your own code.

Post by nocash » Sat Jan 11, 2020 2:38 am

I like the replacement, it's quite funny, and it does hardly hit non-spam posts. The post about building a PC for my mom was probably meant as-so... although some people do probably really build PCs for getting paid with special goods ; )
If anything, it's possibly a bit unfair for real moms. Something that isn't a person might work better. My nostrils, half-eaten bananas, used chewing gums, my intimate thoughts about useless things, whatever.
Uhm, but please not ******, that makes me feel uneasy and reminds me about scary people whom say "f***ing cool sh**" instead of "fucking cool shit".
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

Pokun
Posts: 1485
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Auditing your own code.

Post by Pokun » Sat Jan 11, 2020 3:34 am

OK that explains everything, and makes sense considering how popular this place is among spammers. I do remember one time when a spammer was seemingly trying to sell his mom. It was very funny. :)

User avatar
rainwarrior
Posts: 7822
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Auditing your own code.

Post by rainwarrior » Sat Jan 11, 2020 10:58 am

Mainly I just want to be able to believe that what someone typed is what I'm looking at. If you wanna tell me that the substitutions hit 0% of real posts, then it's not a problem, but I just have to trust you on it.

tepples: I don't want to know the list of words to try and reverse engineer original messages from substituted ones. :P That just adds another layer of puzzle on top of the comprehension and trust problem. Also shouldn't those words be kept secret anyway, as per their function as an anti-spam factor?

Like whatever secret stuff you gotta do to fight spam is fine. If there's a reason that my mom is more suitable than **** then go ahead and stick with it, but if it's all the same to you I'd much rather know when the text I'm looking at has been altered than not know. I don't know if that's worse for spam security though, and I presume that's best discussed in private anyway.

tepples
Posts: 22014
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Auditing your own code.

Post by tepples » Sat Jan 11, 2020 2:12 pm

That's why I mentioned categories, to keep the actual words secret. The only category likely to show up in remotely on-topic posts is the substitution for one of blargg's old sites. If it'd help, I could add special punctuation to all substitutions.

User avatar
Memblers
Site Admin
Posts: 3855
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Re: Auditing your own code.

Post by Memblers » Sun Jan 12, 2020 3:13 am

I went in and added a note to most of the substitutions, at least the ones that have a non-zero chance of being used in a normal post. Now instead of spammers looking like they're engaging in some bizarre form of human trafficking, now it just looks like they're selling cruddy 3V bootleg carts █RX SPAM█. Which might be almost as offensive around here, heheh.

I almost forgot about that "building a PC for my mom" thread. Thinking about it, I do remember doing a double-take when I first saw that. Fun coincidence.

User avatar
Bregalad
Posts: 7889
Joined: Fri Nov 12, 2004 2:49 pm
Location: Chexbres, VD, Switzerland

Re: Auditing your own code.

Post by Bregalad » Sun Jan 12, 2020 12:12 pm

rainwarrior wrote:
Fri Jan 10, 2020 8:34 pm
!cruddy 3V bootleg carts (word replaced by spam filter)?

wow, that kind of silent replacement is pretty fucking disturbing. I do not like this at all.
I can see where you're coming from, but on one side, I like this funny home-made and NESdev specific spam fight. It's not creepy like Facebook or Google doing massive private (non state controlled) and possibly politically or economically biased censorship.

User avatar
rainwarrior
Posts: 7822
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Auditing your own code.

Post by rainwarrior » Sun Jan 12, 2020 6:36 pm

That post you quoted has said 3 different things over the course of this conversation without me editing the post. ;)

tepples
Posts: 22014
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Auditing your own code.

Post by tepples » Sun Jan 12, 2020 7:41 pm

Fighting spam would be a lot easier with a non-ancient version of MySQL that accepts non-BMP characters like 🐖

EDIT: Apparently non-BMP characters can be used in posts, but not in word censors. I was thinking of surrounding each replacement with a pig emoji on each side, but that won't work in the current setup.

User avatar
Gilbert
Posts: 384
Joined: Sun Dec 12, 2010 10:27 pm
Location: Hong Kong
Contact:

Re: Auditing your own code.

Post by Gilbert » Sun Jan 12, 2020 8:09 pm

This thread is really offtopic now, as everyone is talking about your mom forum spam filters. Maybe it's a good time to split the topic?

Post Reply