← Home ← Back to /g/

Thread 107162569

9 posts 8 images /g/
Anonymous No.107162569 [Report] >>107162853 >>107162876 >>107162893 >>107162895 >>107162913
AI moderation
How realistic is the idea of an AI moderator that has to approve every post before it goes live?
Anonymous No.107162829 [Report] >>107162915
why throw ideas like this into the universe, op
Anonymous No.107162853 [Report] >>107162893
>>107162569 (OP)
Stupid idea. AI moderators objectively suck, they get false positives frequently
Anonymous No.107162876 [Report]
>>107162569 (OP)
They are good judges in an extremely specific well-defined context.

using them to judge random posts is a bad idea
Anonymous No.107162893 [Report] >>107162925
>>107162569 (OP)
In my setup every single post not made by a moderator is thrown to a llamaguard model
>>107162853
>they get false positives frequently
I noticed this with emojis, if a user says nothing but emojis it *sometimes* gets flagged as unsafe. I'm currently working on replacing streamer.bot with nothing but python and thankfully twitch irc on_pubmsg tags include emoji character length and start/end location so i can effictively strip them from chat prior to moderation
Anonymous No.107162895 [Report]
>>107162569 (OP)
it's inevitable and is already happenning everywhere
all these models will ultimately start pruning posts even slightly deviant from norm and thus will converge to a complete "zero discussion possible" mode.
Anonymous No.107162913 [Report] >>107162925
>>107162569 (OP)
>How realistic is the idea of an AI moderator that has to approve every post before it goes live?
they already have that it's called llama guard(meta) and omni-moderation-latest(openai)
Anonymous No.107162915 [Report]
>>107162829
4chan already blocks content on upload
>System thinks your post is spam
>md5 blocks on certain images/webms
etc. etc.
Using AI would just make it more sophisticated.
Anonymous No.107162925 [Report]
>>107162913
>they already have that it's called llama guard
yepp. i just mentioned it and i use it for live content filtering >>107162893
i dont punish the sender if a message is marked as unsafe though, since if its unsafe nobody actually sees the message but me