Case Study: AI Moderation Effectiveness on Reddit

Over the past few months, we've worked with many subreddits across different niches to implement our AI moderation bot with their community. Here, we highlight some stories from contrasting subreddits.

Interesting in bringing our tech to your subreddits, and drastically reducing toxicity? Learn more about our free program.

r/TrueOffMyChest

r/TrueOffMyChest is a fork of the more popular, r/offmychest. When some users are banned on r/offmychest, they seek refugee in r/TrueOffMyChest to start anew. Naturally however, this, combined with the confession-style posts that r/TrueOffMyChest accepts leads to a very significant amount of hate, comments, and moderation overload.

r/TrueOffMyChest receives thousands of comments a day, placing it amongst Reddit's most active subreddits. To cut back on the hateful criticism, insults, and arguments on the subreddit, a lengthy series of AutoMod rules are used in conjunction with a team of 40+ moderators (which also places the subreddit on the list of largest number of moderators).

<4% False Positive Rate

92% Hate Content Removed

Overflow remained an issue though, so working with their team, we implemented our system to automatically remove and flag hateful content, 24/7, in addition to tracking and identifying repeat offenders for further action. ModerateHatespeech has proved highly effective for r/TrueOffMyChest, boasting a false positive rate of less than 4%, and removal of ~92% of all the hateful/uncivil content online, most of which was previously uncaught by its moderators (based on retrospective analysis).

“This bot is now out performing the top 5 most active mods on the sub combined. In October 2022 it found and removed over 10,000 comments in a single month. This has led to a much less toxic subreddit and is helping the mod team foster a more nurturing and positive environment”

– u/I_Am_A_Real_Hacker, moderator of r/TrueOffMyChest

r/deadbydaylight

Dead by Daylight is a popular horror-style survival video game. Its community-run subreddit, r/deadbydaylight, serves as a hub for asking strategy questions, sharing gameplay experiences (and memes), and everything else related to Dead by Daylight.

Moderating the ~4,000 comments the subreddit receives each day can be a challenge though, given how heated conversations about gameplay can quickly grow toxic. So, turning to ModerateHatespeech’s SATI (Subreddit Anti-Toxicity Initiative), they’ve found great success in both cutting down moderator workload and finding previously unreported comments.

The challenge with r/deadbydaylight is the presence of game-specific terms, like “killer” example, which refers to a core character in the game. In the context of Dead by Daylight, certain comments that would be otherwise rule-breaking – like “kill yourself on hook and move on” (where “kill yourself on hook” is a strategy) – are actually harmless.

Working with r/deadbydaylight though, we’re able to balance and mitigate the effects of ambiguity on flagged comments, simultaneously ensuring a low rate of false positives and detection of genuine harassment. In fact, of the comments we flag to the moderation team at r/deadbydaylight, <10% are reported by other users. Because our bot scans through all the comments submitted to a subreddit, we catch many comments that moderators/vigilant user never discover, and instantly report them before they spread.

“I have been moderating r/DeadByDaylight for almost two years now, and one of the biggest issues I have seen is ensuring the comments of posts remained civil and courteous; people are not very good at reporting vitriolic comments, and so a lot of really uncalled-for stuff flew under the radar.

ToxicityModBot has been an absolute boon for this team: it has managed to catch so many comments that we are able to action on quickly (frequently before the user who got the hateful reply even notices), and has made our jobs so much easier in the process. I cannot recommend this bot enough for your sub.”

– moderator from /r/deadbydaylight

r/PoliticalDiscussion

Reddit has a lot of politically oriented subreddits, but r/PoliticalDiscussion places a special emphasis on respectful, thought-provoking discourse. With chains of comments that span hundreds of words each, enforcing constructive, civil debates is extremely time-consuming, even if the sub only averages hundreds of comments a day. Plus, its well-known politics can be extremely polarizing!

By integrating our moderation system though, r/PoliticalDiscussion has been able to automatically remove rule-breaking content, seconds after its submitted (before users even see it) and identify more questionable content to moderators for further review.

<4% False Positive Rate

94% Previously Unreported

The results have been nothing short of stellar – with just 4% of removed comments being false positives – and successfully detection of more than 80% of all the hateful, rule-breaking content submitted to the subreddit (based on retrospective analysis) – of which, only ~6% of was reported by other users. Naturally, this has translated into significant time savings for r/PoliticalDiscussion’s moderator team, totaling dozens of hours a month.

“ToxicityModBot has been very helpful in catching incivility preemptively, with the removal feature being quite strong with a low false positive rate when set to 95%. Reporting items between 90-95% has also been helpful. There's no perfect substitute in language processing for evaluating context, but in terms of time and effort saved, ToxicityModBot has been equal to an additional human moderator.”

– u/The_Egalitarian, moderator of r/PoliticalDiscussion