Researchers Investigate Challenges in Detecting Online Hate Speech by Artificial Intelligence

June 18, 2026 • Al Jazeera

Here’s a rewritten version of the article in a neutral newsroom style:

The United Nations has marked the International Day for Countering Hate Speech on June 18. The event comes as social media platforms are increasingly relying on artificial intelligence (AI) to detect and remove hate speech online.

According to the UN, hate speech encompasses any communication that discriminates against or incites violence towards a person or group based on their identity, race, ethnicity, religion, gender, sexual orientation, or disability. The term also includes non-verbal forms of expression such as images, cartoons, gestures, and objects.

A 2023 joint survey conducted by Ipsos and the UN Educational, Scientific and Cultural Organization (UNESCO) found that over two-thirds of internet users encountered hate speech online. The survey also revealed that LGBTQI individuals, ethnic and racial minorities, and women were among those most frequently targeted by hate speech.

Major social media companies have reported varying levels of success in removing hateful content from their platforms. Meta, which owns Facebook, removed 1.3 million posts from Instagram and 1.3 million from Facebook in the last quarter of 2025, compared to 7.4 million removed from Instagram and 5.8 million from Facebook in the fourth quarter of 2024.

TikTok reported removing 96.3% of all hate speech and content in the fourth quarter of 2025 before it was reported. In contrast, social media companies have increasingly turned to AI-powered content moderation systems to detect and combat hate speech online. These systems use large language models to analyze vast amounts of messages and apply rules or score thresholds to determine whether content is hateful.

A 2025 study by researchers at the University of Pennsylvania found that these AI moderation systems vary widely in their ability to identify and classify hate speech, with significant inconsistencies across different categories. The study evaluated seven AI moderation systems and found major differences in how they scored hate speech targeting various groups.

Source: Al Jazeera