Users trust AI as much as humans for flagging problematic content

UNIVERSITY PARK, PA. – According to Penn State researchers, social media users can trust artificial intelligence – AI – just like human editors to flag hate speech and harmful content.

The researchers said that users show more trust in AI when they think of positive characteristics of machines, such as their accuracy and objectivity. However, when users are reminded of machines’ inability to make subjective decisions, their trust is lower.

The results can help developers design better AI-powered content curation systems that can handle the large amounts of information currently being generated while avoiding the impression that the material has been censored or inaccurately classified, said S Shyam Sundar, James P. Jimirro Professor of Media Effects at Donald P. Bellisario College of Communications and Co-Director of the Media Effects Research Laboratory.

“There’s this urgent need for content moderation in social media and online media more generally,” said Sundar, who is also a fellow of Penn State’s Institute for Computational and Data Sciences. “In traditional media, we have news editors who act as gatekeepers. But online the gates are so wide open and gatekeeping isn’t necessarily feasible for humans, especially with the amount of information being generated. As the industry increasingly moves towards automated solutions, this study examines the difference between human and automated content moderators in terms of how people respond to them.”

Both human and AI editors have advantages and disadvantages. According to Maria D. Molina, an assistant professor of advertising and public relations at the State of Michigan and the study’s first author, people tend to be more accurate in assessing whether content is harmful, e.g. B. if they are racist or potentially lead to self-harm. However, people are unable to process the massive amounts of content that is now being generated and shared online.

On the other hand, while AI editors can quickly analyze content, people often distrust these algorithms to make accurate recommendations and fear the information could be censored.

“When we think about automated moderation of content, the question arises whether editors with artificial intelligence interfere with a person’s freedom of expression,” Molina said. “This creates a dichotomy between the fact that we need content moderation – because people are sharing all this problematic content – ​​and at the same time, people are concerned about AI’s ability to moderate content. So ultimately, we want to know how we can build AI content moderators that people can trust without compromising that freedom of expression.”

Transparency and interactive transparency

According to Molina, bringing humans and AI together in the moderation process can be a way to build a trusted moderation system. She added that transparency – or signaling to users that a machine is involved in moderation – is one approach to increasing trust in AI. However, trusting users to make suggestions to the AIs, what the researchers call “interactive transparency,” appears to bolster user trust even further.

To examine transparency and interactive transparency, among other things, the researchers recruited 676 participants to interact with a content classification system. Participants were randomly assigned to one of 18 experimental conditions to test how the source of moderation—AI, human, or both—and transparency—regular, interactive, or no transparency—affect participants’ trust in AI content editors could. The researchers tested classification decisions – whether the content was labeled as “flagged” or “unlabeled” because it was harmful or hateful. The “harmful” test content dealt with suicidal thoughts, while the “hateful” test content included hate speech.

Among other things, the researchers found that user trust depends on whether the presence of an AI content moderator evokes positive traits of machines, such as their accuracy and objectivity, or negative traits, such as their inability to make subjective judgments about human nuances Language.

Allowing users to help the AI ​​system decide whether online information is harmful or not can also boost their trust. The researchers said that when study participants added their own terms to the results of an AI-selected list of words used to classify posts, they trusted the AI ​​editor just as they would a human editor.

ethical concerns

Sundar said freeing people from checking content is more than just giving employees a break from a tedious job. Hiring human editors for work means those workers are exposed to hateful and violent images and content for hours on end, he said.

“There is an ethical imperative for automated content moderation,” said Sundar, who is also director of the Penn State Center for Socially Responsible Artificial Intelligence. “There is a need to protect human content moderators – who provide social benefits in the process – from constant exposure to harmful content on a day-to-day basis.”

According to Molina, future work could look at how to help people not only trust AI, but also understand it. Interactive transparency could also be a key element in understanding AI, she added.

“Something really important is not just trusting systems, but engaging people in a way that they actually understand AI,” said Molina. “How can we use this concept of interactive transparency and other methods to help people better understand AI? How can we best present AI in a way that evokes the right balance between appreciating machine capabilities and being skeptical of its weaknesses? These questions are worth exploring.”

The researchers present their results in the current issue of the Journal of Computer-Mediated Communication.