Modeling online polarization: why shrinking acceptable speech can backfire and what might help
This paper uses a psychology-based computer model to test how different platform policies might prevent or reverse political polarization online. The main result is that simply shrinking the range of opinions that people see or consider acceptable — often called narrowing the Overton window — has little effect and can even start polarization. In contrast, nudging attention to neglected topics, enforcing existing social norms, and boosting visible, non-polarized public figures tend to work better at preventing or reducing polarization. Still, the authors find that once polarization is established, interventions often leave a hidden or latent tendency toward extremism when people have complex, multi-faceted identities.
The researchers built a simulation of online conversations grounded in ideas from psychology. In the model, people do not hold a single fixed opinion. Instead they have multiple identity-related attitudes and a short memory of recent exchanges. The team ran many large-scale simulations of thread-like conversations that are typical of platforms such as Reddit, Facebook or X. They then implemented realistic interventions at the level of platform rules or algorithms to see which approaches reduce or reverse polarized outcomes.
At a high level, the model captures two common social forces. People want to fit in with a group (in-group pull) and they also push away from groups they see as different (out-group push). Agents use recent interaction history to judge whether others are similar or not. The model also includes two ways of thinking about acceptable speech: a soft Overton window that makes people less likely to voice views far outside what they recently encountered, and a hard Overton window that represents the broader range a platform allows.
When the authors tested interventions, they found clear differences. Adjusting what the public considers acceptable (the Overton window) rarely prevented polarization and could even trigger it if done to "optimize" acceptability. Shifting attention toward under-discussed topics and increasing the social or reputational cost of breaking norms often helped prevent polarization, but were less able to undo it once it had taken hold. The most effective strategy in the model was increasing the visibility of influential individuals who model balanced, non-polarized conversation; this helped both prevent and reverse polarization in many simulations. Despite these successes, the model often produced latent extremism after polarization, especially when people had complex internal identities.