Experiment Finished, Deep in Analysis
Results are promising but we aren't ready to release them
I haven’t posted since December which is… eight months! But the project is still alive and very well.
Endline survey
First, we we ran the endline survey in mid January, collecting 3739 responses. This is a lot less than the 5700 responses we got for the midline, but about the expected attrition. Note however that the number of people using the extension is a few thousand higher than this — once the extension is installed, it keeps returning data even if the user doesn’t check their email and answer our survey.
We ran the survey up until the morning of January 20th to ensure responses weren’t contaminated by what Trump did right after being sworn in. We decided to keep data collection running until February 10th. At that point the extension automatically uninstalled itself.
This marked the end of the operational phase of the experiment. I immediately went on vacation.
Data analysis
In March we began data analysis in earnest. This involved things that normally aren’t discussed much, like building out our team to handle the amount of work involved which in turn required ethics protocol amendments and inter-university data sharing agreements to allow our new collaborators access to this very private data. Meanwhile, merely extracting clean data files and creating data dictionaries took a month of work.
In addition to our relatively small amount of survey data, we ended up with a huge data set of almost 200 million events: posts seen, posts engaged with, posts written, and more.
We're now deep in analysis of this treasure trove of data — everything that ~6,000 social media users saw and did on three platforms for roughly three months before and three months after the Nov 2024 election.
Part of this is careful statistical programming to do the difference-in-difference estimations that we preregistered. That is painstaking manual work, and the draft appendix of detailed results is growing. Another part is running nearly 200 million text posts through a suite of classifiers, so that we can create quantitative summaries of what people actually saw — and ultimately how it affected them. None of this is conceptually complicated, but it’s a lot of data wrangling and it all has to be done to the highest standards of scientific integrity, privacy and security.
Results
Unfortunately we're not ready to publicly announce any results yet, in part because we are still making absolutely sure our calculations are correct, and in part because doing so would prevent us from publishing in major journals such as Nature or Science.
But I will say that what have seen so far is promising! We saw affective polarization reductions for all of the algorithms tested, and two of these reached statistical significance. We are not ready to publish numbers yet, but the effect size is in line with previous research — that is, small, but we would expect greater effects if such algorithms were deployed more widely (for example because of networks effects, and because we weren’t able to deploy on mobile).
The other big result is that while we saw the expected decreases in engagement for most algorithms and platforms, it looks like we robustly increased engagement in some cases. This mixed result is very significant from a policy perspective: it is not necessarily true that doing the right thing for society will have negative business effects.
The experiment worked. This is the first time anyone has directly shown that algorithmic changes on real platforms with real users can decrease political polarization.
We hope to have a draft paper done in the next few months.
A huge team brought us to this result! Besides the core research group, we are grateful for the contributions of the ranker teams (both the ones we ran and the ones we couldn’t), the judges, our cracked engineering team, administrative and operational support from several universities and non-profits, and of course our funders. Thank you all for your help — just a little more patience, and we’ll be there.



if we're not actively fostering consensus on solutions to our common problems, we're just adding to the noise. let's meet in The Big Middle and agree on what we're prepared to do together to keep our Republic. thebigmiddle.substack.com