How Wikipedia dodged public outcry plaguing social media platforms

Pete Forsyth on 2018-08-24

Image from Max Pixel, available for unrestricted reuse (CC0).

Everybody has an opinion about how to govern social media platforms. It’s mostly because they’ve shown they’re not too good at governing themselves. We see headlines about which famous trolls are banned from what sites. Tech company executives are getting called before Congress, and the topic of how to regulate social media is getting play all over the news.

Wikipedia has problematic users and its share of controversies, but as web platforms have taken center stage in recent months, Wikipedia hasn’t been drawn into the fray. Why aren’t we hearing more about the site’s governance model, or its approach to harassment, bullying? Why isn’t there a clamor for Wikipedia to ease up on data collection? At the core, Wikipedia’s design and governance are rooted in carefully articulated values and policies, which underlie all decisions. Two specific aspects of Wikipedia inoculate it from some of the sharpest critiques endured by other platforms.

Wikipedia exists to battle fake news. That’s the whole point.

Wikipedia’s fundamental purpose is to present facts, verified by respected sources. That’s different from social media platforms, which have a more complex project…they need to maximize engagement, and get people to give up personal information and spend money with advertisers. Wikipedia’s core purpose involves battling things like propaganda and “fake news.” Other platforms are finding they need to retrofit their products to address misinformation; but battling fake news has been a central principle of Wikipedia since the early days.

1. Wikipedia lacks “personalization algorithms” that get other kids in trouble.

The “news feed” or “timeline” of sites like Facebook, Twitter, or YouTube is the source of much controversy, and of much talk of regulation. These platforms feed their users content based on…well, based on something. Any effort to anticipate what users will find interesting can be tainted by political spin or advertising interests. The site operators keep their algorithms private. Each social media company closely guards its algorithm as valuable intellectual property, even as they tinker and test new versions.

That’s not how Wikipedia works. Wikipedia’s front page is the same for all users. Wikipedia’s volunteer editors openly deliberate the about what content to feature. Controversies sometimes spring up, but even when they do, the decisions leading to them are transparent and open to public commentary.

Search within Wikipedia is governed by an algorithm. But relative to a Twitter feed, it’s fairly innocuous; when you search for something, there are probably only a handful of relevant Wikipedia articles, and they will generally come up in the search results. Much of the work that guides Wikipedia search is open, and is generated by Wikipedia’s user community: redirects, disambiguation pages, and “see also” links. And the MediaWiki software that drives the site, including the search function, is open source.

The “Knowledge Engine” was an aspiration for Wikipedia’s future that would have put an algorithmic approach at the site’s core. See the Signpost’s excellent coverage.

But even so, an ambitious Wikimedia Foundation executive tried to take bold action around the search algorithm a few years ago. The “Knowledge Engine” was conceived as a new central component of Wikipedia; artificial intelligence and machine learning would have taken a central role in the user experience. The plan was hatched with little regard for the values that drive the Wikipedia community, and was ultimately scuttled by a full-blown revolt by Wikipedia’s users and the Foundation’s staff. Would an algorithm-based approach to driving reader experience have exposed Wikipedia to the kind of aggressive scrutiny Twitter and Facebook now face? Perhaps the problems Wikipedia dodged in that tumultuous time were even bigger than imagined.

The Wikimedia Foundation’s fund-raising banners are driven by algorithms, too. These spark frequent debates, but even the design of those algorithms is somewhat transparent, and candid discussion about them is not unusual. Those of us who care deeply about Wikipedia’s reputation for honesty sometimes find significant problems with the fund-raising messages; but the impact of problems like these is limited to Wikipedia’s reputation, not the public’s understanding of major issues.

2. Wikipedia isn’t conspiring to track your every move.

The browser extension “Ghostery” tracks the trackers. Note the difference between Wikipedia and the Wall Street Journal.

Most web sites collect, use, and sell a tremendous amount of data about their users. They’ve gotten really sophisticated, and can surmise an incredible amount of information about us. But that’s a game that Wikipedia simply doesn’t play.

In 2010, the Wall Street Journal ran a series on how web sites use various technologies and business partnerships to track all kinds of information about their users. Journalists Julia Angwin and Ashkan Soltani were nominated for a Pulitzer Prize, and won the Loeb Award for Online Enterprise. It’s still relevant in 2018.

Even back then, coverage of the issue managed to neglect one vital fact: Wikipedia, unlike all the other top web sites, does not track your browsing history. The site barely captures any such information to begin with, and its operators don’t share it unless legally compelled. When considered by the Electronic Frontier Foundation in their “Who Has Your Back” report (and I’ll claim a little credit for their considering Wikipedia to begin with), the Wikimedia Foundation has earned stellar marks.

Why Wikipedia’s principled design matters

At its core, Wikipedia is avoiding scandal by two core aspects of how it functions: it doesn’t try to predict and guide what you encounter online, and it doesn’t capture and analyze user data.

It might be possible for social media platforms to constrain their approach to those activities enough to satisfy their critics. Just like it might be possible for a heroin addict to limit their use enough to function in society, or for a cabbie to minimize the possibility of a car wreck through attentive driving.

But it would have been safer for the heroin addict to avoid using heroin to begin with, or for the cabbie to have taken a desk job. That’s how it is with Wikipedia. The site has relentlessly kept its focus on its main goal of providing information — even to the exclusion of chasing money from advertisers or by reselling user data.

One benefit of that clarity of vision among the designers and maintainers of Wikipedia is that we’ve been able to govern ourselves reasonably well. Which means the government and media pundits aren’t trying to do it for us.

— —

Followup article here: “Open” everything, and minimal financial needs: Wikipedia’s strengths

Originally published at on August 24, 2018.