I was referring to translations, which while being silly seem not that much of an issue. After all he provided the content in multiple languages (I know, I know)
'a user from the Tumbuka Wikipedia reported that they had initially felt "hope and joy that a small community had then gained another native editor", before finding out that this account had been a promotional sockpuppet.'
It looks like a user in the HN thread noticed the irregularities on the Italian Wikipedia [0] and started the deletion discussion [1] that the article credits with kickstarting this investigation.
[0]: https://news.ycombinator.com/item?id=44035222
[1]: https://it.wikipedia.org/wiki/Wikipedia:Pagine_da_cancellare...
Props to whatever HackerNewsian (YCombinist?) took the time to chase all this down and do this fascinating writeup! You will be remembered in /r/TodayILearned posts every few months for many decades to come, no doubt.
But for other subjects, for example science and mathematics, it does a huge disservice to non-English readers: it means that their Wikipedia is second-rate, or worse.
Wikipedia should, in science, mathematics, and other subjects that do not have cultural inflection, use machine translation so that all articles in all languages are translations of the same underlying semantic content.
It would still be written by humans. But ML / LLMs would be involved in the editing pipeline so that people lacking a common language can edit the same text.
This is the biggest mistake Wikipedia's made IMO: it privileges English readers since the English content is highest quality in most areas that are not culturally specific, and I do not think that it's an organization that wants to privilege English readers.
https://zh.wikipedia.org/wiki/%E7%BA%BF%E6%80%A7%E4%BB%A3%E6...
The problem, as I pointed out, is that readers of non-English pages are accessing a second-rate Wikipedia. For example: https://zh.wikipedia.org/wiki/%E8%B2%9D%E7%A5%96%E5%AE%9A%E7...
I prefee my Wikipedia to remain 100% human generated quality information over garbage AI slop content, which is already abundant enough on the internet.
https://zh.wikipedia.org/wiki/%E8%B2%9D%E7%A5%96%E5%AE%9A%E7...
It reads "Bézout's theorem is a theorem in algebraic geometry that describes the number of intersections of two algebraic curves . The theorem states that the number of intersections of two coprime curves X and Y is equal to the product of their degrees."
which is perfectly good English. The problem is that that is the entire page! It is thus woefully inadequate in comparison to the English page:
?
The point is that the quality of a wikipedia page is positively correlated with the number of editors working on it.
"This comment is well-meaning, but it is both naive and technically flawed in several key ways. Let’s unpack why it's wrong and even counterproductive, especially when it comes to topics like science and mathematics." ...snip snip... "TL;DR: The comment is naive because it overestimates the capabilities of machine translation for precise scientific knowledge, underestimates cultural context in science/math, and proposes a solution that would undermine Wikipedia’s decentralized, community-driven model. It wrongly frames linguistic diversity as a weakness instead of a strength."
1: https://gist.github.com/numpad0/2fcf3e61d57f07d8e3a65743a43b...I find it interesting that the whole scheme might not have been noticed had he been more modest and not tried to translate the pages into rare languages. We don't know the motive, but if it was self-promotion, these additional languages were presumably of negligible value yet risked the scheme.
Sadly because it's 2025, he has a lot of competition for the award of "most insufferable douchebag".
It's quite unlikely for anybody to stumble upon any given English-language Wikipedia article by chance, given that there's literally billions of them now - therefore, the promotional value of having a Wikipedia article on something even in a popular language is negligible. However, by spamming all the Wikipedias, and having this "scheme" discovered, Woodard created a situation where he is widely reported on as the artist that spammed Wikipedia, and has therefore received the five minutes of fame that he so desperately wanted.
If he had stuck to spamming the English Wikipedia, would he have ended up on the frontpage of HN?
Quietly having all these articles might be personally satisfying in some way, but his obvious appetite for fame or notoriety points toward him wanting the scheme to be exposed. In fact I would not be entirely surprised if he somehow instigated the discovery of his activities.
After a full month of coordinated, decentralised action, the number of articles about Mr. Woodard was reduced from 335 articles to 20. A full decade of dedicated self-promotion by an individual network has been undone in only a few weeks by our community.
class Namespace(IntEnum):
MEDIA = -2
SPECIAL = -1
ARTICLE = 0
TALK = 1
TEMPLATE = 10
PORTAL = 100
PORTAL_TALK = 101
TEMPLATE_TALK = 11
DRAFT = 118
DRAFT_TALK = 119
HELP = 12
MOS = 126
MOS_TALK = 127
HELP_TALK = 13
CATEGORY = 14
CATEGORY_TALK = 15
USER = 2
USER_TALK = 3
WIKIPEDIA = 4
WIKIPEDIA_TALK = 5
FILE = 6
FILE_TALK = 7
TIMEDTEXT = 710
TIMEDTEXT_TALK = 711
MEDIAWIKI = 8
MODULE = 828
MODULE_TALK = 829
MEDIAWIKI_TALK = 9
Wikipedia is a donwright fascinating technical environment once you find the rabbit hole. Shoutout to their purpose-built version control site[1] and their brand-new SWE-focused project "WikiFunctions"[2], the first new wikimedia project in a decade!...which, while we're at it, brings the total to 18: wikipedia, wikibooks, wikinews, wikisource, wiktionary, wikiquote, wikiversity, wikivoyage, wikidata, wikifunctions, mediawiki, commons, species, foundation, meta, incubator, and phabricator. Ok I'm done with fun facts, I swear!
I don't use LinkedIn but when I stumble upon someone's page, I often see testimonies from their work colleagues about them.
Often when I search for startups and their founders I can't find information about them on Wikipedia but I find it on Golden.
If it is without permission than it is illegal and people can sue otherwise web scraping is legal.
Wikipedia is supposed to be an encyclopaedia, which means it’s intended to come with some expectation of neutrality.
If you could edit your own page, do you really think it’d stay as factual and as neutral as possible?
Just make yourself a website.
And it isn't because of the self promoting described, but because of the response to it.
Deletionists are evil.
For example, the French article about David Hockney has a lovely Francophone twist in that the first few lines point out that he lived in Normandy for a few years, whereas Emglish Wikipedia buries the fact deep in the page. The page for VLC has a photo of the lead dev in the French page but no discussion of the plugin architecture. And so on. It doesn't seem unreasonable to me to assume that the pages in some languages might be particularly strong if the topic plays a bigger role in the culture than in the English-speaking world.
In Italian, Spanish, and Tagalog it's the scientific name of the animal.
This makes sense in languages (like Spanish) where an animal may have a lot of different names depending on the country, region, or dialect. If you look at the article for Pig[2], you'll see at least fifteen names listed.
[1] https://en.wikipedia.org/wiki/African_bush_elephant [2] https://es.wikipedia.org/wiki/Sus_scrofa_domestica
Previously, the ones trained on a thousand or more languages by Meta and Wycliffe used the Bible since it's the only complex, rich message translated to most, human languages. Which God said would happen to His authentic message. :)
https://interestingengineering.com/innovation/meta-used-bibl...
latexr•6mo ago
> I discovered what I think might have been the single largest self-promotion operation in Wikipedia’s history, spanning over a decade and covering as many as 200 accounts and even more proxy IP addresses.
decimalenough•5mo ago
If you want even more gruesome details, the story of how this all unraveled plus all sorts of info about Woodard, a positively creepy while supremacist, can be found on the English article's talk page:
https://en.wikipedia.org/wiki/Talk:David_Woodard/Archive_1
https://en.wikipedia.org/wiki/Talk:David_Woodard
And with this anomaly removed, the list of articles in the most languages is back to what you'd expect: the top 10 is all large countries and Wikipedia itself.
https://en.wikipedia.org/w/index.php?title=Wikipedia:Wikiped...
brabel•5mo ago
drdeca•5mo ago
opan•5mo ago
latexr•5mo ago
I did, yes, that was a typo. I did notice it after the edit window was closed but the submission hadn’t had any traction so it felt silly to reply to my own comment to correct it.
Glad the submission was resurrected, I think it deserves it. My original comment was precisely to convince people to give it a read.
ViscountPenguin•5mo ago
madcaptenor•5mo ago
Another suspicious one on that list: the city of Kurów in Poland, population 2,725.