And which are these universal human values and preferences ? Or are we talking about silicon valley's executives values ?
Depends how you look at it. I think a brand like Google should vet a mere one level down the supply chain.
to ensure the AI models are more aligned with Google's values and preferences.
FTFY
> "Massive privacy invasion: The core of modern adtech runs on tracking your behavior across different websites and apps. It collects vast amounts of personal data to build a detailed profile about your interests, habits, location, and more, often without your full understanding or consent."
It does not have to have anything ro do with cyberpunk. Corporations are not people, but if they were people, they would be powerful sociopaths. Their interests and anybody elses interests are not the same.
"AI raters at GlobalLogic are paid more than their data-labeling counterparts in Africa and South America, with wages starting at $16 an hour for generalist raters and $21 an hour for super raters, according to workers. Some are simply thankful to have a gig as the US job market sours, but others say that trying to make Google’s AI products better has come at a personal cost."
There are lots of jobs out there that suck and people do them anyway. Because the freedom that they supposedly have is not as free as you imagine.
How this industry managed to not grasp that meaning exists entirely separate from words is altogether bizarre.
I used to point to their reporting as models that my nation’s newspapers should seek to emulate.
How is this not a straight up lie? For this to be true they would have to throw away labeled training data.
It does so indirectly, so it's a true albeit misleading statement.
This doesn't sound as bad to me as the Facebook moderator job or even a call center job, but it does sound pretty tedious.
That’s sort of what I expect the Guardian’s UK online non-sub readers to make.
Perhaps GlobalLogic should open a subsidiary in the UK?
There are a whole lot of organizations training competent LLMs these days in addition to the big three (OpenAI, Google, Anthropic).
What about Mistral and Moonshot and Qwen and DeepSeek and Meta and Microsoft (Phi) and Hugging Face and Ai2 and MBZUAI? Do they all have their own (potentially outsourced) teams of human labelers?
I always look out for notes about this in model cards and papers but it's pretty rare to see any transparency about how this is done.
Given the number of labs that are competing these days on "open weights" and "transparency" I'd be very interested to read details of how some of them are handling the human side of their model training.
I'm puzzled at how little information I've been able to find.
To counter your question, what makes you think that's not the case? Do you think Mistral/Moonshot/Qwen/etc. are all emloying their own data labelers? Why would you expect this kind of transparency from for-profit bodies that are evaluated in the billions?
https://www.theverge.com/features/23764584/ai-artificial-int...
it explores the world of outsourced labeling work. Unfortunately hard numbers on the number of people involved are hard to come by because as the article notes:
"This tangled supply chain is deliberately hard to map. According to people in the industry, the companies buying the data demand strict confidentiality. (This is the reason Scale cited to explain why Remotasks has a different name.) Annotation reveals too much about the systems being developed, and the huge number of workers required makes leaks difficult to prevent. Annotators are warned repeatedly not to tell anyone about their jobs, not even their friends and co-workers, but corporate aliases, project code names, and, crucially, the extreme division of labor ensure they don’t have enough information about them to talk even if they wanted to. (Most workers requested pseudonyms for fear of being booted from the platforms.) Consequently, there are no granular estimates of the number of people who work in annotation, but it is a lot, and it is growing. A recent Google Research paper gave an order-of-magnitude figure of “millions” with the potential to become “billions.” "
I too would love to know more about how much human effort is going into labeling and feedback for each of these models, it would be interesting to know.
In that case, how is the notion of truthiness (what the model accepts as right or wrong) affected during this stage , that is affected by human beings vs. it being sealed into the basic model itself, that is truthiness being deduced by the method / part of its world model.
Congratulations, you just described most jobs. And many backbreaking laborers make about the same or less, even in the U.S., not to mention the rest of the world.
kerblang•1h ago
jkkola•1h ago
[0] https://youtu.be/0bF_AQvHs1M?si=rpMG2CY3TxnG3EYQ
thepryz•1h ago
This was one of the first links I found re: Scale’s labor practices https://techcrunch.com/2025/01/22/scale-ai-is-facing-a-third...
Here’s another: https://relationaldemocracy.medium.com/an-authoritarian-work...
lawgimenez•1h ago
benreesman•1h ago
But the next paradigm breakthrough is hard to forecast, and the current paradigm's asymptote is just as hard to predict, so it's +EV to say "tomorrow" and "forever".
When the second becomes clear before the first, you turk and expert label like it's 1988 and pray that the next paradigm breakthrough is soon, you bridge the gap with expert labeling and compute until it works or you run out of money and the DoD guy stops taking your calls. AI Winter is cold.
And just like Game of Thrones, no I mean no one, not Altman, not Amodei, not Allah Most Blessed knows when the seasons in A Song of Math and Grift will change.
jhbadger•1h ago