frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

China Will Win at AI Because of Elsevier

3•markallenbattey•3h ago
AI models don’t just need raw text—they need deep, structured, peer-reviewed knowledge to reason about science, medicine, engineering, and more. But most of that knowledge in the West is locked behind paywalls run by publishers like Elsevier.

Elsevier doesn’t just sell access to human readers. It aggressively enforces licenses that prohibit text and data mining for machine learning. Even universities that pay for journal access often find their AI research groups barred from using that content to train models. The terms are clear: you can read the paper—but your model can’t.

Meanwhile, China ignores these restrictions. Its researchers operate with centralized access to nearly every major Western journal. In many cases, they use institutional mirrors, semi-legal repositories, or just direct scraping. Tools like Sci-Hub are quietly tolerated or integrated into internal systems. Whether legal or not, the outcome is clear: China’s models are learning from the full scientific corpus.

In the West, researchers are stuck paying Elsevier for access, and still told they can't use it for machine learning unless they strike special deals—which are expensive, limited, or flatly denied.

Everyone talks about compute. But the real long-term advantage lies in training data. If China is feeding its models every scientific paper ever published, and Western models are trained on Reddit, Wikipedia, and scraped blogs—who's really ahead?

We’ve put up massive walls around our most valuable content and then told our own researchers to innovate with scraps. Elsevier’s copyright model was designed for print-era publishing—but it now acts as a national AI tax.

If AI is the new electricity, Elsevier is the dam. And China built a bypass.

p.s. I changed the text, after seeing how the formatting here gets stripped.

Comments

incomingpain•3h ago
I cant say im that familiar with mandarin, but i bet tokenization of their language and understanding the language with their far more complex grammar is going to make their LLMs much more challenging to produce.

English speaking countries are going to have a mega advantage here.

VK538FY•13m ago
Chinese grammar, mandarin or whatever, is surprisingly simple. It's the characters that are complex.

The Mirror Math Spell-Book: The Definitive Compendium (First Edition Preprint) [pdf]

https://github.com/TristenHarr/goldenalgebra/blob/main/goldenscroll.pdf
1•tristenharr•1m ago•0 comments

Show HN: Marmot – Simple data catalog with powerful search and lineage

https://github.com/marmotdata/marmot
1•charlie-haley•1m ago•0 comments

Costco Membership Card Is Not a Replacement for Real ID

https://www.nytimes.com/2025/06/08/travel/costco-tsa-real-id.html
1•bookofjoe•2m ago•1 comments

Google battling 'fox infestation' on roof of £1B London office

https://www.theguardian.com/uk-news/2025/jun/09/google-foxes-roof-london-kings-cross-office
2•pseudolus•3m ago•0 comments

Italian citizenship referendum void after low turnout

https://www.bbc.com/news/articles/crr7vg1zdklo
2•debarshri•4m ago•0 comments

Chatan: Synthetic Dataset Generator

https://chatan.readthedocs.io/en/latest/index.html
1•cdreetz•5m ago•1 comments

The Xerox Alto, Smalltalk, and Rewriting a Running GUI

https://www.righto.com/2017/10/the-xerox-alto-smalltalk-and-rewriting.html
2•rbanffy•7m ago•0 comments

Remote MCP: Custom Integrations vs. Custom Connectors

https://remotemcp.substack.com/p/custom-integrations-vs-custom-connectors
1•joshwarwick15•9m ago•0 comments

Ruby on Rails Log viewer gem

https://github.com/silva96/log_bench
1•silva96•12m ago•0 comments

Show HN: Custom OAuth Connector Example for ChatGPT Deep Research

https://github.com/OBannon37/chatgpt-deep-research-connector-example
1•seanobannon•13m ago•0 comments

At the Bitcoin Conference, the Republicans were for sale

https://www.theverge.com/cryptocurrency/679685/bitcoin-conference-gop-takeover
4•pwatsonwailes•14m ago•1 comments

Untether AI Shuts Down, Engineering Team Joins AMD

https://www.eetimes.com/untether-ai-shuts-down-engineering-team-joins-amd/
1•fzliu•15m ago•0 comments

[RFC] MLIR Dialect for WebAssembly

https://discourse.llvm.org/t/rfc-mlir-dialect-for-webassembly/86758
3•matt_d•16m ago•0 comments

Limited Free Access for Course – Python for Signal Processing for EEG on Udemy

https://www.udemy.com/course/signal-processing-python-for-eeg/?couponCode=D73F259C550CC76E9F49
1•GaredFagsss•16m ago•0 comments

Physicality: The New Age of UI

https://www.lux.camera/physicality-the-new-age-of-ui/
1•navanchauhan•17m ago•0 comments

A new AI image modification app: Transformate, with version history & forks

https://www.transformate.uk/
1•themrchrisman•19m ago•1 comments

Kennedy guts CDC's vaccine panel of independent experts

https://www.nbcnews.com/health/health-news/kennedy-guts-acip-cdc-vaccine-panel-rcna211935
17•ceejayoz•21m ago•2 comments

IP-Nose: IP Geolocation Tool (C++ & Matrix-Style CLI)

https://github.com/Karim93160/ip-nose
3•karim7793•22m ago•1 comments

Junk bond sales surge as companies try to beat fresh tariff uncertainty

https://www.ft.com/content/c1bec33a-b466-45d5-b89e-4442dff6a0f7
2•petethomas•23m ago•0 comments

Antinutrient

https://en.wikipedia.org/wiki/Antinutrient
2•downboots•25m ago•0 comments

Who do you go to for advice?

https://www.avabear.xyz/p/who-do-you-go-to-for-advice
2•jger15•29m ago•1 comments

Scientists Grow Human Teeth in a Lab

https://www.techbusinessnews.com.au/news/scientists-grow-human-teeth-in-a-lab-for-the-first-time/
6•aillia•33m ago•3 comments

Show HN: I made a mobile app that turns your step count into a race

https://www.stepracers.com/
2•zipvoila•34m ago•0 comments

Brain Medicine - The calamity of a plastic spoon in your brain (micro plastics)

https://genomicpress.kglmeridian.com/view/journals/brainmed/1/3/article-p1.xml
2•ephbit•35m ago•0 comments

Tobacco CEOs testify: "Nicotine is not addictive" (1994)

https://www.pbs.org/wgbh/pages/frontline/shows/settlement/timelines/april94.html
14•andrewstetsenko•35m ago•1 comments

Show HN: RenderDay: A GPU-only render farm for Blender

https://renderday.com
1•sascha_•39m ago•0 comments

Triangulate – Turn-based triangle drawing game for 2 to 4 players

https://laisrast.github.io/triangulate-game/
1•laisrast•39m ago•1 comments

Containerization is a Swift package for running Linux containers on macOS

https://github.com/apple/containerization
50•gok•39m ago•9 comments

WWDC 25 Keynote Thoughts

https://taoofmac.com/space/blog/2025/06/09/2130
2•rcarmo•44m ago•0 comments

Why Icon Rebranded to Sodax and Abandoned Its Layer-1

https://www.coindesk.com/markets/2025/05/12/here-s-why-icon-rebranded-to-sodax-and-abandoned-its-layer-1
1•PaulHoule•44m ago•0 comments