frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

[DeepSeek-OCR Rebuttal] Optical Context Compression Is Just (Bad) Autoencoding

https://arxiv.org/abs/2512.03643
1•atbhtunnm•1h ago

Comments

atbhtunnm•1h ago
Author here. I started this project after reading the earlier threads on DeepSeek-OCR [1][2]. I got really excited about "vision for context compression," but after reading their paper, a couple things were bugging me.

They show good OCR results (image → text), but the pitch is context compression (text → image → text). They never actually test that pipeline. So I implemented it: render text, compress to vision tokens, reconstruct. Then I compared against just compressing the text embeddings directly. Mean pooling (averaging embeddings in a sliding window) nearly matched DeepSeek-OCR. A small conv encoder crushed both.

Ok so fine, maybe vision isn't special for reconstruction. But maybe the path matters more than the destination. Do the representations learned through vision work better for language modeling? I finetuned the checkpoints from the reconstruction experiments for next-token prediction. Vision and mean pooling couldn't beat truncation, but the conv encoder could. I didn't do any architecture search. It just worked.

That said, this is preliminary work. I just wanted to answer the obvious next questions. So far, the findings don't support the "vision for context compression" narrative.

Happy to answer questions.

[1] https://news.ycombinator.com/item?id=45640594 [2] https://news.ycombinator.com/item?id=45658928

Device Activity Tracker: WhatsApp Activity Tracker via RTT Analysis

https://github.com/gommzystudio/device-activity-tracker
1•thunderbong•48s ago•0 comments

Using Git add -p for fun (and profit)

https://techne98.com/blog/using-git-add-p/
1•fixedprog•1m ago•0 comments

Show HN: MCPShark – Traffic Inspector for Model Context Protocol

3•mywork-dev•2m ago•0 comments

Microservices is the software industry's most successful confidence scam

https://twitter.com/dhh/status/1998785569468399819
3•dinvlad•4m ago•0 comments

HN Time Capsule

https://karpathy.ai/hncapsule/
1•__rito__•4m ago•1 comments

John Noble Wilford, Times Reporter Who Covered the Moon Landing, Dies at 92

https://www.nytimes.com/2025/12/08/science/john-noble-wilford-dead.html
1•digital55•5m ago•0 comments

Auto-grading decade-old Hacker News discussions with hindsight

https://karpathy.bearblog.dev/auto-grade-hn/
2•__rito__•5m ago•1 comments

Demonstrably Safe AI for Autonomous Driving

https://waymo.com/blog/2025/12/demonstrably-safe-ai-for-autonomous-driving
1•simonpure•5m ago•0 comments

Code-less vibe-"coding" – hallucinating apps one screen at a time

https://amongai.com/2025/12/10/hallucinate-any-app-one-screen-at-a-time/
2•danielmewes•6m ago•2 comments

New research shows RL may not help a model learn new basic skills

https://arxiv.org/abs/2512.07783
1•binsquare•8m ago•1 comments

Glyphosate safety article retracted 8 years after Monsanto ghostwriting revealed

https://retractionwatch.com/2025/12/04/glyphosate-safety-article-retracted-elsevier-monsanto-ghos...
3•robtherobber•8m ago•0 comments

Cube: Agentic Analytics Platform

https://cube.dev/
1•gk1•8m ago•0 comments

Valve: HDMI Forum Continues to Block HDMI 2.1 for Linux

https://www.heise.de/en/news/Valve-HDMI-Forum-Continues-to-Block-HDMI-2-1-for-Linux-11107440.html
7•OsrsNeedsf2P•9m ago•0 comments

Waymo: Delivering more for our riders in a year of growth

https://waymo.com/blog/2025/12/2025-year-in-review
1•hnburnsy•9m ago•0 comments

Star Types

https://science.nasa.gov/universe/stars/types/
2•belter•10m ago•0 comments

Show HN: TermKeeper – Track contracts, renewals, and notice periods with AI

https://termkeeper.com/
1•SNDRVH•11m ago•0 comments

Links for December 2025

https://www.astralcodexten.com/p/links-for-december-2025
1•feross•11m ago•0 comments

A woman who discovered black holes

https://newhumanist.org.uk/articles/6296/the-woman-who-discovered-black-holes
1•MaysonL•13m ago•0 comments

World Spider Catalog

https://wsc.nmbe.ch/
1•fi-le•13m ago•0 comments

Tourists required to give 5 years social media history to enter US

https://www.dailymail.co.uk/news/article-15369957/Trump-foreign-tourists-social-media-history.html
8•testing22321•14m ago•4 comments

Pg_ClickHouse: A Postgres extension for querying ClickHouse

https://clickhouse.com/blog/introducing-pg_clickhouse
1•spathak•14m ago•0 comments

Writing Leads to Thinking (and Not the Other Way Around)

https://www.historians.org/perspectives-article/how-writing-leads-to-thinking-february-2010/
1•bryanrasmussen•15m ago•0 comments

IDF Soldiers Fire on UN Peacekeepers

https://unifil.unmissions.org/unifil-statement-10-december-2025
5•a_paddy•17m ago•0 comments

Token‑Efficient Agents: Building MCP‑Heavy Agents Without Burning Tokens

https://codeagentsalpha.substack.com/p/tokenefficient-agents-building-mcpheavy
1•olegkozlov•17m ago•1 comments

Scouts by Yutori is now generally available

https://yutori.com/scouts
10•abhshkdz•22m ago•1 comments

A smart cup for wireless, biofuel-powered, sweat-based Vitamin C sensing

https://www.sciencedirect.com/science/article/abs/pii/S0956566325009777?via%3Dihub
1•PaulHoule•23m ago•0 comments

Study: ~250 documents is all it takes to backdoor an LLM

https://www.searchenginejournal.com/ai-poisoning-black-hat-seo-is-back/561217/
2•rezamoaiandin•24m ago•1 comments

Morning coffee may protect the heart better than all-day coffee drinking

https://www.escardio.org/The-ESC/Press-Office/Press-releases/morning-coffee-may-protect-the-heart...
1•nateb2022•25m ago•0 comments

I built a Grafana your support team can use

https://github.com/towlabs/dashfrog
1•mehdig10•25m ago•0 comments

"My self-awareness of my limitations is limited."

1•niklai•27m ago•0 comments