frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Idea) Autoregressive joint embedding predictor model

2•LarsDu88•4w ago
Originally posted on reddit (https://www.reddit.com/r/deeplearning/comments/1q8yfgw/idea_feedback_using_joint_embeddings_lejepa_to/), but in lieu of good forums for this kind of stuff, am reposting on HN for some feedback.

I've been brainstorming ideas recently, and one paper that caught my attention was Yann LeCunn's leJEPA paper. It claims to solve a large host of problems with joint embedding model training, and it had me thinking...

What if you simply replace the discrete tokenizer used by LLMs with joint embeddings, and make your autoregressive language model, a "predict the next latent embedding"

For example:

- Write some software to convert text to images where every 8x8 block (or maybe 16x16?) contains a character or whitespace. Can incorporate augmentations like jitter and font changes. - Train a leJEPA VIT model on generated text "images" using SSL to create embeddings from these "images"

- Freeze the leJEPA trained VIT embedding model, and use it as a frozen embedding layer for an autoregressive transformer based model that "predicts the next embedding"

- With the embedding model and the autoregressive latent predictor frozen, train a decoder that translates embeddings into discrete tokenized text.

I can see the following benefits:

- No discrete tokenizer for input

- Autoregressive latent predictor model quickly outputs full image scale concepts rather than individual discrete tokens and can be run asynchronously very quickly compared to the embedding -> discrete text model

- Cohesive multimodality built in... text-free images are still images that can result in latents, perhaps with finetuning on pure image datasets.

In my mind this would be more akin to how humans think - with far superior image recall than text sequence recall and thinking abstractly before speaking or typing language.

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
1•ravenical•2m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
1•rcarmo•3m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
1•gmays•4m ago•0 comments

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

https://www.bloomberg.com/news/newsletters/2026-02-03/musk-s-xai-merger-poses-bigger-threat-to-op...
1•andsoitis•4m ago•0 comments

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

https://www.youtube.com/watch?v=UNorxwlZlFk
1•lysace•5m ago•0 comments

Zen Tools

http://postmake.io/zen-list
1•Malfunction92•8m ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
1•carnevalem•8m ago•0 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•10m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
1•rcarmo•11m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•12m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•12m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
2•Brajeshwar•12m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
2•Brajeshwar•12m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•13m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•13m ago•0 comments

Ask HN: Anyone orchestrating multiple AI coding agents in parallel?

1•buildingwdavid•15m ago•0 comments

Show HN: Knowledge-Bank

https://github.com/gabrywu-public/knowledge-bank
1•gabrywu•20m ago•0 comments

Show HN: The Codeverse Hub Linux

https://github.com/TheCodeVerseHub/CodeVerseLinuxDistro
3•sinisterMage•21m ago•2 comments

Take a trip to Japan's Dododo Land, the most irritating place on Earth

https://soranews24.com/2026/02/07/take-a-trip-to-japans-dododo-land-the-most-irritating-place-on-...
2•zdw•21m ago•0 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
25•bookofjoe•22m ago•9 comments

BookTalk: A Reading Companion That Captures Your Voice

https://github.com/bramses/BookTalk
1•_bramses•23m ago•0 comments

Is AI "good" yet? – tracking HN's sentiment on AI coding

https://www.is-ai-good-yet.com/#home
3•ilyaizen•24m ago•1 comments

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

https://github.com/BETAER-08/amdb
1•try_betaer•24m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
2•anhxuan•24m ago•0 comments

Show HN: Seedance 2.0 Release

https://seedancy2.com/
2•funnycoding•25m ago•0 comments

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
1•thelok•25m ago•0 comments

Towards Self-Driving Codebases

https://cursor.com/blog/self-driving-codebases
1•edwinarbus•25m ago•0 comments

VCF West: Whirlwind Software Restoration – Guy Fedorkow [video]

https://www.youtube.com/watch?v=YLoXodz1N9A
1•stmw•26m ago•1 comments

Show HN: COGext – A minimalist, open-source system monitor for Chrome (<550KB)

https://github.com/tchoa91/cog-ext
1•tchoa91•27m ago•1 comments

FOSDEM 26 – My Hallway Track Takeaways

https://sluongng.substack.com/p/fosdem-26-my-hallway-track-takeaways
1•birdculture•28m ago•0 comments