frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
424•klaussilveira•5h ago•97 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
19•mfiguiere•40m ago•7 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
774•xnx•11h ago•472 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
141•isitcontent•6h ago•15 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
134•dmpetrov•6h ago•57 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
41•quibono•4d ago•3 comments

A century of hair samples proves leaded gas ban worked

https://arstechnica.com/science/2026/02/a-century-of-hair-samples-proves-leaded-gas-ban-worked/
68•jnord•3d ago•4 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
246•vecti•8h ago•117 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
313•aktau•12h ago•153 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
177•eljojo•8h ago•124 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
11•matheusalmeida•1d ago•0 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
311•ostacke•12h ago•85 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
396•todsacerdoti•13h ago•217 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
322•lstoll•12h ago•233 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
11•kmm•4d ago•0 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
48•phreda4•5h ago•8 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
109•vmatsiiako•11h ago•34 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
186•i5heu•8h ago•129 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
236•surprisetalk•3d ago•31 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
975•cdrnsf•15h ago•415 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
144•limoce•3d ago•79 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
17•gfortaine•3h ago•2 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
41•rescrv•13h ago•17 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
48•ray__•2h ago•11 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
35•lebovic•1d ago•11 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
77•antves•1d ago•57 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
50•SerCe•2h ago•41 comments

The Oklahoma Architect Who Turned Kitsch into Art

https://www.bloomberg.com/news/features/2026-01-31/oklahoma-architect-bruce-goff-s-wild-home-desi...
18•MarlonPro•3d ago•4 comments

Claude Composer

https://www.josh.ing/blog/claude-composer
108•coloneltcb•2d ago•70 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
39•nwparker•1d ago•10 comments
Open in hackernews

Embedding Text Documents with Qwen3

https://www.daft.ai/blog/embedding-millions-of-text-documents-with-qwen3
18•kiyanwang•5mo ago

Comments

xfalcox•5mo ago
Just migrated all embeddings to this same model a few weeks ago in my company, and it's a game changer. Having 32k context is a 64x increase when compared with our previous used model. Plus being natively multilingual and producing very standard 1024 long arrays made it a seamless transition even with millions of embeddings across thousands of databases.

I do recommend using https://github.com/huggingface/text-embeddings-inference for fast inference.

ipsum2•5mo ago
What does it mean to generate 1000 float16 array size on a 32k context? Surely the embedding you get is no longer representative of the text.
xfalcox•5mo ago
Depends on your needs. You surely don't want 32k long chunks for doing the standard RAG pipeline, that's for sure.

My use case is basically a recommendation engine, where retrieve a list of similar forum topics based on the current read one. As with dynamic user generated content, it can vary from 10 to 100k tokens. Ideally I would generate embeddings from an LLM generated summary, but that would increase inference costs considerably at the scale I'm applying it.

Having a larger possible context out of the box just made a simple swap of embeddeding models increase quality of recommendations greatly.

markerz•5mo ago
Why would you use sentence level chunking?

I’ve generated embedding for “objects” or whole documents to get similarity scores. Helps with “relevant articles” type features.

I’ve also made embeddings for paragraphs or fixed sized chunks for RAG lookups. Good for semantic search.

I don’t understand why you would want embeddings on sentences.

> Chunking Strategies

> Sentence-level chunking works well for most use cases, especially when the document structure is unclear or inconsistent.