frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

What I haven't figured out

https://macwright.com/2026/01/29/what-i-havent-figured-out
1•stevekrouse•1m ago•0 comments

KPMG pressed its auditor to pass on AI cost savings

https://www.irishtimes.com/business/2026/02/06/kpmg-pressed-its-auditor-to-pass-on-ai-cost-savings/
1•cainxinth•1m ago•0 comments

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

https://twitter.com/b1rdmania/status/2020155122181869666
1•birdmania•1m ago•1 comments

First Proof

https://arxiv.org/abs/2602.05192
2•samasblack•3m ago•1 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•4m ago•0 comments

Kagi Translate

https://translate.kagi.com
1•microflash•5m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•6m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
1•facundo_olano•8m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•8m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•8m ago•0 comments

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
20•tartoran•9m ago•1 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•9m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•9m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
1•maxmoq•10m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
1•headalgorithm•11m ago•0 comments

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•11m ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•12m ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•15m ago•1 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•16m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•20m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•20m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•20m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•21m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•22m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•23m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•23m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•23m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•23m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•25m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•27m ago•1 comments
Open in hackernews

Why are we accepting silent data corruption in Vector Search? (x86 vs. ARM)

5•varshith17•1mo ago
I spent the last week chasing a "ghost" in a RAG pipeline and I think I’ve found something that the industry is collectively ignoring.

We assume that if we generate an embedding and store it, the "memory" is stable. But I found that f32 distance calculations (the backbone of FAISS, Chroma, etc.) act as a "Forking Path."

If you run the exact same insertion sequence on an x86 server (AVX-512) and an ARM MacBook (NEON), the memory states diverge at the bit level. It’s not just "floating point noise" it’s a deterministic drift caused by FMA (Fused Multiply-Add) instruction differences.

I wrote a script to inspect the raw bits of a sentence-transformers vector across my M3 Max and a Xeon instance. Semantic similarity was 0.9999, but the raw storage was different

For a regulated AI agent (Finance/Healthcare), this is a nightmare. It means your audit trail is technically hallucinating depending on which server processed the query. You cannot have "Write Once, Run Anywhere" index portability.

The Fix (Going no_std) I got so frustrated that I bypassed the standard libraries and wrote a custom kernel (Valori) in Rust using Q16.16 Fixed-Point Arithmetic. By strictly enforcing integer associativity, I got 100% bit-identical snapshots across x86, ARM, and WASM.

Recall Loss: Negligible (99.8% Recall@10 vs standard f32).

Performance: < 500µs latency (comparable to unoptimized f32).

The Ask / Paper I’ve written a formal preprint analyzing this "Forking Path" problem and the Q16.16 proofs. I am currently trying to submit it to arXiv (Distributed Computing / cs.DC) but I'm stuck in the endorsement queue.

If you want to tear apart my Rust code: https://github.com/varshith-Git/Valori-Kernel

If you are an arXiv endorser for cs.DC (or cs.DB) and want to see the draft, I’d love to send it to you.

Am I the only one worried about building "reliable" agents on such shaky numerical foundations?

Comments

varshith17•1mo ago
Github repo: https://github.com/varshith-Git/Valori-Kernel
chrisjj•1mo ago
> Am I the only one worried about building "reliable" agents on such shaky numerical foundations?

You might be the only one expecting a reliable "AI" agent period.

varshith17•1mo ago
"You might be the only one expecting a reliable 'AI' agent period."

That is a defeatist take.

Just because the driver (the LLM) is unpredictable doesn't mean the car (the infrastructure) should have loose wheels.

We accept that models are probabilistic. We shouldn't accept that our databases are.

If the "brain" is fuzzy, the "notebook" it reads from shouldn't be rewriting itself based on which CPU it's running on. Adding system-level drift to model level hallucinations is just bad engineering.

If we ever want to graduate from "Chatbot Toys" to "Agentic Systems," we have to lock down the variables we actually control. The storage layer is one of them.

michalsustr•1mo ago
It actually gets worse. The GPUs are numerically non deterministic too. So your embeddings may not be fully reproducible either.
chrisjj•1mo ago
One could switch ones GPU arithmetic to integer...

... or resign oneself to the fact we've entered the age of Approximate Computing.

varshith17•1mo ago
Switching GPUs to integer (Quantization) is happening, yes. But that only fixes the inference step.

The problem Valori solves is downstream: Memory State.

We can accept 'Approximate Computing' for generating a probability distribution (the model's thought). We cannot accept it for storing and retrieving that state (the system's memory).

If I 'resign myself' to approximate memory, I can't build consensus, I can't audit decisions, and I can't sync state between nodes.

'Approximate Nearest Neighbor' (ANN) refers to the algorithm's recall trade-off, not an excuse for hardware-dependent non-determinism. Valori proves you can have approximate search that is still bit-perfectly reproducible. Correctness shouldn't be a casualty of the AI age.

varshith17•1mo ago
You are absolutely right. GPU parallelism (especially reduction ops) combined with floating-point non-associativity means the same model can produce slightly different embeddings on different hardware.

However, that makes deterministic memory more critical, not less.

Right now, we have 'Double Non-Determinism':

The Model produces drifting floats.

The Vector DB (using f32) introduces more drift during indexing and search (different HNSW graph structures on different CPUs).

Valori acts as a Stabilization Boundary. We can't fix the GPU (yet), but once that vector hits our kernel, we normalize it to Q16.16 and freeze it. This guarantees that Input A + Database State B = Result C every single time, regardless of whether the server is x86 or ARM.

Without this boundary, you can't even audit where the drift came from.

codingdave•1mo ago
> We assume that if we generate an embedding and store it, the "memory" is stable.

Why do you assume that? In my experience, the "memory" is never stable. You seem to have higher expectations of reliability than would be reasonable.

If you have proven that unreliability, that proof is actually interesting. But seems less like a bug, and more of an observation of how things work.

varshith17•1mo ago
"You seem to have higher expectations of reliability than would be reasonable."

If sqlite returned slightly different rows depending on whether the server was running an Intel or AMD chip, we wouldn't call that "an observation of how things work." We would call it data corruption.

We have normalized this "unreliability" in AI because we treat embeddings as fuzzy probabilistic magic. But at the storage layer, they are just numbers.

If I am building a search bar? Sure, 0.99 vs 0.98 doesn't matter.

But if I am building a decentralized consensus network where 100 nodes need to sign a state root, or a regulatory audit trail for a financial agent, "memory drift" isn't a quirk, it's a system failure.

My "proof" isn't just that it breaks; it's that it doesn't have to. I replaced the f32 math with a fixed-point kernel (Valori) and got bit-perfect stability across architectures.

Non-determinism is not a law of physics. It’s just a tradeoff we got lazy about.

realitydrift•1mo ago
This reads more like a semantic fidelity problem at the infrastructure layer. We’ve normalized drift because embeddings feel fuzzy, but the moment they’re persisted and reused, they become part of system state, and silent divergence across hardware breaks auditability and coordination. Locking down determinism where we still can feels like a prerequisite for anything beyond toy agents, especially once decisions need to be replayed, verified, or agreed upon.