O(1) Context Retrieval for Agents Using Weightless Neural Networks

7•aperi•1mo ago

Comments

aperi•1mo ago

Hi HN, I am Anil and I am building Rice (https://tryrice.com), a low latency context orchestration layer for AI agents.

Rice replaces the standard HNSW vector search with Weightless Neural Networks (WNNs) to enable O(1) retrieval speeds, specifically designed for realtime voice agents and high-frequency multi agent workflows.

The problem we ran into while building voice agents was simple: Latency kills immersion.

Between STT (Speech-to-Text), the LLM inference, and TTS (Text-to-Speech), we had a strict latency budget. Spending 200ms+ on a Vector DB lookup (plus reranking) was eating up too much of that budget. On top of that, we found that stateless RAG meant our agents were constantly hallucinating permissions and accessing data they shouldn't, or failing to remember a constraint set by another agent 10 seconds ago.

The industry standard is to throw everything into Pinecone or pgvector and handle the logic in the application layer. That works for chatbots, but for autonomous agents that need mutable memory (read/write state 50 times a minute), standard vector indexes are too heavy and slow to update.

Rice is our attempt to fix the Working Memory problem.

Under the hood:

Rice is an indexing and state management engine that sits between your LLM and your data. Instead of using HNSW graphs (which are O(log N)), we rely on Weightless Neural Networks (similar to WiSARD architectures).

- Deep Semantic Hashing: We train a lightweight model to compress dense embeddings into sparse binary codes while preserving semantic relationships. - O(1) Lookup: These binary codes are mapped directly to memory addresses. This effectively turns "Search" into a hash table lookup.

The Result: Retrieval latency stays flat (<50ms) even as your context grows to millions of items, and updates to the memory state are instant (no reindexing penalty).

We wrap this WNN core in a State Machine that handles Access Control (ACLs). When an Agent requests context, Rice checks the identity and state before the retrieval, ensuring you don't leak data between users or agents. Think of it as "Supabase for Agent Context", a managed backend that handles the memory graph and security policies so you don't have to write raw SQL RLS queries for every RAG call.

Where we are now

Rice is currently in closed beta/alpha. We are working with a few design partners in the voice and support automation space who need that sub 100ms retrieval speed.

We know using WNNs for semantic search is a contrarian bet compared to the massive investment in Vector DBs. We are specifically optimizing for "Hot State" (short term, high velocity memory) rather than "Cold Storage" (archival knowledge), though the lines are blurring.

Use Cases we are seeing: - Voice Agents: Shaving 200ms off RAG latency to make conversation feel natural. - Multi-Agent Hand-offs: Agent A (Sales) updates a "Customer Mood" state, and Agent B (Support) sees it instantly without hallucinating. - Internal Tools: Enforcing strict ACLs (e.g., "Junior Devs can't query the Salary Table") at the infrastructure layer.

We are looking for engineers who are pushing the limits of agent latency or struggling with state management to try it out and tell us where it breaks.

I’m especially interested in hearing your skepticism on the WNN approach - we know it’s weird, but for our specific constraints, the speed tradeoff has been worth it.

ob_mobly•1mo ago

Interesting take on the matter. Joined the waitlist, would like to see it in action.

The essential Reinhold Niebuhr: selected essays and addresses

Rentahuman.ai Turns Humans into On-Demand Labor for AI Agents

StovexGlobal – Compliance Gaps to Note

Show HN: Afelyon – Turns Jira tickets into production-ready PRs (multi-repo)

Trump says America should move on from Epstein – it may not be that easy

Tiny Clippy – A native Office Assistant built in Rust and egui

LegalArgumentException: From Courtrooms to Clojure – Sen [video]

US moves to deport 5-year-old detained in Minnesota

If you lose your passport in Austria, head for McDonald's Golden Arches

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

RFCs vs. READMEs: The Evolution of Protocols

Kanchipuram Saris and Thinking Machines

Chinese chemical supplier causes global baby formula recall

I've used AI to write 100% of my code for a year as an engineer

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

AI-native capabilities, a new API Catalog, and updated plans and pricing

What changed in tech from 2010 to 2020?

From Human Ergonomics to Agent Ergonomics

Advanced Inertial Reference Sphere

Toyota Developing a Console-Grade, Open-Source Game Engine with Flutter and Dart

Typing for Love or Money: The Hidden Labor Behind Modern Literary Masterpieces

Show HN: A longitudinal health record built from fragmented medical data

CoreWeave's $30B Bet on GPU Market Infrastructure

Creating and Hosting a Static Website on Cloudflare for Free

"The Stanford scam proves America is becoming a nation of grifters"

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

X (Twitter) is back with a new X API Pay-Per-Use model

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law