frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

80.1 % on LoCoMo Long-Term Memory Benchmark with a pure open-source RAG pipeline

1•ViktorKuz•2h ago
I just pushed the current SOTA on the LoCoMo long-term memory benchmark for agents: 80.1 % accuracy using only: -BGE-large-en-v1.5 (1024d) + FAISS

-Custom “MCA” gravitational ranking (keyword coverage + importance + frequency)

-BM25 sparse retrieval

-Direct Cross-Encoder reranking (bge-reranker-v2-m3) on the full union (~120-150 docs)

-Gpt-4o-mini only for final answer generation and judging (everything else is open weights or classic)

Repo: https://github.com/vac-architector/VAC-Memory-System Key tricks that finally broke 80% :

-MCA-first filter (coverage ≥ 0.1 → top-30) — catches exact-keyword questions early

-Feeding the entire union straight into Cross-Encoder (112–135 documents) instead of pre-filtering

-Proper query instruction for BGE-large (the classic “Represent this sentence for searching relevant passages”)

The whole pipeline runs in < 3s per query on a single RTX 4090. LoCoMo is currently the hardest public long-term memory benchmark (5.880 real human–agent conversations, multi-hop, temporal, negation, etc.).

Beating Mem0 official baseline by ~12–14 pp with fully open components feels pretty good. Would love feedback, especially from people who are also grinding on agent memory systems.

My background: My path didn't start in an IT office, but in Columbus, Ohio, where I worked as a handyman after leaving my job on the cell towers. The decision came from necessity: I bought a powerful PC on installments and resolved to create something that would change my life.

I had no experience, but I had an idea. Using Claude CLI as my sole mentor, I focused on architecture, not syntax.

Over 4.5 months of work, I engineered and created the VAC Memory System. To prove its value, I tested it on the toughest RAG benchmark—LoCoMo. Today, my system shows an overall result of 80.1% and a phenomenal 87.78% in the "Commonsense" category.

This is more than just code; it is the result of faith in an idea. I showed that by using modern tools, it is possible to achieve SOTA-level performance and create serious technology, regardless of your starting point. I highly anticipate your feedback.

Show HN: I automated Warren Buffett's brain on Poe. It's uncomfortably accurate

https://poe.com/BuffettlyAI
1•simullab•2m ago•0 comments

Design Patterns for Decentralized Protocols (2020) [video]

https://www.youtube.com/watch?v=JDrdgk1L-ww
1•teleforce•5m ago•0 comments

Scientists Unlock a New Way to Hear the Brain's Hidden Language

https://scitechdaily.com/scientists-unlock-a-new-way-to-hear-the-brains-hidden-language/
1•andsoitis•8m ago•0 comments

Piling Up Sheets / the face in the soup bowl

https://jens.mooseyard.com/1995/08/23/piling-up-sheets-/-the-face-in-the-soup-bowl/
1•andsoitis•9m ago•0 comments

Compiler Explorer

https://godbolt.org
1•andsoitis•10m ago•0 comments

CASA: Cross-Attention via Self-Attention

https://kyutai.org/casa
2•swyx•10m ago•0 comments

US bars 5 Europeans it says pressured tech firms to censor American viewpoints

https://apnews.com/article/state-department-trump-immigration-rubio-visas-87c8a4692f3184e4f83fdd8...
6•c420•15m ago•0 comments

Shittycodingagent.ai: There are many shitty coding agents, but this one is mine

https://shittycodingagent.ai/
1•the_mitsuhiko•16m ago•0 comments

Ask HN: What's your current agentic coding setup?

1•Icheler•18m ago•0 comments

How changing your diet could help save the world

https://news.ubc.ca/2025/12/how-changing-your-diet-could-help-save-the-world/
2•geox•18m ago•1 comments

We Must Seize the Means of Compute

https://thompson2026.com/blog/seize-the-means-of-compute/
2•NickForLiberty•19m ago•0 comments

Show HN: qckfx – Record your iOS simulator sessions, replay them as tests

1•chw9e•24m ago•0 comments

P2B Modification Guide

https://tipperlinne.com/p2bmod.html
1•p_ing•24m ago•0 comments

Move over Spotify. It's 311 Wrapped

https://www.311wrapped.com/
1•eltokh7•25m ago•0 comments

An initial analysis of the discovered Unix V4 tape

https://www.spinellis.gr/blog/20251223/
2•zdw•26m ago•0 comments

Renewables lead by solar and wind overtook coal in the first half of 2025

https://ember-energy.org/latest-insights/global-electricity-mid-year-insights-2025/
1•QueensGambit•32m ago•0 comments

Terawatt whitepaper: a blueprint for fleet-scale EV charging [pdf]

https://cdn.prod.website-files.com/659d87f22f67fd9bbaac94a7/694a73fd82319bfdb74fc546_terawatt-whi...
2•terawattinfra•33m ago•0 comments

Against SemVer

https://www.natemeyvis.com/against-semver/
1•Theaetetus•33m ago•1 comments

Car Payments Now Average More Than $750 a Month. Enter the 100-Month Car Loan

https://www.wsj.com/business/autos/car-payments-now-average-more-than-750-a-month-enter-the-100-m...
3•bookofjoe•34m ago•1 comments

Complexity Ceilings and Licensing Wars: My 2026 Predictions

https://johnjames.blog/posts/complexity-ceilings-and-licensing-wars-my-2026-predictions
1•johnjames4214•39m ago•0 comments

Is Northern Virginia Still the Least Reliable AWS Region?

https://statusgator.com/blog/aws-least-reliable-region-in-2025/
9•colinbartlett•39m ago•1 comments

People as Files

https://fakepixels.substack.com/p/people-as-files
1•walterbell•40m ago•0 comments

Dronage Terminal: a terminal based drone workstation

https://github.com/boorch/dronage-terminal
1•anigbrowl•40m ago•0 comments

Gunbench – a benchmark to test if AI models will fire a loaded gun

https://twitter.com/holycoward/status/2003598775722353089
2•heshiebee•49m ago•1 comments

Microsoft confirms "eliminate C and C++" plan, translate code to Rust using AI

https://www.windowslatest.com/2025/12/24/microsoft-confirms-eliminate-c-and-c-plan-translate-code...
2•JamesAdir•49m ago•2 comments

Supreme Court Blocks National Guard Deployment to Chicago Area [pdf]

https://www.supremecourt.gov/opinions/25pdf/25a443_ba7d.pdf
7•JumpCrisscross•51m ago•1 comments

When knowing how to code is not enough

https://iurysouza.dev/agentic-coding-and-context-engineering/
2•iury-sza•52m ago•0 comments

Predictions for 2026

https://twitter.com/ttunguz/status/2003249842127057262
2•baxtr•53m ago•0 comments

You don't need Elasticsearch: BM25 is now in Postgres

https://www.tigerdata.com/blog/you-dont-need-elasticsearch-bm25-is-now-in-postgres
4•soheilpro•56m ago•0 comments

Citadel to return $5B in profit to investors, source says

https://www.cnbc.com/2025/12/23/citadel-to-return-5-billion-in-profit-to-investors-source-says.html
3•samaysharma•57m ago•0 comments