frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

80.1 % on LoCoMo Long-Term Memory Benchmark with a pure open-source RAG pipeline

2•ViktorKuz•18m ago
I just pushed the current SOTA on the LoCoMo long-term memory benchmark for agents: 80.1 % accuracy using only:

-BGE-large-en-v1.5 (1024d) + FAISS

-Custom “MCA” gravitational ranking (keyword coverage + importance + frequency)

-BM25 sparse retrieval

-Direct Cross-Encoder reranking (bge-reranker-v2-m3) on the full union (~120-150 docs)

-Gpt-4o-mini only for final answer generation and judging (everything else is open weights or classic)

Repo: https://github.com/vac-architector/VAC-Memory-System Key tricks that finally broke 80% :

-MCA-first filter (coverage ≥ 0.1 → top-30) — catches exact-keyword questions early

-Feeding the entire union straight into Cross-Encoder (112–135 documents) instead of pre-filtering

-Proper query instruction for BGE-large (the classic “Represent this sentence for searching relevant passages”)

The whole pipeline runs in < 3s per query on a single RTX 4090. LoCoMo is currently the hardest public long-term memory benchmark (5.880 real human–agent conversations, multi-hop, temporal, negation, etc.).

Beating Mem0 official baseline by ~12–14 pp with fully open components feels pretty good. Would love feedback, especially from people who are also grinding on agent memory systems.

My background: My path didn't start in an IT office, but in Columbus, Ohio, where I worked as a handyman after leaving my job on the cell towers. The decision came from necessity: I bought a powerful PC on installments and resolved to create something that would change my life.

I had no experience, but I had an idea. Using Claude CLI as my sole mentor, I focused on architecture, not syntax.

Over 4.5 months of work, I engineered and created the VAC Memory System. To prove its value, I tested it on the toughest RAG benchmark—LoCoMo. Today, my system shows an overall result of 80.1% and a phenomenal 87.78% in the "Commonsense" category.

This is more than just code; it is the result of faith in an idea. I showed that by using modern tools, it is possible to achieve SOTA-level performance and create serious technology, regardless of your starting point. I highly anticipate your feedback.

Crumb: A Cartoonist's Life

https://www.lrb.co.uk/the-paper/v47/n21/j.-hoberman/desperate-character
1•mitchbob•2m ago•1 comments

Space Truckin' – The Nostromo (2012)

https://alienseries.wordpress.com/2012/10/23/space-truckin-the-nostromo/
2•exvi•3m ago•0 comments

Polar Signals is 70% faster by switching from Parquet to Vortex

https://www.polarsignals.com/blog/posts/2025/11/25/interface-parquet-vortex
1•SchwKatze•4m ago•0 comments

TMLR Beyond PDF:Journal of Machine Learning Research Now Accept HTML Submissions

https://tmlr-beyond-pdf.org/about
1•lnyan•7m ago•0 comments

ClipEgg: We Confused Copying with Hoarding

1•DaaaaveATX•7m ago•1 comments

Lawsuit alleges social media giants buried research on teen mental health harms

https://www.cnn.com/2025/11/25/tech/social-media-youth-mental-health-lawsuit-meta-tiktok-snap-you...
2•anonymousiam•9m ago•0 comments

The final evolution of agentic memory

https://manthanguptaa.in/posts/towards_human_like_memory_for_ai_agents/
1•manthangupta109•10m ago•0 comments

Klarna to launch dollar-backed stablecoin as race in digital payments heats up

https://www.reuters.com/business/finance/klarna-launch-dollar-backed-stablecoin-race-digital-paym...
1•krrishd•12m ago•1 comments

Optimzing Our Jax LLM RL Pipeline

https://notes.kvfrans.com/7-misc/rl-infra.html
1•lnyan•14m ago•0 comments

The Nostromo Project (2011) [video]

https://www.youtube.com/watch?v=9NoCsZvYeEQ
1•exvi•14m ago•0 comments

80.1 % on LoCoMo Long-Term Memory Benchmark with a pure open-source RAG pipeline

2•ViktorKuz•18m ago•0 comments

Nostromo: A Legend Born and Born Again - Part 1 (2011)

https://web.archive.org/web/20110519073318/http://www.therpf.com/f45/prop-store-first-look-nostro...
1•exvi•19m ago•0 comments

<5KB demoscene intro by Claude

https://demo-blue-fog-5621.fly.dev/
1•MattPearce•22m ago•1 comments

A Math Horror Show at UC San Diego

https://www.wsj.com/opinion/a-math-horror-show-at-cal-at-san-diego-c91f2035
1•delichon•24m ago•0 comments

US triples national park fee for non-residents, amid 'new' fee for Americans

https://www.theguardian.com/us-news/2025/nov/25/national-park-fee-non-residents
1•c420•24m ago•1 comments

Client Registration and Enterprise Management in the Nov 2025 MCP Auth Spec

https://aaronparecki.com/2025/11/25/1/mcp-authorization-spec-update
1•gz5•25m ago•0 comments

Markets are getting more concerned about Oracle's AI data center debt

https://sherwood.news/markets/markets-are-getting-more-concerned-about-oracles-ai-data-center-debt/
1•zerosizedweasle•26m ago•0 comments

Plug it in and make it magic

https://doingsoftwarewrong.com/blog/plug-in-ai-magic/
1•ChunkyAu•28m ago•0 comments

Java Quirks: Bridge and Synthetic Methods for Reflection

https://www.ptidej.net/blog/bridge-methods-java/
3•yann-gael•31m ago•1 comments

CS QLola

https://news.ycombinator.com
2•bappaforjio•33m ago•0 comments

Lifetime Safety in Clang – 2025 US LLVM Developers' Meeting [video]

https://www.youtube.com/watch?v=3zWK7Lx96vI
1•matt_d•36m ago•0 comments

Joe Armstrong – The mess we are in

https://youtu.be/lKXe3HUG2l4?si=YEbsd9xOCH_yP_C2
1•lifeisstillgood•40m ago•0 comments

Ask HN: Hard and deep tech – why are Jira and Confluence the go-to PM tools?

2•dnlh_lvg•40m ago•2 comments

Dr. Chainlove Or: How I Learned to Stop Worrying and Love On-Chain Gaming

https://organizedplayer.substack.com/p/dr-chainlove-or-how-i-learned-to
1•0north•41m ago•0 comments

Prosecutor Used Flawed A.I. To Keep a Man in Jail, His Lawyers Say

https://www.nytimes.com/2025/11/25/us/prosecutor-artificial-intelligence-errors-lawyers-californi...
3•perihelions•43m ago•0 comments

BebboSSH: SSH2 implementation for Amiga systems (68000, GPLv3)

https://franke.ms/git/bebbo/bebbossh
1•snvzz•44m ago•0 comments

Genesis Mission – A National Mission to Accelerate Science Through AI

https://genesis.energy.gov/
1•Anon84•47m ago•0 comments

Design Follows Data Structures

https://www.tedinski.com/2019/01/29/data-structures-are-fundamental.html
3•plutonium3345•48m ago•0 comments

Maybe some people should just give up [video]

https://www.youtube.com/watch?v=rsoEipuwXiI
1•koakuma-chan•51m ago•0 comments

I tracked 609 food additives across 817K products to find awareness gaps

https://compareadditives.com
4•markvitals•52m ago•2 comments