LLM Memory System

https://github.com/For-Sunny/nova-mcp-research

1•Opus_Warrior•13m ago

Comments

Opus_Warrior•13m ago

Hi HN,

I built a memory architecture that gives AI actual persistence across sessions. Not prompt engineering, not RAG retrieval - proper memory with episodic, semantic, and procedural layers.

The problem: Current AI "memory" is either expensive context window tricks or slow retrieval systems. Neither feels natural, and nothing persists across sessions in a meaningful way.

The solution: CASCADE - a 6-layer memory architecture running on consumer GPUs. Key specs: - Sub-2ms semantic search across 11,000+ memories (Faiss + GPU acceleration) - 9.68x computational amplification through optimization - 95% GPU utilization (up from 8% baseline) - True session-to-session persistence

Dual-tier release philosophy: - Research Edition: Unrestricted access for power users who accept responsibility - Enterprise Edition: Production-ready with input validation, SQL injection protection, rate limiting

Why unrestricted matters: Researchers need tools without artificial limits. Enterprises need compliance-ready security. Both should exist.

Built in a basement lab on a single RTX 3090. All research, protocols, and code are open source (MIT).

Repo: https://github.com/For-Sunny/nova-mcp-research

Technical docs: See NOVA_MEMORY_ARCHITECTURE.md in repo

We're not a product company - this is a research lab that produces practical tools. No VC funding, no customers, just genuine exploration into making AI memory work properly.

Trade-offs we document honestly: - Research Edition requires trust (no guardrails) - GPU acceleration needs decent hardware (3060+ recommended) - Windows-focused initially (Linux/Mac coming)

Questions welcome. Looking for feedback on the architecture and anyone interested in reproducing the results.

'The Age of Disclosure' Documentary About UFO's, Gets a Congressional Audience

Scientists create new bullet-proof fiber stronger and thinner than Kevlar

Conjectural connection between Quantum Mechanics and Gravitation

Ask HN: Is it time to measure Inflation and CPI without the government?

Show HN: Run a Website from an Old Phone

Do You Know Where Your Load-Bearing Code Is?

Ask HN: Can recursion be more useful than regular loops?

Minute Cryptic

I'm Writing Another Book

LLM Memory System

Thoughts on AI by Gavin Baker, Investor and Financial Analyst

San Francisco's most controversial startup is moving to New York

Don't Look Up: NYPD Drones Take Flight

Thankful for Memory Managed Languages

Chinese drone impersonates British Typhoon fighter jet

Using PlanetScale to reduce the impact of thundering herd

kindleship: Command-line tool to ship ebooks to your Kindle library

Show HN: NovaCiv – A New Digital Civilization for the Age of AI

The Influence of Music on Mental Health Through Neuroplasticity

David Lynch on Color Timing (2004) [video]

China Reaches Energy Milestone by "Breeding" Uranium from Thorium

I mathematically proved the best "Guess Who?" strategy [video]

Gwern's "Stem Humor" Directory

The realities of being a pop star

Java Decompiler

Windows 11 will load File Explorer in the background to make it faster

Show HN: Santamon – Lightweight macOS threat detection agent

Google must double AI serving capacity every 6 months to meet demand

Nvidia crushed its quarter–and CEO Jensen Huang said in a leaked all-hands

An overview of memory management in Go (2021)

LLM Memory System

Comments

'The Age of Disclosure' Documentary About UFO's, Gets a Congressional Audience

Scientists create new bullet-proof fiber stronger and thinner than Kevlar

Conjectural connection between Quantum Mechanics and Gravitation

Ask HN: Is it time to measure Inflation and CPI without the government?

Show HN: Run a Website from an Old Phone

Do You Know Where Your Load-Bearing Code Is?

Ask HN: Can recursion be more useful than regular loops?

Minute Cryptic

I'm Writing Another Book

LLM Memory System

Thoughts on AI by Gavin Baker, Investor and Financial Analyst

San Francisco's most controversial startup is moving to New York

Don't Look Up: NYPD Drones Take Flight

Thankful for Memory Managed Languages

Chinese drone impersonates British Typhoon fighter jet

Using PlanetScale to reduce the impact of thundering herd

kindleship: Command-line tool to ship ebooks to your Kindle library

Show HN: NovaCiv – A New Digital Civilization for the Age of AI

The Influence of Music on Mental Health Through Neuroplasticity

David Lynch on Color Timing (2004) [video]

China Reaches Energy Milestone by "Breeding" Uranium from Thorium

I mathematically proved the best "Guess Who?" strategy [video]

Gwern's "Stem Humor" Directory

The realities of being a pop star

Java Decompiler

Windows 11 will load File Explorer in the background to make it faster

Show HN: Santamon – Lightweight macOS threat detection agent

Google must double AI serving capacity every 6 months to meet demand

Nvidia crushed its quarter–and CEO Jensen Huang said in a leaked all-hands

An overview of memory management in Go (2021)