Mutatis – Database that mutates its schema based on semantic patterns

https://github.com/ScooterMageee/mutatis-public

3•Mutatis•1mo ago

Comments

Mutatis•1mo ago

I spent 9 months building Mutatis — a neuroplastic database that physically rewrites its own schema at runtime based on what it learns about you.

The core problem: RAG systems treat all memories equally. "I'm allergic to peanuts" gets the same storage priority as "I like jazz music." Eventually, critical facts get buried under noise.

Mutatis solves this through semantic-triggered schema evolution. When patterns emerge (e.g., "sara" mentioned 7+ times with family_spouse pattern), the system automatically creates a dedicated indexed table and migrates relevant records. Zero downtime via shadow table + atomic swap.

The performance improvement is superlinear: • 1,000 records: 18.9× faster (6.04ms → 0.32ms) • 10,000 records: 62.1× faster (59.03ms → 0.95ms) • 500,000 records: 213.4× faster (4.26s → 19.97ms)

Query complexity improves from O(N) full table scans to O(log N) index lookups. The mutation literally changes SQL from:

```sql -- Before: Generic table, full scan SELECT * FROM generic_memories WHERE content LIKE '%sara%'

-- After: Dedicated table, indexed SELECT * FROM family_spouse_sara WHERE entity = 'sara' ```

The system uses √2 gravity weighting to ensure foundational memories outrank transient ones. Even when episodic memories have higher raw similarity (0.696 vs 0.670), the √2 boost ensures foundational facts rank first (0.947 final score).

What triggers evolution: • Medical conditions ("I'm allergic to penicillin") • Identity statements ("I am vegetarian") • Strong preferences ("I hate coffee") • Pattern matching + confidence scoring + entity tracking

Built with TypeScript + SQLite. Uses mock embeddings for the POC (no API keys needed—just clone and run). Patent pending (US 63/949,136).

Interactive demo: Clone the repo and run `cd core && npm run dev` to watch schema evolution happen live.

Repo: https://github.com/ScooterMageee/mutatis-public

Looking for feedback on: 1. What other semantic patterns should trigger schema evolution? 2. Edge cases where automatic schema mutation could create inconsistencies? 3. How do you currently handle memory drift in RAG systems?

regnodon•1mo ago

Really interesting approach to the RAG noise problem. The atomic swap via shadow tables is a clever way to handle the migration.

One edge case I’m curious about is how the system handles modal logic or intent vs. fact. If a user says 'I live in Texas' and then 'I wish I lived in Florida,' a regex-heavy approach might struggle to differentiate between current state and aspiration.

In a 'neuroplastic' database, how do you handle schema deprecation or 'forgetting' when the foundational patterns drift (e.g., a user moves cities or changes a diet)? Do you have a mechanism for the schema to 'de-evolve' or merge back into a generic table if a specific entity's mention-frequency drops below a certain threshold?

Mutatis•1mo ago

Mutatis: Autonomous Schema Evolution & Managed Deprecation

I’ve seen a lot of discussion about "Memory Bloat" in RAG systems. In Mutatis, we solve this by treating the database schema as a fluid organism that evolves (and de-evolves) based on a combination of Semantic Pattern Detection and Confidence Decay.

As the data scales, the system shadow-builds specialized tables for high-confidence entities, shifting query complexity from O(N) to O(log N).

How we handle the lifecycle of a memory from "Generic" to "Optimized" and back again:

1. SEMANTIC LOGIC VS. REGEX We don't trigger schema changes on keyword frequency alone. We use an LLM-driven classifier to distinguish Modal Logic (intent) from Foundational Facts. - Intent: "I wish I lived in Florida" -> Stored as preference in a generic table. - Fact: "I live in Florida" -> Triggers the evolution pipeline. This prevents schema "pollution" from noise or aspirational intent.

2. MENTIONS, DECAY, AND "DE-EVOLUTION" Schema evolution is a reward for frequently referenced data; deprecation is the penalty for irrelevance. - Confidence Decay: When contradictory statements are detected (e.g., "I moved to Texas"), the confidence score for the "Florida" schema decays. - Frequency Thresholds: If an optimized table isn't hit within a specific window, it is flagged for De-Evolution.

3. MECHANISM: SHADOW TABLES & ATOMIC SWAPS To ensure zero-downtime, we use a shadow-table migration pattern: - Selection: A schema is flagged for merging via periodic hygiene checks. - Shadow Merge: A background transaction copies data from the specialized table back into a generic_memories table. - Atomic Swap: We drop the specialized table and update the query router in a single atomic transaction.

THE BOTTOM LINE: By allowing the schema to "de-evolve" back into generic tables, we maintain O(log N) performance for relevant data without the overhead of maintaining thousands of stale indices.

Mutatis•1mo ago

Here's what happens when you run the demo:

After mentioning "abel" 4 times with emotional patterns, schema evolution triggers:

════════════════════════════════════════ SCHEMA EVOLUTION TRIGGERED ════════════════════════════════════════ [SHADOW] Creating emotional_love_evolved_shadow... [BACKFILL] Moving records mentioning 'abel'... [BACKFILL] Moved 8 records [SWAP] Executing atomic transaction... [COMPLETE] Schema evolved successfully

  Before: SELECT * FROM generic_memories WHERE LIKE '%abel%' (O(N) scan)
  After:  SELECT * FROM emotional_love_evolved WHERE entity = 'abel' (O(log N) index)

════════════════════════════════════════

Query performance: • Before evolution: 1.25ms (vector scan) • After evolution: 0.57ms (indexed lookup)

Try it yourself: ```bash cd core && npm install && npm run dev ```

Example session: ``` add I live in Texas add I love abel add abel lives with me add abel loves Texas add I love abel # ← Evolution triggers here

query who is abel? ```

Watch the system detect patterns, track entities, and evolve the schema in real-time. The O(log N) indexed retrieval kicks in automatically after evolution.

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

How close is AI to taking my job?

You are the reason I am not reviewing this PR

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

How Meta Made Linux a Planet-Scale Load Balancer

A Turing Test for AI Coding

How to Identify and Eliminate Unused AWS Resources

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

CLI for Common Playwright Actions

Would you use an e-commerce platform that shares transaction fees with users?

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Evolution of the Interface

Azure: Virtual network routing appliance overview

Seedance2 – multi-shot AI video generation

Πfs – The Data-Free Filesystem

Go-busybox: A sandboxable port of busybox for AI agents

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

Zen Tools

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

The purpose of Continuous Integration is to fail

Apfelstrudel: Live coding music environment with AI agent chat

What Is Stoicism?

What happens when a neighborhood is built around a farm

Every major galaxy is speeding away from the Milky Way, except one

Extreme Inequality Presages the Revolt Against It

There's no such thing as "tech" (Ten years later)

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

How close is AI to taking my job?

You are the reason I am not reviewing this PR

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

How Meta Made Linux a Planet-Scale Load Balancer

A Turing Test for AI Coding

How to Identify and Eliminate Unused AWS Resources

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

CLI for Common Playwright Actions

Would you use an e-commerce platform that shares transaction fees with users?

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

The Evolution of the Interface

Azure: Virtual network routing appliance overview

Seedance2 – multi-shot AI video generation

Πfs – The Data-Free Filesystem

Go-busybox: A sandboxable port of busybox for AI agents

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

Zen Tools

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

The purpose of Continuous Integration is to fail

Apfelstrudel: Live coding music environment with AI agent chat

What Is Stoicism?

What happens when a neighborhood is built around a farm

Every major galaxy is speeding away from the Milky Way, except one

Extreme Inequality Presages the Revolt Against It

There's no such thing as "tech" (Ten years later)

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

Mutatis – Database that mutates its schema based on semantic patterns

Comments