After mentioning "abel" 4 times with emotional patterns, schema evolution triggers:
════════════════════════════════════════ SCHEMA EVOLUTION TRIGGERED ════════════════════════════════════════ [SHADOW] Creating emotional_love_evolved_shadow... [BACKFILL] Moving records mentioning 'abel'... [BACKFILL] Moved 8 records [SWAP] Executing atomic transaction... [COMPLETE] Schema evolved successfully
Before: SELECT * FROM generic_memories WHERE LIKE '%abel%' (O(N) scan)
After: SELECT * FROM emotional_love_evolved WHERE entity = 'abel' (O(log N) index)
════════════════════════════════════════Query performance: • Before evolution: 1.25ms (vector scan) • After evolution: 0.57ms (indexed lookup)
Try it yourself: ```bash cd core && npm install && npm run dev ```
Example session: ``` add I live in Texas add I love abel add abel lives with me add abel loves Texas add I love abel # ← Evolution triggers here
query who is abel? ```
Watch the system detect patterns, track entities, and evolve the schema in real-time. The O(log N) indexed retrieval kicks in automatically after evolution.
Mutatis•2h ago
The core problem: RAG systems treat all memories equally. "I'm allergic to peanuts" gets the same storage priority as "I like jazz music." Eventually, critical facts get buried under noise.
Mutatis solves this through semantic-triggered schema evolution. When patterns emerge (e.g., "sara" mentioned 7+ times with family_spouse pattern), the system automatically creates a dedicated indexed table and migrates relevant records. Zero downtime via shadow table + atomic swap.
The performance improvement is superlinear: • 1,000 records: 18.9× faster (6.04ms → 0.32ms) • 10,000 records: 62.1× faster (59.03ms → 0.95ms) • 500,000 records: 213.4× faster (4.26s → 19.97ms)
Query complexity improves from O(N) full table scans to O(log N) index lookups. The mutation literally changes SQL from:
```sql -- Before: Generic table, full scan SELECT * FROM generic_memories WHERE content LIKE '%sara%'
-- After: Dedicated table, indexed SELECT * FROM family_spouse_sara WHERE entity = 'sara' ```
The system uses √2 gravity weighting to ensure foundational memories outrank transient ones. Even when episodic memories have higher raw similarity (0.696 vs 0.670), the √2 boost ensures foundational facts rank first (0.947 final score).
What triggers evolution: • Medical conditions ("I'm allergic to penicillin") • Identity statements ("I am vegetarian") • Strong preferences ("I hate coffee") • Pattern matching + confidence scoring + entity tracking
Built with TypeScript + SQLite. Uses mock embeddings for the POC (no API keys needed—just clone and run). Patent pending (US 63/949,136).
Interactive demo: Clone the repo and run `cd core && npm run dev` to watch schema evolution happen live.
Repo: https://github.com/ScooterMageee/mutatis-public
Looking for feedback on: 1. What other semantic patterns should trigger schema evolution? 2. Edge cases where automatic schema mutation could create inconsistencies? 3. How do you currently handle memory drift in RAG systems?
regnodon•2h ago
One edge case I’m curious about is how the system handles modal logic or intent vs. fact. If a user says 'I live in Texas' and then 'I wish I lived in Florida,' a regex-heavy approach might struggle to differentiate between current state and aspiration.
In a 'neuroplastic' database, how do you handle schema deprecation or 'forgetting' when the foundational patterns drift (e.g., a user moves cities or changes a diet)? Do you have a mechanism for the schema to 'de-evolve' or merge back into a generic table if a specific entity's mention-frequency drops below a certain threshold?
Mutatis•1h ago
I’ve seen a lot of discussion about "Memory Bloat" in RAG systems. In Mutatis, we solve this by treating the database schema as a fluid organism that evolves (and de-evolves) based on a combination of Semantic Pattern Detection and Confidence Decay.
As the data scales, the system shadow-builds specialized tables for high-confidence entities, shifting query complexity from O(N) to O(log N).
How we handle the lifecycle of a memory from "Generic" to "Optimized" and back again:
1. SEMANTIC LOGIC VS. REGEX We don't trigger schema changes on keyword frequency alone. We use an LLM-driven classifier to distinguish Modal Logic (intent) from Foundational Facts. - Intent: "I wish I lived in Florida" -> Stored as preference in a generic table. - Fact: "I live in Florida" -> Triggers the evolution pipeline. This prevents schema "pollution" from noise or aspirational intent.
2. MENTIONS, DECAY, AND "DE-EVOLUTION" Schema evolution is a reward for frequently referenced data; deprecation is the penalty for irrelevance. - Confidence Decay: When contradictory statements are detected (e.g., "I moved to Texas"), the confidence score for the "Florida" schema decays. - Frequency Thresholds: If an optimized table isn't hit within a specific window, it is flagged for De-Evolution.
3. MECHANISM: SHADOW TABLES & ATOMIC SWAPS To ensure zero-downtime, we use a shadow-table migration pattern: - Selection: A schema is flagged for merging via periodic hygiene checks. - Shadow Merge: A background transaction copies data from the specialized table back into a generic_memories table. - Atomic Swap: We drop the specialized table and update the query router in a single atomic transaction.
MANAGED MEMORY LIFECYCLE SUMMARY: Mechanism | Purpose | Implementation Mention Decay | Identifies stale data | Rolling counters on hits Confidence Scoring | Handles contradictions | Drift via sqrt(2) weighting Hygiene Checks | Prevents schema bloat | Periodic TTL-driven merges Atomic Swaps | Safe transitions | Transactions + Shadow Tables Modal Tagging | Filters intent vs fact | Zero-shot categorization
THE BOTTOM LINE: By allowing the schema to "de-evolve" back into generic tables, we maintain O(log N) performance for relevant data without the overhead of maintaining thousands of stale indices.