We started building an AI project manager. Users needed to search for context about projects, and discover insights like open tasks holding up a launch.
Vector search was terrible at #1 (couldn't connect code, marketing and PR that are for the same project). Knowledge graphs were too slow for #1, but perfect for structured relationships, great for UIs.
Then we started talking to other teams building AI agents - we realized everyone was hitting the exact same two problems.
So we pivoted to build Papr — a unified memory layer that combines: - Intent vectors: Fast goal-oriented search for conversational AI - Knowledge graph: Structured insights for analytics and dashboard generation - One API: Add unstructured content once, query for search or discover insights
And just open sourced it.
How intent vectors work (search problem) The problem with vector search: it's fast but context-blind. Returns semantically similar content but misses goal-oriented connections.
These are far apart in vector space (different keywords, different topics). Traditional vector search returns fragments. You miss the complete picture.
Our solution: Group memories by user intent and goals stored as a new vector embedding (also known as associative memory - per Google's latest research).
When you add a memory: 1. Detect the user's goal (using LLM + context) 2. Find top 3 related memories serving that goal 3. Combine all 4 → generate NEW embedding 4. Store at different position in vector space (near "product launch" goals, not individual topics) 5. Query "What's the status of mobile launch?" finds the goal-group instantly (one query, sub-100ms), returns all four memories—even though they're semantically far apart.
This is what got us #1 on Stanford's STaRK benchmark (91%+ retrieval accuracy). The benchmark tests multi-hop reasoning—queries needing information from multiple semantically-different sources. Pure vector search scores ~60%, Papr scores 91%+.
Automatic knowledge graphs (structured insights) Intent graph solves search. But production AI agents also need structured insights for dashboards and analytics. The problem with knowledge graphs: - Hard to get unstructured data IN (entity extraction, relationship mapping) - Hard to query with natural language (slow multi-hop traversal) - Fast for static UIs (predefined queries), slow for dynamic assistants
Our solution: - Automatically extract entities and relationships from unstructured content - Cache common graph patterns and match them to queries (speeds up retrieval) - Expose GraphQL API so LLMs can directly query structured data - Support both predefined queries (fast, for static UIs) and natural language (for dynamic assistants)
We combined both of these solutions in one API.
What I'd Love Feedback On
1. Evaluation - We chose Stanford STARK's benchmark because it required multi-hop search but it only captures search, not insights we generate. Are there better evals we should be looking at?
2. Graph pattern caching - We cache unique and common graph patterns stored in the knowledge graph (i.e. node -> edge -> node), then match queries to them. What patterns should we prioritize caching? How do you decide which patterns are worth the storage/compute trade-off?
3. Embedding weights - When combining 4 memories into one group embedding, how should we weight them? Equal weights? Weight the newest memory higher? Let the model learn optimal weights?
4. GraphQL vs Natural Language - Should LLMs always use GraphQL for structured queries (faster, more precise), or keep natural language as an option (easier for prototyping)? What are the trade-offs you've seen?
---
Try it: - Developer dashboard: platform.papr.ai (free tier) - Open source: https://github.com/Papr-ai/memory-opensource - SDK: npm install papr/memory or pip install papr_memory
GraphNinja23•1mo ago
amirkabbara•1mo ago