GraFlo is a declarative framework that handles this once and for all. You define your graph structure in a database-agnostic schema - vertices, edges, properties, and how they map to your source data (CSV, SQL, JSON, XML). GraFlo then generates the ingestion code for your target database. The key insight: while Neo4j, ArangoDB, and TigerGraph are all idiosyncratic, the underlying property graph model is universal. We crystallized that into a single abstraction layer.
What GraFlo handles automatically:
- Consistent ID generation across vertices and edges
- Type coercion (strings to dates, numbers, etc.)
- Vertex and edge deduplication
- Generating database-specific ingestion scripts
It's plug-and-play in the sense that swapping from Neo4j to ArangoDB takes no time — just change the target database type in your config (docker compose examples provided).
We've used it to build knowledge graphs from academic publications, financial datasets, and package dependencies. Instead of maintaining N × M scripts (N datasets, M databases), we maintain N schemas.
On the roadmap: SQL/API integration (e.g., automatically generating GraFlo configs from SQL schemas).
Would love feedback from anyone working with graph databases or building knowledge graphs.
x0xa•1h ago
acrostoic•1h ago
But the nice thing is: if you have your source data and GraFlo schema, regenerating your graph in a different DB is trivial. GraFlo handles indexes and constraints for each target database. It's like having the recipe instead of trying to reverse-engineer the cake.