I'm excited to announce a new release of OntoCast — an open-source framework for extracting semantic triples and building knowledge graphs (KG) from unstructured documents (PDF, JSON, Markdown, and more).
Before extracting facts, OntoCast automatically selects or creates a relevant ontology and iteratively refines it, leading to much more accurate and context-aware fact extraction. This is especially valuable for cross-domain or complex documents where a static ontology falls short.
- Agentic workflow: Uses LLMs (OpenAI/Ollama) to drive the extraction and ontology refinement process.
- MCP-compatible API server: Easy to integrate into your stack.
- Flexible storage: Works with Jena Fuseki and Neo4j for knowledge graph storage.
- Open source: Apache licensed.
Uses cases include extracting structured knowledge from scientific papers, financial reports, or clinical trial documents — even when they span multiple domains.
Repo: https://github.com/growgraph/ontocast Docs: https://growgraph.github.io/ontocast
Would love feedback, questions, or suggestions!
x0xa•4h ago
acrostoic•4h ago
At current openai pricing for GPT-4.1 mini $0.4/1M tokens we expect the cost of processing of 100 pages of text to be in the range of $0.02-0.08
NB: small models (< 14b) available on ollama struggle with structured output in our experience.
x0xa•4h ago