> For developers using local LLMs, LangExtract offers built-in support for Ollama and can be extended to other third-party APIs by updating the inference endpoints.
If you look in the code they currently have classes for Gemini and Ollama: https://github.com/google/langextract/blob/main/langextract/...
If you want to do structured data extraction with a wider variety of libraries I'm going to promote my LLM library and tool, which supports dozens of models for this via the plugins mechanism: https://llm.datasette.io/en/stable/schemas.html
My version works with Pydantic models or JSON schema in Python code, or with JSON schema or a weird DSL I invented on the command-line:
curl https://news.ycombinator.com/ | \
llm --schema-multi 'headline,url,votes int' \
-m gpt-4.1 --system 'all links'
Result: https://gist.github.com/simonw/f8143836cae0f058f059e1b8fc2d9...Does this proposed approach complement this or supercede the need for NER / Knowledge Graph. Just wondering aloud. Appreciate any insights here.
constantinum•6mo ago
1. Unstract has a Pre-processing layer(OCR). Which converts documents into LLM readable formats.(helps improve accuracy, and control costs)
2. Unstract also connects to your existing data sources, making it an out-of-the-box ETL tool.
https://github.com/Zipstack/unstract
oriettaxx•6mo ago
fudged71•6mo ago
ttul•6mo ago