I do it that way and then I hooked it up with the Telegram API. I’m able to ask things like “What’s my passport number?” and it just works.
Combine it with git and you have a Datomic-esque way of seeing facts getting added and retracted simply by traversing the commits.
I arrived to the solution after trying more complex triplets-based approach and seeing that plain text-files + HTTP calls work as good and are human (and AI) friendly.
The main disadvantage is having unstructured data, but for content that fits inside the LLM context window, it doesn’t matter practically speaking. And even then, when context starts being the limiting factor, you can start segmenting by categories or start using embeddings.
Like the example "CocoIndex supports Incremental Processing" becomes the subject/predicate/object triple (CocoIndex, supports, Incremental Processing)... so what? Are you going to look up "Incremental Processing" and get a list of related entities? That's not a term that is well enough defined to be meaningful across a variety of subjects. I can incrementally process my sandwich by taking small bites.
I guess you could actually expand "Incremental Processing" to some full definition. But then it's not really a knowledge graph because the only entity ever associated with that new definition will be CocoIndex, and you are back to a single sentence that contains the information, you've just pretended it's structured. ("Supports" hardly a well-defined term either!)
I can _kind of_ see how knowledge graphs can be used for limited relationships. If you want to map companies to board members, and board members to family members, etc. Very clearly and formally defined entities (like a person or company), with clearly defined relationships (board member, brother, etc). I still don't know how _useful_ the result is, but at least I can understand the validity of the model. But for everything else... am I missing something?
Or, I have a docker container image that is built from multiple base images owned by different teams in my organization. Who is responsible for fixing security vulnerabilities introduced by each layer?
We really could model these as tables but getting into all those joins makes things so cumbersome. Plus visualizing these things in a graph map is very compelling for presentation and persuading stakeholders to make security decisions.
- Structured data - this is probably more close to the use case you mention
- Unstructure data and extract relationship and build KG with natural language understanding - which is this article trying to explore. Here is a paper discussing about this https://arxiv.org/abs/2409.13731
In general it is an alternative way to establish connections with entities easily. And these relationships could help with discovery, recommendation and retrieval. Thanks @alexchantavy for sharing use-cases in security.
Would love to learn more from the community :)
gorpy7•6h ago
gorpy7•6h ago
cipehr•5h ago
If so thats crazy, and I would love pointers on how to prompt it to suggest this?
Onawa•1h ago
marviel•5h ago