frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

A non-anthropomorphized view of LLMs

http://addxorrol.blogspot.com/2025/07/a-non-anthropomorphized-view-of-llms.html
60•zdw•1h ago•23 comments

Building the Rust Compiler with GCC

https://fractalfir.github.io/generated_html/cg_gcc_bootstrap.html
62•todsacerdoti•2h ago•1 comments

Intel's Lion Cove P-Core and Gaming Workloads

https://chipsandcheese.com/p/intels-lion-cove-p-core-and-gaming
36•zdw•1h ago•0 comments

I extracted the safety filters from Apple Intelligence models

https://github.com/BlueFalconHD/apple_generative_model_safety_decrypted
229•BlueFalconHD•4h ago•132 comments

Show HN: I wrote a "web OS" based on the Apple Lisa's UI, with 1-bit graphics

https://alpha.lisagui.com/
231•ayaros•5h ago•76 comments

Nobody has a personality anymore: we are products with labels

https://www.freyaindia.co.uk/p/nobody-has-a-personality-anymore
28•drankl•1h ago•21 comments

More than 1 in 5 Show HN posts are now AI-related, get > half the votes/comments

https://ryanfarley.co/ai-show-hn-data/
211•rfarley04•2d ago•118 comments

Jane Street barred from Indian markets as regulator freezes $566 million

https://www.cnbc.com/2025/07/04/indian-regulator-bars-us-trading-firm-jane-street-from-accessing-securities-market.html
188•bwfan123•9h ago•101 comments

Why English doesn't use accents

https://www.deadlanguagesociety.com/p/why-english-doesnt-use-accents
49•sandbach•2h ago•33 comments

Opencode: AI coding agent, built for the terminal

https://github.com/sst/opencode
103•indigodaddy•6h ago•23 comments

Get the location of the ISS using DNS

https://shkspr.mobi/blog/2025/07/get-the-location-of-the-iss-using-dns/
249•8organicbits•11h ago•75 comments

Functions Are Vectors (2023)

https://thenumb.at/Functions-are-Vectors/
142•azeemba•8h ago•75 comments

Backlog.md – CLI that auto-generates task files (took my Claude success to 95 %)

https://github.com/MrLesk/Backlog.md
67•mrlesk•3h ago•14 comments

Async Queue – One of my favorite programming interview questions

https://davidgomes.com/async-queue-interview-ai/
79•davidgomes•7h ago•59 comments

Lessons from creating my first text adventure

https://entropicthoughts.com/lessons-from-creating-first-text-adventure
17•kqr•2d ago•1 comments

Crypto 101 – Introductory course on cryptography

https://www.crypto101.io/
10•pona-a•2h ago•0 comments

I don't think AGI is right around the corner

https://www.dwarkesh.com/p/timelines-june-2025
108•mooreds•3h ago•126 comments

Cool People [pdf]

https://www.apa.org/pubs/journals/releases/xge-xge0001799.pdf
65•ilamont•6h ago•19 comments

Corrected UTF-8 (2022)

https://www.owlfolio.org/development/corrected-utf-8/
31•RGBCube•3d ago•22 comments

Metriport (YC S22) is hiring engineers to improve healthcare data exchange

https://www.ycombinator.com/companies/metriport/jobs/Rn2Je8M-software-engineer
1•dgoncharov•6h ago

Hannah Cairo: 17-year-old teen refutes a math conjecture proposed 40 years ago

https://english.elpais.com/science-tech/2025-07-01/a-17-year-old-teen-refutes-a-mathematical-conjecture-proposed-40-years-ago.html
329•leephillips•8h ago•72 comments

Mirage: First AI-Native UGC Game Engine Powered by Real-Time World Model

https://blog.dynamicslab.ai
15•zhitinghu•23h ago•10 comments

Paper Shaders: Zero-dependency canvas shaders

https://github.com/paper-design/shaders
4•nateb2022•2d ago•0 comments

Toys/Lag: Jerk Monitor

https://nothing.pcarrier.com/posts/lag/
43•ptramo•9h ago•36 comments

Collatz's Ant and Σ(n)

https://gbragafibra.github.io/2025/07/06/collatz_ant5.html
21•Fibra•6h ago•3 comments

Overclocking LLM Reasoning: Monitoring and Controlling LLM Thinking Path Lengths

https://royeisen.github.io/OverclockingLLMReasoning-paper/
46•limoce•10h ago•0 comments

Serving 200M requests per day with a CGI-bin

https://simonwillison.net/2025/Jul/5/cgi-bin-performance/
298•mustache_kimono•23h ago•271 comments

Hidden interface controls that affect usability

https://interactions.acm.org/archive/view/july-august-2025/stop-hiding-my-controls-hidden-interface-controls-are-affecting-usability
538•cxr•1d ago•382 comments

1945 TV Console Showed Two Programs at Once

https://spectrum.ieee.org/dumont-duoscopic-tv-set
32•pseudolus•1d ago•11 comments

LLMs should not replace therapists

https://arxiv.org/abs/2504.18412
52•layer8•2h ago•68 comments
Open in hackernews

Build real-time knowledge graph for documents with LLM

https://cocoindex.io/blogs/knowledge-graph-for-docs/
181•badmonster•1mo ago

Comments

gorpy7•1mo ago
idk if it’s precisely the same but o3 recently offered to create one for me in, was it markdown?, recently. suggesting it was something it was willing to maintain for me.
gorpy7•1mo ago
i think it offered a few formats but specifically remember it would do it in obsidian to use concept map ability within.
cipehr•1mo ago
sorry, what is `o3`? I am not familiar with it... unless you're talking about the open api chat gpt model?

If so thats crazy, and I would love pointers on how to prompt it to suggest this?

Onawa•1mo ago
o3 is one of the myriad models offered by OpenAI. You can see some metrics and comparisons with other models here: https://artificialanalysis.ai/models/o3/providers
marviel•1mo ago
mermaid probably.
dvrp•1mo ago
I feel like you can do the same using a single markdown file and an LLM (e.g. Claude Code).

I do it that way and then I hooked it up with the Telegram API. I’m able to ask things like “What’s my passport number?” and it just works.

Combine it with git and you have a Datomic-esque way of seeing facts getting added and retracted simply by traversing the commits.

I arrived to the solution after trying more complex triplets-based approach and seeing that plain text-files + HTTP calls work as good and are human (and AI) friendly.

The main disadvantage is having unstructured data, but for content that fits inside the LLM context window, it doesn’t matter practically speaking. And even then, when context starts being the limiting factor, you can start segmenting by categories or start using embeddings.

yard2010•1mo ago
I'm curious, how do you find your passport number in Telegram? Do you embed every message and then do cosine similarity to find the message that is relevant to the question? Please write about your system more :)
barrenko•1mo ago
Not OP, but I think he literally supplies the "vanilla" .md file to the LLM and prompts.
dvrp•1mo ago
Yes! This in essence.

Specifically, it’s a file that contains a list of Entity-Attribute-Value assertions in triplets.

It’s called “FACTS.md” and each line represents a fact. Such as “<OP>, PASSPORT_NUMBER, <VALUE>”

Then put it in context, ask question, and then use Telegram API and suddenly I have a “Private ChatGPT” that’s aware of my filesystem, can run my own binaries/tools, and has access to a private document store.

It gets cool once you add function calling to open images on demand (or any type of file) with vision capabilities/OCR and you start running shell commands and combining that with many media types from Telegram.

Funny enough, I called the project “COO” initially. Been thinking of writing up something about it.

I think it’s a no brainer and I’m confident OpenAI, Claude, and Notion will go there.

In the meantime, I have good-ol’ vi, .md/.txt, and HTTP/SMTP!

unshavedyak•1mo ago
I really miss the ease of Telegram bots. It's so fun to write stuff like this.
badmonster•1mo ago
This is such a cool idea, would love to hack a project sometime with multi-media types and map to knowledge graphs and feed that to agents :)
hailruda•1mo ago
I’d appreciate a writeup! I’d like to implement this myself, maybe add reminders.
sgt101•1mo ago
Maybe ask the LLM to extract facts from the documents as datalog assertions and then use a reasoner/llm tool to answer the question?
th0ma5•1mo ago
People probably don't discuss the problems enough about an open world knowledge graph. Essentially the same class of problems as spam filters. Using an open language model to produce a graph doesn't create a closed world graph by definition. This confusion as well as just general avoidance of measuring actual productivity outcomes seems like an insurmountable problem in knowledge world now and I feel language itself is failing at times to educate on this issues.
lyu07282•1mo ago
They don't even do any entity disambiguation, the resulting graph won't be very useful indeed. I also saw people then use a different prompt to generate a cypher query from user input for RAG, I can't imagine that actually works well. It would make a little more sense if they then use knowledge graph embeddings, but I'm not sure if neo4j supports that.
badmonster•1mo ago
Entity resolution is a great topic and there’s lots of research in this area. This is something I’m looking into next. I’m thinking about

- Metadata based match (I’ve done that with search system in the past)

- Embedding base match (false positive is definite consideration

- Using knowledge graph itself to do entity resolution before feeding the entities to graph next

Add human in the loop to guide entity resolutions.

What do you think? Would love to learn your thoughts:)

badmonster•1mo ago
btw, Neo4j supports vector properties and building vector index out of it. It's supported in cocoindex to configure a vector index: https://cocoindex.io/docs/ops/storages#neo4j

Neo4j also supports building embedding leveraging more information in the graph in addition to single node's property: https://neo4j.com/docs/graph-data-science/current/machine-le... (It's hard to incrementally compute them, but users can still compute them after the graph is built)

looking forward to learn your thoughts :)

Frummy•1mo ago
Now imagine it with theorems as entities and lean proofs as relationships
manishsharan•1mo ago
Why not merely upload all relevant documents into Gemini? Split the knowledge into smaller knowledge domains and have agents ( backed by Gemini) for each domain?
badmonster•1mo ago
could you clarify a bit how to - Split the knowledge into smaller knowledge domains - in this case? does that inform some semantic extraction or more manually?
8thcross•1mo ago
building knowledge graphs (GrahRAGs) are obsolete from a acamedic and technical point of view. LLMs are getting better with built in graph networks capable algorithms like SONAR and knowledge embeddings. like someone said - just use Notebook LM instead. But, they are useful in corporate setup when the infrastructure,teams and skills are lagging by years.
timfrazer•1mo ago
Could you provide some academic proofs from what I read this isn’t true so I’d be interested to see what you’re referring to
phren0logy•1mo ago
My use case is for documents related to a legal issue, where a foundation model has no knowledge of any of the participants or particular issues. There are many, many such situations. Your statement is ignorant and overly broad.
esafak•1mo ago
That's only true if you can train or fine tune the LLM. If you are merely a user augmenting it with RAG then GraphRAG is perfectly viable.
ianbicking•1mo ago
I feel like I should understand the purpose of knowledge graphs, but I just... don't.

Like the example "CocoIndex supports Incremental Processing" becomes the subject/predicate/object triple (CocoIndex, supports, Incremental Processing)... so what? Are you going to look up "Incremental Processing" and get a list of related entities? That's not a term that is well enough defined to be meaningful across a variety of subjects. I can incrementally process my sandwich by taking small bites.

I guess you could actually expand "Incremental Processing" to some full definition. But then it's not really a knowledge graph because the only entity ever associated with that new definition will be CocoIndex, and you are back to a single sentence that contains the information, you've just pretended it's structured. ("Supports" hardly a well-defined term either!)

I can _kind of_ see how knowledge graphs can be used for limited relationships. If you want to map companies to board members, and board members to family members, etc. Very clearly and formally defined entities (like a person or company), with clearly defined relationships (board member, brother, etc). I still don't know how _useful_ the result is, but at least I can understand the validity of the model. But for everything else... am I missing something?

alexchantavy•1mo ago
IMO knowledge graphs are a must have for security use-cases because of how well they handle many-to-many relationships. Who has access to read each storage bucket? Via which IAM policies? Who owns each bucket? What is the shortest possible role-assumption path available from internet-exposed compute instances to read this bucket? What is the effective blast radius from a vulnerability that allows remote code execution on an internet exposed compute instance?

Or, I have a docker container image that is built from multiple base images owned by different teams in my organization. Who is responsible for fixing security vulnerabilities introduced by each layer?

We really could model these as tables but getting into all those joins makes things so cumbersome. Plus visualizing these things in a graph map is very compelling for presentation and persuading stakeholders to make security decisions.

nightfly•1mo ago
Are there existing tools that model security stuff like this? For a few years I've wanted to build a model like this and search for vulnerabilities using something like GOAP (Goal-Oriented Action Planning)
alexchantavy•1mo ago
I built an open source one (https://github.com/cartography-cncf/cartography) and am building commercial support around it (https://subimage.io)
badmonster•1mo ago
In my understanding, there are two kinds of use cases potentially can be explored with knowledge graph.

- Structured data - this is probably more close to the use case you mention

- Unstructure data and extract relationship and build KG with natural language understanding - which is this article trying to explore. Here is a paper discussing about this https://arxiv.org/abs/2409.13731

In general it is an alternative way to establish connections with entities easily. And these relationships could help with discovery, recommendation and retrieval. Thanks @alexchantavy for sharing use-cases in security.

Would love to learn more from the community :)

vintermann•1mo ago
Well, in my hobby of genealogy it's all about building a knowledge graph. Not just the obvious of who are children to which parents etc, but also where great-grandpa was in 1920.

People reach for a database, and of course you need that, but for one thing the data certainly doesn't always come in a nice tabular format, and moreover you often don't know which piece of knowledge will become relevant for a question you care about - maybe two people worked together at the Kings Bay Mining Company, and then the was there accident in 1962, but uncle Hans was inspector at Wilhelmsen etc. Often you make progress because you remember niche geographical or historical information.

visarga•1mo ago
> I feel like I should understand the purpose of knowledge graphs, but I just... don't.

Like RAG, it decouples KG size from context size, but unlike RAG, a KG offers deduplication and relational traversal. Some searches based on just similarity or keywords fail when the relation is functional. Both KG and RAG work better when the LLM is planning the search process, doing multiple searches, basing each one off the previous one. In the last few months LLMs have gotten great at exploration with search tools.

I implemented my own KG recently and I put both search and node generation in the hands of the LLM, as MCP tools. The cool trick is that when I instruct the LLM to generate a node it links to previous nodes using inline references (like @45). So I get the graph structure for free. I think coupling RAG with a KG allows for both breadth and precise control. The RAG is assimilating unstructured chunks, the KG is mapping the corpus. All done with human in the loop to guide the process.

zackmorris•1mo ago
For context, the subject-predicate-object pattern is known as a semantic triple or Resource Description Framework (RDF) triple:

https://en.wikipedia.org/wiki/Semantic_triple

They're useful for storing social network graph data, for example, and can be expressed using standards like Open Graph and JSONAPI:

https://ogp.me

https://jsonapi.org

I've stored RDF triples in database tables and experimented with query concepts from neo4j:

https://neo4j.com/docs/getting-started/data-modeling/tutoria...

These are straightforward to translate to SQL but the syntax can get messy due to not always having foreign keys available and hitting limitations with polymorphic relationships. Some object-relational mapping (ORM) frameworks help with this:

https://laravel.com/docs/12.x/eloquent-relationships#polymor...

I feel that document-oriented databases like MongoDB jumped the gun a bit, and would have preferred to have had graph-oriented or key-value-oriented databases providing row/column/document oriented queries and views. Going the other way feels a bit kludgy to me:

https://www.mongodb.com/resources/basics/databases/mongodb-g...

Basically Set Theory internally with multiple query languages externally and indexed by default.

Oh and have all writes generate an event stream like Firebase does so we can easily build reactive apps.

krallistic•1mo ago
Its quite funny to see that LLMs reviewed interest in KnowledgeGraphs/Reasoning/Triple Stores etc... since (on a high level) they both are often pitched to solve the same goal. (E.g. Ask an AI about a topic...)
StopDisinfo910•1mo ago
If you think about it, I think it makes a lot of sense. The main impediment to the usefulness of knowledge graphs were always how to build them as turning unstructured data into structured data at scale is difficult. Now that it's something at which LLMs are pretty good at. It makes a lot of sense.
Xmd5a•1mo ago

    You have no graphs, no concepts, no nothing
    [...]
    You never understood the meaning of concept
    Lyrics are full of depth and ideas connect
    [...]
    You can only dream to write like I write, I might
    Ignite, confuse and leave you blinded by the light
    'Cause I been working on graphs, concepts and all of that
    Making it difficult for those who might try to follow that

Mark B & Blade – The Way It Has To Be

https://www.youtube.com/watch?v=l-rbtCM0g6c

gitroom•1mo ago
[flagged]
kk58•1mo ago
What's the angle you're thinking?