frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Logic Puzzles: Why the Liar Is the Helpful One

https://blog.szczepan.org/blog/knights-and-knaves/
1•wasabi991011•1m ago•0 comments

Optical Combs Help Radio Telescopes Work Together

https://hackaday.com/2026/02/03/optical-combs-help-radio-telescopes-work-together/
1•toomuchtodo•6m ago•1 comments

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

https://github.com/ppomes/myanon
1•pierrepomes•12m ago•0 comments

The Tao of Programming

http://www.canonical.org/~kragen/tao-of-programming.html
1•alexjplant•13m ago•0 comments

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

https://medium.com/@ognian.milanov/forcing-rust-how-big-tech-lobbied-the-government-into-a-langua...
1•akagusu•13m ago•0 comments

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

https://www.tryinspector.com/blog/code-first-design-tools
2•quentinrl•16m ago•1 comments

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•21m ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•25m ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
2•DesoPK•28m ago•0 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•30m ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
14•mfiguiere•36m ago•1 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
2•meszmate•38m ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•40m ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•55m ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•1h ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•1h ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
3•gmays•1h ago•0 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•1h ago•1 comments

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

https://github.com/MelzLabs/DeSync
1•0xUnavailable•1h ago•0 comments

Automatic Programming Returns

https://cyber-omelette.com/posts/the-abstraction-rises.html
1•benrules2•1h ago•1 comments

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]

https://economics.mit.edu/sites/default/files/inline-files/Why%20Are%20there%20Still%20So%20Many%...
2•oidar•1h ago•0 comments

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•1h ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•1h ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•1h ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
3•geox•1h ago•1 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
3•bookmtn•1h ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
4•bookmtn•1h ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
2•tjr•1h ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
5•alephnerd•1h ago•5 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
1•keepamovin•1h ago•1 comments
Open in hackernews

Use the Accept Header to Serve Markdown Instead of HTML to LLMs

https://www.skeptrune.com/posts/use-the-accept-header-to-serve-markdown-instead-of-html-to-llms/
74•hahnbee•4mo ago

Comments

skeptrune•4mo ago
There was a lot of conversation about this on X over the last couple days and the `Accept` request header including "text/markdown, text/plain" has emerged as kind of a new standard for AI agents requesting content such that they don't burn unnecessary inference compute processing HTML attributes and CSS.

- https://x.com/bunjavascript/status/1971934734940098971

- https://x.com/thdxr/status/1972421466953273392

- https://x.com/mintlify/status/1972315377599447390

hahnbee•4mo ago
keep us posted on how this change impacts your GEO!
burcs•4mo ago
Really cool idea

Humans get HTML, bots get markdown. Two tiny tweaks I’d make...

Send Vary: Accept so caches don’t mix Markdown and HTML.

Expose it with a Link: …; rel="alternate"; type="text/markdown" so it’s easy to discover.

yawaramin•4mo ago
This person hypermedias
Rohansi•4mo ago
Would be nice for humans to get the markdown version too. Once it's rendered you get a clean page.
captn3m0•4mo ago
I’ve been asking for browser-native markdown support for years now. A clean web is not that far, if browsers support more than just HTML.
lelanthran•4mo ago
> I’ve been asking for browser-native markdown support for years now. A clean web is not that far, if browsers support more than just HTML.

You can always do the markdown -> DOM conversion on the client. Sure, there's a bit of latency there, but it means easier deployment (no build step involving pandoc or similar).

Browser-native markdown support would be better though; you'd get ability to do proper contenteditable divs with bold, italic, etc done via markdown

captn3m0•4mo ago
To get broad support from the server side, you’ll need to showcase high browser support. We need Wordpress and Wikipedia and Ghost to support this, and that won’t happen without native browser support.
lelanthran•4mo ago
> We need Wordpress and Wikipedia and Ghost to support this, and that won’t happen without native browser support.

It can. Unlikely but possible. A good first step would be to have a well-written web component to be used like this: `<markdown>...</markdown>`, with no support at all for a build-step. The .js file implementing this should be included directly in the `<head>`.

If that gets traction (unlikely, but possible) then the standards would sooner or later introduce a tag native to the browser that does the same thing.

xigoi•4mo ago
Markdown is not standardized, so every browser would render the page differently and you’d get the same problems as with pre-standard HTML.
captn3m0•4mo ago
Browsers can take standards position on CommonMark extensions and decide on a baseline that goes into the W3C spec? It will just converge on the lowest common denominator and that’s good enough for the vast majority of content reading usecases.
foxfired•4mo ago
I think there is a problem of incentive here. When we made our websites Search Engine Optimized, the incentive was for google to understand our content, and bring traffic our way. When you make your content optimized for LLM, it only improves their product, and you get nothing in return.
skeptrune•4mo ago
This isn't true. ChatGPT and Gemini link to sites in a similar way to how search engines have always done it. You can see the traffic show up in ahrefs or semrush.
nozzlegear•4mo ago
I had a call with a new user for a SaaS product that I sell recently. During the call he mentioned that he found it by typing what he was looking for into Gemini, and it recommended my app. I don't do anything special for llms, and the public-facing part of the website has been neglected for longer than I like to admit, so I was delighted. I had never considered that AI could send new users to me rather than pull them away. It felt like I'd hacked the system somehow, skipped through all the SEO best practices of yesteryear and had this benevolent bullshit machine bestow a new user on me at the cost of nothing.
skeptrune•4mo ago
Exactly! It can actually be a positive thing, might as well make it easy for LLMs to read.
hahnbee•4mo ago
> benevolent bullshit machine bestow a new user on me at the cost of nothing

that's awesome. i love this line.

gl-prod•4mo ago
How many users do actually visit these links?
bastawhiz•4mo ago
I usually do, as a data point.
Larrikin•4mo ago
I've found it's extremely important to, because you will get some results that are already AI slop optimized to show in LLM research searches.
foxfired•4mo ago
Yes, they show a tiny link behind a collapsed menu that very few people bother clicking. For example, my blog has gone from being prominently taking first spot on Google for some queries. Now with AI overviews, there is a sharp drop in traffic. However, it still showed higher impressions then ever. This means I'm appearing in search, even in AI overview, it's just that very few people click.

As of last week, impressions have also dropped. Maybe people not clicking on my links anymore is the result?

ako•4mo ago
Maybe it's about adding knowledge to LLMs, and not how many people read your website? I would be very happy if i had a simple way to get my insights, knowledge and best practices into the next version of an LLM so I have a way to improve it.
skeeter2020•4mo ago
and like Google - but much, much worse - they bring back enough content to keep users in the chat interface; they never visit your site.
naet•4mo ago
I do dev work for a marketing dept of a large company and there is a lot of talk about optimizing for LLMs/AI. Chatgpt can drive sales in the same way a blog post indexed by Google can.

If a customer asks the AI what product can solve their problem and it replies with our product that is a huge win.

If your business is SEO spam with online ads, chatgpt might eat it. But if your business is selling some product, chatgpt might help you sell it.

krainboltgreene•4mo ago
Neat up until the "customer ask" is "What, in X space, is the worst product you can purchase?" Something you have no ability to manipulate.
charcircuit•4mo ago
>Something you have no ability to manipulate.

What makes you think this?

krainboltgreene•4mo ago
Because I built an LLM, I know how they work.
NaomiLehman•4mo ago
just add an arbiter layer on top for the possibility of advertising and modifying the output. not rocket science
yawaramin•4mo ago
Why would a customer ask that? If I'm looking for something, why would I waste time with the worst version of it? I'd just go straight for the best.
Vespasian•4mo ago
That is at most temporary. I expect within the next 5 year "partner products" and "LLM-optpmized content" will take the place of SEO.

The economic dynamics did not change and the methods will adapt.

Why wouldn't Google sell advertisers a prominent spot in the AI summary. That's their whole deal. Why wouldn't OpenAI do the same with (free) users.?

krainboltgreene•4mo ago
Because that’s not how LLMs work.
fouc•4mo ago
They have many ways to manipulate the LLM's results, for example they can use a lot of the same mechanisms that are used to block or filter out inappropriate material.
krainboltgreene•4mo ago
Given that there are entire forums devoted to successfully doing just that (easily) my point stands.
monkeyelite•4mo ago
And what that means is the usefulness of LLms in recommending products is about to jump off a cliff.
whatevaa•4mo ago
This is what everybody should have expected.
monkeyelite•4mo ago
I think it’s going to be even worse - companies are going to go to ChatGPT with lawyers and say you are making false/unfair claims about our product. We should be able to give it this copy with correct information to consume.
userbinator•4mo ago
And neither of those two ultimately help the humans who are actually looking for something. You have a finite amount of time to spend on optimising for humans, or for search engines (and now LLMs), and unfortunately many chose the latter and it's just lead to plenty of spam in the search results.

Yes, SEO can bring traffic to your site, but if your visitors see nothing of value, they'll quickly leave.

CGamesPlay•4mo ago
But software documentation is a prime example of when the incentives don't have any problems. I want my docs to be more accessible to LLMs, so more people use my software, so my software gets more mindshare, so I get more paying customers on my enterprise support plan.
skeptrune•4mo ago
Oh hey, I work at Mintlify! We shipped this as a default feature for all of our customers.
foxyv•4mo ago
If you are selling advertising, then I agree. However, if you are selling a product to consumers then no. Ask an LLM "What is the best refrigerator on the market." You will get various answers like:

> The best refrigerator on the market varies based on individual needs, but top brands like LG and Samsung are highly recommended for their innovative features, reliability, and energy efficiency. For specific models, consider LG's Smart Standard-Depth MAX™ French Door Refrigerator or Samsung's smart refrigerators with internal cameras.

Optimizing your site for LLM means that you can direct their gestalt thinking towards your brand.

shpx•4mo ago
You get to live in a world where other people are slightly more productive.
anabis•4mo ago
OpenAI cookbook says LLMs understand XML better than Markdown text, so maybe that also? Although, it should be more specified and structured, but not HTML.
onion2k•4mo ago
OpenAI cookbook says LLMs understand XML better than Markdown text.

Yes, for prompts. Given how little XML is out on the public internet it'd be surprising if it also applies to data ingestion from web scraping functions. It'd be odd if Markdown works better than HTML to be honest, but maybe Markdown also changes the content being served e.g. there's no menu, header, or footer sent with the body content.

Kimitri•4mo ago
The concept is called content negotiation. We used to do this when we wanted to serve our content as XHTML to clients preferring that over HTML. It's nice to see it return as I always thought it was quite cool.
skeptrune•4mo ago
Agreed! I love that such a tried and true web standard is making a comeback because of AI.
pabs3•4mo ago
Content negotiation is also good for choosing human languages, unfortunately the browser interfaces for it are terrible.
klodolph•4mo ago
I don’t understand why the agents requesting HTML can’t extract text from HTML themselves. You don’t have to feed the entire HTML document to your LLM. If that’s wasteful, why not have a little bit of glue that does some conversion?
skeptrune•4mo ago
It's always better for the agent to have fewer tools and this approach means you get to avoid adding a "convert HTML to markdown" one which improves efficiency.

Also, I doubt most large-scale scrapers are running in agent loops with tool calls, so this is probably necessary for those at a minimum.

klodolph•4mo ago
This does not make any sense to me. Can you elaborate on this?

It seems “obvious” to me that if you have a tool which can request a web page, you can make it so that this tool extracts the main content from the page’s HTML. Maybe there is something I’m missing here that makes this more difficult for LLMs, because before we had LLMs, this was considered an easy problem. It is surprising to me that the addition of LLMs has made this previously easy, efficient solution somehow unviable or inefficient.

I think we should also assume here that the web site is designed to be scraped this way—if you don’t, then “Accept: text/markdown” won’t work.

hahnbee•4mo ago
If you have a website and you're optimizing it for GEO, you can't assume that the agents are going to have the glue. So as the person maintaining the website you implement as much of the glue as possible.
klodolph•4mo ago
That sounds completely backwards. It seems, again, obvious to me that it would be easier to add HTML->markdown converters to agents, given that there are orders of magnitude more websites out there compared to agent.

If your agent sucks so bad that it isn’t capable of consuming HTML without tokenizing the whole damn thing, wouldn’t you just use an agent that isn’t such a mess?

This whole thing kinda sounds crazy inefficient to me.

xg15•4mo ago
I don't think it's about including this as a tool, just as general preprocessing before the agent even gets the text.
skeptrune•4mo ago
Well that's what I implemented. There are markdown docs for every HTML file and the proxy decides to serve either markdown or HTML based on the Accept header.
xg15•4mo ago
I think GP meant on the client, i.e. agent side. As in, you could deploy this kind of proxy in a forward/non-reverse way inside the agent system, so the LLM always gets markdown, regardless of what the site supports.

There is no real reason to pass HTML with tags and all to the LLM - you can just strip the tags beforehand.

simonw•4mo ago
Converting HTML into Markdown isn't particularly hard. Two methods I use:

1. The Jina reader API - https://jina.ai/reader/ - add r.jina.ai to any URL to run it through their hosted conversion proxy, eg https://r.jina.ai/www.skeptrune.com/posts/use-the-accept-hea...

2. Applying Readability.js and Turndown via Playwright. Here's a shell script that does that using my https://shot-scraper.datasette.io tool: https://gist.github.com/simonw/82e9c5da3f288a8cf83fb53b39bb4...

skeptrune•4mo ago
I learned that the golang CLI[1] is the best through my work simplifying Firecrawl[2]. However, in this case I used one available through npmjs such that it would work with `npx` for the CF worker builds.

[1]: https://github.com/JohannesKaufmann/html-to-markdown

[2]: https://github.com/devflowinc/firecrawl-simple

osener•4mo ago
A lightweight alternative to Playwright, which starts a browser instance, is using an HTML parser and DOM implementation like linkedom.

This is much cheaper to run on a server. For example: https://github.com/ozanmakes/scrapedown

NathanFlurry•4mo ago
We’re doing this on https://rivet.dev now. I did not realize how much context bloat we had since we were using Tailwind.
skeptrune•4mo ago
It is crazy how badly Tailwind bloats HTML. Tradeoffs!
stebalien•4mo ago
Or one can just use semantic HTML; it's easy enough to convert semantic HTML into markdown with a tool like pandoc. That would also help screen readers, browser "reader modes", text-based web browsers, etc.
jauntywundrkind•4mo ago
Maybe adopt the existing Gemini Protocol instead? It's already a nice very simple markdown-like.

https://toffelblog.xyz/blog/gemini-overview/ https://news.ycombinator.com/item?id=23730408

https://gemini.circumlunar.space/ https://news.ycombinator.com/item?id=23042424

troyvit•4mo ago
FYI both the link to toffelblog and circumlunar.space are broken with ssl errors.