frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Temporal: A nine-year journey to fix time in JavaScript

https://bloomberg.github.io/js-blog/post/temporal/
468•robpalmer•8h ago•160 comments

Many SWE-bench-Passing PRs would not be merged

https://metr.org/notes/2026-03-10-many-swe-bench-passing-prs-would-not-be-merged-into-main/
86•mustaphah•2h ago•9 comments

Don't post generated/AI-edited comments. HN is for conversation between humans.

https://news.ycombinator.com/newsguidelines.html#generated
2403•usefulposter•4h ago•904 comments

Making WebAssembly a first-class language on the Web

https://hacks.mozilla.org/2026/02/making-webassembly-a-first-class-language-on-the-web/
358•mikece•19h ago•136 comments

Personal Computer by Perplexity

https://www.perplexity.ai/personal-computer-waitlist
91•josephwegner•5h ago•63 comments

Show HN: I built a tool that watches webpages and exposes changes as RSS

https://sitespy.app
137•vkuprin•7h ago•39 comments

Show HN: Autoresearch_at_home – SETI_at_home but for LLM training

https://www.ensue-network.ai/autoresearch
8•austinbaggio•23m ago•3 comments

Google closes deal to acquire Wiz

https://www.wiz.io/blog/google-closes-deal-to-acquire-wiz
208•aldarisbm•8h ago•139 comments

Britain is ejecting hereditary nobles from Parliament after 700 years

https://apnews.com/article/uk-house-of-lords-hereditary-peers-expelled-535df8781dd01e8970acda1dca...
125•divbzero•2h ago•102 comments

The MacBook Neo

https://daringfireball.net/2026/03/the_macbook_neo
348•etothet•12h ago•589 comments

I was interviewed by an AI bot for a job

https://www.theverge.com/featured-video/892850/i-was-interviewed-by-an-ai-bot-for-a-job
93•speckx•5h ago•99 comments

Meticulous (YC S21) is hiring to redefine software dev

https://jobs.ashbyhq.com/meticulous/3197ae3d-bb26-4750-9ed7-b830f640515e
1•Gabriel_h•2h ago

BitNet: 100B Param 1-Bit model for local CPUs

https://github.com/microsoft/BitNet
287•redm•11h ago•146 comments

Preliminary data from a longitudinal AI impact study

https://newsletter.getdx.com/p/ai-productivity-gains-are-10-not
20•donutshop•2h ago•6 comments

Show HN: Klaus – OpenClaw on a VM, batteries included

https://klausai.com/
110•robthompson2018•7h ago•63 comments

Entities enabling scientific fraud at scale (2025)

https://doi.org/10.1073/pnas.2420092122
248•peyton•10h ago•179 comments

5,200 holes carved into a Peruvian mountain left by an ancient economy

https://newatlas.com/environment/5-200-holes-peruvian-mountain/
83•defrost•1d ago•44 comments

Building Better Country Selects

https://talysto.com/blog/building-better-country-selects/
6•dlrush•1h ago•2 comments

Against vibes: When is a generative model useful

https://www.williamjbowman.com/blog/2026/03/05/against-vibes-when-is-a-generative-model-useful/
35•takira•1d ago•2 comments

How we hacked McKinsey's AI platform

https://codewall.ai/blog/how-we-hacked-mckinseys-ai-platform
375•mycroft_4221•13h ago•150 comments

Physicist Astrid Eichhorn is a leader in the field of asymptotic safety

https://www.quantamagazine.org/where-some-see-strings-she-sees-a-space-time-made-of-fractals-2026...
103•tzury•8h ago•15 comments

Swiss e-voting pilot can't count 2,048 ballots after decryption failure

https://www.theregister.com/2026/03/11/swiss_evote_usb_snafu/
140•jjgreen•10h ago•319 comments

Show HN: Open-source browser for AI agents

https://github.com/theredsix/agent-browser-protocol
95•theredsix•9h ago•29 comments

Launch HN: Prism (YC X25) – Workspace and API to generate and edit videos

https://www.prismvideos.com
30•aliu327•7h ago•16 comments

Can the Dictionary Keep Up?

https://www.thenation.com/article/culture/stefan-fatsis-dictionary-history/
8•pepys•1d ago•3 comments

Launch HN: Sentrial (YC W26) – Catch AI agent failures before your users do

https://www.sentrial.com/
22•anayrshukla•7h ago•10 comments

Show HN: Satellite imagery object detection using text prompts

https://www.useful-ai-tools.com/tools/satellite-analysis-demo/
34•eyasu6464•2d ago•13 comments

What Is a Tort?

https://harvardlawreview.org/print/vol-139/what-is-a-tort/
22•bookofjoe•3h ago•24 comments

Fungal Electronics (2021)

https://arxiv.org/abs/2111.11231
57•byt3h3ad•6h ago•6 comments

Building a TB-303 from Scratch

https://loopmaster.xyz/tutorials/tb303-from-scratch
207•stagas•4d ago•82 comments
Open in hackernews

Many SWE-bench-Passing PRs would not be merged

https://metr.org/notes/2026-03-10-many-swe-bench-passing-prs-would-not-be-merged-into-main/
84•mustaphah•2h ago

Comments

love2read•1h ago
Edit: Nevermind
refulgentis•1h ago
Well, no: one of the first things it says is reviewers were blind to human vs. ai.
yorwba•57m ago
The comment you're replying to is talking about a hypothetical scenario.

In any case, the blinding didn't stop Reviewer #2 from calling out obvious AI slop. (Figure 5)

collabs•41m ago
I feel like I don't have the context for this conversation. If slop is obvious as slop, I feel like we should block it.

If you look at the comment it says what the code following the comment does. It doesn't matter whether it is a human or a machine that wrote it. It is useless. It is actually worse than useless because if someone needs to change the code, now they need to change two things. So in that sense, you just made twice the work for anyone who touches the code after you and for what benefit?

zozbot234•29m ago
The point is that AI models do these kinds of things all the time. They're not really all that smart or intelligent, they just replicate patterns or boilerplate and then iterate until it sort of appears to work properly.
spartanatreyu•23m ago
> appears to work

That "appears" is doing a lot of heavy lifting.

The code working isn't what's being selected for.

The code looking convincing IS what is being selected for.

That distinction is massive.

nubg•41m ago
> mid-2024 agents

Is this a post about AI archeology?

varispeed•24m ago
Do these benchmarks make any sense? I tried a few local models that seem to be scoring well in SWE but the results were pure rubbish. (For instance MiniMax-M2.5 at 128GB from unslothed - completely unusable).
languid-photic•6m ago
makes sense! we wrote something yesterday about the weaknesses of test-based evals like swe-bench [1]

they are definitely useful but they miss the things that are hard to encode in tests, like spec/intent alignment, scope creep, adherence to codebase patterns, team preferences (risk tolerance, etc)

and those factors are really important. which means that test-evals should be relied upon more as weak/directional priors than as definitive measures of real-world usefulness

[1] https://voratiq.com/blog/test-evals-are-not-enough/