frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Why most AI evals would miss the Linear sales email failure

https://tenureai.dev/writing/why-most-ai-evals-would-miss-the-linear-sales-email-failure/
4•jflynt76•1h ago

Comments

evil-olive•51m ago
> GroundEval is built around that question...

> This is the same distinction GroundEval makes for question answering agents.

> GroundEval treats agent behavior as something that can be tested against a state contract.

> That is the class of failure GroundEval is designed to catch.

this is an ad shaped like a blog post

jflynt76•37m ago
For what it's worth, I didn't describe it as anything; just posted the link. It's a paper with open code, no product behind it.

We are witnessing the slow death of the prestige career

https://www.theguardian.com/commentisfree/2026/jun/22/consulting-ai-prestige-careers
1•bookofjoe•1m ago•0 comments

"ChatGPT is I presume broken"

https://old.reddit.com/r/ChatGPT/comments/1ucs6ni/chatgpt_is_i_presume_broken/
1•beatthatflight•2m ago•0 comments

Humanoid Robot Begs for Electricity Money on China Streets with QR Code

https://yipzap.com/humanoid-robot-begs-for-electricity-money-on-china-streets-with-qr-code-the-vi...
3•noida•3m ago•0 comments

GLM-5.2 is above GPT-5.5 in new agentic knowledge work eval

https://artificialanalysis.ai/articles/aa-briefcase
2•declanjackson•4m ago•0 comments

AI Doesn't Replace On-Call Judgment

https://blog.sntxrr.dev/ai-doesnt-replace-on-call-judgment/
1•mooreds•7m ago•0 comments

Not a Task Manager

https://columns.app/
1•inloopwetrust•13m ago•1 comments

Why eval startups fail (2025)

https://thomasliao.com/eval-startups
1•jxmorris12•14m ago•0 comments

The PostgreSQL C Dialect

https://wiki.postgresql.org/wiki/The_PostgreSQL_C_Dialect
1•plaur782•17m ago•0 comments

Trump signs executive orders to drive development of quantum computer by 2028

https://finance.yahoo.com/markets/article/trump-signs-executive-orders-to-drive-development-of-co...
3•wslh•18m ago•0 comments

Server Survival

https://github.com/pshenok/server-survival
1•skogstokig•24m ago•0 comments

Ask HN: How do you make AI writing usable?

1•david_shi•24m ago•1 comments

People around the world see a winner on AI – and it's not the US

https://www.politico.com/news/2026/06/15/people-around-the-world-see-a-winner-on-ai-and-its-not-t...
1•yogthos•24m ago•1 comments

The Value of Standards-Compliant Authentication

https://fusionauth.io/articles/oauth/value-standards-compliant-authentication
1•mooreds•27m ago•0 comments

ContextMaestro – Curated Engineering Feed

https://www.contextmaestro.com/
1•jazzboss•31m ago•0 comments

Knowledge Catalog – universal context engine for agents

https://github.com/GoogleCloudPlatform/knowledge-catalog
1•modinfo•33m ago•0 comments

Google Investing in 'Backrooms' Studio A24 in AI research partnership

https://www.wsj.com/tech/ai/google-investing-in-backrooms-studio-a24-e7585ebe
1•jaredwiener•36m ago•0 comments

Citroën Ami Is an Ultra Affordable EV (2020)

https://insideevs.com/news/401218/citroen-ami-deliveries-june/
1•rawgabbit•39m ago•0 comments

AI's PR Problem

https://blog.dshr.org/2026/05/ais-pr-problem.html
4•linsomniac•41m ago•0 comments

Frozen Reformer

https://arunc.dev/essays/frozen-reformer/
1•arunc•41m ago•0 comments

Ask HN: How do you make the LLM generate good code?

1•bjourne•44m ago•1 comments

Why AI Is a Bubble

https://federicozebele.substack.com/p/this-is-why-ai-is-a-bubble-and-what
4•stanislavb•44m ago•2 comments

Europe must choose between AI and climate goals, data center lobby says

https://www.politico.eu/article/europe-choose-ai-climate-goals-data-center-chief-warns/
4•cdrnsf•46m ago•1 comments

AI Is Not a Tool

https://theconvivialsociety.substack.com/p/your-ai-is-not-a-tool
4•longdefeat•46m ago•0 comments

Robots will replace 700k delivery workers 'sooner or later' warns JD.com boss

https://www.ft.com/content/465635e2-633b-4311-afe5-9b3bff8c9240
3•momentmaker•48m ago•2 comments

The AI shift in cyber risk: why leaders must act now

https://www.cyber.gov.au/about-us/view-all-content/news/five-eyes-cyber-security-agencies-statement
2•Khaine•49m ago•0 comments

Kya is hiring an AI/ML Engineer

https://www.kyahq.com/careers/software-engineer-ai-ml
1•Johnall_n•49m ago•1 comments

Bipartite Matching Is in NC

https://scottaaronson.blog/?p=9851
2•amichail•51m ago•0 comments

Q.js: modern front-end framework for 2026. No build scripts unlike React et al.

https://www.npmjs.com/package/@qbix/q.js
1•EGreg•52m ago•1 comments

Hyperbolic Discounting

https://en.wikipedia.org/wiki/Hyperbolic_discounting
2•rzk•52m ago•0 comments

Report: Kennedy Space Center not ready for era of super heavy rockets

https://arstechnica.com/space/2026/06/report-kennedy-space-center-not-ready-for-era-of-super-heav...
1•voxadam•54m ago•0 comments