frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Why would we care about "extended time horizons" and LLMs?

2•ozozozd•1h ago
Is it more impressive to take longer to answer 2 + 2? It’s not. The longer one takes, the less intelligent we would rate that person.

Somehow for AI agents taking longer is getting praise with the framing “maintaining attention for long-time horizons?”

Have we collectively gone down to room temperature IQs with COVID?

Why would the time dimension matter for a tool that is limited in context window? Doesn’t matter if you fill up the window in 1 second or 60 minutes. Also, it’s super easy to game. Insert random lags, reduce tokens/sec, there you have a model that maintains attention over “long-time horizons”

Maybe more importantly how do people in this field buy into these easily game-able non-indicators so easily? How did they not develop the instinct to instantly call out metrics like lines of code, number of tokens burned or time taken to process a task as BS the instant they hear it?

How do they benchmark their code? The longer running the better? Number of CPU cycles spent?

Comments

ben_w•1h ago
You have a common misunderstanding of what is meant by "time horizon".

This is not "how long does AI take to do ${thing}", it is "how long does *human* take to do ${thing}, where ${thing} is from the set of things that AI has probability = n of getting right", where n happens to be 50% or 80% in the METR studies.

At least, that's the short answer, here's a video with more depth: https://www.youtube.com/watch?v=evSFeqTZdqs

My experience is the AI actually completes the task in a few minutes, when it was a 2-ish hour task and the AI has a time horizon of 2 hours at P(correct) = 0.8. It is I the human, not the AI used by me, that would have taken 2 hours.

A Rust implementation of the Teal programming language compiler

https://github.com/rustq/tear
1•meloyc•2m ago•0 comments

Got tired of clunky extensions for pdf from ChatGPT Export

https://getchatcache.com
1•vedant28t•6m ago•1 comments

OTUS Project – Observations of Tornadoes by UAV Systems

https://www.theotusproject.com
2•unsnap_biceps•11m ago•0 comments

Cool project to replace PCB in not-very-private home tech

https://www.crowdsupply.com/micimike-rev-devices/micimike-home-mini-drop-in-pcb
2•m463•12m ago•0 comments

Turn geopolitical buzz into concrete risk alerts

https://github.com/vassiliylakhonin/agenda-intelligence-md
2•vassilbek•15m ago•0 comments

A man who blew up a nuclear power station and disappeared

https://www.theguardian.com/world/2026/may/05/the-man-who-blew-up-a-nuclear-power-station-koeberg...
2•sam-cop-vimes•15m ago•0 comments

How do I inform Windows that I'm writing a binary file?

https://devblogs.microsoft.com/oldnewthing/20260504-00/?p=112296
3•ingve•18m ago•0 comments

SEC and Elon Musk agree to settle lawsuit over Twitter buyout in 2022

https://www.cnbc.com/2026/05/04/sec-and-elon-musk-agree-to-settle-lawsuit-over-twitter-buyout-in-...
2•1vuio0pswjnm7•18m ago•0 comments

Germany's main left-wing parties quit Musk's X over disinformation

https://dpa-international.com/politics/urn:newsml:dpa.com:20090101:260504-930-30940/
2•vrganj•18m ago•0 comments

Amazon rolls out Claude Code and Codex internally

https://www.businessinsider.com/amazon-claude-code-codex-all-employees-after-pushback-2026-5
4•preston-kwei•23m ago•0 comments

Free world city time and weather tracker for multiple cities simultaneously

https://weatherdesk.app/
2•abereza•27m ago•2 comments

Inexpressibility in Exp-Minus-Log (EML)

https://arxiv.org/abs/2605.01636
3•unprovable•27m ago•1 comments

One of California's Ritziest Beach Towns Has a Problem: A Tsunami of Raw Sewage

https://www.wsj.com/us-news/climate-environment/california-coronado-island-san-diego-mexico-sewag...
1•helsinkiandrew•28m ago•1 comments

Ask HN: Created Testing Hub for Indi Game Devs and Gamers Community

1•gray_wolf_99•31m ago•0 comments

Norwegian fish farms polluting fjords with waste likened to 'raw sewage'

https://www.theguardian.com/world/2026/may/04/norwegian-fish-farms-polluting-fjords-with-waste-li...
1•vinni2•31m ago•0 comments

Solving the Third Condiment Mystery with Primary Source Documents

https://review.gale.com/2025/09/02/solving-the-third-condiment-mystery/
2•thunderbong•39m ago•0 comments

Distributed Counters in NATS JetStream

https://www.synadia.com/blog/distributed-counter-crdt
1•latchkey•47m ago•0 comments

Cost of AI-Driven Development

https://blog.codonomics.com/2026/05/cost-of-ai-driven-development.html
1•sirkarthik•49m ago•1 comments

Image Loading on the Web

https://www.ludicon.com/castano/blog/2026/05/image-loading-on-the-web/
2•Aissen•52m ago•0 comments

Show HN: SongShift, an advanced, AI-powered song conversion service

https://songshift.reachnick.co
2•lobf•54m ago•0 comments

OpenAI Raises $4B for 'The Deployment Company' to Help Businesses Leverage AI

https://officechai.com/ai/openai-raises-4-billion-for-the-deployment-company-to-help-businesses-l...
1•0xsn3k•58m ago•0 comments

Is making IRL friends are hard

1•sumanrani•1h ago•0 comments

Show HN: Retroguard – Verifiably secure AI guardrails

https://retroguard.ai
4•ttttonyhe•1h ago•0 comments

New study shows how Nazi-era propaganda influences present-day attitudes

https://www.psypost.org/new-study-shows-how-nazi-era-propaganda-influences-present-day-attitudes/
2•giuliomagnifico•1h ago•0 comments

Brockman Says Musk Vowed on Trial's Eve to Make Him 'Hated'

https://www.bloomberg.com/news/articles/2026-05-04/openai-s-brockman-to-testify-after-musk-s-text...
1•1vuio0pswjnm7•1h ago•0 comments

Anthropic quietly nerfed Claude Code's 1-hour cache

https://www.xda-developers.com/anthropic-quietly-nerfed-claude-code-hour-cache-token-budget/
3•mikhael•1h ago•0 comments

Qwem Meetup Presentation: Function Calling Harness, from 6.75% to 100%

https://typia.io/blog/function-calling-harness-qwen-meetup-korea/
1•autobe•1h ago•0 comments

The week my AI assistant deleted my production model (and made it better)

https://medium.com/@cmitre/the-week-my-ai-assistant-tried-to-end-me-and-accidentally-helped-me-bu...
2•ceemite•1h ago•1 comments

What Is Authorship When Machines Can Write?

https://thereader.mitpress.mit.edu/what-is-authorship-when-machines-can-write/
2•Hooke•1h ago•0 comments

Anthropic entering AI services business

https://www.anthropic.com/news/enterprise-ai-services-company
2•darshanmakwana•1h ago•0 comments