frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Gaia2 and Are: Empowering the Community to Evaluate Agents

https://huggingface.co/blog/gaia2
4•mortimerp9•1h ago

Comments

mortimerp9•1h ago
Meta AI is releasing two new resources for AI agents research: - GAIA 2 Benchmark: An updated approach to agents evaluation

• 800 dynamic scenarios across ten realistic universes

• Tests adaptability, robustness to failure, and time sensitivity

• Moves beyond static benchmarks to evaluate real-world agent capabilities

- Agents Research Environments (ARE): A simulation platform for agents research

• Dynamic, evolving environments that mirror real-world complexity

• Built-in reward signals and comprehensive evaluation tools

• Realistic apps (email, calendar, file system, messaging) with realistic data

• Event-driven architecture that creates dynamic scenarios for multi-turn tasks

Cap'n Web: A JavaScript-native RPC system

https://github.com/cloudflare/capnweb
1•anhldbk•53s ago•0 comments

Why random lines of video game dialogue get stuck in our heads

https://www.theguardian.com/games/2025/sep/17/video-game-dialogue-pushing-buttons
1•andsoitis•3m ago•0 comments

Show HN: Handwritten Letter Generator

https://claude.ai/public/artifacts/1f1af45e-f5b0-4831-ade3-e5538ae2d145
1•paperplaneflyr•3m ago•0 comments

A Lookback at Workers Launchpad

https://blog.cloudflare.com/workers-launchpad-006
1•sarreph•4m ago•0 comments

Why Johnny Can't Add Anymore: Low standards and ideological fights don't help

https://www.wsj.com/opinion/u-s-math-education-crpe-report-naep-scores-648adf94
1•Bostonian•5m ago•1 comments

I was going to start a VC fund, but instead got hooked on vibe coding

https://fredbenenson.com/blog/2025/09/22/getting-caught-on-the-inside/
1•mecredis•5m ago•0 comments

Weaviate's Query Agent with Charles Pierse [podcast]

1•CShorten•7m ago•0 comments

Documents offer rare insight on Ice's close relationship with Palantir

https://www.theguardian.com/us-news/ng-interactive/2025/sep/22/ice-palantir-data
2•mitchbob•9m ago•0 comments

The role of canines in health myth and fact (2018)

https://hekint.org/2018/12/06/from-the-goddess-of-healing-to-hair-of-the-dog-the-role-of-canines-...
1•debo_•13m ago•0 comments

Now Is the Time to Start Planning for the Post-Android World (2018)

https://www.linuxjournal.com/content/now-time-start-planning-post-android-world
1•Waraqa•16m ago•2 comments

Congressman Calmly Explains "Entities" Coming from "Five or Six Deepwater Areas"

https://futurism.com/congressman-burchett-aliens-water
1•DocFeind•17m ago•1 comments

Tips for Working with Legacy Code

https://www.esveo.com/en/blog/tips-for-working-with-legacy-code/
2•Bogdanp•17m ago•0 comments

Show HN: E-E-A-T Checker for SEO Content – QuickCreator Chrome Extension

https://chromewebstore.google.com/detail/quickcreator-extension-al/behnnpfjjmnpcmclpbcgfmidjmmpffhh
1•yanzt•18m ago•0 comments

Tool Calls Are Expensive and Finite

https://www.reillywood.com/blog/tool-calls-are-expensive-and-finite/
1•ripley12•19m ago•0 comments

Hammerspoon: A tool for powerful automation of OS X

https://github.com/Hammerspoon/hammerspoon
2•thunderbong•20m ago•0 comments

Taking a Look at Compression Algorithms

https://cefboud.com/posts/compression/
1•GarethX•20m ago•1 comments

Compass to acquire Anywhere for $1.6B

https://www.wsj.com/real-estate/brokerage-giant-compass-agrees-to-acquire-rival-anywhere-for-1-6-...
1•uptown•21m ago•0 comments

Ask HN: Share Your Pocket Bookmarks

2•earlyriser•21m ago•1 comments

Railway partial outage (unless you pay more)

https://railway.instatus.com
3•jfaat•23m ago•1 comments

Identity Types

https://bartoszmilewski.com/2025/09/22/identity-types/
1•ibobev•23m ago•0 comments

SQLite 3.51.0 supports 64-bit browser-side WASM

https://sqlite.org/forum/forumpost/f91fe5095f
2•sgbeal•24m ago•1 comments

AI Spreadsheet Benchmark [pdf]

https://huggingface.co/datasets/rowshq/aispreadsheetbenchmark/blob/main/technical_paper.pdf
2•patife•24m ago•0 comments

AMail: An Amiga IMAP and SMTP Client over SSL

https://bluewizardnet.itch.io/amail
3•doener•25m ago•1 comments

All Hail the Technocracy

https://www.wired.com/politics-issue/
2•nickcotter•26m ago•0 comments

Delta-8 THC use highest where marijuana is illegal, study finds

https://medicalxpress.com/news/2025-09-delta-thc-highest-marijuana-illegal.html
2•PaulHoule•27m ago•0 comments

The great falls of Boeing, Intel, and Apple

https://world.hey.com/dhh/the-great-falls-of-boeing-intel-and-apple-4c18ca39
1•flinner•27m ago•2 comments

Only social media companies know how teen ban will work

https://www.abc.net.au/news/2025-09-17/social-media-ban-law-rules-analysis/105780188
1•eowyn•27m ago•0 comments

A history of the Internet, part 3: The rise of the user

https://arstechnica.com/gadgets/2025/09/a-history-of-the-internet-part-3-the-rise-of-the-user/
1•doener•28m ago•0 comments

Show HN: A Minimal Ncurses Web Browser That Renders HTML with Colors

1•den_dev•29m ago•1 comments

Bluesky says it's getting more aggressive about moderation and enforcement

https://techcrunch.com/2025/09/22/bluesky-says-its-getting-more-aggressive-about-moderation-and-e...
3•doener•29m ago•1 comments