frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

AlphaEvolve: Gemini-powered coding agent scaling impact across fields

https://deepmind.google/blog/alphaevolve-impact/
79•berlianta•1h ago

Comments

pingou•29m ago
AI improving itself (or at least the architecture it runs on), the singularity is near as they say.

Do we have other examples of AI being used to improve the LLMs, apart for the creation of synthetic data and the testing of the models?

mkw5053•24m ago
I feel like the most viral lately is https://github.com/karpathy/autoresearch
dataviz1000•10m ago
I am having huge success [0]

It is a recursive self-reflective agent. It will copy itself into /tmp/, run, analyze the results / eval, and update itself ...... copy itself into /tmp/, run, analyze the results / eval and update itself ..... ad infinitem.

Left alone it can bypass all bot detection security including Turnstile and hCaptcha which means it can have anonymous access to gpt-5.3 with internet search, Perplexity with internet search, and all the models on nvidia like Deepseek v4. Although flaky, the python instructor library shines here creating validation and structured data.

For shits and giggles, I wondered if could become viral. So I had a coding agent create 100 containers each with a security vulnerability like SQL injection in isolation. To my surprise because it was a playground in isolation the coding agents made that.

I stopped there. I know that it can copy itself. I know that it can evolve very quickly. I know that it can reverse engineer any website. They can create a 10 minute mail account and pass the reference along so they can communicate with each other. I don't check if it could do breadth search on known vulnerabilities to access the isolated servers.

Situation is like Leonardo DiCaprio in Don't Look Up screaming on TV. If anyone at any of these companies want to discuss this with me, please reach out.

[0] https://github.com/adam-s/agent-tuning

[1] https://build.nvidia.com/models

NitpickLawyer•17m ago
> Do we have other examples of AI being used to improve the LLMs

Yes, last year when they revealed AlphaEvolve they used a previous gemini model to improve kernels that were used in training this gen models, netting them a 1% faster training run. Not much, but still.

lewtun•16m ago
Shameless plug: https://huggingface.co/spaces/smolagents/ml-intern

It’s a simple harness around Opus, but with tight integration to Hugging Face infra, so the agent can read papers, test code and launch experiments

dinfinity•7m ago
> AI improving itself

This is the thing to look for in 2027, imho. All the big AI labs have big projects working on research agents, also specifically into improving AI (duh) and I expect a lot of that to get out of the experimental phases this year.

Next year they actually get to do a lot of work and I think we will see the first big effective architectural change co-invented by AI.

maxothex•29m ago
What I'm most curious about is how this translates to messy, real-world codebases without well-defined metrics. Most production software isn't chip design or kernel optimization - it's business logic with unclear success criteria. The infrastructure story is impressive, but I'd love to see how they handle domains where the evaluation function itself is ambiguous.
baq•28m ago
RSI is here on the hardware level and on software level. Sprinkle with a couple algorithmic breakthroughs and results are nigh unimaginable.
alecco•28m ago
Are Googlers themselves happy using Gemini coding agent instead of Claude Code or Codex? (no snark, I'm really asking)
carbocation•25m ago
Last month, Steve Yegge suggested that they are not: https://xcancel.com/Steve_Yegge/status/2043747998740689171
PunchTornado•15m ago
This couldn't be further from the truth
NitpickLawyer•11m ago
> He says the problem is that they can't use Claude Code because it's the enemy, and Gemini has never been good enough to capture people's workflows like Claude has, so basically agentic coding just never really took off inside Google. They're all just plodding along, completely oblivious to what's happening out there right now.

This is a bunch of gabagoo. Wrong on so many layers, it's not even worth reading further.

a) goog has agentic coding in both antigravity & cli forms. While it is not at the level of cc + opus, it's still decent.

b) goog has their own versions of models trained on internal code

c) goog has claude in vertex, and most definitely can set it up in secure zones (like they can for their clients) so they'd be able to use claude (at cost) within their own projects.

stormbeard•8m ago
Demis Hassabis chimed in on that thread and called it what it is: clickbait.
PunchTornado•15m ago
Codex?
nine_k•13m ago
The point of dogfooding is exactly that: if we're unhappy, we're the ones to improve.
jensensbutton•11m ago
Note that coding is not the only use of Gemini or any of these models. It's also not what this article is talking about. Gemini can be not the best coding agent, but very good at other things.
brkn•6m ago
I would be interested to see how exactly the agent helped. How was it used, where did it lead to the given improvement and in how far would it have taken a human to come to the same solution.

The map that keeps Burning Man honest

https://www.not-ship.com/burning-man-moop/
246•speckx•2h ago•80 comments

AlphaEvolve: Gemini-powered coding agent scaling impact across fields

https://deepmind.google/blog/alphaevolve-impact/
85•berlianta•1h ago•18 comments

Authorities say Flock cameras' data allegedly used for immigration enforcement

https://www.ohio.news/stories/dayton-authorities-say-that-flock-cameras-data-allegedly-used-for-i...
23•pseudolus•25m ago•6 comments

Child marriages plunged when girls stayed in school in Nigeria

https://www.nature.com/articles/d41586-026-00796-2
140•surprisetalk•3h ago•82 comments

I switched from Mac to a Lenovo Chromebook, and you can too

https://blog.johnozbay.com/i-left-apples-ecosystem-for-a-lenovo-chromebook-and-you-can-too.html
11•speckx•29m ago•1 comments

The Self-Cancelling Subscription

https://predr.ag/blog/the-self-cancelling-subscription/
41•surprisetalk•2h ago•22 comments

RaTeX: KaTeX-compatible LaTeX rendering engine in pure Rust

https://ratex.lites.dev/
94•atilimcetin•3d ago•50 comments

Valve releases Steam Controller CAD files under Creative Commons license

https://www.digitalfoundry.net/news/2026/05/valve-releases-steam-controller-cad-files-under-creat...
1641•haunter•1d ago•549 comments

Indian matchbox labels as a visual archive

https://www.itsnicethat.com/features/the-view-from-mumbai-matchbook-graphic-design-130426
103•sahar_builds•3d ago•26 comments

37x Speedup in Lattice Boltzmann Cylinder Flow

https://github.com/alikamp/Parks-KPBM-Scaling
31•kauai1•2d ago•3 comments

MPEG-2 Transport Stream Packaging for Media over QUIC Transport

https://www.ietf.org/archive/id/draft-gregoire-moq-msfts-00.html
18•mondainx•1h ago•1 comments

Grand Theft Oil Futures: Insider traders keep making a killing at our expense

https://paulkrugman.substack.com/p/grand-theft-oil-futures
378•Qem•5h ago•238 comments

Boris Cherny: TI-83 Plus Basic Programming Tutorial (2004)

https://www.ticalc.org/programming/columns/83plus-bas/cherny/
134•suoken•2d ago•57 comments

SQLite Is a Library of Congress Recommended Storage Format

https://sqlite.org/locrsf.html
491•whatisabcdefgh•18h ago•153 comments

GovernGPT (YC W24) Is Hiring Engineers to Build Thinking Systems in Montreal

https://www.ycombinator.com/companies/governgpt/jobs/hRyltS0-backend-engineer-thinking-systems
1•owalerys•4h ago

Appearing productive in the workplace

https://nooneshappy.com/article/appearing-productive-in-the-workplace/
1439•diebillionaires•1d ago•580 comments

Motherboard sales are now collapsing amid unprecedented shortages fueled by AI

https://www.tomshardware.com/pc-components/motherboards/motherboard-sales-collapse-by-more-than-2...
43•speckx•1h ago•18 comments

Agent-harness-kit scaffolding for multi-agent workflows (MCP, provider-agnostic)

https://ahk.cardor.dev
56•enmanuelmag•5h ago•16 comments

Diskless Linux boot using ZFS, iSCSI and PXE

https://aniket.foo/posts/20260505-netboot/
158•stereo-highway•13h ago•88 comments

OpenBSD Stories: The closest thing to cute kittens (OpenBSD/zaurus)

http://miod.online.fr/software/openbsd/stories/zaurus1.html
3•zdw•21h ago•0 comments

Vibe coding and agentic engineering are getting closer than I'd like

https://simonwillison.net/2026/May/6/vibe-coding-and-agentic-engineering/
710•e12e•1d ago•795 comments

Chevrolet Performance eCrate package (400v/200hp)

https://www.chevrolet.com/performance-parts/crate-engines/ecrate
123•mindcrime•2d ago•96 comments

RSS feeds send me more traffic than Google

https://shkspr.mobi/blog/2026/05/rss-feeds-send-me-more-traffic-than-google/
216•SpyCoder77•15h ago•48 comments

SingleRide: Longest route on NYC Subway without visiting the same station twice

https://singleride.nyc/
71•TMWNN•1d ago•39 comments

The mechanical latching memory of an adhesive tape

https://iopscience.iop.org/article/10.1088/1367-2630/ae4acc
17•gnabgib•1d ago•7 comments

Chrome removes claim of On-device Al not sending data to Google Servers

https://old.reddit.com/r/chrome/comments/1t5qayz/chrome_removes_claim_of_ondevice_al_not_sending/
15•newsoftheday•34m ago•1 comments

LinkedIn profile visitor lists belong to the people, says Noyb

https://www.theregister.com/offbeat/2026/05/05/noyb-cries-foul-on-linkedin-withholding-profile-vi...
161•robin_reala•5h ago•86 comments

Permacomputing Principles

https://permacomputing.net/principles/
218•andsoitis•14h ago•145 comments

ProgramBench: Can Language Models Rebuild Programs from Scratch?

https://arxiv.org/abs/2605.03546
106•jonbaer•12h ago•59 comments

Show HN: Agent-skills-eval – Test whether Agent Skills improve outputs

https://github.com/darkrishabh/agent-skills-eval
55•darkrishabh•10h ago•20 comments