frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Computer use in Gemini 3.5 Flash

https://blog.google/innovation-and-ai/models-and-research/gemini-models/introducing-computer-use-gemini-3-5-flash/
53•swolpers•1h ago

Comments

satvikpendem•1h ago
There's still no MCP support in the Gemini app, which is very useful to get various pieces of info as a user just via chatting. For example I recently wanted to get an Airbnb and wanted to filter by specific criteria including house image analysis and Gemini couldn't do it so I had to do it in Codex.
tonyrice•1h ago
This is why I don't always use the official Gemini Web app. Lately I've found that it's more useful to utilize a CLI. I'm looking forward to the day they add MCP in the web.
pregseahorses•53m ago
Gemini CLi now requires antigravity subscription..
singingtoday•33m ago
CLI doesn't work with my subscription..
anticorporate•1h ago
Yeah, it seems like this is the biggest missing feature from the Gemini ecosystem.

If I can't connect MCP, there's really no selling point for me to use Gemini from my watch, car, smart speaker, etc. If I'm already bound to using my own front end, then I'm only evaluating Gemini as a model/API, at which point it has many competitors that may be cheaper or better fit for the task.

thejaycampbell•1h ago
agreed... this is where they lost me too
airstrike•1h ago
Computer use is such a terrible idea. It's slow, insecure, error prone, expensive.

I guess if you're trying to get people to tokenmaxx it may look like a valid strategy, but ain't no way this will be delightful to users.

I think it's a symptom of just not understanding how LLMs should interface with the OS because we're still in their early days.

Eventually there'll be an iPhone moment for the ergonomics of LLM usage outside of coding

nzach•29m ago
> Computer use is such a terrible idea. It's slow, insecure, error prone, expensive.

And yet having an agent able yo use a computer on your behalf is really useful.

Recently I gave a Nix OS vm to my hermes agent and it has been a good experience. I don't really care if destroy the machine I can just rollback to an earlier version, and for any meaningful data he creates for me I make sure he creates a repo, commit and pushes to my private Gitea instance.

airstrike•10m ago
[delayed]
mlmonkey•1h ago
It's funny how in their own graph, https://storage.googleapis.com/gweb-uniblog-publish-prod/ima... Gemini 3.5 Flash is beat hands down by both Opus 4.8 and GPT 5.5, and yet the graph is drawn as if Gemini wins ... :-D
sheept•1h ago
It highlights the Gemini models blue since that's what the article is about. The bar heights seem consistent with the values.
mroche•48m ago
The graph has Gemini 3.5 Flash matching Sonnet 4.6, losing to Opus 4.8, and slightly behind GPT-5.5 by 0.3 points... That's not that much of a hands-down loss for Gemini for this specific workload benchmark.

The methodology used:

https://deepmind.google/models/evals-methodology/gemini-3-5-...

Methodology: All Gemini scores are pass @1 except where otherwise noted. "Single attempt" settings allow no majority voting or parallel test-time compute. All of the results are all run with the Gemini API for the model-id gemini-3.5-flash with default sampling settings unless indicated otherwise below. To reduce variance, we average over multiple trials for smaller benchmarks.

All the results for non-Gemini models are sourced from providers' self reported numbers unless otherwise mentioned below. For Claude Opus 4.7 , Sonnet 4.6, and GPT-5.5 we default to reporting maximum thinking/reasoning settings available, but when reported results are not available we use best available reasoning results.

gb2d_hn•38m ago
It's honest - people who know what they are looking at will take speed and token costs into account. I don't use Gemini 3.5 for coding, but I use it as something in between a search engine and agent.
beastman82•59m ago
No UI like their competitors Claude CoWork or Codex. This is vaporware
villgax•50m ago
Will it skip Ads lol
humblyCrazy•45m ago
I looked at their demo and it does not
knollimar•10m ago
Where is 3.5 pro?
data-ottawa•27m ago
I think 3.5 flash is trying to target agentic work, like Google Search or ADK (agent development kit) use cases.

It’s something cheap enough you’d put out in front of your customers, and Opus is expensive enough you wouldn’t.

Thomann takes legal action against Fender

https://www.thomann.de/blog/en/inside/thomann-takes-legal-action-against-fender/
1•Audiophilip•1m ago•0 comments

The proxy that stops your colleague from leaking another database

https://github.com/intellideep/nlproxy
1•itsLerb•1m ago•0 comments

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

https://arxiv.org/abs/2602.21548
1•yogthos•2m ago•0 comments

Ask HN: Why don't LLM harnesses enable/expose custom middleware hooks?

1•fur-tea-laser•2m ago•0 comments

An Overview of Petri Net Theory [video]

https://www.youtube.com/watch?v=LIAOJj1IflA
1•matt_d•5m ago•0 comments

Figma Code Prototypes in Canvas

https://www.figma.com/
1•chamsom•5m ago•1 comments

Offline detection of short‑term loudness instability in audio signals

https://github.com/AdBusterOfficial/Adbuster--WinApp
2•Bo_Amigo_910•6m ago•0 comments

AI Shopping Agents Pose Novel Liability, Authorization Risks

https://news.bloomberglaw.com/banking-law/ai-shopping-agents-pose-novel-liability-authorization-r...
1•petethomas•7m ago•0 comments

Show HN: Lumli – Privacy-first image tools that run entirely in your browser

https://www.lumli.de
1•lumli•7m ago•0 comments

Show HN: Boostrap to try out Claude Code in Docker or sbx or microsandbox

https://github.com/Elpulgo/polysbx
1•elpulgo•8m ago•0 comments

Going solo should be the rule, not the exception

https://www.youtube.com/watch?v=InnMiO__GTk
2•spking•8m ago•0 comments

Trusting Your Agent Is Overrated

https://blog.southparkcommons.com/p/trusting-your-agent-is-overrated
1•nadis•10m ago•0 comments

Microsoft uses AI to link two malware operations in racketeering suit

https://www.theregister.com/security/2026/06/24/microsoft-uses-ai-to-link-two-malware-operations-...
2•speckx•12m ago•0 comments

Superhuman acquires AI detection startup GPTZero with 19M+ users and $30M ARR

https://gptzero.me/news/preserving-whats-human/
1•thoughtpeddler•12m ago•0 comments

Show HN: Assist: an open-model-first work surface for agents

https://www.withassist.xyz
1•js4•19m ago•0 comments

France national supercomputer shuts down GPU/CPU clusters due to heatwave

https://twitter.com/nurikolan/status/2069853428310794412
4•puttycat•20m ago•1 comments

An autonomous AI agent that handles video retention editing

https://www.autoeditor.app/
1•Quise•23m ago•0 comments

Pingwi, an interactive map of night trains around the world

https://sleeper-train-map.pingwi.com
1•TheMrBooblik•25m ago•0 comments

Agility Robotics plans to go public via SPAC in a $2.5B deal

https://techcrunch.com/2026/06/24/agility-robotics-plans-to-go-public-via-spac-in-a-2-5b-deal/
2•shevis•26m ago•0 comments

Reducing tick density along recreational trails in Ottawa, Canada

https://www.sciencedirect.com/science/article/pii/S1877959X26000476
4•bushwart•27m ago•0 comments

Welcome to Gentoo is Rice, the Volume goes to 11 here

https://www.shlomifish.org/humour/by-others/funroll-loops/Gentoo-is-Rice.html
4•thunderbong•27m ago•0 comments

Reward models for LMs are fundamentally broken

https://twitter.com/vijaytarian/status/2069438063345115187
1•panthertrax•28m ago•0 comments

Loops explained: Claude, GPT, Mira and what works

https://twitter.com/AnatoliKopadze/status/2068328135611822149
5•vantareed•28m ago•0 comments

Oldest known asteroid impact on Earth dated to 3B years

https://phys.org/news/2026-06-oldest-asteroid-impact-earth-dated.html
1•rbanffy•29m ago•0 comments

Show HN: Drudgereport but for AI

https://aititus.com/news/
1•titusblair•29m ago•0 comments

Show HN: Lelu – gate OpenAI agent actions on confidence and prompt injection

https://github.com/Lelu-ai/lelu
3•abeni1990•30m ago•0 comments

The State of GEO Readiness 2026: 100 B2B brands across AI search engines

https://getgeoscan.ai/en/blog/state-of-geo-readiness-2026
1•jrame•30m ago•0 comments

Musk loses trillionaire status as global tech rout hits SpaceX

https://www.bbc.co.uk/news/articles/c8j2m2p8dgmo
5•zh3•32m ago•1 comments

Google set to lose two more AI researchers to Anthropic

https://www.bloomberg.com/news/articles/2026-06-24/google-poised-to-lose-two-more-high-profile-ai...
5•isomorphic_duck•33m ago•2 comments

Developers can overcome AI FOMO

https://old.reddit.com/r/LLMDevs/comments/1ueloyp/i_am_an_indie_dev_and_i_published_the_article_ai/
3•gamescodedogs•33m ago•0 comments