frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Windows-Use: an AI agent that interacts with Windows at GUI layer

https://github.com/CursorTouch/Windows-Use
46•djhu9•3d ago

Comments

yodon•2h ago
Very cool - does anyone know of an OSX equivalent?

Preferably one that is similarly able to understand and interact with web page elements, in addition to app elements and system elements.

CharlesW•1h ago
There are MCPs that work with the macOS Accessibility stack, like https://github.com/steipete/macos-automator-mcp, https://github.com/ashwwwin/automation-mcp, https://github.com/mediar-ai/MacosUseSDK, and https://github.com/baryhuang/mcp-remote-macos-use.

For web page elements, you could drive the browser via `do JavaScript` or use a dedicated browser MCP (Chrome DevTools MCP, Playwright MCP).

philfreo•2h ago
Cool. Reminds me of using SendKeys() in Visual Basic 6 in the 90s

https://learn.microsoft.com/en-us/dotnet/api/microsoft.visua...

kh9000•1h ago
Using the UIA tree as the currency for LLMs to reason over always made more sense to me than computer vision, screenshot based approaches. It’s true that not all software exposes itself correctly via UIA, but almost all the important stuff does. VS code is one notable exception (but you can turn on accessibility support in the settings)
freedomben•42m ago
Agreed. I've noticed ChatGPT when parsing screenshots writes out some Python code to parse it, and at least in the tests I've done (with things like, "what is the RGB value of the bullet points in the list" or similar) it ends up writing and rewriting the script five or so times and then gives up. I haven't tried others so I don't know if their approach is unique or not, but it definitely feels really fragile and slow to me
electroly•57m ago
Looks awesome. I've attempted my own implementation, but I never got it to work particularly well. "Open Notepad and type Hello World" was a triumph for me. I landed on the UIA tree + annotated screenshot combination, too, but mine was too primitive, and I tried to use GPT which isn't as good at image tasks as Gemini as used here. Great job!
tiahura•16m ago
LLM’s do a pretty good job of using pywin32 for programs that support COM like office.

Many hard LeetCode problems are easy constraint problems

https://buttondown.com/hillelwayne/archive/many-hard-leetcode-problems-are-easy-constraint/
201•mpweiher•3h ago•125 comments

The treasury is expanding the Patriot Act to attack Bitcoin self custody

https://www.tftc.io/treasury-iexpanding-patriot-act/
414•bilsbie•5h ago•321 comments

3D modeling with paper

https://www.arvinpoddar.com/blog/3d-modeling-with-paper
144•joshuawootonn•3h ago•21 comments

Advanced Scheme Techniques (2004) [pdf]

https://people.csail.mit.edu//jhbrown/scheme/continuationslides04.pdf
63•mooreds•2h ago•4 comments

Vector database that can index 1B vectors in 48M

https://www.vectroid.com/blog/why-and-how-we-built-Vectroid
11•mathewpregasen•53m ago•2 comments

Windows-Use: an AI agent that interacts with Windows at GUI layer

https://github.com/CursorTouch/Windows-Use
48•djhu9•3d ago•7 comments

Qwen3-Next

https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancement...
444•tosh•11h ago•178 comments

Doom-ada: Doom Emacs Ada language module with syntax, LSP and Alire support

https://github.com/tomekw/doom-ada
47•tomekw•2h ago•3 comments

Oq: Terminal OpenAPI Spec Viewer

https://github.com/plutov/oq
41•der_gopher•2h ago•2 comments

A beginner's guide to extending Emacs

https://blog.tjll.net/a-beginners-guide-to-extending-emacs/
96•ibobev•2h ago•6 comments

Humanely Dealing with Humungus Crawlers

https://flak.tedunangst.com/post/humanely-dealing-with-humungus-crawlers
4•freediver•42m ago•0 comments

Show HN: DWS OS, a Plan 9 Inspired Web "OS"

https://dws.rip
30•tdubey•2h ago•6 comments

Building a Deep Research Agent Using MCP-Agent

https://thealliance.ai/blog/building-a-deep-research-agent-using-mcp-agent
17•saqadri•2d ago•5 comments

Racintosh Plus – Rackmount Mac Plus

http://www.identity4.com/2025-racintosh-plus/
87•zdw•3d ago•13 comments

VaultGemma: The most capable differentially private LLM

https://research.google/blog/vaultgemma-the-worlds-most-capable-differentially-private-llm/
10•meetpateltech•1h ago•0 comments

Chat Control faces blocking minority in the EU

https://twitter.com/TutaPrivacy/status/1966384776883142661
285•miohtama•4h ago•96 comments

Show HN: An MCP Gateway to block the lethal trifecta

https://github.com/Edison-Watch/open-edison
25•76SlashDolphin•2h ago•6 comments

OpenAI Grove

https://openai.com/index/openai-grove/
11•manveerc•1h ago•12 comments

Why our website looks like an operating system

https://posthog.com/blog/why-os
608•bnc319•18h ago•423 comments

Float Exposed

https://float.exposed/
359•SomaticPirate•17h ago•96 comments

Crates.io phishing attempt

https://fasterthanli.me/articles/crates-io-phishing-attempt
127•dmarto•2h ago•61 comments

Astrophysics Source Code Library

http://ascl.net/
57•SiempreViernes•6h ago•7 comments

Debian 13, Postgres, and the US time zones

https://rachelbythebay.com/w/2025/09/11/debtz/
241•move-on-by•15h ago•121 comments

Introduction to Nyquist and Lisp Programming

https://manual.audacityteam.org/man/introduction_to_nyquist_and_lisp_programming.html
88•swatson741•3d ago•1 comments

Over 100 ships have sailed with fake insurance from the Norwegian Ro Marine

https://www.nrk.no/vestland/xl/over-100-ships-have-sailed-without-legitimate-insurance-from-the-n...
158•aregue•4h ago•62 comments

Show HN: I made a generative online drum machine with ClojureScript

https://dopeloop.ai/beat-maker/
127•chr15m•9h ago•25 comments

Ankit Gupta Joins YC as General Partner

https://www.ycombinator.com/blog/welcome-ankit/
12•todsacerdoti•1h ago•4 comments

Classic GTK1 GUI Library

https://gitlab.com/robinrowe/gtk1
110•MaximilianEmel•4d ago•49 comments

Top model scores may be skewed by Git history leaks in SWE-bench

https://github.com/SWE-bench/SWE-bench/issues/465
447•mustaphah•23h ago•136 comments

Lumina-DiMOO: An open-source discrete multimodal diffusion model

https://synbol.github.io/Lumina-DiMOO/
33•SweetSoftPillow•6h ago•2 comments