frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Indian Culture

https://indianculture.gov.in/
1•saikatsg•1m ago•0 comments

Show HN: Maravel-Framework 10.61 prevents circular dependency

https://marius-ciclistu.medium.com/maravel-framework-10-61-0-prevents-circular-dependency-cdb5d25...
1•marius-ciclistu•2m ago•0 comments

The age of a treacherous, falling dollar

https://www.economist.com/leaders/2026/02/05/the-age-of-a-treacherous-falling-dollar
1•stopbulying•2m ago•0 comments

Ask HN: AI Generated Diagrams

1•voidhorse•5m ago•0 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
2•josephcsible•5m ago•0 comments

Show HN: A delightful Mac app to vibe code beautiful iOS apps

https://milq.ai/hacker-news
2•jdjuwadi•8m ago•2 comments

Show HN: Gemini Station – A local Chrome extension to organize AI chats

https://github.com/rajeshkumarblr/gemini_station
1•rajeshkumar_dev•8m ago•0 comments

Welfare states build financial markets through social policy design

https://theloop.ecpr.eu/its-not-finance-its-your-pensions/
2•kome•12m ago•0 comments

Market orientation and national homicide rates

https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.70023
3•PaulHoule•12m ago•0 comments

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

https://www.cbsnews.com/news/california-death-cap-mushrooms-poisonings-liver-transplants/
1•rolph•12m ago•0 comments

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

https://www.capenews.net/falmouth/obituaries/matthew-a-shulman/article_33af6330-4f52-5f69-a9ff-58...
3•canucker2016•14m ago•1 comments

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

https://github.com/varun369/SuperLocalMemoryV2
1•varunpratap369•15m ago•0 comments

Show HN: Pyrig – One command to set up a production-ready Python project

https://github.com/Winipedia/pyrig
1•Winipedia•17m ago•0 comments

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

https://github.com/AysajanE/moltbook-persistence/blob/main/paper/main.pdf
1•EagleEdge•17m ago•0 comments

C and C++ dependencies: don't dream it, be it

https://nibblestew.blogspot.com/2026/02/c-and-c-dependencies-dont-dream-it-be-it.html
1•ingve•17m ago•0 comments

Show HN: Vbuckets – Infinite virtual S3 buckets

https://github.com/danthegoodman1/vbuckets
1•dangoodmanUT•18m ago•0 comments

Open Molten Claw: Post-Eval as a Service

https://idiallo.com/blog/open-molten-claw
1•watchful_moose•18m ago•0 comments

New York Budget Bill Mandates File Scans for 3D Printers

https://reclaimthenet.org/new-york-3d-printer-law-mandates-firearm-file-blocking
2•bilsbie•19m ago•1 comments

The End of Software as a Business?

https://www.thatwastheweek.com/p/ai-is-growing-up-its-ceos-arent
1•kteare•20m ago•0 comments

Exploring 1,400 reusable skills for AI coding tools

https://ai-devkit.com/skills/
1•hoangnnguyen•21m ago•0 comments

Show HN: A unique twist on Tetris and block puzzle

https://playdropstack.com/
1•lastodyssey•24m ago•1 comments

The logs I never read

https://pydantic.dev/articles/the-logs-i-never-read
1•nojito•25m ago•0 comments

How to use AI with expressive writing without generating AI slop

https://idratherbewriting.com/blog/bakhtin-collapse-ai-expressive-writing
1•cnunciato•27m ago•0 comments

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

https://github.com/choihimchan/linkscope-bpu-uart-analyzer
1•octablock•27m ago•0 comments

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

https://github.com/user19870/cppsp
1•user19870•28m ago•1 comments

The next frontier in weight-loss drugs: one-time gene therapy

https://www.washingtonpost.com/health/2026/01/24/fractyl-glp1-gene-therapy/
2•bookofjoe•31m ago•1 comments

At Age 25, Wikipedia Refuses to Evolve

https://spectrum.ieee.org/wikipedia-at-25
2•asdefghyk•34m ago•4 comments

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

https://reviewreact.com
2•sara_builds•34m ago•1 comments

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

https://zenodo.org/records/18514533
1•DarenWatson•35m ago•0 comments

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

1•laurex•39m ago•0 comments
Open in hackernews

Windows-Use: an AI agent that interacts with Windows at GUI layer

https://github.com/CursorTouch/Windows-Use
135•djhu9•5mo ago

Comments

yodon•4mo ago
Very cool - does anyone know of an OSX equivalent?

Preferably one that is similarly able to understand and interact with web page elements, in addition to app elements and system elements.

CharlesW•4mo ago
There are MCPs that work with the macOS Accessibility stack, like https://github.com/steipete/macos-automator-mcp, https://github.com/ashwwwin/automation-mcp, https://github.com/mediar-ai/MacosUseSDK, and https://github.com/baryhuang/mcp-remote-macos-use.

For web page elements, you could drive the browser via `do JavaScript` or use a dedicated browser MCP (Chrome DevTools MCP, Playwright MCP).

nikisweeting•4mo ago
https://github.com/browser-use/macOS-use

https://github.com/browser-use/browser-use

philfreo•4mo ago
Cool. Reminds me of using SendKeys() in Visual Basic 6 in the 90s

https://learn.microsoft.com/en-us/dotnet/api/microsoft.visua...

anthk•4mo ago
And BeOS/Haiku with the "Hey" command which does literally the same, but far more than key input. You can interact with widgets too. Under Unix, there's xdotool and friends.
sebastiennight•4mo ago
I loved SendKeys()!

Used it to write programs that would run in the background & spook my friends by "typing" quotes from movies at random times on their computer.

halfcat•4mo ago
SendKeys() in VB powered basically all of the AOL chat bots in the 90’s.

It’s how I accidentally learned the Win32 API

yarone•4mo ago
Me too! With Sendkeys and some Win32 API calls, I wrote an AOL add-on (available through Keyword: addons) called AoLOL!. It was my first software business.

Q: How do you identify the AOL window? A: Look for an app with titlebar = "America[space][space]Online"

kh9000•4mo ago
Using the UIA tree as the currency for LLMs to reason over always made more sense to me than computer vision, screenshot based approaches. It’s true that not all software exposes itself correctly via UIA, but almost all the important stuff does. VS code is one notable exception (but you can turn on accessibility support in the settings)
freedomben•4mo ago
Agreed. I've noticed ChatGPT when parsing screenshots writes out some Python code to parse it, and at least in the tests I've done (with things like, "what is the RGB value of the bullet points in the list" or similar) it ends up writing and rewriting the script five or so times and then gives up. I haven't tried others so I don't know if their approach is unique or not, but it definitely feels really fragile and slow to me
Juminuvi•4mo ago
I noticed something similar. I asked it extract a guid from an image and it wrote a python script to run ocr against it...and got it wrong. Prompting a bit more seemed to finally trigger it to use it's native image analysis but I'm not sure what the trick was.
morkalork•4mo ago
I've run into this with uploading audio and text files, have to yell at it to not write any code and use it's native abilities to do the job.
spacebacon•4mo ago
Probably just ask it to use native image analysis versus writing code. I have done this before extracting usernames from screenshots.
philipbjorge•4mo ago
Important is subjective — In the healthcare space, I’d make the claim that most applications don’t expose themselves correctly (native or web).

CV and direct mouse/kb interactions are the “base” interface, so if you solve this problem, you unlock just about every automation usecase.

(I agree that if you can get good, unambiguous, actionable context from accessibility/automation trees, that’s going to be superior)

phatskat•4mo ago
I’ve been working hard on our new component implementation (Vue/TS) to include accessibility for components that aren’t just native reskins, like combo and list boxes, and keyboard interactivity is a real pain. One of my engineers had it half-working on her dropdown and threw in the towel for MVP because there’s a lot of little state edge cases to watch out for.

Thankfully the spec as provided by MDN for minimal functionality is well spelled out and our company values meeting accessibility requirements, so we will revisit and flesh out what we’re missing.

Also I wanna give props (ha) to the Storybook team for bringing accessibility testing into their ecosystem as it really does help to have something checking against our implementations.

akurilin•4mo ago
I recently tried using Qwen VL or Moondream to see if off-the-shelf they would be able to accurately detect most of the interesting UI elements on the screen, either in the browser or your average desktop app.

It was a somewhat naive attempt, but it didn't look like they performed well without perhaps much additional work. I wonder if there are models that do much better, maybe whatever OpenAI uses internally for operator, but I'm not clear how bulletproof that one is either.

These models weren't trained specifically for UI object detection and grounding, so, it's plausible that if they were trained on just UI long enough, they would actually be quite good. Curious if others have insight into this.

nikanj•4mo ago
Most Electron software doesn't follow accessibility guidelines and exposes nothing over UIA
electroly•4mo ago
Looks awesome. I've attempted my own implementation, but I never got it to work particularly well. "Open Notepad and type Hello World" was a triumph for me. I landed on the UIA tree + annotated screenshot combination, too, but mine was too primitive, and I tried to use GPT which isn't as good at image tasks as Gemini as used here. Great job!
tiahura•4mo ago
LLM’s do a pretty good job of using pywin32 for programs that support COM like office.
mtVessel•4mo ago
I feel vaguely vindicated that the agent can't figure out how to use the modern Save as workflow, either, and reverts to the traditional dialog.
dvt•4mo ago
Working on something very similar in Rust. It's quite magical when it works (that's a big caveat, as I'm trying to make it work with local LLMs). Very cool implementation, and imo, this is the future of computing.
KaseKun•4mo ago
Can it farm a ber rune for me?
alexchantavy•4mo ago
Yeahh computer-use agents remind me of game automators like RuneScape autoclickers back in the day like SCAR: I posted on this a while back haha https://news.ycombinator.com/item?id=29716900#29720860
AfterHIA•4mo ago
I remember an older friend asking me recently; will there be a thing soon where I can make my computer go on auto-pilot?

I guess I can answer, "yes I think so."

vivzkestrel•4mo ago
genuinely asking, what do you think are the use cases for someone requiring this?
MurageKabui•4mo ago
Awesome job! I'm working on a similar Agent that's highly dependent on AutoIt.