frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Playwright Skill for Claude Code – Less context than playwright-MCP

https://github.com/lackeyjb/playwright-skill
55•syntax-sherlock•3h ago•22 comments

Show HN: I got tired of managing dev environments, so I built ServBay

https://www.servbay.com
7•Saltyfishh•3h ago•0 comments

Show HN: Duck-UI – Browser-Based SQL IDE for DuckDB

https://demo.duckui.com
206•caioricciuti•1d ago•58 comments

Show HN: Pyversity – Fast Result Diversification for Retrieval and RAG

https://github.com/Pringled/pyversity
77•Tananon•1d ago•11 comments

Show HN: Open-Source Voice AI Badge Powered by ESP32+WebRTC

https://github.com/VapiAI/vapicon-2025-hardware-workshop
44•Sean-Der•1w ago•6 comments

Show HN: Browser-based PDF form fields detection (YOLO-based)

https://commonforms.simplepdf.com/
22•nip•23h ago•3 comments

Show HN: Hank – Simplest CLI tool to get errors in plain English

https://github.com/dillondesilva/hank
3•dillondesilva•7h ago•1 comments

Show HN: Smash Balls – Breakout and Vampire Survivors

https://smash-balls.app/
4•waynerd•7h ago•2 comments

Show HN: I Built Raycast for Windows (Microsoft Store)

https://apps.microsoft.com/detail/9n8fpgknwlfq?hl=en-US&gl=US
2•NabilChiheb•3h ago•0 comments

Show HN: Web-directive.js – A directive pattern for native HTML

https://github.com/asika32764/web-directive
11•asika32764•1d ago•2 comments

Show HN: MarkdownConverters – Convert any file format to clean Markdown

https://markdownconverters.com
6•Dkaur•22h ago•3 comments

Show HN: CheckHN – A checklist for the most popular Hacker News posts

https://checkhn.ad-si.com
6•adius•22h ago•0 comments

Show HN: Syna – Minimal ML and RL Framework Built from Scratch with NumPy

https://github.com/sql-hkr/syna
7•sql-hkr•1d ago•0 comments

Show HN: The Shape of YouTube

https://soy.leg.ovh/
36•hide_on_bush•1w ago•11 comments

Show HN: Jotite – A whimsical Linux Markdown note-taking app

https://github.com/maxberggren/jotite
5•maxberggren•19h ago•0 comments

Show HN: Hokusai Pocket (WIP) – Portable GUIs with MRuby

https://codeberg.org/skinnyjames/hokusai-pocket
2•zero-st4rs•14h ago•0 comments

Show HN: Nova: Open-source solution for CAD file conflicts

https://github.com/agg111/nova
8•aishwaryagune•1d ago•0 comments

Show HN: ServiceRadar – open-source Network Observability Platform

https://github.com/carverauto/serviceradar
58•carverauto•2d ago•3 comments

Show HN: Proxmox-GitOps: Container Automation Metaframework (Recursive Monorepo)

https://github.com/stevius10/Proxmox-GitOps
9•gitopspm•1d ago•1 comments

Show HN: Inkeep (YC W23) – Agent Builder to create agents in code or visually

https://github.com/inkeep/agents
78•engomez•4d ago•49 comments

Show HN: We packaged an MCP server inside Chromium

https://github.com/browseros-ai/BrowserOS/blob/main/docs/browseros-mcp/how-to-guide.mdx
46•felarof•2d ago•16 comments

Show HN: Firm, a text-based work management system

https://github.com/42futures/firm
169•danielrothmann•5d ago•60 comments

Show HN: A large format XY scanning hyperspectral camera

https://www.anfractuosity.com/projects/waverider/
44•anfractuosity•1w ago•10 comments

Show HN: Halloy – Modern IRC client

https://github.com/squidowl/halloy
378•culinary-robot•5d ago•100 comments

Show HN: HN Terminal Theme Browser Extension

https://github.com/DanceItBreakIt/hacker-news-terminal-theme
4•danceitbreakit•22h ago•1 comments

Show HN: Photerra – One app to discover hidden gems, plan with friends, and book

https://www.photerra.com/
3•davidlevien•22h ago•3 comments

Show HN: Land use visualization for European countries

https://onsland.koenvangilst.nl/
21•vnglst•2d ago•6 comments

Show HN: Moonfish – AI podcast generator with research, writing, and voicing

https://apps.apple.com/us/app/moonfish-ai/id6748574770
2•huygiab•23h ago•0 comments

Show HN: EloqDoc: MongoDB-Compatible Doc DB with Object Storage as First Citizen

https://github.com/eloqdata/eloqdoc
11•eloqdata•1d ago•10 comments

Show HN: 17 Y/O built my second app: Omegle for Indie Hackers and Builders

https://www.xappy.fun/
4•imad-101•1d ago•0 comments
Open in hackernews

Show HN: Playwright Skill for Claude Code – Less context than playwright-MCP

https://github.com/lackeyjb/playwright-skill
55•syntax-sherlock•3h ago
I got tired of playwright-mcp eating through Claude's 200K token limit, so I built this using the new Claude Skills system. Built it with Claude Code itself.

Instead of sending accessibility tree snapshots on every action, Claude just writes Playwright code and runs it. You get back screenshots and console output. That's it.

314 lines of instructions vs a persistent MCP server. Full API docs only load if Claude needs them.

Same browser automation, way less overhead. Works as a Claude Code plugin or manual install.

Token limit issue: https://github.com/microsoft/playwright-mcp/issues/889

Claude Skills docs: https://docs.claude.com/en/docs/claude-code/skills

Comments

wahnfrieden•2h ago
Why not just ask the agent to use Playwright via CLI? That’s what I do and it works fine. With Codex anyway

Edit: oops that’s what you did too. Yes most MCP shouldn’t be used.

wild_egg•2h ago
This was on my TODO list for the week, thanks for sharing!

Now I just need to make a skill for using Jira and I can go back to the MCP-free life.

syntax-sherlock•1h ago
thanks!
AftHurrahWinch•1h ago
MCPs are deterministic, SKILLS.md isn't. Also run.js can run arbitrarily generated Node.js code. It is a trivial vector for command injection.

This might be sufficient for an independent contractor or student. It shouldn't be used in a production agent.

syntax-sherlock•1h ago
Yeah, this isn’t meant to replace your real tests it’s more for quick “does my new feature work?” checks during local dev. Think of it like scriptable manual testing: Claude spits out the Playwright code faster than you would, but it’s not CI-level coverage.

And for privacy screenshots stay local in /tmp, but console output and page content do go to Claude/Anthropic. It’s designed for dev environments with dummy data, not prod. Same deal as using Claude for any coding help.

pacoWebConsult•59m ago
If you're going to use claude to help you respond to feedback the least you can do is restate this in your own words. Parent commenter deserves the respect of corresponding with a real human being.
bravura•46m ago
LLMs are not deterministic though. So by definition MCPs are not deterministic.

For example, GPT-5 doesn't support temperature parameter. And even models that do support temperature are not deterministic with temperature=0.

siva7•15m ago
MCPs aren't deterministic...
dragonwriter•7m ago
> MCPs are deterministic, SKILLS.md isn't.

MCPs themselves may provide access to tools that are either deterministic or not, but the LLM using them generally isn't deterministic, so when used by an LLM as part of the request-response cycle determinism, if the MCP-provided tool had it, is not in a feature of the overall system.

SKILLS.md relies on a deterministic code execution environment, but has the same issue. I'm not seeing a broad difference in kind here when used in the context of an LLM response generation cycle, and that’s really the only context where both are usable (MCP could be used for non-LLM integration, but that doesn't seem relevant.)

rapatel0•1h ago
I think that this is actually the biggest threat to the current "AI bubble." Model efficiency and diffusion of models to open source. It's probably to start hedging bets on Nvidia
philipallstar•38m ago
Why would OSS models threaten Nvidia?
ISV_Damocles•18m ago
Most of the big OSS AI codebases (LLM and Diffusion, at least) have code to work on any GPU, not just nVidia GPUs, now. There's a slight performance benefit to sticking with nVidia, but once you need to split work across multiple GPUs, you can do a cost-benefit analysis and decide that, say, 12 AMD GPUs is faster than 8 nVidia GPUs and cheaper, as well.

Then nVidia's moat begins to shrink because they need to offer their GPUs at a somewhat reduced price to try to keep their majority share.

Rooster61•1h ago
I have a few questions about test frameworks that use AI services like this.

1)The examples always seem very generic: "Test Login Functionality, check if search works, etc". Do these actually work well at all once you step outside of the basic smoketest use cases?

2) How to you prevent proprietary data from being read when you are just foisting snapshots over to the AI provider? There's no way I'd be able to use this in any kind of real application where data privacy is a constraint.

syntax-sherlock•1h ago
Good questions!

1) Beyond basic tests: You're right to be skeptical. This is best for quick exploratory testing during local development ("does my new feature work?"), not replacing your test suite. Think "scriptable manual testing" - faster than writing Playwright manually, but not suitable for comprehensive CI/CD test coverage.

2) Data privacy: Screenshots stay local in /tmp, but console output and page content Claude writes tests against are sent to Anthropic. This is a local dev tool for testing with dummy data, not for production environments with real user data. Same privacy model as any AI coding assistant - if you wouldn't show Claude your production database, don't test against it with this.

Rooster61•53m ago
Thanks. I keep seeing silver bullet testing solutions pitched left right and center and wondering about these two points. Glad to see a project with realistic boundaries and expectations set. Would definitely give this a shot if I was working on a local project.
jcheng•30m ago
For 2, a lot of companies use AWS Bedrock to access Claude models instead of Anthropic, for exactly this reason. Amazon’s terms say they don’t log prompts or completions and don’t send anything to the model provider. If your production database is already hosted by AWS, it doesn’t seem like much additional risk.
siva7•18m ago
> Do these actually work well at all once you step outside of the basic smoketest use cases?

Excellent question... no, beyond basic kindergarten stuff playwright (with AI) falls quickly apart. Have some OAuth? Good luck configuring playwright for your exact setup. Need to synthesize all information available from logs and visuals to debug something? Good luck..

simonw•39m ago
I'm using Playwright so much right now. All of the good LLMs appear to know the API really well.

Using Claude Code I'll often prompt something like this:

"Start a python -m http.server on port 8003 and then use Playwright Python to exercise this UI, there's a console error when you click the button, click it and then read that error and then fix it and demonstrate the fix"

This works really well even without adding an extra skill.

I think one of the hardest parts of skill development is figuring out what to put in the skill that produces better results than the model acting alone.

Have you tried iteratively testing the skill - building it up part by part and testing along the way to see if the different sections genuinely help improve the model's performance?

syntax-sherlock•22m ago
Yeah you can definitely do this with prompts since LLMs know the API really well. I just got tired of retyping the same instructions and wanted to try out the new Skills.

I did test by comparing transcripts across sessions to refine the workflow. As I'm running into new things I'm continuing to do that.

yomismoaqui•20m ago
One thing that I see skills having the advantage is when they include scripts for specific tasks that the LLM has a difficult time generating the right code.

Also the problem with the LLM being trained to use foo tool 1.0 and now foo tool is on version 2.0.

The nice thing is that scripts on a skill are not included in the context and also they are deterministic.

rgbrgb•13m ago
this is the core problem rn with developing anything that uses an LLM. It’s hard to evaluate how well it works and nearly impossible to evaluate how well it generalizes unless the input is constrained so tightly that you might as well not use the LLM. For this I’d probably write a bunch of test tasks and see how well it performs with and without the skill. But the tough part here is that in certain codebases it might not need the skill. The whole environment is an implicit input for coding agents. In my main codebase right now there are tons of playwright specs that Claude does a great job copying / improving without any special information.
yomismoaqui•24m ago
Thanks, I have installed it and it works great!

Related anecdote: some months ago I tried to coax the Playwright MCP to do a full page screenshot and it couldn't do it. Then I just told Claude Code to write a Playwright JS script to do that and it worked at the first try.

Taking into account all the tools crap that the Playwright MCP puts in your context window and the final result I think this is the way to go.