frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: ScrapeCopilot – Notebook Code Interface + Puppeteer + AI Copilot

3•erichi•1y ago
Hi HN, I’m Eric, and I’m building ScrapeCopilot, an AI assistant designed to eliminate friction in browser automation development.

Here is the link to VS Code extension - https://marketplace.visualstudio.com/items?itemName=scrapeco...

I've built browser automations for more than 5 years, and the constant frustration was always the sheer friction involved in getting working code – especially when debugging in headless mode or connecting to remote browsers.

When I started using LLMs to generate automation code, I found myself stuck in a repetitive loop: navigate to the desired page state, copy-paste HTML into the AI chat, and ask it to generate code. The worst part is that there was no easy way to run that generated code without losing the page state, forcing me to restart the browser session constantly. This wasted large amounts of time and mental energy. I built ScrapeCopilot to make this workflow seamless.

How it works:

ScrapeCopilot combines the power of a Jupyter-style notebook with a live Puppeteer browser session and integrated AI.

- Live Interactive Development: When you create an automation notebook, it initiates a fresh Puppeteer browser session. The page object is exposed directly to your notebook cells, allowing you to run any Puppeteer code against the live browser state and see the results instantly.

- AI-Powered Assistance: It integrates with GitHub Copilot (via the @scrapecopilot chat participant). The AI automatically sees the current page HTML, allowing it to generate highly relevant Puppeteer code based on your instructions directly within the chat.

- LLM Code Export: Once you've developed your automation logic, you can easily export the final, complete Puppeteer script based on your instructions.

This tool saves me hours daily, but even more importantly, it improves the developer experience in browser automation which is frustrating area.

I believe ScrapeCopilot can complement existing browser automation tools and frameworks by providing an interactive AI-assisted development experience.

Current Status & Future Plans:

- The extension currently works within VS Code. It will work in Cursor, but without chat support initially. I'm actively working on integrating a backend server to enable full chat functionality with Cursor.

- Currently the key workflow assumes that you create a new browser automation step by step, using code cells. But in my work I spend half of the time fixing existing automations, so my focus now is trying to adapt extension for debugging and fixing existing code.

- Playwright support is also on the list.

Check out short videos: - Demo: Headless False - https://scrapecopilot.ai/assets/demo-headless-false-Dhc_jeNR... - Demo: Headless True - https://scrapecopilot.ai/assets/demo-headless-true-PRQndDxP....

I'd love to hear your thoughts, feedback, and any suggestions!

American Pride Falls to 25-Year Record Low

https://news.gallup.com/poll/711938/american-pride-falls-year-record-low.aspx
1•giuliomagnifico•1m ago•0 comments

Show HN: I indexed 37h of my videos using an RTX 4090 and local ML models in 24h

1•iliashad•4m ago•0 comments

A single MR with 52k commits had stalled GitLab CI

https://gitlab.com/gitlab-com/gl-infra/production/-/work_items/22412
2•ramon156•6m ago•0 comments

GitLab CI Is Down

https://status.gitlab.com/
3•absqueued•6m ago•1 comments

VPSMaxxing – Migrate Your Codex, Claude Code and Other Agents to a VPS

https://github.com/Kuberwastaken/VPSmaxxing
1•kuberwastaken•7m ago•1 comments

The emergence of human influence on the ozone layer by the 1960s

https://www.pnas.org/doi/10.1073/pnas.2608286123
1•croes•8m ago•0 comments

Why does AI still forget what your codebase is "for"?

https://www.reddit.com/r/Brunelly/s/UdTgItmmlS
1•RihabAI•9m ago•1 comments

On Lazy Secrets Management

https://radekmie.dev/blog/on-lazy-secrets-management/
1•thunderbong•14m ago•0 comments

Show HN: ZenLocks – A privacy-first, subscription-free iOS screen time blocker

https://zenlocks.haogre.com
1•haogre•15m ago•0 comments

I found a vulnerability in an IRS authorized e-file vendor's app. What's next?

2•kevinminehart•17m ago•0 comments

TheoremGraph: Search 18M+ Mathematical Dependencies

https://www.theoremsearch.com/theorem-graph
1•ilreb•17m ago•0 comments

AMD Versal Architecture and Product Data Sheet (New Premium Series Gen 2)

https://docs.amd.com/v/u/en-US/ds950-versal-overview
2•oneofthose•19m ago•2 comments

Herdr: One terminal for he whole herd

https://herdr.dev/
2•rldjbpin•20m ago•0 comments

Japan has 72 micro-seasons. In 2021 it stopped officially watching most of them

https://jivx.com/microseasons
3•momentmaker•22m ago•0 comments

Hunting a 16-year-old SQLite WAL bug with TLA+

https://ubuntu.com/blog/hunting-a-16-year-old-sqlite-bug-with-tla-is-dqlite-affected
2•peterparker204•25m ago•1 comments

Tokki – language learning app because Duolingo is useless

https://apps.apple.com/us/app/tokki-chat-learn/id6768467025
1•kozielgpc•25m ago•1 comments

Claude Code Is Quietly Fingerprinting China-Linked API Routers

https://www.vincentschmalbach.com/claude-code-china-router-fingerprint/
7•vincent_s•25m ago•0 comments

6 Months of Rift

https://monster0506.dev/blog/6-months-of-rift
1•Monster0506•28m ago•0 comments

I built a browser video editor where you can verify no footage is uploaded

https://Aethercut.app
1•AetherCut•28m ago•0 comments

Should every baby's DNA be sequenced?

https://www.economist.com/science-and-technology/2026/06/29/should-every-babys-dna-be-sequenced
9•nedruod•29m ago•5 comments

Coding with DeepSeek 4 on a 128GB MacBook Pro

https://ronreiter.com/posts/running-deepseek-v4-flash-locally/
2•ronreiter•29m ago•0 comments

We built the hackathon idea we gave up on in 2011

2•EngineeringStuf•30m ago•0 comments

Building tech in the secret R&D hub

https://www.technologyreview.com/2026/06/30/1139661/building-tech-in-the-worlds-secret-rd-hub/
1•joozio•30m ago•0 comments

Sony erases digital content from libraries; reminded we don't own what we buy

https://arstechnica.com/gadgets/2026/06/sony-erases-digital-content-from-libraries-were-reminded-...
27•pseudolus•30m ago•5 comments

Digiplot – automatically extract data from chart images with AI

https://digiplot.ai/
1•Jeremy_DH•31m ago•1 comments

Microsoft builds a bouncer to keep bots out of Teams meetings

https://www.theregister.com/software/2026/06/30/microsoft-builds-a-bouncer-to-keep-bots-out-of-te...
3•LorenDB•33m ago•0 comments

All you need is PostgreSQL

https://ebellani.github.io/blog/2026/all-you-need-is-postgresql/
1•schonfinkel•34m ago•0 comments

The .join() that should be a bug

https://kronotop.com/blog/the-join-that-should-be-a-bug/
1•mastabadtomm•36m ago•0 comments

Towards Automating Scientific Review with Google's Paper Assistant Tool

https://arxiv.org/abs/2606.28277
1•Anon84•36m ago•0 comments

DoorDash robot refuses to leave SWAT operation in Arizona

https://www.the-independent.com/news/world/americas/crime/doordash-robot-swat-operation-arizona-b...
4•pseudolus•38m ago•1 comments