frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a YC data scraper in under 5 minutes

https://www.autonoly.com/blog/682212a2b65a68f26d0c10a4/how-to-scrape-complete-y-combinator-startup-data-in-3-minutes-without-writing-a-single-line-of-code
2•dpacman•1y ago
Hi HN,

I'm an indie hacker who built Autonoly solo over the past 3.5 months. I essentially vibe coded the entire platform based on automation needs I encountered in my own work. I wanted to share a practical example of what it can do - creating a Y Combinator data scraper in just a few minutes without writing any traditional code.

The technical approach is straightforward but effective:

1. Browser automation navigates to YC's company directory 2. For YC's infinite scroll pagination, I implemented a progressive scroll function that iterates about 150 times with calibrated delays (ensuring all ~1000+ companies load) 3. Data extraction uses XPath selectors to identify and capture the structural pattern of each company listing 4. The system then extracts specific data points (company name, description, location, etc.) into a structured CSV

The trickiest parts were getting the XPath patterns right (the DOM structure varies slightly between different company entries) and fine-tuning the scroll timing to ensure complete loading without timeout issues.

What makes this approach effective is that it works with the site's intended user experience. The browser automation renders JavaScript properly, handles dynamic loading, and interacts with elements in a natural way.

While this YC scraper example is specific, I built Autonoly to automate virtually any digital task - data processing, content creation, file management, business workflows, and more. As an indie developer, I kept encountering processes that were tedious to do manually but didn't justify hiring someone or spending weeks on custom code.

I'd love to hear feedback from the HN community, especially from those who've built similar systems or have different approaches to workflow automation. Happy to answer any technical questions about the implementation or discuss the challenges of building automation tools as a solo founder.

White-collar workers report growing feelings of 'AI brain fry'

https://www.ft.com/content/0ba3bd4f-cc3a-4cad-8a8e-76925da2a711
2•1vuio0pswjnm7•4m ago•0 comments

How Do VPNs Protect Your Privacy? VPN Overview

https://www.privacyguides.org/en/basics/vpn-overview/
1•Cider9986•5m ago•0 comments

Secrets at Rest: SOPS and Age for Docker Compose Homelabs

https://pikemd.com/blog/sops-age-docker-compose/
2•pike00•8m ago•0 comments

Self-destructing $2k Nvidia chips for distributed solar data ctrs in lampposts

https://www.techradar.com/pro/self-destructing-usd2-000-nvidia-chips-will-soon-power-tens-of-thou...
2•toss1•10m ago•0 comments

I ran forensics on closed models and discovered no one is using dense attention

https://blog.0xmmo.co/forensics/post.html
1•mmoustafa•13m ago•0 comments

Countdown to Apophis Close Approach–Cascading Hazards from Asteroid Impacts

https://pubs.usgs.gov/publication/fs20253028/full
1•rolph•14m ago•0 comments

Systematically Auditing AI Agent Benchmarks with BenchJack

https://arxiv.org/abs/2605.12673
1•matt_d•17m ago•0 comments

Show HN: Trailmaps.app – Mobile maps that match the trail

https://trailmaps.app/
1•c0nsumer•20m ago•1 comments

Musk's China trip during OpenAI trial prompts apology from his lawyer

https://www.cnbc.com/2026/05/14/musk-lawyer-trial-jury-china-trip-openai-altman.html
1•1vuio0pswjnm7•23m ago•0 comments

How to Fix "DMARC Quarantine/Reject Policy Not Enabled"

https://dmarcguard.io/blog/dmarc-policy-not-enabled-fix/
1•meysamazad•24m ago•0 comments

How do you tell who's thinking?

https://willhackett.com/borrowed-cognition/
1•meysamazad•24m ago•0 comments

Ingest – Capture Anything from Anywhere

https://edleeman.co.uk/posts/ingest-capture-anything-from-anywhere/
1•meysamazad•25m ago•0 comments

Cowboy files plans for up to 20k orbital data centers

https://spacenews.com/cowboy-files-plans-for-up-to-20000-orbital-data-centers/
2•defrost•26m ago•0 comments

Bay Area customers may face warnings, fees under Recology's new camera system

https://www.sfgate.com/local/article/recology-cameras-22259377.php
1•turtlegrids•27m ago•0 comments

Water on Earth

https://www.scientificamerican.com/article/its-a-water-full-world/
2•soupspaces•29m ago•0 comments

Big tech is sacrificing its cashflows to prop up the AI boom

https://www.economist.com/business/2026/05/13/big-tech-is-sacrificing-its-cashflows-to-prop-up-th...
3•1vuio0pswjnm7•29m ago•1 comments

Possible Samsung strike puts more pressure on memory pricing

https://www.theregister.com/systems/2026/05/15/possible-samsung-strike-puts-even-more-pressure-on...
1•jnord•32m ago•0 comments

Beyond Git: Coordinating humans, agents, and automation in a repo with a ledger

https://www.mentu.ai/blog/beyond-git
2•rashidae•32m ago•0 comments

Audit of Serai's Substrate Blockchain

https://serai.exchange/2026/04/15/serai-blockchain-audited.html
1•Cider9986•33m ago•0 comments

The secretive and lucrative world of orchid breeding

https://www.bbc.com/news/articles/cly039rr2mgo
1•y1n0•34m ago•0 comments

Spam Resistant Forges

https://blog.feld.me/posts/2026/05/spam-resistant-forges/
1•y1n0•34m ago•0 comments

Untangling Communication (2001) [pdf]

https://dhemery.com/pdf/untangling_communication.pdf
1•mooreds•35m ago•0 comments

Don't let your old NVMe gather dust: It's the fastest USB stick you own

https://www.xda-developers.com/old-nvme-is-the-fastest-usb-stick-you-own/
2•y1n0•37m ago•0 comments

AI Wellbeing – Measuring and Improving the Functional Pleasure and Pain of AIs

https://www.ai-wellbeing.org/
1•xiaoyu2006•38m ago•1 comments

Heads up: new Google support scam uses a REAL email from Google: sysadmin

https://old.reddit.com/r/sysadmin/comments/1tdezhu/heads_up_new_google_support_scam_uses_a_real/
1•freediver•39m ago•0 comments

US plans to indict Cuba's Raul Castro, US DOJ official says

https://www.reuters.com/legal/government/us-plans-indict-cubas-raul-castro-us-doj-official-says-2...
1•tartoran•42m ago•0 comments

We Didn't Ask for This Internet

https://angelabenton.substack.com/p/what-a-post-social-media-internet
1•ethanplant•52m ago•0 comments

How the World Became a Casino

https://podcasts.apple.com/us/podcast/how-the-world-became-a-casino-with-natasha-sch%C3%BCll/id17...
1•gmays•54m ago•0 comments

A defunct email service as a template for campus AI

https://nathanschneider.info/2026/05/a-defunct-email-service-as-a-template-for-campus-ai/
1•ntnsndr•58m ago•0 comments

Why should a Trace-ID be 128 bits?

https://newsletter.signoz.io/p/why-should-a-trace-id-be-128-bits
2•pranay01•59m ago•0 comments