frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: I built a YC data scraper in under 5 minutes

https://www.autonoly.com/blog/682212a2b65a68f26d0c10a4/how-to-scrape-complete-y-combinator-startup-data-in-3-minutes-without-writing-a-single-line-of-code
2•dpacman•1y ago
Hi HN,

I'm an indie hacker who built Autonoly solo over the past 3.5 months. I essentially vibe coded the entire platform based on automation needs I encountered in my own work. I wanted to share a practical example of what it can do - creating a Y Combinator data scraper in just a few minutes without writing any traditional code.

The technical approach is straightforward but effective:

1. Browser automation navigates to YC's company directory 2. For YC's infinite scroll pagination, I implemented a progressive scroll function that iterates about 150 times with calibrated delays (ensuring all ~1000+ companies load) 3. Data extraction uses XPath selectors to identify and capture the structural pattern of each company listing 4. The system then extracts specific data points (company name, description, location, etc.) into a structured CSV

The trickiest parts were getting the XPath patterns right (the DOM structure varies slightly between different company entries) and fine-tuning the scroll timing to ensure complete loading without timeout issues.

What makes this approach effective is that it works with the site's intended user experience. The browser automation renders JavaScript properly, handles dynamic loading, and interacts with elements in a natural way.

While this YC scraper example is specific, I built Autonoly to automate virtually any digital task - data processing, content creation, file management, business workflows, and more. As an indie developer, I kept encountering processes that were tedious to do manually but didn't justify hiring someone or spending weeks on custom code.

I'd love to hear feedback from the HN community, especially from those who've built similar systems or have different approaches to workflow automation. Happy to answer any technical questions about the implementation or discuss the challenges of building automation tools as a solo founder.

Webflow hosting is currently down

https://statusgator.com/services/webflow
1•datadrivenangel•1m ago•0 comments

Local Privacy Filter for Claude Code

https://github.com/outgate-ai/og-local
1•alikh31•2m ago•0 comments

Malware developers added nuclear and biological weapons text to to their spyware

https://twitter.com/jsrailton/status/2064661778978533571
1•marc__1•2m ago•0 comments

SpaceX valued at nearly $1.8T ahead of record share sale

https://www.bbc.com/news/articles/cwy034q89j4o
2•amclennon•6m ago•0 comments

Apple Shares List of 250 Changes Across iOS 27, macOS Golden Gate, and More

https://www.macrumors.com/2026/06/10/apple-lists-250-changes-ios-27-and-more/
1•ksec•6m ago•0 comments

OpenAI Mulls Significant Cuts to What It Charges for Tokens

https://www.wsj.com/tech/ai/openai-considers-drastic-price-cuts-anticipating-war-for-users-with-a...
1•khazit•7m ago•2 comments

Spy law (FISA) on track to lapse after Congress rejects extension

https://www.politico.com/news/2026/06/11/spy-law-on-track-to-lapse-after-house-rejects-extension-...
2•bsimpson•9m ago•1 comments

A Minimum Wage Natural Experiment Has Been Running for over a Decade

https://arindube.substack.com/p/a-minimum-wage-natural-experiment
1•bilsbie•10m ago•0 comments

Canada and America Are Drifting Apart. The Pentagon Just Made It Official

https://thewalrus.ca/canada-and-america-are-drifting-apart-the-pentagon-just-made-it-official/
1•emptybits•10m ago•0 comments

New in Kiro Web: Build with Spec, GitLab, and More

https://kiro.dev/blog/kiro-web-specs-gitlab/
2•siegers•12m ago•1 comments

I stopped tracking my time. Now I can't focus

https://newsletter.masilotti.com/p/i-stopped-tracking-my-time-now-i
2•joemasilotti•12m ago•1 comments

Travel Locally, Where You Are

https://www.ssp.sh/brain/travel-where-you-are/
9•zazuke•18m ago•1 comments

Fable is a compiler that brings F# into the JavaScript ecosystem

https://fable.io/
2•ofrzeta•18m ago•0 comments

Show HN: Heard – offline LoRa mesh that keeps hiking groups together

https://github.com/luciobaiocchi/heard
2•luciobaiocchi•19m ago•0 comments

AI Lead Should Be an Engineer

https://presentofcoding.substack.com/p/your-ai-lead-should-be-an-engineer
3•heterodoxjedi•21m ago•0 comments

New Founder Focused Link-in-Bio Tool

https://lablio.app
2•techguydiy•22m ago•0 comments

AUR Packages Attacked by Infostealer

https://lists.archlinux.org/archives/list/aur-general@lists.archlinux.org/thread/FGXPCB3ZVCJIV7FX...
3•speckx•23m ago•1 comments

Trump phone has HTC guts. Tremendous guts. The best guts

https://www.theregister.com/personal-tech/2026/06/11/trump-phone-has-htc-guts-tremendous-guts-the...
6•beardyw•25m ago•2 comments

Skeptics Question Whether SpaceX Is Worth $1.77 Trillion

https://www.nytimes.com/2026/06/11/technology/spacex-valuation-skeptics.html
5•petilon•30m ago•1 comments

Spark – Turn any article into a LinkedIn post without leaving the tab

https://spark.bhairav.ai/
2•chandanjha_dev•31m ago•0 comments

JEP 401 being merged into JDK 28?

https://mail.openjdk.org/archives/list/jdk-dev@openjdk.org/message/AIA3O3LHFZ6T7TIPH7KZT4WS4B6U72U5/
3•birdculture•31m ago•0 comments

Shall we play a game? – LLMs use tactical nukes in 95% of simulations

https://www.kennethpayne.uk/p/shall-we-play-a-game
3•nick238•32m ago•0 comments

Recognize and break the cycle of poor family upbringing

https://comuniq.xyz/post?t=1230
3•01-_-•33m ago•0 comments

MTPLX 1.0.0

https://github.com/youssofal/MTPLX/releases
2•rrevi•33m ago•1 comments

FAQ: The Secure Boot Disaster

https://www.heise.de/en/guide/FAQ-The-secure-boot-disaster-9751994.html
2•fh973•34m ago•0 comments

AI Learned How the Universe Works and Created Unexpected Problems for Physicists

https://gizmodo.com/ai-learned-how-the-universe-works-and-that-created-an-unexpected-problem-for-...
2•SVI•36m ago•0 comments

Refactoring English: Month 18

https://mtlynch.io/retrospectives/2026/06/
4•speckx•37m ago•0 comments

Powering the next era of Confidential AI

https://cloud.google.com/blog/products/identity-security/powering-the-next-era-of-confidential-ai
4•strstr•42m ago•0 comments

Show HN: PyTorch on Java

https://github.com/haifengl/smile/tree/master/deep
2•pdsminer•42m ago•0 comments

The three lines of CSS that saved me 40kb and might do the same for you

https://blog.welcomehome.city/b/three-lines/
5•speckx•42m ago•0 comments