frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: I built a YC data scraper in under 5 minutes

https://www.autonoly.com/blog/682212a2b65a68f26d0c10a4/how-to-scrape-complete-y-combinator-startup-data-in-3-minutes-without-writing-a-single-line-of-code
2•dpacman•1y ago
Hi HN,

I'm an indie hacker who built Autonoly solo over the past 3.5 months. I essentially vibe coded the entire platform based on automation needs I encountered in my own work. I wanted to share a practical example of what it can do - creating a Y Combinator data scraper in just a few minutes without writing any traditional code.

The technical approach is straightforward but effective:

1. Browser automation navigates to YC's company directory 2. For YC's infinite scroll pagination, I implemented a progressive scroll function that iterates about 150 times with calibrated delays (ensuring all ~1000+ companies load) 3. Data extraction uses XPath selectors to identify and capture the structural pattern of each company listing 4. The system then extracts specific data points (company name, description, location, etc.) into a structured CSV

The trickiest parts were getting the XPath patterns right (the DOM structure varies slightly between different company entries) and fine-tuning the scroll timing to ensure complete loading without timeout issues.

What makes this approach effective is that it works with the site's intended user experience. The browser automation renders JavaScript properly, handles dynamic loading, and interacts with elements in a natural way.

While this YC scraper example is specific, I built Autonoly to automate virtually any digital task - data processing, content creation, file management, business workflows, and more. As an indie developer, I kept encountering processes that were tedious to do manually but didn't justify hiring someone or spending weeks on custom code.

I'd love to hear feedback from the HN community, especially from those who've built similar systems or have different approaches to workflow automation. Happy to answer any technical questions about the implementation or discuss the challenges of building automation tools as a solo founder.

The "Collosophone"

https://pressforsound.com/tag/collosophone/
1•Eridanus2•49s ago•0 comments

AWS Lambda MicroVMs: The Compute Nobody Asked For?

https://www.confessionsofadataguy.com/aws-lambda-microvms-the-compute-nobody-asked-for/
1•nosky•1m ago•0 comments

8 Comprehensive WebSite Health Check Ups

https://urlwatch.io/
1•mssblogs•5m ago•0 comments

Claude Design System Prompt

https://github.com/Trystan-SA/claude-design-system-prompt
1•handfuloflight•11m ago•0 comments

Creed: Canonical Markdown File

https://creed.md/home
1•handfuloflight•13m ago•0 comments

Train SIM Created by Just One Person Is Being Called the Best Ever Made

https://kotaku.com/a-train-sim-created-by-just-one-person-is-being-called-the-best-ever-made-2000...
2•oumua_don17•14m ago•0 comments

Apocketlypse

https://0dd.company/galleries/triumph/1.html
3•scaglio•22m ago•0 comments

Surus Agentic Postgres Companion

https://github.com/Geometrein/surus
1•geometrein•22m ago•1 comments

Show HN: Super fast pipeline for finding specific paths across mills of domains

https://github.com/Kirill89/webcensus
1•k1r111•26m ago•0 comments

A Postmortem of an LLM Social Network

https://armx64.medium.com/emergence-without-understanding-a-postmortem-of-an-llm-social-network-9...
1•_pdp_•31m ago•0 comments

Show HN: Reactive Resume v5 – free, private, self-hostable resume builder

https://github.com/amruthpillai/reactive-resume
1•AmruthPillai•33m ago•0 comments

Arrest 0.2.1, an HTTP client with data validation

https://github.com/s-bose/arrest
1•s-bose•35m ago•1 comments

Letter addressed to 'woman in Cornwall shed' arrives at its correct home (2021)

https://metro.co.uk/2021/07/01/letter-addressed-to-woman-in-cornwall-shed-arrives-at-its-correct-...
1•joebig•35m ago•0 comments

Bad Epoll (CVE-2026-46242)

https://github.com/J-jaeyoung/bad-epoll
1•g0xA52A2A•38m ago•0 comments

A Theory of Arrays (ToA) Union Find

https://www.philipzucker.com/toa_unionfind/
1•g0xA52A2A•52m ago•0 comments

Web-based cryptography is always snake oil

https://www.devever.net/~hl/webcrypto
1•enz•53m ago•0 comments

Knowledge Should Not Be Gated

https://www.formaly.io/blog/knowledge-should-not-be-gated
1•nezhar•54m ago•0 comments

India seeks to quell public backlash on ethanol-mixed fuel

https://www.reuters.com/world/india/india-seeks-quell-public-backlash-ethanol-mixed-fuel-after-ex...
1•JumpCrisscross•1h ago•0 comments

SigMap: 97% token reduction for AI coding sessions

https://sigmap.io/
3•handfuloflight•1h ago•0 comments

Notaru

https://notaru.org/
1•handfuloflight•1h ago•0 comments

Show HN: I trained a language model that thinks the capital of Japan is Paris

https://hamiltonianresearch.xyz/blog/hr-diffuse-1.html
4•farisallafi•1h ago•0 comments

code-on-incus: Give each AI agent its own isolated machine with root

https://github.com/mensfeld/code-on-incus
1•Tomte•1h ago•0 comments

Programmers need to start meditating now

https://jacob.gold/posts/programmers-need-to-start-meditating-now/
2•enz•1h ago•0 comments

NirCmd – Windows command line tool

https://www.nirsoft.net/~nirsoft/utils/nircmd.html
1•thunderbong•1h ago•0 comments

Paul Pelosi in hit-and-run in Napa County wine country, police say

https://www.sfgate.com/news/politics/article/paul-pelosi-in-hit-and-run-in-napa-county-wine-22332...
1•turtlegrids•1h ago•0 comments

BESS deployment to escalate as lenders see the light

https://tamarindo.global/insight/analysis/bess-deployment-to-escalate-as-lenders-see-the-light/
1•zeristor•1h ago•1 comments

Fast Software, the Best Software

https://craigmod.com/essays/fast_software/
1•ustad•1h ago•0 comments

Electronic Engineers Master Catalog

https://archive.org/details/electronicengine00unse_8
1•ustad•1h ago•0 comments

Reducing Assumptions, Exploding Your Code

https://ryelang.org/blog/posts/reducing_assumptions_but_exploding/
3•mpweiher•1h ago•0 comments

Dark Mode with Web Standards

https://olliewilliams.xyz/blog/dark-mode/
2•thm•1h ago•0 comments