frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a YC data scraper in under 5 minutes

https://www.autonoly.com/blog/682212a2b65a68f26d0c10a4/how-to-scrape-complete-y-combinator-startup-data-in-3-minutes-without-writing-a-single-line-of-code
2•dpacman•8mo ago
Hi HN,

I'm an indie hacker who built Autonoly solo over the past 3.5 months. I essentially vibe coded the entire platform based on automation needs I encountered in my own work. I wanted to share a practical example of what it can do - creating a Y Combinator data scraper in just a few minutes without writing any traditional code.

The technical approach is straightforward but effective:

1. Browser automation navigates to YC's company directory 2. For YC's infinite scroll pagination, I implemented a progressive scroll function that iterates about 150 times with calibrated delays (ensuring all ~1000+ companies load) 3. Data extraction uses XPath selectors to identify and capture the structural pattern of each company listing 4. The system then extracts specific data points (company name, description, location, etc.) into a structured CSV

The trickiest parts were getting the XPath patterns right (the DOM structure varies slightly between different company entries) and fine-tuning the scroll timing to ensure complete loading without timeout issues.

What makes this approach effective is that it works with the site's intended user experience. The browser automation renders JavaScript properly, handles dynamic loading, and interacts with elements in a natural way.

While this YC scraper example is specific, I built Autonoly to automate virtually any digital task - data processing, content creation, file management, business workflows, and more. As an indie developer, I kept encountering processes that were tedious to do manually but didn't justify hiring someone or spending weeks on custom code.

I'd love to hear feedback from the HN community, especially from those who've built similar systems or have different approaches to workflow automation. Happy to answer any technical questions about the implementation or discuss the challenges of building automation tools as a solo founder.

Attack Against Poland's Grid Disrupted Communication Devices at About 30 Sites

https://www.zetter-zeroday.com/attack-against-polands-grid-disrupted-communication-devices-at-abo...
1•aa_is_op•25s ago•0 comments

A LinkedIn Job Offer Tried to Install Malware on My Machine

https://codecrank.ai/blog/linkedin-malware-warning/
1•campuscodi•51s ago•0 comments

The Endless Pursuit of "The Next New Thing"

https://www.souravinsights.com/blog/the-endless-pursuit-of-the-next-new-thing
1•SouravInsights•1m ago•0 comments

Pigmentation screen in Drosophila reveals regulators of brain dopamine, sleep

https://www.sciencedirect.com/science/article/pii/S2589004225026495
1•PaulHoule•2m ago•0 comments

4MB JavaScript runtime for IoT and edge devices

1•altinmert•3m ago•0 comments

Federal judge sides with city of Norfolk in Flock camera lawsuit

https://www.wtkr.com/news/in-the-community/norfolk/federal-judge-sides-with-city-of-norfolk-in-fl...
1•pilingual•3m ago•0 comments

Amazon to Cut 16,000 Jobs in Latest Round of Layoffs

https://www.nytimes.com/2026/01/28/technology/amazon-corporate-layoffs.html
1•apparent•4m ago•1 comments

Scientists link 22 genes to deadly risks from common virus

https://www.newscientist.com/article/2513522-this-virus-infects-most-of-us-but-why-do-only-some-g...
1•muhammedbash•4m ago•0 comments

Where did southern Australia's record-breaking heatwave come from?

https://theconversation.com/where-did-southern-australias-record-breaking-heatwave-come-from-274417
1•MaysonL•9m ago•0 comments

Tesla's unsupervised robotaxis vanish after earnings announcement

https://electrek.co/2026/01/28/teslas-unsupervised-robotaxis-vanish/
2•Animats•9m ago•0 comments

Show HN: EU Regulations MCP Server – Query GDPR, Dora, NIS2 from Claude

https://github.com/Ansvar-Systems/EU_compliance_MCP
1•Aesir89•10m ago•0 comments

Notes on Clawbot: agents are starting to look like OS runtimes design now

1•chenyusu•11m ago•0 comments

Energy based AI reasoning model – Sudoku solver performance comparison

https://sudoku.logicalintelligence.com/
1•alexjray•12m ago•0 comments

Show HN: BoxLight- a macOS Sonoma ring light app that doesn't waste space

https://www.mimiran.com/boxlight-a-macos-ring-light-that-doesnt-require-tahoe-and-doesnt-waste-sc...
1•reubenswartz•13m ago•0 comments

Choosing a web host feels a lot like a first date

https://www.hostingadvice.com/blog/the-first-30-minutes-in-hosting-are-like-a-first-date/
1•ljh501•17m ago•0 comments

We Optimized the World to Death

https://www.himthe.dev/blog/we-optimized-the-world-to-death
1•vegadw•17m ago•0 comments

The Three-Cent Problem

https://docs.eventsourcingdb.io/blog/2026/01/29/the-three-cent-problem/
1•goloroden•18m ago•0 comments

China's Four-Year Energy Spree Has Eclipsed US Power Grid

https://www.bloomberg.com/news/articles/2026-01-28/china-s-four-year-energy-spree-has-eclipsed-en...
1•toomuchtodo•18m ago•1 comments

Show HN: MoaV – Mother of all VPNs

https://github.com/shayanb/MoaV
2•shayanbahal•22m ago•0 comments

Stoicism-as-a-Service (SaaS)

https://indianspotlighttime.substack.com/p/stoicism-as-a-service-saas
2•koolhead17•23m ago•0 comments

Add Support for PyCapsule to Pyspark

https://github.com/apache/spark/commit/ecf179c3485ba8bac72afd9105892d9798d23f8f
3•devin-petersohn•25m ago•0 comments

Selectively Disabling HTTP/1.0 and HTTP/1.1

https://markmcb.com/web/selectively_disabling_http_1/
2•birdculture•26m ago•0 comments

We instrumented GitHub Actions. Here's what GitHub won't show you

https://depot.dev/blog/we-instrumented-github-actions
1•kylegalbraith•28m ago•0 comments

Predictive Translation: High-Perf Buffer Management Without the Tradeoffs [pdf]

https://db.in.tum.de/~zinsmeister/papers/predictive-translation.pdf
1•tanelpoder•33m ago•0 comments

Dark Triad: Explained

https://www.technotheoria.org/p/dark-triad-explained
2•paulpauper•35m ago•0 comments

I'm back to building my own digital music collection

https://hidde.blog/owning-music/
2•speckx•36m ago•0 comments

The Jevons Paradox of AI and New Industries of Creativity

https://kamilas.substack.com/p/the-jevons-paradox-of-ai-and-new
1•kamselig•42m ago•1 comments

Native Linux VST plugin directory

https://linuxmusic.rocks
10•Aldipower•43m ago•5 comments

New Books Aren't Worth Reading

https://www.atlaspress.co/p/new-books-arent-worth-reading
4•speckx•43m ago•0 comments

Don't Be a Tourist in Your Own Codebase

https://www.parand.com/dont-be-a-tourist-in-your-own-code-base.html
2•tworats•44m ago•0 comments