frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a YC data scraper in under 5 minutes

https://www.autonoly.com/blog/682212a2b65a68f26d0c10a4/how-to-scrape-complete-y-combinator-startup-data-in-3-minutes-without-writing-a-single-line-of-code
2•dpacman•1y ago
Hi HN,

I'm an indie hacker who built Autonoly solo over the past 3.5 months. I essentially vibe coded the entire platform based on automation needs I encountered in my own work. I wanted to share a practical example of what it can do - creating a Y Combinator data scraper in just a few minutes without writing any traditional code.

The technical approach is straightforward but effective:

1. Browser automation navigates to YC's company directory 2. For YC's infinite scroll pagination, I implemented a progressive scroll function that iterates about 150 times with calibrated delays (ensuring all ~1000+ companies load) 3. Data extraction uses XPath selectors to identify and capture the structural pattern of each company listing 4. The system then extracts specific data points (company name, description, location, etc.) into a structured CSV

The trickiest parts were getting the XPath patterns right (the DOM structure varies slightly between different company entries) and fine-tuning the scroll timing to ensure complete loading without timeout issues.

What makes this approach effective is that it works with the site's intended user experience. The browser automation renders JavaScript properly, handles dynamic loading, and interacts with elements in a natural way.

While this YC scraper example is specific, I built Autonoly to automate virtually any digital task - data processing, content creation, file management, business workflows, and more. As an indie developer, I kept encountering processes that were tedious to do manually but didn't justify hiring someone or spending weeks on custom code.

I'd love to hear feedback from the HN community, especially from those who've built similar systems or have different approaches to workflow automation. Happy to answer any technical questions about the implementation or discuss the challenges of building automation tools as a solo founder.

Open Source Ecosystems

https://asimovaddendum.substack.com/p/open-source-ecosystems
1•BerislavLopac•25s ago•0 comments

More efficient Earth–Moon transfer

https://link.springer.com/article/10.1007/s42064-025-0297-x
1•cl3misch•1m ago•0 comments

Holonomy_lib, exact non Euclidean geometry primitives for PyTorch

https://github.com/Synoros-io/holonomy_lib
1•John_Vaught•1m ago•0 comments

The Kaiser and a "Mediocre Man" Theory of History

https://www.deadcarl.com/p/the-kaiser-and-a-mediocre-man-theory
2•baud147258•7m ago•0 comments

Replacing WebView2 print-to-PDF with an embedded Typst engine (Rust)

https://4worlds.dev/lore/014-typst-pdf-engine/
1•ghost-of-asimov•12m ago•0 comments

Ask HN: What's the hardest problem you've ever solved?

3•chistev•17m ago•1 comments

Ken Iverson in Denmark (2005)

https://web.archive.org/web/20071023024007/https://vector.org.uk/archive/v223/gitte222.htm
2•tosh•20m ago•0 comments

Show HN: AI-org – org-mode powered by AI

https://ai-org.net/
1•mannders•26m ago•0 comments

What Is an Array?

https://www.jsoftware.com/papers/array.htm
1•tosh•29m ago•0 comments

New charter gives River Wye the right to be free from pollution

https://www.bbc.co.uk/news/articles/czx21820rn4o
3•susam•38m ago•0 comments

Yocto vs. Debian for building embedded Linux systems

https://sigma-star.at/blog/2026/05/you-probably-dont-need-yocto-and-thats-fine/
2•fanf2•44m ago•0 comments

Building a game engine for 20 years [video]

https://www.youtube.com/watch?v=4d-CKaBpLC4
1•AshleysBrain•45m ago•0 comments

Zig: Build System Reworked

https://ziglang.org/devlog/2026/#2026-05-26
4•tosh•48m ago•1 comments

Thunderbolt-Ibverbs: InfiniBand for Everyone

https://blog.hellas.ai/blog/thunderbolt-ibverbs/
2•grw_•49m ago•0 comments

Rsync 3.4.3 has hundreds of Claude commits

https://mastodon.gamedev.place/@JeremiahFieldhaven/116654345332213390
47•fooker•53m ago•28 comments

Apple working to cram Gemini model into iPhone to power new Siri

https://arstechnica.com/ai/2026/05/apple-reportedly-trying-to-distill-googles-multi-trillion-para...
3•TMWNN•53m ago•1 comments

How we run Gemini at scale across billions of posts

https://www.modash.io/engineering/how-we-run-gemini-at-scale-across-billions-of-posts
1•igarnedo•55m ago•0 comments

How many emails should be in the waitlist before launching an application?

1•dash_ai•55m ago•1 comments

Microsoft wants you to share your health symptoms with its new Copilot tool

https://www.xda-developers.com/microsoft-wants-you-to-share-your-symptoms-with-its-new-copilot-he...
2•01-_-•59m ago•0 comments

ICE to keep an eye on your eyes under $25M biometric scanner deal

https://www.theregister.com/public-sector/2026/05/29/ice-awards-bi2-25m-contract-for-1570-biometr...
3•01-_-•1h ago•0 comments

Putin's $26B Quest for Longevity

https://www.wsj.com/world/russia/putin-longevity-antiaging-92dee6e8
3•kubami•1h ago•0 comments

Best OLM to PST Converter Tool to Convert Mac OLM to PST

https://apps.microsoft.com/detail/9n7jk7z3546j?hl=en-US&gl=US
1•tieanderson•1h ago•0 comments

Mercedes-Benz may be shut out of U.S. market due to Chinese ownership

https://www.cnbc.com/2026/05/29/mercedes-benz-ban-congressional-bill-china-ownership.html
2•KnuthIsGod•1h ago•0 comments

Meta Lays Off 8k Employees, as A.I. Casualties Mount

https://www.nytimes.com/2026/05/19/technology/meta-layoffs-ai.html
2•tagyro•1h ago•1 comments

The true power of regular expressions (2012)

https://www.npopov.com/2012/06/15/The-true-power-of-regular-expressions.html
1•downbad_•1h ago•1 comments

Iron-rich immune cells help homing pigeons navigate

https://www.science.org/content/article/mind-blowing-iron-rich-immune-cells-help-homing-pigeons-n...
20•XzetaU8•1h ago•0 comments

The SLAX Scripting Language: An Alternate Syntax for XSLT

http://juniper.github.io/libslax/slax-manual.html
2•thefilmore•1h ago•0 comments

Danish pension fund excludes SpaceX citing governance and valuation

https://www.reuters.com/legal/transactional/danish-pension-fund-excludes-spacex-citing-governance...
56•vrganj•1h ago•23 comments

Tesla Self-Certifies Level 4 Autonomous Vehicles in Texas

https://www.notateslaapp.com/news/4216/tesla-self-certifies-l4-autonomy-in-texas
14•frankacter•1h ago•3 comments

Sana high-resolution image and video generation from NVidia

https://github.com/NVlabs/Sana
1•andsoitis•1h ago•0 comments