frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a YC data scraper in under 5 minutes

https://www.autonoly.com/blog/682212a2b65a68f26d0c10a4/how-to-scrape-complete-y-combinator-startup-data-in-3-minutes-without-writing-a-single-line-of-code
2•dpacman•11mo ago
Hi HN,

I'm an indie hacker who built Autonoly solo over the past 3.5 months. I essentially vibe coded the entire platform based on automation needs I encountered in my own work. I wanted to share a practical example of what it can do - creating a Y Combinator data scraper in just a few minutes without writing any traditional code.

The technical approach is straightforward but effective:

1. Browser automation navigates to YC's company directory 2. For YC's infinite scroll pagination, I implemented a progressive scroll function that iterates about 150 times with calibrated delays (ensuring all ~1000+ companies load) 3. Data extraction uses XPath selectors to identify and capture the structural pattern of each company listing 4. The system then extracts specific data points (company name, description, location, etc.) into a structured CSV

The trickiest parts were getting the XPath patterns right (the DOM structure varies slightly between different company entries) and fine-tuning the scroll timing to ensure complete loading without timeout issues.

What makes this approach effective is that it works with the site's intended user experience. The browser automation renders JavaScript properly, handles dynamic loading, and interacts with elements in a natural way.

While this YC scraper example is specific, I built Autonoly to automate virtually any digital task - data processing, content creation, file management, business workflows, and more. As an indie developer, I kept encountering processes that were tedious to do manually but didn't justify hiring someone or spending weeks on custom code.

I'd love to hear feedback from the HN community, especially from those who've built similar systems or have different approaches to workflow automation. Happy to answer any technical questions about the implementation or discuss the challenges of building automation tools as a solo founder.

Benchmarking glibc vs. jemalloc vs. mimalloc vs. tcmalloc in MariaDB TPC-C

https://tidesdb.com/articles/tpcc-analysis-jemalloc-mimalloc-tcmalloc-tidesql-and-innodb-in-maria...
1•alexpadula•1m ago•1 comments

R

https://blog.cloudflare.com/post-quantum-warp/
1•kefekefe•3m ago•0 comments

A neural network for lossy audio codec reconstruction

https://github.com/aston89/UNlossifier-lossy-audio-reconstructor-and-sound-signature-simulator/tr...
1•Aston89•4m ago•0 comments

256 Lines or Less: Test Case Minimization

https://matklad.github.io/2026/04/20/test-case-minimization.html
1•gsky•6m ago•0 comments

Asset Harvester

https://github.com/NVIDIA/asset-harvester
1•downboots•7m ago•0 comments

Aisbf – AI Service Broker Framework (BETA) unified proxy and intelligent router

https://aisbf.cloud
1•nextime•12m ago•1 comments

I programmed an AI in 6502 assembly

https://medium.com/@paul.newell_20752/i-programmed-an-ai-in-6502-assembly-it-worked-23fcd7cf2a96
1•paul_n3•14m ago•0 comments

Tell HN: YouTube RSS feeds no longer work

2•019•14m ago•0 comments

Dutch central bank chooses Lidl for European Cloud

https://www.techzine.eu/news/infrastructure/140634/dutch-central-bank-chooses-lidl-for-european-c...
1•vrganj•16m ago•0 comments

RFDiffusion3 in 3 Levels of Abstraction

https://www.figma.com/board/ja7l5ZFFqrSoTu78wzEVNm/RFDiffusion3-In-3-Levels-of-Abstraction?node-i...
2•nicetomeetyu•18m ago•0 comments

Open Source Credential Proxy and Vault for Agents

https://twitter.com/dangtony98/status/2046982854710857762
1•vmatsiiako•19m ago•1 comments

Exposing Attack Behaviour in Realtime

https://fibratus.io
1•archrabbit•21m ago•0 comments

Show HN: Linux Desktops in the Browser

https://vmpixel.com/
1•andoando•21m ago•0 comments

Tantrums as Status Symbols (2005)

https://marginalrevolution.com/marginalrevolution/2005/08/tantrums_as_sta.html
1•downbad_•23m ago•1 comments

Meta tracking employee keystrokes on Google, LinkedIn, Wikipedia for AI training

https://www.cnbc.com/2026/04/22/meta-tracks-employee-usage-on-google-linkedin-ai-training-project...
4•1vuio0pswjnm7•25m ago•1 comments

Anthropic now requires new Claude users to verify identity with photo ID

https://twitter.com/Pirat_Nation/status/2044960285510053929
4•croes•25m ago•1 comments

Information Asymmetry

https://en.wikipedia.org/wiki/Information_asymmetry
3•downboots•50m ago•1 comments

Malicious Checkmarx Artifacts Found in Official KICS Docker Repo and Code Ext

https://socket.dev/blog/checkmarx-supply-chain-compromise
1•orkj•50m ago•0 comments

Show HN: CreepJS Browser Fingerprinting

https://abrahamjuliot.github.io/creepjs/
2•gastonmorixe•52m ago•0 comments

Sruthi Chandran Elected Debian Project Leader

https://bits.debian.org/2026/04/dpl-elections-2026.html
2•tapanjk•55m ago•0 comments

Every local SEO playbook is built on proximity, AI overviews ignore it completly

https://webmatrices.com/post/every-local-seo-playbook-is-built-on-proximity-ai-overviews-ignore-i...
1•bishwasbh•56m ago•1 comments

Ars Technica: Our newsroom AI policy

https://arstechnica.com/staff/2026/04/our-newsroom-ai-policy/
4•zdw•1h ago•1 comments

Computing in the Era of Doom: What Were PCs Like in 1993?

https://www.ahalbert.com/reviews/technology/2026/04/20/black-book-doom.html
2•pjmlp•1h ago•0 comments

High Street mini-marts selling cocaine, cannabis and prescription drugs

https://www.bbc.co.uk/news/articles/c62l429w2pko
2•vinni2•1h ago•0 comments

A disabled kea parrot is the alpha male of his circus

https://www.cell.com/current-biology/fulltext/S0960-9822(26)00259-9
1•zdw•1h ago•0 comments

Ford pivoting to catch up with his real competitor: China's BYD

https://finance.yahoo.com/sectors/technology/articles/ford-ceo-says-tesla-doesn-180115430.html
2•KnuthIsGod•1h ago•0 comments

Bloom filters: the niche trick behind a 16× faster API

https://incident.io/blog/bloom-filters
2•crcastle•1h ago•1 comments

Cursor and SpaceX: In search of a complete loop

https://kwokchain.com/2026/04/23/cursor-and-spacex-in-search-of-a-complete-loop/
2•borisjabes•1h ago•0 comments

Show HN: Viscacha - A crashsafe, zero infra job system for funcs/AI pipelines

https://github.com/skylarm-b/viscacha
1•SkyguyMB•1h ago•0 comments

House lawmakers get a chilling demo of 'jailbroken' AI

https://www.politico.com/news/2026/04/22/ai-chatbots-jailbreak-safety-00887869
1•0in•1h ago•1 comments