frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why it's still hard to track Y Combinator Companies

https://www.researchly.at/post/y-combinator-companies-finden-tracken
3•leo_researchly•2w ago

Comments

leo_researchly•2w ago
We have been working on how to track Y Combinator Companies (and other incubators for that matter) and I wanted to share a few learnings:

Identifying them in the first place is easy; even with a simple scraper you can save them to Notion, Google Sheets etc. Firecrawl or Exa are pretty good for that

The hard part is keeping the list current and "actionable":

you need a way to cluster startups; most VCs (& their thesis) are sector-driven. So a sector-taxonomy is ideal. But classifying startups into sectors is extremely difficult; you cannot rely on official documentation as it either does not exist or is too vague. you cannot rely on the startup name as the unique identifier; it can change, it might not be unique Startups pivot; so you need to keep your lists up-to-date you need to keep your database up-to-date; can't rely on news as information is scattered all over the place, can't rely on generic alerts, you need signals like "two new use cases on website" What we have seen to work

use two types of taxonomies: a broad one (NACE codes are a great way to start) and an AI-based one (use a simple classifier based on the company website) use the domain as the unique identifier instead of the name monitor live signals; that's usually the hardest part. The simplest (but also most expensive way) is to capture certain pages (about us, career pages etc.) regularly and use AI to get a "diff" usually Google Sheets is enough to start, but move quickly to a more stable database like Notion or Airtable (CRMs work too, but tend to be too overloaded) use N8N to glue all that together with a few simple prompts If there's interest, I can also share a technical breakdown and N8N files to start.

PS: you can read our full breakdown (in German, however) on your blog: https://www.researchly.at/post/y-combinator-companies-finden...

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

https://www.bloomberg.com/news/articles/2026-02-06/nvidia-ceo-says-ai-capital-spending-is-appropr...
1•virgildotcodes•1m ago•1 comments

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

https://www.styloshare.com
1•stylofront•2m ago•0 comments

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

1•PhantomKey•6m ago•0 comments

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

https://github.com/qrafty-ai/teleop_xr
1•playercc7•9m ago•1 comments

The Highest Exam: How the Gaokao Shapes China

https://www.lrb.co.uk/the-paper/v48/n02/iza-ding/studying-is-harmful
1•mitchbob•13m ago•1 comments

Open-source framework for tracking prediction accuracy

https://github.com/Creneinc/signal-tracker
1•creneinc•15m ago•0 comments

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
2•Osiris30•16m ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•19m ago•0 comments

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•21m ago•1 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•23m ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•23m ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•29m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•30m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
8•witnessme•34m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
2•aloukissas•38m ago•1 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
2•bigbromaker•40m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•46m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
6•alephnerd•49m ago•2 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•49m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
2•pbradv•52m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
4•hasheddan•53m ago•0 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•1h ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•1h ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
34•duxup•1h ago•6 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•1h ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•1h ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•1h ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•1h ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•1h ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments