frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Sumble – knowledge graph for GTM data – query tech stack, key projects

https://sumble.com
63•antgoldbloom•2h ago
I’m Anthony, co-founder/CEO of Sumble. I was previously co-founder/CEO of Kaggle. Sumble is my newco with Ben Hamner (former co-founder and CTO of Kaggle).

### What we built

Sumble is a knowledge graph for go-to-market teams. We allow you to run very rich queries to identify prospects at a granular level and be able to do very targeted outreach.

Sumble allows you to find:

- tech stacks (in larger companies, down to the team or buying group level) - key projects those teams are working on (cloud migrations, GenAI initiatives, etc.) - people involved in those key projects

For example, here's a list of GenAI projects at Capital One that involve RAG/Vector databases: https://sumble.com/l/6sDqKmhyAH

And this view includes a list of people who we think are involved in a particular project being undertaken by the AI Foundation Team at Capital One: https://sumble.com/l/j8mbRrDsly

These views allow you to reach out to that team with a granular understanding of what they are working on.

### Inspiration

Sumble was very much inspired by our experience at Kaggle:

1. Kaggle’s public-data platform showed us how hungry people are for high-quality data (the metrics on that product were really strong)

2. At Google we saw knowledge graphs unlock powerful and composable queries

### Trying it out

- The app is live today; you’ll need to log in (Google OAuth or magic links)

- Most functionality and data are free; we only charge individual users for bulk exports

### How it works (briefly)

- Sources: job posts, resume data, company websites (more to come!)

- Extraction & linking: We use LLM (mostly fine-tuned models) to extract entities out of text from sources (company → team → people on a team → projects the team is undertaking → technology the team uses)

### What’s next

- Adding more sources so you can run even more composable queries

- Opening an API so devs can hit the graph directly

- Much later: expand to use cases beyond GTM

### Feedback

- Is the web app intuitive?

- What queries do you want us to prioritize supporting in an API?

- What additional external data sources would you like us to prioritize? - What workflow improvements/integrations would you find most helpful?

Comments

Nivge•2h ago
Congratulations! Looks awesome. 1. I found it very intuitive. 2. If I could have smart filtering using llm classification, that would be very powerful. Any plans on doing that?
antgoldbloom•2h ago
As in a search box where you can ask free form queries rather than applying filters? We haven't heard much demand for that yet, so haven't prioritized it. We will if it's a common request.
johnsillings•2h ago
Sumble is one of my go-to data tools for GTM – great data quality and lots of interesting data points that are kind of a pain to find elsewhere.

I do find myself wanting to transform the data (especially the stuff in job descriptions) using an LLM, e.g. for scoring companies/contacts or looking for more subtle signals. Sometimes I do this manually but exporting a bunch of JDs from Sumble isn't possible AFAIK. Or doing it in Sumble would be great, too.

Awesome to see it on HN. Congrats on the launch!

benhamner•41m ago
Thanks! Job descriptions are included in job post CSV exports, which is the current only path for that workflow.

We're planning to make that workflow much better in four ways this year:

1. Adding an API to make it easier to consume the data programmatically (next 2 months)

2. Enabling running LLM's on tabular results on Sumble directly that would enable pulling in job description context into the LLM call

3. Experimenting with an MCP endpoint, to see if that's helpful for these workflows as well

4. Experimenting with adding Sumble scoring models

richardmeng•1h ago
Sumble has been my critical tool to research the organization structure and responsibility in a large company, technology adoption like which organization has the LLM adoption.

Congrats on the launch!

jeffchuber•1h ago
There is so much signal in job posts - excited to see this launch.
pbmango•1h ago
As the founder of another product in this space - this is super impressive and well built. Great demo video and congrats on top of HN! Getting this smooth UX and data behind the scenes is not easy.
ryanrasti•1h ago
Wow -- tried it out and looks quite impressive. The granularity of data for these companies is amazing!

My last startup was selling to SMBs. It looks like Sumble is most likely targeted at mid-market and enterprise companies. Any plans to expand coverage into the long tail of smaller companies?

benhamner•1h ago
Thanks! Our current coverage is focused on companies with a significant online presence (e.g. they've made job posts, people say they at the company, and/or they have a functional website).

Our goal is to have complete coverage for active companies and organizations in the world, and an understanding for companies that previously existed but are no longer active as well (these appear extensively in CRM's and add noise).

We prioritize expanding data coverage in areas that we hear are most useful from our current users and customers.

ryanrasti•1h ago
Awesome, go crush it!
esafak•1h ago
Nicely done. Do you have a roadmap, public ticketing system or communication channel?
benhamner•1h ago
Thanks! Haven't prioritized something public facing on this front yet - what would you find most helpful?
esafak•1h ago
I'd set up a ticketing system so you can receive bug reports and feature requests. It's more structured than chat rooms, which are information black holes.
csomar•1h ago
This is incredibly useful and I can see myself using it and paying a subscription. That being said:

1. I couldn't find some key persons that I know works in an organization. How accurate is the data?

2. I don't know if this is happening because you are getting lots of traffic now, but each query takes 20-30 seconds which is unusable.

> - Is the web app intuitive?

Yes

> - What queries do you want us to prioritize supporting in an API?

Maybe specific but I want to filter by head count in job function (ie: find organizations that have 50-200 software engineers regardless of their total head count).

> - What additional external data sources would you like us to prioritize? - What workflow improvements/integrations would you find most helpful?

I don't really care as long as the data is as accurate as possible. The process of lead generation/research is a slow one that I don't think workflows matter.

benhamner•1h ago
| 2. I don't know if this is happening because you are getting lots of traffic now, but each query takes 20-30 seconds which is unusable.

Thanks! What queries are you finding painful? Most should be under a second, there's some that are expensive though

csomar•1h ago
Simple queries. As in typing the name of a person in a company list of 200. Keeps spinning forever.
benhamner•1h ago
Thanks! We'll take a look at that one
JasonPunyon•1h ago
Thanks for taking it for a spin! I'm working on why this is slow now.
benhamner•1h ago
| Maybe specific but I want to filter by head count in job function (ie: find organizations that have 50-200 software engineers regardless of their total head count).

You're not alone! We've heard this from others as well, planning to add it soon

antgoldbloom•1h ago
People data has ~85% coverage at the moment for people who put their resume data online. We are going to be adding some others sources (e.g. Github profiles) that will help improve coverage, particularly for technical personas.
catpower•1h ago
How far off is an API? Looks slick but I’d want to be able to query programmatically
antgoldbloom•1h ago
Currently aiming for next 2 months.
vibhork•1h ago
Super interesting!
chsrbrts•1h ago
Using this product.... big fan. Most important in our GTM stack for building account lists.
riku_iki•1h ago
> For example, here's a list of GenAI projects at Capital One that involve RAG/Vector databases: https://sumble.com/l/6sDqKmhyAH

requires you to sign in, which then follows by marketing emails.

antgoldbloom•33m ago
We plan to put some data outside the login wall to see what we have without logging in. Haven't put time into this flow yet.
rudx•1h ago
Great work, and interesting to see Knowledge Graphs in a production setting. Why did you choose a Knowledge Graph as the backend? How is the graph modeled. Do you use existing Graph Query languages, or did you have to create your own?
benhamner•54m ago
Thanks! We describe this as a knowledge graph because that's how we think about the structure in the data & is where we want to go.

Right now, we've focused on normalizing several key entities (e.g. organizations including parent/subsidiary relations, technologies, people, and job functions), and capturing the relations between these as well as additional useful metadata like location and industry.

From a backend implementation standpoint, this is currently implemented as structured relational tables for query performance and simplicity (e.g. count up all teams mentioning pytorch in job posts including rolling up across parent subsidiaries and sort by the biggest organizations descending).

Future direction here is TBD as we expand the sources that we cover and types of queries that can be computed across these sources.

There's been a lot of attempts at building high-quality public knowledge graphs that haven't hit escape velocity.

We're focusing on a structured, commercially relevant subset of the problem as a starting point to generate a critical mass of usage and funding that will enable us to build the bigger vision: a highly structured, up-to-date, and trusted repository of all the facts about the world that is easy to browse, query, and integrate programatically into all the relevant workflows (including for grounding LLM's)

rudx•44m ago
Appreciate you sharing the vision. Having worked in this space for a while, IMO the biggest challenges for a public facing graph are in 1. Entity Linking from NL Query -> Graph queries or in your case relational queries (Multiple similarly named teams in Microsoft). And 2. Relevance of results for more complex queries. I like your approach of having a drop down of filter tags, which eliminates 1, but will be harder to scale like in a Graph of everything.
marvinkennis•57m ago
Looks like an amazing product. Been playing around with it for a few mins. The UI is quite buggy and jumps around a lot (Chrome, MacOS), and seems to auto-refresh on the organizations page, which makes curating lists impossible. What's a good way to keep providing feedback?
antgoldbloom•48m ago
Can email me at a@sumble.com. Great if you can record a loom.
ghc•41m ago
The page is also refreshing constantly for me. Chrome & Safari :(.
liorsh•57m ago
Super useful and intuitive product, love the granularity of the tech stack keywords, it does find relevant leads/companies that you couldn't find otherwise..

API could be helpful for enrichment of internal sources. MCP would also definitely make sense as well

benhamner•52m ago
Thanks! We're planning to add an API in the next two months, and exploring MCP alongside that
chrisweekly•55m ago
I'm sure tools like this are useful to salespeople and recruiters, but it also seems like a dream resource for spammers, scammers and esp. phishing attacks.
johnsillings•47m ago
i think this is true of most powerful sales tools
ohadpr•55m ago
Love this product and the overall strategy.
bittermandel•44m ago
I just tried this. HOLY CRAP its good. How did you achieve this? I'm very impressed.

Also: Please don't evolve the UI. Its perfect as it is

ghc•43m ago
Is there no way to add custom searches? As a test, I wanted to look for flight test engineers in aerospace companies, but the only way I could see to approximate it was to look at job postings. I was able to drill down by picking a company (Boeing) until I found one, but that's really tedious compared to just adding a custom job function ("flight test engineer") or selecting "Test Engineer" and adding a custom industry ("Aerospace").
benhamner•36m ago
Thanks for the feedback!

The job functions we currently classify have been mostly focused by our early users/customers (companies building products/tools/infrastructure for data and software engineering teams), and handling the multilingual aspects of those across countries well.

We're aiming to extend this in two ways:

1. Adding job title and job description full-text search, to handle the long tail of usecases (in-flight project)

2. Extend the job function classification to the full universe of jobs that people can have

ghc•26m ago
Job title and job description full text search would really be perfect. At least in my mind, good GTMs are narrow (software engineers in flight test) vs. broad, like selling to all software engineers using python in the manufacturing industry.
constantinum•40m ago
How does this tool differ from Apollo.io, Clay.com, and Promptloop?

Tools like Clay and Apollo are often misused for spammy cold outreach—which rarely works. The real value lies in enriching leads who’ve already shown interest, helping align marketing efforts with the right prospects. Beyond that, more data doesn’t always improve GTM decisions.

I'd love to hear(and learn) how others would want to use this tool specifically for GTM.

benhamner•17m ago
Here's three main ways our users and customers use us:

1. Revenue Operations teams

Integrate Sumble's data programmatically to help with account scoring, territory planning, account qualification/disqualification, and CRM data cleanup. We provide feature matrices that feed ML models for large sales teams.

2. Individual AE's/SDR's

Many sales people have a small universe of named accounts that they go deep on. They use Sumble to understand the buying groups that exist within their target accounts, and which relevant technologies these groups use, and any relevant projects going on (e.g. data infrastructure migrations, cloud migrations, and GenAI projects can be critical signals for many of our customers)

For ongoing awareness of key changes within accounts, we work with our enterprise customers to define all the signals that are relevant to their sales plays, and send email/slack notifications when any of these signals happens in their accounts as well.

For sales reps with a larger universe of accounts (e.g. the SMB/commercial tier), they use us to filter out a lot of the noise in their territory and understand which accounts are real active businesses that are potential users of their product that they should spend time on.

3. Marketing

Marketers use us to figure out which accounts to focus on, and to spin up very targeted LinkedIn/Facebook/etc. campaigns to reach their most likely potential users and buyers

dflock•26m ago
Please ingest the biotech industry!
antgoldbloom•17m ago
What query are you trying to run? Can email me at a@sumble.com and I can see if we can support your query.

Tech Lead Manager: it's a trap (and I'm still in it)

https://grahamgilbert.com/blog/2025/07/07/tlm-its-a-trap-and-im-still-in-it/
1•dipierro•1m ago•0 comments

Peter Jackson Tries to Resurrect a Giant Bird That Went Extinct 600 Years Ago

https://www.ign.com/articles/its-more-jurassic-park-than-lord-of-the-rings-but-peter-jackson-is-trying-to-resurrect-a-giant-bird-that-went-extinct-600-years-ago-the-celebrated-director-tells-us-why
2•HelloUsername•2m ago•0 comments

IBM Power11 Raises the Bar for Enterprise IT

https://newsroom.ibm.com/2025-07-08-ibm-power11-raises-the-bar-for-enterprise-it
1•ksec•2m ago•0 comments

Peter Jackson backs long shot de-extinction plan starring New Zealand's lost moa

https://apnews.com/article/peter-jackson-moa-de-extinction-colossal-biosciences-04260e26cbe04e787640c9502df94dda
2•petethomas•3m ago•0 comments

Synthetic Chromatophores for Color and Pattern Morphing Skins

https://advanced.onlinelibrary.wiley.com/doi/10.1002/adma.202505104
1•PaulHoule•6m ago•0 comments

Cluely filed a DMCA takedown for tweet about their system prompt

https://twitter.com/jackhcable/status/1942636823525679182
3•taytus•6m ago•0 comments

Words Don't Compile

https://blog.surkar.in/words-dont-compile
1•manthan1674•7m ago•0 comments

Facial recognition cameras could be introduced to tackle fare dodging on Tube

https://www.standard.co.uk/news/transport/facial-recognition-cameras-fare-dodging-tube-london-underground-tfl-b1237049.html
1•pseudolus•7m ago•0 comments

Dynamical origin of Theia, the last giant impactor on Earth

https://arxiv.org/abs/2507.01826
5•bikenaga•8m ago•0 comments

Judge rules that VMware must support crucial Dutch government agency migration

https://www.theregister.com/2025/06/30/dutch_agency_wins_right_to/
1•Logans_Run•9m ago•0 comments

Skia Graphite: Chrome's rasterization back end for the future

https://blog.chromium.org/2025/07/introducing-skia-graphite-chromes.html
2•ingve•9m ago•0 comments

Google's Moonshot Project Gears Up for Human Trail of AI-Designed Drugs

https://in.mashable.com/science/96798/googles-secret-moonshot-project-gears-up-for-human-trail-of-ai-designed-drugs
1•Bluestein•9m ago•0 comments

What Gets Measured, AI Will Automate

https://hbr.org/2025/06/what-gets-measured-ai-will-automate
1•Michelangelo11•10m ago•0 comments

June.so Acquired by Amplitude

https://www.june.so/blog/a-new-chapter
2•camjw•10m ago•0 comments

In Hiroshima, search for remains keeps war alive for lone volunteer

https://www.reuters.com/world/hiroshima-search-remains-keeps-war-alive-lone-volunteer-2025-07-08/
1•speckx•12m ago•0 comments

All living NASA science chiefs unite in opposition to unprecedented budget cuts

https://www.planetary.org/press-releases/nasa-science-chiefs-letter-press-release
3•consumer451•12m ago•0 comments

Thunderbird 140

https://www.thunderbird.net/en-US/thunderbird/140.0/releasenotes/
2•doener•13m ago•0 comments

Show HN: Vibes – Discover music through human stories, not algorithms

https://sharevibes.app/
2•lucascliberato•14m ago•1 comments

Framework 12 Platform Tuning for Better Performance or Power Efficiency

https://www.phoronix.com/review/framework-12-performance
2•doener•14m ago•1 comments

Mastodon's latest update readies the app for Quote Posts

https://techcrunch.com/2025/07/08/mastodons-latest-update-readies-the-app-for-quote-posts-revamps-design/
1•doener•15m ago•0 comments

What if the moon turned into a black hole? [Xkcd's What If?] [video]

https://www.youtube.com/watch?v=UQgw50GQu1A
1•nfriedly•15m ago•0 comments

Brut: A New Web Framework for Ruby

https://naildrivin5.com/blog/2025/07/08/brut-a-new-web-framework-for-ruby.html
11•onnnon•15m ago•1 comments

We're testing a way to auto-update docs from Slack/Zoom/email. Thoughts?

https://getautobase.com/
1•ElfDragon11•16m ago•1 comments

Rooktook.com – daily chess tournament app

1•shubhamrrawal•20m ago•0 comments

Copy/paste text to highlight AI writing patterns like "It's not X. It's Y"

https://unaiify.com/
1•justinowings•21m ago•2 comments

Show HN: A simple business management tool for small business owners

https://github.com/oitcode/samarium
1•azaz12•24m ago•0 comments

Mount Rainier Currently Experiencing an Earthquake Swarm

https://volcanoes.usgs.gov/hans-public/notice/DOI-USGS-CVO-2025-07-08T14%3A41%3A41%2B00%3A00
8•jandrewrogers•25m ago•0 comments

Amazon asked corporate employees to help fulfill deliveries for Prime Day

https://www.engadget.com/big-tech/amazon-asked-corporate-employees-to-help-fulfill-grocery-deliveries-for-prime-day-131022042.html
2•bartekrutkowski•26m ago•0 comments

LLM-Ready Training Dataset for Apple's Foundation Models (iOS 26)

https://rileyhealth.gumroad.com/l/bwoqe
1•rileygersh•26m ago•0 comments

Announcing TypeScript 5.9 Beta

https://devblogs.microsoft.com/typescript/announcing-typescript-5-9-beta/
2•zackify•27m ago•0 comments