- Census last raised $60M Series B at a $630M valuation (upper bound)
- Census’s estimated annual revenue is $31.6 million with ~200 employees.
- Median private-SaaS EV/ARR multiple is 7× (7 * 31 = 217 = lower bound)
- Hightouch raises $80M on a $1.2B valuation(at ~60× ARR)
- Twilio completes $3.2B acquisition of Segment at ~21× ARR (upper multiple bound)
if you want a data platform that's built to work as one cohesive unit, we got you: https://www.definite.app/
Definite has a data lake, ETL, and BI in one app.
But I am not going to pay $1000/month as a bootstrap startup. What open source alternatives exist that can be run on basic hardware?
It's like logging. Yeah, there is sentry, papertrail, splunk, datadog and the like. But something better than greping sys logs is nice and totally reasonable for a startup to standup with Kibana/Elastic running on a tiny instance. That can provide significantly higher value.
There is a middle ground between stone tools and jet aircrafts. I was asking: what are the middle ground tools in this space.
One of their pitfalls is charging by the row. If you're cost-conscious, you really need to watch what data you're syncing and you need to pare it down quite a bit during the 2-week period they give you when setting up a new connector. If you do all that though, you can get a lot of mileage out of the free plan for some use cases.
As I said, I totally understand this market and why these companies are valuable. I respect the work they do. But while I am a tiny, tiny startup I don't want to lock in to anything and I know I can handle the amount of data myself with little effort if I have a basic open source alternative I can manage myself.
To be honest, I hadn't really given much thought about what event streaming I would use anyway. So I imagine using redpanda along with redpanda connect could be that layer (I was considering just using Redis streams or even PostgreSQL) and then there is just another redpanda connector for the db to add into that mix. If someone is starting from scratch that might be a good path. But I agree the MIT license of warpstream is a bit nicer if all you need is the connectors.
It's not like running Postgres which "just works". When you self-host Airbyte, you're still building a good bit.
I felt the same way about the cost of data tools. Paying $1,000 for Fivetran, $2,000 for Snowflake, $2,000 for Looker seemed crazy. We bundle all three for $500 / month at https://www.definite.app
We built an entire stack so the agent can operate across that whole stack (e.g. create pipelines, model data, build reports, etc.)
> There were many thousands of customers who paid less than $10 a month for storage, which is half a terabyte. Among customers who were using the service heavily, the median data storage size was much less than 100 GB.
I'm a fan of what motherduck is doing. We're building something different (opinionated, instant data stack), but yes, we both use duckdb under the hood.
My best bet for now will be dlt if you have dedicated DE team, but sling will get you a long way for moving data around your warehouse
_dark_matter_•18h ago
mritchie712•17h ago
I built a company[0], SeekWell, in this space (launched before Census), but was mostly focused on Sheets and Slack as destinations. SeekWell was acquired a few years ago too.
0 - https://seekwell.io/
skadamat•17h ago
Once you have customers and a good network of integrations with a large number of tools, I suspect it's easier to just buy that company than build it all yourself?
throwaway7783•17h ago
georgewfraser•16h ago