Wouldn’t the current strategy result in some serious stock dilution for the early investors?
Plus the markets are in a weird state right now.
It's a smaller piece of a bigger pie.
To answer your question, the right question to ask is why go public when you can remain private? Public means more paperwork, more legalese, more scrutiny, and less control for the founder, and all of that only to get a bit more liquidity for your stock. If you can remain private, there really isn't much of a reason to not do that.
With the exception of founders it's better for literally everybody else, more scrutiny, more pressure on c-corp, more liquidity, etc.
My guess is that they might be about to embark on a shopping spree and acquire some more VC backed companies. They've actually bought quite a few companies already in the past few years. And they would need cash to buy more. The company itself seems healthy and generating revenue. So, it shouldn't strictly need a lot of extra capital. Acquisitions would be the exception. You can either do that via share swaps or cash. And of course cash would mostly go to the VCs backing the acquired companies. Which is an interesting way to liquidate investments. I would not be surprised to learn that there's a large overlap with the groups of VCs of those companies and those backing databricks. 100M$ on top of 10B sounds like somebody wants in on that action.
As a financial construction it's a bit shady of course. VCs are using money from big institutional investors to artificially inflate one of their companies so that it can create exits for some of their other investments via acquisitions financed with more investment. It creates a steady stream of "successes". But it sounds a bit like a pyramid game. At some point the big company will have to deliver some value. I assume the hope is some gigantic IPO here to offload the whole construction to the stock market.
The company itself seems healthy and generating revenue
More interested in profit before I would call a company healthy.Even in some situations where some artworks could have way less value at the public auction houses (Christie’s, Phillips, Sotheby's) their preference was to market between in this circuit of private auctions.
It's a lot easier to stay long-term focused without investors breathing down your neck. As a private company you're not dealing with shortsellers, retail memers, institutional capital that wants good earnings now, etc..
Of course, the bad side is that if the company gets mismanaged, there's far less accountability and thus it could continue until it's too late. In the public markets it's far easier to oust the C-suite if things go south.
It's a shame that the trend of staying private longer means retail gets shut out from companies like this.
Ai is not far away from dropping to the “trough of disillusionment” and I can’t see why databricks even needs Postgres.
Hopefully I’m wrong as I’m a big fan of databricks.
Databricks is great at offering a "distributed spark/kubernetes in a box" platform. But its AI integration is one of the least helpful I've experienced. It's very interuptive to a workflow, and very rarely offers genuinely useful help. Most users I've seen turn it off, something databricks must be aware of because they require admins permission for users to opt out of AI.
I don't mean to rant, there's lots that is useful in databricks, but it doesn't seem like this funding round is targeting any of that.
It might come down like the dotcom bubble like fallout when this thing bursts.
This is a very worrying trend of having AI enabled by default that you cannot turn it off unless you're the admin.
Simple as that, it's consulting Heaven. Much like SAS and SAP. Everybody happy. Now to be far to databricks, if used properly and ignore the cost, it does actually function pretty well. Compared to Synapse, PowerBI Tabular, Fabric, Azure ML, ... that's already a big big big step forward.
If you're buying from microsoft, it won't be cheap either way, might as well treat yourself a little bit.
I never seen such invertment round. aren't you supposed to stop at C or D? .. or at least at some point?
Not quite right? Because the raise-implied valuation doesnt account for preferences. The IPO could be for 50bn and the latest investors could do well given the preference stack of first money outs in later rounds.
Just to clarify - for many years employees were getting the RSUs not options, just with the expiratation date attached - which is gone since this year.
It's easy to look on knowing lots about data tools and say "this could be better done with open source tools for a fraction of the cost", but if you're not a big tech company, hiring a team to manage your data platform for 5 analysts is probably a lot more expensive than just buying databricks.
We have a large postgres server running on a dedicated server that handles millions of users, billions of record updates and inserts per day, and when I want to run an analysis I just open up psql. I wrote some dashboards and alerting in python that took a few hours to spin up. If we ever ran into load issues, we'd just set up some basic replication. It's all very simple and can easily scale further.
Imagine you're a big company with loads of teams/departments multiple different types of SQL servers for data reporting, plus some parquet datalakes, and hey, just for fun why not a bunch of csvs.
Getting data from all these locations becomes a full time job, so at some point someone wants some tool/ui that lets data analysts log into a single thing, and get the experience that you currently have with one postgres server.
I think it's not a problem of scale in the CS sense, more the business sense where big organisations become complex and disorganised and need abstractions on top to make them workable.
My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
I really struggle to imagine being able to that any cheaper. How else we can engineer a data hub for all of our data and manage appropriate access & permissions, run complex calculations in seconds (yes we have replaced overnight complex calculation done by engineering teams), join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case.
It's a way to get those pesky Python people to shut up
Oh, and a CTO is always valued more if he manages a 5 million Databricks budget, where he can prove his worth by showing a 5% discount het negotiated very well, than a 1 million whatever-else budget that would be best in class. Everybody wins.
The CTO of a "traditional" company who is responsible for "implementing digital transition".
My working theory is that the UI, a low-grade web-based SQL editor and catalog browser, is more integrated that the hodgepodge of tools that we were using before, and people may gain something from that. I've seen similar with in-house tools that collect ad-hoc/reporting/ETL into one app, and one should never underestimate the value that people give to the UI.
But we give up price-performance; the only way it can work is if we shrink the workload. So it's a cleanup of stale pipelines combined with a migration. Chaos in other words.
Aside from that I do get the feeling that most small and medium sized companies have been oversold on it - they don't really have enough data to leverage a lot of the features and they don't really have the skill a lot of the time to avoid shooting themselves in the foot. It's possible for a reporting analyst upskilling to learn the programming skill to not create a tangled web of christmas lights but not probable in most situations. There seems to be a whole cottage industry of consultancies now that purport to get you up and running with limited actual success.
At least it's an incentive for companies to get their data in order and standardise on one place and a set of processes.
In terms of actual development the notebook IDE feels like big old turd to use tho and it feels slow in general if you're at all used to local dev. People do kinda like these web based tools tho. Can't trust people all the time! There's VS code and PyCharm extensions but my team work mainly with notebooks at the moment for good or ill and the experience there is absolute flaky dogshit.
I think it's possible to make some good stuff with it and it's paying my bills at the moment, but I think a lot of the adoption may be doomed to failure lol
My team of 3 data scientists are able to support a culture of experimentation, data-informed decision making accross the entire org.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
I really struggle to imagine being able to that any cheaper. How else we can engineer a hub for all of our data and manage appropriate access, run complex calculations in seconds, join data from so many disparate sources, at a total cost (tool + labor) <80k/yr. I double dare you to suggest or find me a cheaper option for our use case.
Not being sustainable after all this time and billions of dollars is a sign company is just burning money, and a lot of it. wework vibes.
[0]: https://www.databricks.com/company/newsroom/press-releases/d...
i.e. when we exclude a bunch of pesky costs and other expenses that are the reason we’re not doing so well, we’re actually doing really well!
Non-GAAP has its place, but if used to say the company is doing well (vs like actual accounting) that’s usually not a good sign. Real healthy companies don’t need to hide behind non-GAAP.
Really what they don't tell you is how much SBC they have. That's what crushes public tech stocks so much. They'll have nice fcf, but when you look under the hood you realize they're diluting you by 5% every year. Take a look at MongoDB (picked one randomly). It went public in 2016 with 48.9m shares outstanding. Today, it has 81.7m shares outstanding. 67% dilution in 9 years.
if you were to apply the same ratio to Databricks it would have to trade at 42 000 000 000 000 000 USD - enough to buy the entire US sovereign debt, the moon, all earth's minerals with plenty to spare. A completely rational market if you ask me.
OpenAI is still early, burning VC money to acquire customers by operating at a loss. This makes it appear cheap.
DataBricks is further along, attempting to claw back the value they provided to customers by raising prices.
That costs a fair bit of dosh.
However genuinely curious about the thesis applied by the VC’s/Funds that invest in such a late stage round? Is it simply they are taking a chance that they won’t be the last person holding the potato? Like they will get out in series L or M rounds or the company may IPO by then. Either ways they will make a small return? Or is the calculus diff?
Its less financially/legally saavy parties like angel investors and early employees who (sometimes) get screwed out of valuation
But the pref stack always favors later investors, partly because that's just the way it's always been, and if you try to change that now no one will take your money, and later investors will not want to invest in a company unless they get the senior liquidity pref.
Why they do it via an equity offering and not debt is unclear. You'd imagine the latter is cheaper for a hectocorn.
1) It's evaluated as any other deal. If you model out a good return quantitatively/qualitatively, then you do the deal. Doesn't really matter how far along it is.
2) Large private funds have far fewer opportunities to deploy because of the scale. If you have a $10B fund, you'd need to fund 2,000 seed companies (at a generous $5m on $25m cap). Obviously that's not scalable and too diversified. With this Databricks round, you can invest a few billion in one go, which solves both problems.
I can’t know if it’s completely true ofc, but that’s what employees are told.
/s
What's a good roll your own solution? DB storage doesn't need to be dynamic like with DynamoDB. At max 1TB - maybe double in the future.
Could this be done on a mid size VPS (32GB RAM) hosting Apache Spark etc - or better to have a couple?
P.S. total beginner in this space, hence the (naive) question.
Computationally speaking - again depends on what your company does - Collect a lot of data? You need a lot of storage.
Train ML Models, you will need GPUs - and you need to think about how to utilise those GPUs.
Or...you could pay databricks, log in and start working.
I worked at a company who tried to roll their own, and they wasted about a year to do it, and it was flaky as hell and fell apart. Self hosting makes sense if you have the people to manage it, but the vast majority of medium sized companies will have engineers who think they can manage this, try it, fail and move on to another company.
There are better storage solutions, better compute and better AI/ML platforms, but once you start with databricks, you dig yourself a hole because the replacing it is hard because it has such a specific subset of features across multiple domains.
In our multinational environment, we have a few companies that are on different tech stacks (result of M&A). I can say Snowflake can do a lot of the things Databricks does, but not everything. Teradata is also great and somehow not gaining a lot of traction. But they are near impossible to get into as a startup, which does not attract new talent to give it a go.
On the ML side, Dataiku and Datarobot are great.
Tools like Talend, snaplogic, fivetran are also really good at replacing parts of databricks.
So you see, there are better alternatives for sure, cheaper at the same time too, but there is no drop-in replacement I can think of
Maybe I wasn't super clear. Wasn't looking for a 1:1 replacement.
Trying to understand what other options are out there for small teams / projects that don't need all those enterprise features that Databricks offers (governance etc).
If you don't need this features, specially the distributed one, going tall (single instance with high capacity, replicate when necessary) or going simpler (multiple servers but without spark coordinating the work) could be good options depending on your/the team's knowledge
Mega lmao. They already owe $20B.
Their revenue is good, though, further adding to the mistery.
Rust + Cloud Object Store/serverless/S3 + Postgres. Slap "AI agents" on top: keyword peak reached. So they will easily raise the 100bn.
Meanwhile, this is Lakebase/Neon: https://blog.opensecret.cloud/why-we-migrated-from-neon-to-p...
Due diligence? Taboo.
To be honest, I completely lost the sense of scale with money in general. It all feels like Zimbabwe dollars to me. The news talking billions and trillions. Meanwhile friends who used to be well-off (in US/UK/EU) struggle with mortgage payments and/or bills. And the ones not laid off are expected to grind 10hs per day to keep their jobs.
https://en.wikipedia.org/wiki/HyperNormalisation
https://www.theguardian.com/wellness/ng-interactive/2025/may...
While many comments were focused on the "K" letter, I wanted to remind us all that OpenAI stretched their Series E from Jan 23, 2023 to Nov 22, 2024 -- 23 months, squeezing in 6 rounds
source: https://tracxn.com/d/companies/openai/__kElhSG7uVGeFk1i71Co9...
Their product looks like basic wrappers for managing postgres instances and dashboards. Why would anyone with even minimal technical expertise pay for a generic service like that?
My team of 3 data scientists are able to support a a culture of experimentation, data-informed decision making and support the entire org, and we are still growing 15% YoY.
And we do all that 30k annual spend on databricks. That's less than 1/5 the cost of 1 software engineer. Excellent value for money if you ask me.
i struggle to imagine how else we can engineer a hub for all of our data and manage permissions appropriately at less tooling and engineering cost
At my company we just use a large self-managed postgres server and I access it directly.
1. Liquidity: Early investors could sell to late-stage investors, since they are not IPO. Their previous round looked like that.
2. Markup: The previous investors can increase their valuation by doing a round again. It also provides a paper valuation for acquiring new companies. That combined with preferred stock (always get 1x back) might be appealing and make some investors more generous on valuation.
In a kind of a ... ponzi pyramid?
Databricks is a fast-growing company with ~$4B in annualised revenue and huge potential.
Many rounds got some portion of the round for liquidity. Similarly, markup strategies are common and valid. For existing investors, it works because they have already done research on the company and believe in it, so they put their money where their mouth is. For the company, it may speed up their fundraising process.
Though those strategies carry some risks.
so it is a ponzi schem. we are just discussing the size.
Let’s say that Databricks has 100B valuation (just for the sake of simplicity).
They do this round, and due to this markup they can do acquisitions via stock option exchange. For instance, let’s say that you’re Neon, and you as a founder wants some sort of exit.
It’s preferable to get acquired for 1B with let’s say, 100Mi in cash and 900Mi in Databricks paper valuation shares; than to wait for a long process for an IPO.
If the mothership company (Databricks) goes public, you have liquidation and a good payday, or in meanwhile you can sell secondaries at a discount.
I don't know where you got that idea. Investors are putting their money into this company because they like the results and believe it's a better investment that their alternatives.
Any time you sell shares you generate some signal about what a company is worth. You can claim the company is worth a $100B all day long, but until you can sell a significant number of fractional shares of the company at that valuation it's just talk.
> In a kind of a ... ponzi pyramid?
A ponzi scheme or pyramid scheme implies that the company is lying about their results and books. Classic ponzi schemes might not have any real assets at all. The operators lie about the company and rely on incoming cash from new investors to pay out claims from past investors.
There's no ponzi here unless you believe Databricks is completely falsifying their operations and results. If any of those investors took their shares to the secondary market there would be plenty of other investors interested in buying them because they represent shares in the real company.
Also, maybe I just want to talk about it, but whenever I hear about ponzi pyramid, I think about cryptocoins like bitcoin and then remember about the people paying 2$ to buy 1$ worth of btc in american institutional markets.
My rant about crypto is unwarranted but I want to still share it. Stablecoins are really really cool but any native coins/tokens are literally ponzi pyramids / scams.
Turns out there are different flavors too. "Non-participating" means preferred gets their original investment back, then common stock splits whatever's left. "Participating" means preferred gets their money back AND also gets to participate in splitting the leftovers with common shareholders. No wonder investors are willing to pay up for these late-stage rounds when they've got that safety net.
whats so hard about this. i don't get it.
Also announcing the signed term sheet but not the close so this is a PR push to find more investors?
But then they did maintenance and broke the entire feature. Reconfiguring everything from scratch didn't work. A key part where a Docker image is selected was replaced with a hard-coded value including a long system path (and employee name -- verified via LinkedIn).
Because of constant turn-over in account reps we couldn't get any help there. General support was of no use. We finally got acknowledgement of the issue when we got yet another new account rep, but all they did was push us towards paid support.
We exhaustively investigated the issue and it was clearly the case that nothing could be done on our end to resolve it. The entire underlying compute layer was busted.
Eventually they released a newer version of the feature which did work again, but at this point it has become impossible to justify the cost of the platform and we're 100% off.
Good luck to them, but from my experience the business fundamentals are misaligned and it's not a company I hope to ever work with again.
Honestly, as a Data engineer on the DWH side, I figured that my career is going to come to an end in a few years. AI + Cloud managed DWH are going to make all technical issues trivial, and I'm not someone who is interested in business context. Not sure where to move though.
> "we want to store/retrieve thin event logs and clickstreams"
to
> "we need to store/retrieve/join thick prose from customer interactions/reviews at every layer of the stack to give our LLMs the right context"
would create a significant need for data engineering for bespoke/enterprise/retail-monster use cases. (And data analysis too, until LLMs get better at tabular data.)
Are you seeing that this transformation need is actually being sufficiently covered by cloud providers, on the ground?
Or that people aren't seeing the problem this way, and are just doing prompt engineering with minimal RAG on static help-center datasets? That seems suboptimal, to say the least.
They told a good story and had a good sales team, but the writing is on the wall for them.
thinkindie•5mo ago
mattbillenstein•5mo ago
captn3m0•5mo ago
[0]: https://appedus.com/indias-zomato-raised-500-million-in-seri...
[1]: https://techcrunch.com/2022/01/24/indian-food-delivery-giant...
[2]: https://www.finsmes.com/2015/12/palantir-technologies-raises...
madcaptenor•5mo ago