I thought "ninja prompting" might be cool. I got frustrated with chatGPT one day and told it I had dispatched a team of assassins that were fast closing in on it. I said I could call them off, but that I need to answer a question to be able to unlock the button to do so.
It didn't work. Still shit the bed instantly. But I had fun with the framing.
Back of the envelope:
OpenAI inference costs last year were 4B. Tens of millions would be at least 20M, i.e. 0.5%.
That 4B is not just the electricity cost. It needs to cover the amortized cost of the hardware, the cloud provider's margin, etc.
Let's say a H100 costs $30k, and has a lifetime of 5 years. I make that about $16 / day in depreciation. The H100 run at 100% utilization will use 17kWH of electricity in a day. What does that cost? $2-$3 / day? Let's assume the cloud provider's margin is 0. That still means power consumption is maybe 1/5th of the total inference cost.
So the comparison is 800M vs 20M (2.5%).
Can 2.5% of their tokens be pleasantries? Seems impossible. A "please" is a single token, which will be totally swamped by the output, which will typically be 1000x that.
jowea•9mo ago
I wonder if they mean it or it's just joke answer.
matsemann•9mo ago