We scrape job sites and use that prompt to create tags which are then searchable by users in our interface.
It was a bit surprising to see how Karpathy described software 3.0 in his recent presentation because that's exactly what we're doing with that prompt.
Software 2.0: We need to parse a bunch of different job ads. We'll have a rule engine, decide based on keywords what to return, do some filtering, maybe even semantic similarity to descriptions we know match with a certain position, and so on
Software 3.0: We need to parse a bunch of different job ads. Create a system prompt that says "You are a job description parser. Based on the user message, return a JSON structure with title, description, salary-range, company, position, experience-level" and etc, pass it the JSON schema of the structure you want and you have a parser that is slow, sometimes incorrect but (most likely) covers much broader range than your Software 2.0 parser.
Of course, this is wildly simplified and doesn't include everything, but that's the difference Karpathy is trying to highlight. Instead of programming those rules for the parser ourselves, you "program" the LLM via prompts to do that thing.
People use it to generate meeting notes. I don't like it and don't use it.
It processes Steam game reviews and provides one page summary of what people thing about the game. Have been gradually improving it and adding some features from community feedback. Has been good fun.
What I found interesting with Vaporlens is that it surfaces things that people think about the game - and if you find games where you like all the positives and don't mind largest negatives (because those are very often very subjective) - you're in a for a pretty good time.
It's also quite amusing to me that using fairly basic vector similarity on points text resulted in a pretty decent "similar games" section :D
However, review positivity is usually the best indicator of sales - it's so accurate that there's algorithms that rely entirely on it.
https://apps.apple.com/us/app/forceai-ai-workout-generator/i...
Used it to deeper understand complex code base, create system design architecture diagrams and help onboard new engineers.
Summarizing large data dumps that users were frustrated with.
Pretty much 5-6 niche classification use cases.
We're delivering confusion and thanks to LLMs we're 30% more efficient doing it
$20 could cover half a billion tokens with those models! That's a lot of firehose.
I did some really basic napkin math with some Rails logs. One request with some extra junk in it was about 400 tokens according to the OpenAI tokenizer[0]. 500M/400 = ~1.25 million log lines.
Paying linearly for logs at $20 per 1.25 million lines is not reasonable for mid-to-high scale tech environments.
I think this would be sufficient if a 'firehose of data' is a bunch of news/media/content feeds that needs to be summarized/parsed/guessed at.
As other commenters have mentioned, a firehose can mean many things. For me it might be thousands of different reasonably small things a day which is dollars a day even in the worst case. If you were processing the raw X feed or the whole of Reddit or something, then all of your questions certainly become more relevant :-)
Sentiment analysis is like the “Hello World” when you’re using Machine Learning.
But I had a use case similar to a platform like Uber eats where someone can be critical of the service provider or be critical of the platform itself. I needed to be able to distinguish sentiment about the platform based on reviews and sentiment about someone on the platform.
No matter what you do, people are going to conflate the reviews.
As far as costs, I mentioned in another comment that I work with online call centers sometimes. There anytime a person has to answer a call, it costs the company from $2-$5.
One call deflection that saves the company $5 can pay for a lot of inference. It’s literally 100x cheaper at least to use an LLM.
That said, it required the user to sign in with their real work email or the results are way off.
If everyone is using it now prompts aren’t a good gauge.
I have a js-to-video service (open source sdk, WIP) [1] with the classic "editor to the left - preview on the right" scenario.
To help write the template code I have a simple prompt input + api that takes the llms-full.txt [2] + code + instructions and gives me back updated code.
It's more "write this stuff for me" than vibe-coding, as it isn't conversational for now.
I've not been bullish on ai coding so far, but this "hybrid" solution is perfect for this particular use-case IMHO.
[1] https://js2video.com/play [2] https://js2video.com/llms-full.txt
Getting those events onto a usable, sharable calendar is much easier now.
I went pretty simple, used OpenAI agent sdk and built a couple of tools like “run_query” with read only connection. Initially I also had a tool for getting the join path from A to B, but the context I wrote out was sufficient.
I think main challenge with this agent is how to keep the context up to date.
If all you've built is RAG apps up to this point, I highly recommend playing with some LLM-in-a-loop-with-tools reasoning agents. Totally new playing field.
You used to either budget for data entry or just graft directories in a really ugly way. The forest used to know about 12000 unique access roles and now there are only around 170.
2. I build REPLs into any manual workflow that makes use of LLMs. Instead of just being like "F@ck, it didn't work!" you can instead tell the LLM why it didn't work and help it get the right answer. Saves a ton of time.
3. Coming up with color palettes, themes, and ideas for "content". LLMs are really good at pumping out good looking input for whatever factory you have built.
ATM I use ChatGPT Plus for everything except coding inside my Jetbrains IDEs.
I'm starting to look around at other LLMs for non-coding purposes (brainstorming, docs, being a project manager, summarizing, learning new subjects, etc.).
Gemini 2.5 is pretty cheap and has a huge context window, although not as good as Claude for programming. For that reason I would suggest to use is through the API if you're building a product that has an LLM step.
Traditionally and still how it works in most call centers, you have to explicitly list out the things you can handle (intents), what sentences trigger them (utterances) and slots - ie “I want to get a flight from {origin} to {destination}” the variable parts would be the slots
Anyway, absolutely no company would or should trust an LLM to generate output to a customer. It never ends well. I use Gen ai to categorize free text input from a customer into a set of intents the system can handle and fill in the slots. But the output is very much on rails
It works a lot better than the old school method.
We offer a feature to upload the invoice and we pull out all the rates for you. Uses LLMs under the hood. Fundamentally it's a "chatgpt wrapper" but there's a massive amount of work in tweaking the prompts based on evals, splitting things up into multiple calls, etc.
And it works great! Niche software, but for power users were saving them tens of minutes of monotonous work per day and in all likelihood entering things more accurate. This complements the manual entry process with full ability to review the results. Accuracy is around 98-99 percent.
http://plo.ug/llms,/typescript,/testing/2025/06/26/LLMs-for-...
The new AI threatfeed is everything above + I'm using AI to make rapid decisions for me. I can pull info from sources like dnsbl etc to help judge. If I were to do it manually, maybe 1 ip per 30 seconds? Phi4, omg 1 every second.
What I’ve noticed in my own projects is similar: every shiny AI integration spawns a hidden cost—coordination overhead, new edge cases, unexpected governance needs—everything that sits between "works in demo" and "works at scale."
We should be wary of framing AI as efficiency silver bullets. Instead, the real work is in system integration—making AI enhancements feel seamless, not another silo.
binarymax•7mo ago
For example, I wrote a recent blog post on how I use LLMs to generate excel files with a prompt (less about the actual product and more about how to improve outcomes): https://maxirwin.com/articles/persona-enriched-prompting/
notjoemama•7mo ago