Ask HN: Has anyone deployed LLMs to production?

14•saaspirant•6mo ago

I have been trying to tune Gemini flash to do some classification for me and it's not performing well at all. I had to change a lot of prompts and still it didn't seem to "learn" anything from the training set. The classification embarrassingly lacks common sense.

Has anyone used AI for anything useful? Apart from programming of course.

Comments

muzani•6mo ago

They're great at first level customer service. Lots of questions are repetitive and they go through this better than humans. It was the biggest boost to customer satisfaction rating.

On the other end, I actually canceled a $100/month subscription once through email (it was company email that I no longer had access too). Gave evidence. It canceled the subscription within 20 mins.

Also gemini flash is unreliable. The best cost efficiency today seems to be gpt-4.1. The cheaper models seem to be okay for summarization mostly. Gemini Flash was much better a year ago, still unreliable, but at least it followed instructions.

mooreds•6mo ago

We use it heavily for doc search. We bought Kapa.ai a few years ago and leverage their solution, not an in-house build.

byoung2•6mo ago

I was having trouble getting GPT-4o to extract data like address, email, phone, tracking number from random emails in an inbox. Sometimes it would do it perfectly and other times it would fail miserably on a similar email. Then I tried asking it to first markup the email with schema.org metadata. Then I asked it to extract the data from the schema.org markup. That worked nearly every time.

Maybe there is an extra step you can work into your prompt that would help it get to the proper classification

nkristoffersen•6mo ago

We are using over 50 billion LLM tokens for NLP/classification purposes per month. A mix of self hosted and cloud hosted models. But I have not attempted any fine tuning. Just prompt, (and perhaps more importantly) context “engineering”.

incomingpain•6mo ago

I have Microsoft's Phi4 deployed onto https://mapleintel.ca for the AI side. Currently over 44,000 ips in that list.

I tried 'reasoning plus' but it was so much slower.

boredemployee•6mo ago

Did u ever try to fine tune with openai? it works good for me

Could ionospheric disturbances influence earthquakes?

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

Show HN: One-click AI employee with its own cloud desktop

Show HN: Poddley – Search podcasts by who's speaking

Same Surface, Different Weight

The Rise of Spec Driven Development

The first good Raspberry Pi Laptop

Seas to Rise Around the World – But Not in Greenland

Will Future Generations Think We're Gross?

State Department will delete Xitter posts from before Trump returned to office

Show HN: Verifiable server roundtrip demo for a decision interruption system

Impl Rust – Avro IDL Tool in Rust via Antlr

Stories from 25 Years of Software Development

minikeyvalue

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

How I grow my X presence?

What's the cost of the most expensive Super Bowl ad slot?

What if you just did a startup instead?

Hacking up your own shell completion (2020)

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

GLM-OCR: Accurate × Fast × Comprehensive

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

Show HN: AboutMyProject – A public log for developer proof-of-work

Expertise, AI and Work of Future [video]

So Long to Cheap Books You Could Fit in Your Pocket

PID Controller

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

Kubernetes MCP Server

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife