Ask HN: Using AI/LLM APIs makes me want to give up. What am I doing wrong?

8•moomoo11•4h ago

I'm trying to automate a few manual processes we have right now, but I still can't get over this hump. What am I doing wrong?

I am using these AI APIs for actual processing type work, and I am left defeated and somewhat angry if I'm being honest. These AI companies sell us some galaxy-brain vision of automation, but actually using their services is a disappointing experience.

1. The results are never consistent. "Please ensure you extract ALL items" -> [Item1, Item2, Item3, "literally a comment // ...remaining items"] WHAT THE F$#K!! Sometimes it gives me a full list of all items, and sometimes it does that BS. I provided a tool, and half of the time it just grabs the first 3 and maybe it will grab the very last one too (ignoring everything in the middle).

2. Because the results are not reliable, I have to do more post-processing. About 60% of the time, even after post, I have to reject because they don't meet my confidence threshold.

3. The APIs are poorly supported by the vendors.

- iOS has some insane behavior where file extensions are sometimes .jpg or .JPG, etc. OpenAI's API, for example, will return Bad Request because the extension was not ".jpg" so now I have to add more code to ensure that when the user uploads files, I rename the file.

- The docs will say it supports a list of file formats, but then rejects the request because it was not .PDF even though the purpose was "assistants" (which the docs say can handle images). No problem, I'll just convert..

- Dealing with files coming from other sources (G Drive, etc.) where the extension is missing but the MIME type is present.. Again, bad request.

4. We went from "AGI any day now" in 2024, to "_A_rtificial _S_uper _I_ntelligence any day now" today. Can we just relax? Did I fall for a marketing trap?

I think LLMs are great for applications like in Cursor, or for customer support, where it doesn't need to give "perfect" responses because a human operator will prompt it further. How many times have you had to deal with stupid output from Cursor (I'm a power user, I deal with this daily). RAG is a cool application, and there's no real need for correctness or exactness there, IMO. I've got hundreds of my notes that I've fed which I reference sometimes. I get different answers each time, but I don't need them to be perfect.

:q!

Comments

bob1029•4h ago

I don't think you're doing anything wrong. I think we are trying to apply this technology where it doesn't really belong.

The whole purpose of something like a parser is to reliably capture an AST representation of the thing in question. Asking a statistical model to do the same thing seems insane in principle. You've now turned a deterministic outcome into something that will definitely go wrong with some probability.

The fact that LLMs make for terrible parsers is catastrophic for things like function calling. Every successful agentic demo I've seen has had so many guard rails in terms of error feedback and retry that you wonder where the value-add actually resides. As you say, "I'll just convert..."

rsynnott•2h ago

> Can we just relax? Did I fall for a marketing trap?

Yes.

Maximizing Leverage in Software Systems

We Warned About the First China Shock. The Next One Will Be Worse

GPS on the fritz? Britain and France plot a backup plan

Why Elixir Is the Best Runtime for Building Agentic Workflows

Scientists Unlock Secrets of Matter Under Extreme Conditions – Universe Today

Re-implementing the Nix protocol in Rust

Fast memory vulnerabilities, written in 100% safe Rust

Show HN: Explaining Model Context Protocol (MCP) and AI Agents

Do system prompts ruin OpenAI finetunes or is it just me?

China enters race for LEO broadband dominance

AWS Free Tier now limits billing by default

Designing for Artificial Empathy

The second wave of spaced repetition apps

If MCP is the USB-C of AI agents, A2A is their Ethernet

An AI-generated band got 1M plays on Spotify

Hiding AI text prompts in academic papers to receive positive peer reviews

Things Not to Learn as an AI Engineer – By Paul Iusztin

Meta investors vs. Zuckerberg: $8B trial over alleged privacy violations

WatchWitch: Interoperability, Privacy, and Autonomy for the Apple Watch

Background Remover AI: Revolutionizing Image Editing with AI

The Broadband Story Abundance Liberals Like Ezra Klein Got Wrong

Elon, Three Questions [video]

Sandboxes for AI Code Execution

HoloMem's drop-in holographic tape drive for LTO tape libraries

Wordle-GRPO – A $100 Agent

It's so on. Trump declares war on podcast bros over Epstein

Show HN: Top Free AI Tools for Video Editing and Content Creation in 2025

Why AI Agent Implementations Keep Failing (and the Patterns That Work)

Stopping the rot when good software goes bad means new rules from the start

The price of software freedom is eternal politics