frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Ask HN: Using AI/LLM APIs makes me want to give up. What am I doing wrong?

8•moomoo11•4h ago
I'm trying to automate a few manual processes we have right now, but I still can't get over this hump. What am I doing wrong?

I am using these AI APIs for actual processing type work, and I am left defeated and somewhat angry if I'm being honest. These AI companies sell us some galaxy-brain vision of automation, but actually using their services is a disappointing experience.

1. The results are never consistent. "Please ensure you extract ALL items" -> [Item1, Item2, Item3, "literally a comment // ...remaining items"] WHAT THE F$#K!! Sometimes it gives me a full list of all items, and sometimes it does that BS. I provided a tool, and half of the time it just grabs the first 3 and maybe it will grab the very last one too (ignoring everything in the middle).

2. Because the results are not reliable, I have to do more post-processing. About 60% of the time, even after post, I have to reject because they don't meet my confidence threshold.

3. The APIs are poorly supported by the vendors.

- iOS has some insane behavior where file extensions are sometimes .jpg or .JPG, etc. OpenAI's API, for example, will return Bad Request because the extension was not ".jpg" so now I have to add more code to ensure that when the user uploads files, I rename the file.

- The docs will say it supports a list of file formats, but then rejects the request because it was not .PDF even though the purpose was "assistants" (which the docs say can handle images). No problem, I'll just convert..

- Dealing with files coming from other sources (G Drive, etc.) where the extension is missing but the MIME type is present.. Again, bad request.

4. We went from "AGI any day now" in 2024, to "_A_rtificial _S_uper _I_ntelligence any day now" today. Can we just relax? Did I fall for a marketing trap?

I think LLMs are great for applications like in Cursor, or for customer support, where it doesn't need to give "perfect" responses because a human operator will prompt it further. How many times have you had to deal with stupid output from Cursor (I'm a power user, I deal with this daily). RAG is a cool application, and there's no real need for correctness or exactness there, IMO. I've got hundreds of my notes that I've fed which I reference sometimes. I get different answers each time, but I don't need them to be perfect.

:q!

Comments

bob1029•4h ago
I don't think you're doing anything wrong. I think we are trying to apply this technology where it doesn't really belong.

The whole purpose of something like a parser is to reliably capture an AST representation of the thing in question. Asking a statistical model to do the same thing seems insane in principle. You've now turned a deterministic outcome into something that will definitely go wrong with some probability.

The fact that LLMs make for terrible parsers is catastrophic for things like function calling. Every successful agentic demo I've seen has had so many guard rails in terms of error feedback and retry that you wonder where the value-add actually resides. As you say, "I'll just convert..."

rsynnott•2h ago
> Can we just relax? Did I fall for a marketing trap?

Yes.

Maximizing Leverage in Software Systems

https://bencornia.com/blog/maximizing-leverage-in-software-systems
1•bencornia•20s ago•0 comments

We Warned About the First China Shock. The Next One Will Be Worse

https://www.nytimes.com/2025/07/14/opinion/china-shock-economy-manufacturing.html
1•magpi3•34s ago•1 comments

GPS on the fritz? Britain and France plot a backup plan

https://www.theregister.com/2025/07/14/britain_france_navigation_alternatives/
1•beardyw•1m ago•0 comments

Why Elixir Is the Best Runtime for Building Agentic Workflows

https://www.freshcodeit.com/blog/why-elixir-is-the-best-runtime-for-building-agentic-workflows
1•ViktoriiaYarosh•2m ago•0 comments

Scientists Unlock Secrets of Matter Under Extreme Conditions – Universe Today

https://www.universetoday.com/articles/scientists-unlock-secrets-of-matter-under-extreme-conditions
1•rbanffy•3m ago•0 comments

Re-implementing the Nix protocol in Rust

https://www.tweag.io/blog/2024-04-25-nix-protocol-in-rust/
1•fanf2•4m ago•0 comments

Fast memory vulnerabilities, written in 100% safe Rust

https://github.com/Speykious/cve-rs
1•rc00•5m ago•0 comments

Show HN: Explaining Model Context Protocol (MCP) and AI Agents

https://www.youtube.com/watch?v=E2DEHOEbzks
2•abhisharma2001•5m ago•0 comments

Do system prompts ruin OpenAI finetunes or is it just me?

1•rakan1•8m ago•0 comments

China enters race for LEO broadband dominance

https://spacenews.com/china-enters-race-for-leo-broadband-dominance/
2•defrost•9m ago•0 comments

AWS Free Tier now limits billing by default

https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/free-tier.html
1•ke4qqq•12m ago•0 comments

Designing for Artificial Empathy

https://dinoki.ai/blog/designing-for-artificial-empathy
1•tpae•13m ago•0 comments

The second wave of spaced repetition apps

https://hiandrewquinn.github.io/til-site/posts/the-second-wave-of-spaced-repetition-apps/
1•hiAndrewQuinn•14m ago•0 comments

If MCP is the USB-C of AI agents, A2A is their Ethernet

https://www.theregister.com/2025/07/12/ai_agent_protocols_mcp_a2a/
2•Brajeshwar•14m ago•0 comments

An AI-generated band got 1M plays on Spotify

https://www.theguardian.com/technology/2025/jul/14/an-ai-generated-band-got-1m-plays-on-spotify-now-music-insiders-say-listeners-should-be-warned
1•Brajeshwar•16m ago•1 comments

Hiding AI text prompts in academic papers to receive positive peer reviews

https://www.theguardian.com/technology/2025/jul/14/scientists-reportedly-hiding-ai-text-prompts-in-academic-papers-to-receive-positive-peer-reviews
3•belter•16m ago•0 comments

Things Not to Learn as an AI Engineer – By Paul Iusztin

https://decodingml.substack.com/p/3-things-not-to-learn-as-an-ai-engineer
1•rbanffy•16m ago•0 comments

Meta investors vs. Zuckerberg: $8B trial over alleged privacy violations

https://www.reuters.com/sustainability/boards-policy-regulation/meta-investors-zuckerberg-square-off-8-billion-trial-over-alleged-privacy-2025-07-14/
1•Brajeshwar•19m ago•0 comments

WatchWitch: Interoperability, Privacy, and Autonomy for the Apple Watch

https://arxiv.org/abs/2507.07210
1•nabla9•21m ago•1 comments

Background Remover AI: Revolutionizing Image Editing with AI

https://getaicraft.com
1•SaaSified•23m ago•1 comments

The Broadband Story Abundance Liberals Like Ezra Klein Got Wrong

https://washingtonmonthly.com/2025/07/09/the-broadband-story-abundance-liberals-like-ezra-klein-got-wrong/
1•foolswisdom•24m ago•0 comments

Elon, Three Questions [video]

https://www.youtube.com/watch?v=-5pNrt-_0bQ
2•graycat•25m ago•1 comments

Sandboxes for AI Code Execution

https://github.com/restyler/awesome-sandbox
2•jetter•25m ago•0 comments

HoloMem's drop-in holographic tape drive for LTO tape libraries

https://blocksandfiles.com/2025/07/12/holomems-drop-in-holographic-tape-cartridge-for-lto-tape-libraries/
2•rbanffy•34m ago•0 comments

Wordle-GRPO – A $100 Agent

https://github.com/RLDiary/Wordle-GRPO
1•vigneshBramesh•37m ago•0 comments

It's so on. Trump declares war on podcast bros over Epstein

https://www.damianreilly.co.uk/p/trump-declares-war-on-the-podcast
9•myrtlehinch•38m ago•2 comments

Show HN: Top Free AI Tools for Video Editing and Content Creation in 2025

https://cognitiaai.blogspot.com/2025/07/free-ai-tools-video-editing-content-creation.html
1•Cognitia_AI•39m ago•0 comments

Why AI Agent Implementations Keep Failing (and the Patterns That Work)

https://www.davidlambauer.de/why-most-ai-agent-implementations-fail-and-the-architecture-patterns-that-actually-work/
1•herrmaier•39m ago•1 comments

Stopping the rot when good software goes bad means new rules from the start

https://www.theregister.com/2025/07/14/software_rot_opinion/
1•rntn•40m ago•0 comments

The price of software freedom is eternal politics

https://www.theregister.com/2025/07/12/the_price_of_software_freedom/
3•LorenDB•41m ago•0 comments