Picking the Best AI Model for Cost and Freshness (Lesson from Building My Site)

1•alexcolewrites•8h ago

Comments

alexcolewrites•8h ago

Hey HN,

When building the AI game suggestion feature for steamid.one (a side project for Steam users), I ran into the common, but often under-discussed, problem of choosing the "right" AI model. My needs were simple: smart, fast, cheap, and aware of somewhat recent data without complex RAG.

Here’s how I weighed the options based on cost (per 1 Million tokens, approx.), knowledge cutoff, and performance for my use case (analyzing player profiles to suggest games and output structured JSON): 1. OpenAI (GPT): * gpt-3.5-turbo (0.5M context, cutoff early 2024): ~$0.50 input / $1.50 output. Decent baseline, but reasoning can sometimes miss nuance for creative suggestions. * gpt-4o (128K context, cutoff late 2023): ~$5.00 input / $15.00 output. Powerful, but too pricey for a free tool's scale.

2. Anthropic (Claude): * Claude 3 Haiku (200K context, cutoff early 2024): ~$0.25 input / $1.25 output. Extremely competitive on price and very capable.

3. Google (Gemini): * gemini-1.5-pro (128K context, cutoff early 2024): ~$3.50 input / $10.50 output. Solid, but more than I wanted to spend. * gemini-1.5-flash (128K context, cutoff early 2024): ~$0.35 input / $1.05 output. This was the winner.

Why Flash Stood Out for steamid.one:

Flash's performance for my specific JSON output needs (structuring game suggestions) combined with its unbeatable cost-effectiveness was the killer feature. For a free tool, literally every cent per call matters. Also, I found that for game suggestions based on user-provided data, the models' general knowledge cutoffs were less of a bottleneck than expected. The AI's strength was its reasoning with the data I fed it (player genres, owned games), not needing to know every new release. This significantly changed my prompt engineering strategy.

I'm curious: * How do you balance AI model cost, capability, and knowledge cutoff for your projects? * Any tips for cheap, reliable AI integrations?

Thanks for any feedback!

Reverse Engineering the Firmware Loader for an X-Rite Spectrophotometer

SpatialChat, Instructure's Canvas LMS, and HyFlex Learning

The XINU Page

Modified mRNA vaccine masquerades as virus to trick body into stronger immunity

WKWebExtension – support for WebExtensions in WebKit-based browsers

GitHub Copilot coding agent now uses one premium request per session

Not So Fast: AI Coding Tools Can Reduce Productivity

The AI Creative Destruction Wave

Digital Journaling Platform

Peter Boockvar's Substack

Turkey bans Grok over Erdoğan insults

Adding LSM trees to Postgres makes replication tough

Rodish: Routing Tree Argv Parser

From Scratch: Berry Patch

Integrity-Policy Header

My Digital Minimalism Journey

Libinput 1.29 Improving Scroll Wheel Responsiveness for Most Devices

Updating an old Ubuntu to a supported version

Windows 11 clean install guide: remove bloatware and optimize performance

New Burning Coal Seams Revealed Across Wyoming's Northern Border

Barksdale Airmen Help Secure New Drone Restrictions Across Louisiana

Grok 4 answers controversial questions by searching what Musk has to say

Binding Application in Idris

Retro-gaming YouTuber PatmanQC has died, aged 53

Readeck

The Return of the "Elderly" Pop Star

Don't Eat Honey

Show HN: Context Compass – Track Claude chat token usage before hitting limit

San Francisco firm fined $215M for illegally using Russian oligarch funds

Musk Says Grok Chatbot Coming to Tesla Vehicles by Next Week