When building the AI game suggestion feature for steamid.one (a side project for Steam users), I ran into the common, but often under-discussed, problem of choosing the "right" AI model. My needs were simple: smart, fast, cheap, and aware of somewhat recent data without complex RAG.
Here’s how I weighed the options based on cost (per 1 Million tokens, approx.), knowledge cutoff, and performance for my use case (analyzing player profiles to suggest games and output structured JSON):
1. OpenAI (GPT):
* gpt-3.5-turbo (0.5M context, cutoff early 2024): ~$0.50 input / $1.50 output. Decent baseline, but reasoning can sometimes miss nuance for creative suggestions.
* gpt-4o (128K context, cutoff late 2023): ~$5.00 input / $15.00 output. Powerful, but too pricey for a free tool's scale.
2. Anthropic (Claude):
* Claude 3 Haiku (200K context, cutoff early 2024): ~$0.25 input / $1.25 output. Extremely competitive on price and very capable.
3. Google (Gemini):
* gemini-1.5-pro (128K context, cutoff early 2024): ~$3.50 input / $10.50 output. Solid, but more than I wanted to spend.
* gemini-1.5-flash (128K context, cutoff early 2024): ~$0.35 input / $1.05 output. This was the winner.
Why Flash Stood Out for steamid.one:
Flash's performance for my specific JSON output needs (structuring game suggestions) combined with its unbeatable cost-effectiveness was the killer feature. For a free tool, literally every cent per call matters. Also, I found that for game suggestions based on user-provided data, the models' general knowledge cutoffs were less of a bottleneck than expected. The AI's strength was its reasoning with the data I fed it (player genres, owned games), not needing to know every new release. This significantly changed my prompt engineering strategy.
I'm curious:
* How do you balance AI model cost, capability, and knowledge cutoff for your projects?
* Any tips for cheap, reliable AI integrations?
alexcolewrites•8h ago
When building the AI game suggestion feature for steamid.one (a side project for Steam users), I ran into the common, but often under-discussed, problem of choosing the "right" AI model. My needs were simple: smart, fast, cheap, and aware of somewhat recent data without complex RAG.
Here’s how I weighed the options based on cost (per 1 Million tokens, approx.), knowledge cutoff, and performance for my use case (analyzing player profiles to suggest games and output structured JSON): 1. OpenAI (GPT): * gpt-3.5-turbo (0.5M context, cutoff early 2024): ~$0.50 input / $1.50 output. Decent baseline, but reasoning can sometimes miss nuance for creative suggestions. * gpt-4o (128K context, cutoff late 2023): ~$5.00 input / $15.00 output. Powerful, but too pricey for a free tool's scale.
2. Anthropic (Claude): * Claude 3 Haiku (200K context, cutoff early 2024): ~$0.25 input / $1.25 output. Extremely competitive on price and very capable.
3. Google (Gemini): * gemini-1.5-pro (128K context, cutoff early 2024): ~$3.50 input / $10.50 output. Solid, but more than I wanted to spend. * gemini-1.5-flash (128K context, cutoff early 2024): ~$0.35 input / $1.05 output. This was the winner.
Why Flash Stood Out for steamid.one:
Flash's performance for my specific JSON output needs (structuring game suggestions) combined with its unbeatable cost-effectiveness was the killer feature. For a free tool, literally every cent per call matters. Also, I found that for game suggestions based on user-provided data, the models' general knowledge cutoffs were less of a bottleneck than expected. The AI's strength was its reasoning with the data I fed it (player genres, owned games), not needing to know every new release. This significantly changed my prompt engineering strategy.
I'm curious: * How do you balance AI model cost, capability, and knowledge cutoff for your projects? * Any tips for cheap, reliable AI integrations?
Thanks for any feedback!