frontpage.

Hi HN,

I’m building *new.knife.day* (https://new.knife.day), a crowd-sourced database of every cutlery maker—from Al Mar to brands so small they barely show up on Google. That means I need an automated way to fetch each brand’s official website, even for fringe names like “Actilam” or “Aiorosu Knives”.

So I threw the task at eight web-enabled LLMs via OpenRouter:

  • gpt-4o and gpt-4o-mini
  • claude-sonnet-4
  • gemini-2.5-pro and gemini-2.0-flash
  • llama-3.1-70b
  • qwen-2.5-72b
  • perplexity sonar-deep-research

Prompt: Return *only* JSON { brand, official_url, confidence } Data set: 10 obscure knife brands Scoring: exact domain = correct; “no official site” (with reason) = correct Costs: OpenRouter prices on 31 May 2025 (Perplexity billed separately)

Highlights ----------

  • Perplexity hit 10/10 but cost $9.42 (860 k tokens!).
  • GPT-4o-mini & Llama-3.1-70B got 9/10 for ~2 ¢ per correct URL.
  • Gemini Flash managed 7/10 for $0.001 total—great if you can QA the misses.
  • Half of Gemini 2.5 Pro’s replies were HTML tables my parser rejected.

Full table, code, and raw logs are in the post (and on GitHub).

Take-aways ----------

  1. 90 % accuracy + quick human review often beats 100 % accuracy that costs
     45× more.
  2. Structured output is part of model quality—validate JSON on arrival.
  3. Promo pricing moves fast; always ping the price API before large runs.

Next step: wire GPT-4o-mini into *new.knife.day* so visitors get verified manufacturer links. Crawling ~250 brands now costs under $5.

Curious what you’d improve, and which model you’d bet on for similar “find the canonical URL” tasks. AMA on the setup, prompts, or results!

Microsoft backed AI startup pretending to be AI filed for bankruptcy

Vibe Coding: Where it works and where it doesn't

Neuroscience How Much Energy Does It Take to Think?

Dix – Nix Derivation Diff

WizWhisp – a local whisper GUI app for audio/video-to-text on Windows

Timeline of Audio Formats

Self-hosting your own media considered harmful according to YouTube

Show HN: Tectonic Plates Physics Simulator That Generates Maps

Guide to the History and Beliefs of Roman Catholicism

The permanent place to store and share all your digital memories in the cloud

Show HN: A Discord Note Taker - my new year's resolution of finishing a project

Online Media Is at a Fork in the Road, So We're Removing Ads for Members

Cory Doctorow on how we lost the internet

Ask HN: How to Teach AI?

Discord CTO says he's "constantly bringing up enshittification" during meetings

Show HN: Memotron – PKM Tool for All

My Advice on (Internet) Writing

Smart screen capture with AI insights

Functionally banning school pizza is a tough sell

Quantum Mixed-State Self-Attention Network

Nucleus Launches Embryo

Show HN: Most users won't report bugs unless you make it stupidly easy

Knuth-Bendix Completion Calculator

According to Nielsen, No One Is Watching Anime

Switch 2 factory firmware spotted in the wild

We should protect the high seas from all extraction, forever

Chasing Big Money with the Health-Care Hustlers of South Florida

LTX Studio API v1 Featuring LTX-Video and FLUX.1 Kontext

The Beer Gut 2

Mexican high school student launches mental health app

Show HN: Which LLM Finds Obscure Knife-Brand URLs Cheapest? (8-Model Benchmark)