5:["$","div",null,{"className":"flex-1 flex flex-col","children":[["$","header",null,{"className":"w-full flex items-center justify-between min-h-[52px] h-[52px] px-4 gap-4","children":[["$","$L10",null,{"href":"/newest","children":["$","svg",null,{"xmlns":"http://www.w3.org/2000/svg","width":24,"height":24,"viewBox":"0 0 24 24","fill":"none","stroke":"currentColor","strokeWidth":2,"strokeLinecap":"round","strokeLinejoin":"round","className":"lucide lucide-move-left","children":[["$","path","kyvwex",{"d":"M6 8L2 12L6 16"}],["$","path","1m8cig",{"d":"M2 12H22"}],"$undefined"]}]}],["$","a",null,{"className":"text-xs hover:underline","href":"https://news.ycombinator.com/item?id=43769303","target":"_blank","rel":"noopener noreferrer","children":"Open in hackernews"}]]}],["$","$Lc",null,{}],["$","div",null,{"className":"p-4 overflow-scroll","children":[["$","h1",null,{"className":"text-2xl font-extrabold mb-2","children":"Ask HN: How are engineers evaluating non-deterministic ML/LLM based deployments?"}],null,["$","div",null,{"className":"text-xs text-muted-foreground mt-4 font-bold flex gap-1","children":[["$","span",null,{"className":"flex items-center gap-1","children":[["$","svg",null,{"xmlns":"http://www.w3.org/2000/svg","width":14,"height":14,"viewBox":"0 0 24 24","fill":"none","stroke":"currentColor","strokeWidth":2,"strokeLinecap":"round","strokeLinejoin":"round","className":"lucide lucide-arrow-up","children":[["$","path","hav0vg",{"d":"m5 12 7-7 7 7"}],["$","path","x0mq9r",{"d":"M12 19V5"}],"$undefined"]}],1]}],["$","span",null,{"children":"•"}],["$","span",null,{"children":"zuck_vs_musk"}],["$","span",null,{"children":"•"}],["$","span",null,{"children":"3h ago"}]]}],["$","div",null,{"className":"mt-4 text-sm post-text","dangerouslySetInnerHTML":{"__html":"So, we process data as well as documents from various sources, then,

  - convert all of its text (using different OCRs)\n  - pass it to LLM models - depending on the customer, it can be a cheaper model, and we do have model fallbacks\n\n

\nHow do engineers evaluate such systems?

  1. New models & new libraries are coming all the time\n  2. Even a third-party's deployment model will change over time and might improve/regress our systems\n

\nAny good approach for writing evaluations for these?"}}],false]}]]}]

Geocoding APIs compared: Pricing, free tiers and terms of use

MinC Is Not Cygwin

David Cronenberg Lost His Wife and Will to Make Movies. Then Came 'The Shrouds'

Hacker X

EU fines Apple €500M and Meta €200M

America's cyber defenses are being dismantled from the inside

Google Sovereign Cloud

Apple and Meta Are First to Be Hit by E.U. Digital Competition Law

Which IT certifications have the highest failure rate and why?

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition

Commodore 64 from Scratch: CPU Design and Build [video]

The Quest to Build a Perfect Protein Bar

Apple and Meta fined millions for breaching EU law

Safe Superintelligence reportedly valued at $32B

Flux-Juiced: The Fastest Image Generation Endpoint (2.6x Faster)

Open Source: A hedge against tariffs and geopolitics

Europe whacks Apple and Meta with combined $797M fine

OpenAI wants to buy Chrome and make it an "AI-first" experience

Every few years, the CTO role "quietly" reinvents itself

Show HN: SEC Alerts – Stay ahead of market-moving news, fast and digestible

Even the Densest Metal Doesn't Exceed USPS Shipping Weight Limit

Principles for Maintainable Codebases

Honda Will Test a Fuel-Cell System in Space

Instagram Launches CapCut Competitor

IREE

The 89 Percent Project

Follow the Hindu Editorial Analysis of 22 April 2025

Sixty Years On, We Still Dream of the Arrow

LinkedIn's unlikely role in the AI race

Researchers Use Blue Light and Iron to Fight Cancer

Ask HN: How are engineers evaluating non-deterministic ML/LLM based deployments?