frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Lessons from building an intelligent LLM router

https://github.com/Egham-7/adaptive
1•botirk•4mo ago

Comments

botirk•4mo ago
We have been experimenting with routing inference across LLMs, and the path has been full of wrong turns.

Our first attempt was to just use a large LLM itself to decide routing. It was too costly and the decisions were unreliable.

Next we tried training a small fine-tuned LLM as a router. It was cheaper, but the outputs were poor and not trustworthy.

Then we wrote heuristics to map prompt types to model IDs. That worked for a while, but it was brittle. Every API change or workload shift broke it.

Eventually we shifted to thinking in terms of model criteria instead of hardcoded model IDs. We benchmarked models across task types, domains, and complexity levels, and made routing decisions based on those profiles.

To estimate task type and complexity, we used NVIDIA’s Prompt Task and Complexity Classifier. It classifies prompts into categories like QA, summarization, code generation, and more. It also scores prompts along six dimensions such as creativity, reasoning, domain knowledge, contextual knowledge, constraints, and few-shots. From this it produces a weighted overall complexity score.

This gave us a structured way to decide when a prompt justified a premium model like Claude Opus 4.1 and when a smaller model like GPT-5-mini would perform just as well.

Now we are working on integrating this with Google’s UniRoute (https://arxiv.org/abs/2502.08773

Show HN: Glance – Fast CSV inspection for the terminal (SIMD-accelerated)

https://github.com/AveryClapp/glance
1•AveryClapp•1m ago•0 comments

Busy for the Next Fifty to Sixty Bud

https://pestlemortar.substack.com/p/busy-for-the-next-fifty-to-sixty-had-all-my-money-in-bitcoin-...
1•mithradiumn•1m ago•0 comments

Imperative

https://pestlemortar.substack.com/p/imperative
1•mithradiumn•2m ago•0 comments

Show HN: I decomposed 87 tasks to find where AI agents structurally collapse

https://github.com/XxCotHGxX/Instruction_Entropy
1•XxCotHGxX•6m ago•1 comments

I went back to Linux and it was a mistake

https://www.theverge.com/report/875077/linux-was-a-mistake
1•timpera•7m ago•1 comments

Octrafic – open-source AI-assisted API testing from the CLI

https://github.com/Octrafic/octrafic-cli
1•mbadyl•9m ago•1 comments

US Accuses China of Secret Nuclear Testing

https://www.reuters.com/world/china/trump-has-been-clear-wanting-new-nuclear-arms-control-treaty-...
1•jandrewrogers•9m ago•1 comments

Peacock. A New Programming Language

1•hashhooshy•14m ago•1 comments

A postcard arrived: 'If you're reading this I'm dead, and I really liked you'

https://www.washingtonpost.com/lifestyle/2026/02/07/postcard-death-teacher-glickman/
2•bookofjoe•15m ago•1 comments

What to know about the software selloff

https://www.morningstar.com/markets/what-know-about-software-stock-selloff
2•RickJWagner•19m ago•0 comments

Show HN: Syntux – generative UI for websites, not agents

https://www.getsyntux.com/
3•Goose78•20m ago•0 comments

Microsoft appointed a quality czar. He has no direct reports and no budget

https://jpcaparas.medium.com/ab75cef97954
2•birdculture•20m ago•0 comments

AI overlay that reads anything on your screen (invisible to screen capture)

https://lowlighter.app/
1•andylytic•21m ago•1 comments

Show HN: Seafloor, be up and running with OpenClaw in 20 seconds

https://seafloor.bot/
1•k0mplex•22m ago•0 comments

Tesla turbine-inspired structure generates electricity using compressed air

https://techxplore.com/news/2026-01-tesla-turbine-generates-electricity-compressed.html
2•PaulHoule•23m ago•0 comments

State Department deleting 17 years of tweets (2009-2025); preservation needed

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•sleazylice•23m ago•1 comments

Learning to code, or building side projects with AI help, this one's for you

https://codeslick.dev/learn
1•vitorlourenco•24m ago•0 comments

Effulgence RPG Engine [video]

https://www.youtube.com/watch?v=xFQOUe9S7dU
1•msuniverse2026•25m ago•0 comments

Five disciplines discovered the same math independently – none of them knew

https://freethemath.org
4•energyscholar•26m ago•1 comments

We Scanned an AI Assistant for Security Issues: 12,465 Vulnerabilities

https://codeslick.dev/blog/openclaw-security-audit
1•vitorlourenco•27m ago•0 comments

Amazon no longer defend cloud customers against video patent infringement claims

https://ipfray.com/amazon-no-longer-defends-cloud-customers-against-video-patent-infringement-cla...
2•ffworld•27m ago•0 comments

Show HN: Medinilla – an OCPP compliant .NET back end (partially done)

https://github.com/eliodecolli/Medinilla
2•rhcm•30m ago•0 comments

How Does AI Distribute the Pie? Large Language Models and the Ultimatum Game

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6157066
1•dkga•31m ago•1 comments

Resistance Infrastructure

https://www.profgalloway.com/resistance-infrastructure/
3•samizdis•35m ago•1 comments

Fire-juggling unicyclist caught performing on crossing

https://news.sky.com/story/fire-juggling-unicyclist-caught-performing-on-crossing-13504459
1•austinallegro•35m ago•0 comments

Restoring a lost 1981 Unix roguelike (protoHack) and preserving Hack 1.0.3

https://github.com/Critlist/protoHack
2•Critlist•37m ago•0 comments

GPS and Time Dilation – Special and General Relativity

https://philosophersview.com/gps-and-time-dilation/
1•mistyvales•40m ago•0 comments

Show HN: Witnessd – Prove human authorship via hardware-bound jitter seals

https://github.com/writerslogic/witnessd
1•davidcondrey•40m ago•1 comments

Show HN: I built a clawdbot that texts like your crush

https://14.israelfirew.co
2•IsruAlpha•42m ago•2 comments

Scientists reverse Alzheimer's in mice and restore memory (2025)

https://www.sciencedaily.com/releases/2025/12/251224032354.htm
2•walterbell•46m ago•0 comments