frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The M×N problem of tool calling and open-source models

https://www.thetypicalset.com/blog/grammar-parser-maintenance-contract
49•remilouf•4d ago

Comments

Nevermark•3h ago
Feedback: I don't usually comment on formatting, but that fat indent is too much. I applied "hide distracting items" to the graphic, and the indent is still there. Reader works.
airstrike•2h ago
One of the most relevant posts about AI on HN this year. It's not hype-y, but it's imperative to discuss.

I find it strange that the industry hasn't converged in at least somewhat standardized format, but I guess despite all the progress we're still in the very early days...

kami23•2h ago
Sounds like we need another standard. /s

This is one of the first tech waves where I feel like I'm on the very very groundfloor for a lot of exploration and it only feels like people have been paying closer attention in the last year. I can't imagine too many 'standard' standards becoming a standard that quickly.

It's new enough that Google seems to be throwing pasta against the wall and seeing what products and protocols stick. Antigravity for example seems too early to me, I think they just came out with another type of orchestrator, but the whole field seems to be exploring at the same time.

Everyone and their uncle is making an orchestrator now! I take a very cautious approach lately where I haven't been loading up my tools like agents, ides, browsers, phones with too much extra stuff because as soon as I switch something or something new comes out that doesn't support something I built a workflow around the tool either becomes inaccessible to me, or now a bigger learning curve than I have the patience for.

I've been a big proponent of trying to get all these things working locally for myself (I need to bite the bullet on some beefy video cards finally), and even just getting tool calls to work with some qwen models to be so counterintuitive.

jonathanhefner•2h ago
Does anyone know why there hasn’t been more widespread adoption of OpenAI’s Harmony format? Or will it just take another model generation to see adoption?
jiehong•2h ago
Am I misunderstanding, or isn't this supposed to be the point of MCP?
akoumjian•1h ago
The models only output text. Tool calls are nothing more than specially formatted text which gets parsed and interpreted by the inference server (or some other driver) into something which can be picked up by your agent loop and executed. Models are trained in a wide variety of different delimiters and escape characters to indicate their tool calls (along with things like separate thinking blocks). MCP is mostly a standard way to share with your agent loop the list of tool names and what their arguments are, which then gets passed to the inference server which then renders it down to text to feed to the model.
perlgeek•1h ago
> Tool calls are nothing more than specially formatted text which gets parsed and interpreted by the inference server

I know this is getting off-topic, but is anybody working on more direct tool calling?

LLMs are based on neural networks, so one could create an interface where activating certain neurons triggers tool calls, with other neurons encoding the inputs; another set of neurons could be triggered by the tokenized result from the tool call.

Currently, the lack of separation between data and metadata is a security nightmare, which enables prompt injection. And yet all I've seen done about is are workarounds.

yorwba•1h ago
Each text token already represents the activation of certain neurons. There is nothing "more direct." And you cannot fully separate data and metadata if you want them to influence the output. At best you can clearly distinguish them and hope that this is enough for the model to learn to treat them differently.
perlgeek•4m ago
Are there tokens reserved for tool calls? If yes, I can see the equivalence. If not, not so much.
evelant•1h ago
I guess I fail to see why this is such a problem. Yes it would be nice if the wire format were standardized or had a standard schema description, but is writing a parser that handles several formats actually a difficult problem? Modern models could probably whip up a "libToolCallParser" with bindings for all popular languages in an afternoon. Could probably also have an automated workflow for adding any new ones with minimal fuss. An annoyance, yes, but it does not seem like a really "hard" problem. It seems more of a social problem that open source hasn't coalesced around a library that handles it easily yet or am I missing something?
HarHarVeryFunny•53m ago
There already exist products like LiteLLM that adapt tool calling to different providers. FWIW, incompatibility isn't just an opensource problem - OpenAI and Anthropic also use different syntax for tool registration and invocation.

I would guess that lack of standardization of what tools are provided by different types of agent (e.g. coding agent vs "claw agent") is as much of a problem as the differences in syntax, since the ideal case would be for a model to be trained end-to-end for use with a specific agent and set of tools, as I believe Anthropic do. Any agent interacting with a model that wasn't specifically trained to work with that agent/toolset is going to be at a disadvantage.

jeremyjh•15m ago
Presumably the hosting services are resolving all of this in their OpenAI/Anthropic compatibility layer, which is what most tools are using. So this is really just a problem for local engines that have to do the same thing but are expected to work right away for every new model drop.
kleton•1h ago
Don't inference servers like vllm or sglang just translate these things to openai-compat API shapes?
ikidd•52m ago
This sounds like a problem that LLMs were built to solve.
ontouchstart•30m ago
https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#to...

What is jj and why should I care?

https://steveklabnik.github.io/jujutsu-tutorial/introduction/what-is-jj-and-why-should-i-care.html
248•tigerlily•4h ago•145 comments

Rare concert records going on Internet Archive

https://techcrunch.com/2026/04/13/thousands-of-rare-concert-recordings-are-landing-on-the-interne...
29•jrm-veris•49m ago•8 comments

DaVinci Resolve – Photo

https://www.blackmagicdesign.com/products/davinciresolve/photo
800•thebiblelover7•12h ago•211 comments

NimConf 2026: Dates Announced, Registrations Open

https://nim-lang.org/blog/2026/04/07/nimconf-2026.html
59•moigagoo•3h ago•11 comments

A new spam policy for “back button hijacking”

https://developers.google.com/search/blog/2026/04/back-button-hijacking
565•zdw•11h ago•333 comments

Someone bought 30 WordPress plugins and planted a backdoor in all of them

https://anchor.host/someone-bought-30-wordpress-plugins-and-planted-a-backdoor-in-all-of-them/
1065•speckx•20h ago•301 comments

Backblaze has stopped backing up your data

https://rareese.com/posts/backblaze/
459•rrreese•6h ago•298 comments

Introspective Diffusion Language Models

https://introspective-diffusion.github.io/
132•zagwdt•6h ago•31 comments

GitHub Stacked PRs

https://github.github.com/gh-stack/
800•ezekg•17h ago•434 comments

The acyclic e-graph: Cranelift's mid-end optimizer

https://cfallin.org/blog/2026/04/09/aegraph/
24•tekknolagi•4d ago•3 comments

The M×N problem of tool calling and open-source models

https://www.thetypicalset.com/blog/grammar-parser-maintenance-contract
49•remilouf•4d ago•16 comments

The Case Against Gameplay Loops

https://blog.joeyschutz.com/the-case-against-gameplay-loops/
29•coinfused•3h ago•19 comments

Distributed DuckDB Instance

https://github.com/citguru/openduck
109•citguru•8h ago•24 comments

Franklin's bad ads for Apple ][ clones and the beloved impersonator they depict

https://buttondown.com/suchbadtechads/archive/franklin-ace-1000/
67•rfarley04•3d ago•35 comments

Ransomware Is Growing Three Times Faster Than the Spending Meant to Stop It

https://ciphercue.com/blog/ransomware-claims-grew-faster-than-security-spend-2025
57•adulion•5h ago•47 comments

Lean proved this program correct; then I found a bug

https://kirancodes.me/posts/log-who-watches-the-watchers.html
310•bumbledraven•14h ago•145 comments

Show HN: Run GUIs as Scripts

https://github.com/skinnyjames/hokusai-pocket
7•zero-st4rs•4d ago•0 comments

The exponential curve behind open source backlogs

https://armanckeser.com/writing/jellyfin-flow
22•armanckeser•2h ago•12 comments

The Great Majority: Body Snatching and Burial Reform in 19th-Century Britain

https://publicdomainreview.org/essay/the-great-majority/
13•apollinaire•3d ago•1 comments

Multi-Agentic Software Development Is a Distributed Systems Problem

https://kirancodes.me/posts/log-distributed-llms.html
72•tie-in•9h ago•30 comments

WiiFin – Jellyfin Client for Nintendo Wii

https://github.com/fabienmillet/WiiFin
203•throwawayk7h•15h ago•92 comments

Nothing Ever Happens: Polymarket bot that always buys No on non-sports markets

https://github.com/sterlingcrispin/nothing-ever-happens
454•m-hodges•23h ago•250 comments

MOS tech 6502 8-bit microprocessor in pure SQL powered by Postgres

https://github.com/lasect/pg_6502
50•adunk•8h ago•6 comments

A soft robot has no problem moving with no motor and no gears

https://engineering.princeton.edu/news/2026/04/08/soft-robot-has-no-problem-moving-no-motor-and-n...
54•hhs•4d ago•16 comments

US appeals court declares 158-year-old home distilling ban unconstitutional

https://nypost.com/2026/04/11/us-news/us-appeals-court-declares-158-year-old-home-distilling-ban-...
424•t-3•1d ago•288 comments

Lumina – a statically typed web-native language for JavaScript and WASM

https://github.com/nyigoro/lumina-lang
35•light_ideas•4d ago•14 comments

Design and implementation of DuckDB internals

https://duckdb.org/library/design-and-implementation-of-duckdb-internals/
157•mpweiher•4d ago•11 comments

Write less code, be more responsible

https://blog.orhun.dev/code-responsibly/
131•orhunp_•3d ago•76 comments

Make tmux pretty and usable (2024)

https://hamvocke.com/blog/a-guide-to-customizing-your-tmux-conf/
409•speckx•23h ago•252 comments

N-Day-Bench – Can LLMs find real vulnerabilities in real codebases?

https://ndaybench.winfunc.com
86•mufeedvh•16h ago•27 comments