frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Programmatic Tool Calling for Agents

https://github.com/zeke-john/codecall
1•zekejohn•8h ago

Comments

zekejohn•8h ago
Hey all :)

I've been working on an open source implementation of Programmatic Tool Calling for Agents, based on cloudflare's codemode & a few anthropic articles, and although i think it can be very powerful in certain usecases, there are some challenges that i would love to have your thoughts on

Instead of traditional agents that burn tens of thousands of tokens loading all tool definitions upfront and compound context with sequential calls, this approach lets agents discover only the tools they need from a file tree of TypeScript SDKs, then write code to one-shot tasks in a single pass.

Although having an agent execute code seems like its ideal as LLMs are great at writing code, there are a few big challenges that i have faced below

The main challenges w/ Programmatic Tool Calling:

- Output Schemas from the Tools

MCP servers or most tool definitions almost never define output schemas, and without knowing what a tool returns, the model hallucinates property names, like think of 'task.title' vs 'task.name' as an example, and the script fails at runtime because it has too guess the shape of the output of a tool. I'm working around this by the classifying tools and by actually calling the tools to infer schemas, but it's really hacky because a single sample misses optional fields, and testing write + destructive tools means creating real or destroying data which is an approach i really dislike and don't think is viable

- Tool Outputs Are Often Plain Strings (returns unstructured data)

Even with perfect schemas and defined shapes, most MCP tools return markdown blobs or plain strings meant for LLM inference. No JSON, no fields to index into and just text. If majority of your tools return in just strings (even when listing data) the main value of codecall is lost because you can't write deterministic code against unstructured data in a string. You're forced back into traditional agent behavior where the LLM interprets text. If you don't control the server or the tool definitions, there's no fix i can really think of.

- Input/Output examples for each Tool (Amplified w/ Programmatic Tool Calling)

The final challenge is that JSON Schema defines structure but not usage patterns. Take that support ticket API example: the schema tells you due_date is a string, but not whether it wants "2024-11-06" or "Nov 6, 2024". It says reporter.id is a string, but is that a UUID or "USR-12345"? When should reporter.contact be populated? How do escalation.level and priority interact? (got this example from an anthropic article covering this)

In traditional tool calling, the model can learn these patterns through trial and error across multiple turns. It tries something, gets an error or unexpected result, and adjusts for the rest But with programmatic tool calling, the model writes a script that might call create_ticket 50 times in a loop for different users. If it misinterprets the date format or ID convention in the first call, all 50 calls fail and so on.

-------------

Although all of these could be fixed by just setting them manually by the user, is there a reliable way we can get the Output Schemas and generate Input/Output examples for each Tool, without actually calling the tool, and without having a user manually input the data?

If anybody is interested, or has any thoughts on Tool Calling for Agents and has any ideas please feel free to share!

My Favorite Books of 2025

https://theaiunderwriter.substack.com/p/my-favorite-books-of-2025
1•participant26•4m ago•0 comments

I Didn't Buy Traffic – I Built a Growth Engine Instead

https://www.google.com/search?q=site%3Avect.pro&oq=&gs_lcrp=EgZjaHJvbWUqCQgAECMYJxjqAjIJCAAQIxgnG...
1•WoWSaaS•5m ago•1 comments

Ask HN: Any team-wide experience with vibe coding?

1•tpetrina•8m ago•1 comments

Confidence Required

https://sive.rs/confidence
2•herbertl•9m ago•0 comments

Show HN: Jetbase – A Python database migration tool (Alembic alternative)

https://github.com/jetbase-hq/jetbase
2•jaz1•14m ago•0 comments

Show HN: I made a tool to strip GPS and EXIF metadata from photos client-side

https://metarefresh.net/
1•BenjaminHas•14m ago•2 comments

Semiconductor Fabs II: The Operation

https://nomagicpill.substack.com/p/semiconductor-fabs-ii-the-operation
1•nomagicpill•15m ago•0 comments

The Interrogation Method to Agentic Coding

https://mough.xyz/2026/01/you-can-create-anything/
1•moughxyz•16m ago•0 comments

Stoolap 0.2 Released for Modern Embedded SQL Database in Rust

https://www.phoronix.com/news/Stoolap-0.2-Rust-Embedded-SQL
1•mikece•16m ago•0 comments

Louis Mosley of Palantir at the Science, Innovation and Technology Committee

https://committees.parliament.uk/oralevidence/16290/html/
1•macleginn•19m ago•0 comments

Classroom Phone Bans Work. So Why Don't All Schools Do It?

https://www.wsj.com/us-news/education/school-phone-ban-test-scores-66f8dab7
1•Fiveplus•21m ago•0 comments

Common pesticides and plastic chemicals stifle healthy gut bacteria

https://www.thenewlede.org/2025/11/pesticides-gut-microbiome-bacteria/
1•PaulHoule•22m ago•0 comments

BlueSCSI Ultra and Ultra Wide

https://bluescsi.com/ultra
1•p_ing•23m ago•0 comments

Show HN: FP-pack – Functional pipelines in TypeScript without monads

https://github.com/superlucky84/fp-pack
1•superlucky84•26m ago•1 comments

CS50x 2026: Artificial Intelligence [video]

https://www.youtube.com/watch?v=-9bo8HlSxwQ
1•vinhnx•28m ago•0 comments

The debate about whether the NHS should use magic mushrooms to treat depression

https://www.bbc.com/news/articles/ckg936l88e7o
1•bookofjoe•30m ago•1 comments

Google principal engineer Jaana Dogan – Cloude Code matched year of work in hour

https://twitter.com/rakyll/status/2007239758158975130
1•_____k•31m ago•0 comments

Internet is eating itself. What's next? Model collapse and AI data crisis

https://sderosiaux.substack.com/p/internet-is-eating-itself-whats-next
1•chtefi•33m ago•1 comments

JavaScript's For-Of Loops Are Fast

https://waspdev.com/articles/2026-01-01/javascript-for-of-loops-are-actually-fast
3•surprisetalk•35m ago•0 comments

Zeit v1

https://xn--gckvb8fzb.com/zeit-v1/
2•surprisetalk•35m ago•1 comments

Zeit: A command line tool for tracking time

https://zeit.observer
1•surprisetalk•36m ago•0 comments

Janderland/fenv: FoundationDB development environment

https://github.com/janderland/fenv
1•surprisetalk•36m ago•0 comments

Show HN: Turn raw screen recording into precise guide with annotated screenshots

1•docuagent•37m ago•0 comments

Inside Elon Musk's Optimus Robot Project

https://www.wsj.com/tech/elon-musk-optimus-robots-7196d53e
1•fortran77•38m ago•2 comments

Researchers spot Saturn-sized planet in the "Einstein desert"

https://arstechnica.com/science/2026/01/researchers-spot-saturn-sized-planet-in-the-einstein-desert/
2•Brajeshwar•43m ago•0 comments

Bacteria reveal second 'shutdown mode' for surviving antibiotic treatment

https://phys.org/news/2026-01-bacteria-reveal-shutdown-mode-surviving.html
1•Brajeshwar•43m ago•0 comments

First breathing 'lung-on-chip' developed using genetically identical cells

https://medicalxpress.com/news/2025-12-lung-chip-genetically-identical-cells.html
1•Brajeshwar•43m ago•0 comments

Show HN: Moonfall

https://moonfall.layogtima.com/
1•Layogtima•46m ago•0 comments

Why Haven't Trump's Tariffs Had a Bigger Impact?

https://www.nytimes.com/2026/01/03/business/economy/trump-tariffs-prices-impact.html
1•duxup•47m ago•1 comments

Lumina: A professional, lightweight Chrome extension for web highlighting

https://github.com/gezilinll/Lumina
1•gezilinll•48m ago•0 comments