frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: FC-Eval – CLI to Benchmark Local or Cloud LLMs on Function Calling

https://github.com/gauravvij/function-calling-cli
3•gauravvij137•1h ago
I built FC-Eval to have a repeatable way to evaluate how well different LLMs handle function calling before using them in agent workflows.

It runs models through 30 test cases covering single-turn, multi-turn, and agentic scenarios, modeled loosely after the Berkeley Function Calling Leaderboard methodology.

Validation uses AST matching rather than string comparison to avoid false positives from formatting variations.

Supports two backends: OpenRouter for cloud models (GPT-5.2, Claude, Qwen 3.5, Mistral, etc.) and Ollama for local models with no API key needed.

Tests for best of N trials giving you a reliable score alongside raw accuracy.

Results export to JSON, TXT, CSV, or Markdown.

Quick start commands: Via Openrouter: `fc-eval --provider openrouter --models openai/gpt-5.2 anthropic/claude-sonnet-4.6`

Via Ollama: `fc-eval --provider ollama --models llama3.2`

GitHub repo: https://github.com/gauravvij/function-calling-cli

Happy to answer questions, especially around the test case design or validation logic.

'The Secret Agent': Exploring a Vibrant, yet Violent Brazil

https://theasc.com/articles/the-secret-agent-cinematography
1•tambourine_man•1m ago•0 comments

I built HiddenMRR – find revenue opportunities in your old GitHub repos

https://www.hiddenmrr.com
1•pintayo•1m ago•0 comments

Love Letter to the Claude Code Docs – Tips from the Docs That Changed How I Work

https://www.tyleo.com/blog/love-letter-to-the-claude-code-docs
1•tyleo•2m ago•0 comments

Introduction to Data-Centric Query Compilation

https://duckul.us/blog/data-centric-query-compilation
1•PaulHoule•3m ago•0 comments

Powers of finite decimals are finite decimals

https://www.johndcook.com/blog/2026/03/17/powers-dont-clear-fractions/
1•ibobev•3m ago•0 comments

The Putney Debates (1647)

https://www.putneydebates.com/
1•_doctor_love•3m ago•0 comments

Tone Row Operations

https://www.johndcook.com/blog/2026/03/17/tone-row-operations/
1•ibobev•3m ago•0 comments

Protein complexes added to AlphaFold Database

https://www.embl.org/news/science-technology/first-complexes-alphafold-database/
1•mrkO99•4m ago•0 comments

How the Pokémon franchise has helped to shape neuroscience

https://www.nature.com/articles/d41586-026-00861-w
1•Brajeshwar•4m ago•0 comments

Aqara G350 first Matter-certified camera for multi-platform homes

https://www.aqara.com/us/product/camera-hub-g350/
1•hmokiguess•4m ago•0 comments

MCP vs. CLI Is the Wrong Fight

https://smithery.ai/blog/mcp-vs-cli-is-the-wrong-fight
2•nadis•5m ago•0 comments

Underrated Postgres: Create (Extended) Statistics

https://vela.simplyblock.io/blog/postgres-create-extended-statistics/
2•ronakjalan98•6m ago•0 comments

Show HN: Introducing Unsloth Studio

https://github.com/unslothai/unsloth
2•danielhanchen•6m ago•0 comments

Show HN: Cuckoo-GPU – A 350x faster Bloom filter alternative for GPUs

https://github.com/tdortman/Cuckoo-GPU
1•tdortman•6m ago•1 comments

We give every user SQL access to a shared ClickHouse cluster

https://trigger.dev/blog/how-trql-works
1•eallam•7m ago•0 comments

Forge – OSS governance plugin for Claude Code (22 agents, SDD, quality gates)

https://github.com/nxtg-ai/forge-plugin
2•vipdestiny•9m ago•1 comments

Show HN: PUNK – Remote control for local Claude Code that just works

https://punkcode.rocks
1•jackjackpop•9m ago•1 comments

Show HN: Llamactl – Self-hosted LLM manager with OpenAI-compatible routing

https://github.com/lordmathis/llamactl
1•lordmathis•10m ago•0 comments

OpenAI courts private equity to join enterprise AI venture

https://www.reuters.com/business/openai-courts-private-equity-join-enterprise-ai-venture-sources-...
1•gmays•10m ago•0 comments

Yet Another SQLite-Vector

https://github.com/jtarchie/sqlite-vector
1•jtarchie•10m ago•1 comments

The Singularity Will Not Be Streamed

https://paoramen.fika.bar/the-singularity-will-not-be-streamed-01KJ7KM42KET7EZQQSYD836358
1•masylum•11m ago•0 comments

Sigwork – A 1.7kb signal-based reactive framework

https://framework.thatjust.works/
1•murillobrand•11m ago•1 comments

I love my dumb watches

https://gary.onl/a-post-about-watches/
1•abnercoimbre•11m ago•0 comments

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

https://github.com/antflydb/antfly
9•kingcauchy•12m ago•0 comments

Show HN: I indexed 58K AI agents and built trust scores for the agent economy

https://nanosec.ai
1•bobakTamaddon•13m ago•0 comments

I think AI is pushing me toward the AGPL

https://blogsystem5.substack.com/p/ai-and-agpl-licensing
3•LorenDB•14m ago•0 comments

A.B. 1043's Internet Age Gates Hurt Everyone – Eff.org

https://www.eff.org/deeplinks/2026/03/ab-1043s-internet-age-gates-hurt-everyone
2•netule•15m ago•0 comments

Math in the AI Era

https://3quarksdaily.com/3quarksdaily/2026/03/math-in-the-ai-era.html
3•thm•15m ago•0 comments

Turkish Coffee? Since the 16th Century, It's in the Water

https://specialprojects.sprudge.com/?p=868
4•speckx•15m ago•0 comments

TV Learned to Sell Itself

https://worksinprogress.co/issue/how-tv-learned-to-sell-itself/
1•ortegaygasset•15m ago•0 comments