Go-LLM-proxy v0.3 released – translating proxy for Claude Code and Codex

2•yatesdr•4h ago

Comments

yatesdr•4h ago

Happy to report v0.3 released for go-llm-proxy!

Great for connecting your local LLM coding and vision models to Claude Code and Codex.

General improvements

> Vision pipeline - images described by your vision model, transparent to the client

> Dual OCR pipeline - smart routing for PDFs and tool output (text extraction first, vision fallback for scanned docs). Dedicated OCR models like

> PaddleOCR-VL are ~17x faster than general vision models on document pages

> Brave & Tavily search integration - native behavior for Claude Code and Codex when configured on the proxy

> Per-model processor routing - override vision, OCR, and search settings per model

> Context window auto-detection from backends SSE keepalive improvements during pipeline processing Full MCP SSE endpoint for web search on OpenCode, Qwen Code, Claw, and other MCP-compatible agents Docker update for easier deployment (limited testing so far)

Codex-specific

> Full Responses API translation - Chat Completions under the hood, your local backend doesn't need to support /v1/responses

> Reasoning token display - reasoning_summary_text.delta events so Codex shows thinking natively

> Native search UI - emits web_search_call output items so Codex renders "Searched N results" in its interface

> Structured tool output - Codex's view_image returns arrays/objects, not strings. The proxy handles all three formats

> mcp_tool_call_output and mcp_list_tools input types handled (Codex sends these, other backends choke on them)

> Config generator produces config.toml with provider, reasoning effort, context window, and optional Tavily MCP

Claude Code-specific:

> Full Messages API translation - Anthropic protocol to Chat Completions, so Claude Code works with vLLM/llama-server

> Thinking blocks - backend reasoning tokens wrapped as thinking/signature_delta content blocks so Claude Code renders them

> web_search_20250305 server tool intercepted and executed proxy-side

> PDF type: "document" blocks extracted to text before forwarding

> Streaming search with server_tool_use + web_search_tool_result blocks so Claude Code shows "Did N searches"

> /anthropic/v1/messages explicit route for clients that use the Anthropic base URL convention

> Config generator produces settings.json with Sonnet/Opus/Haiku tier selectors, thinking toggles, and start scripts

Tell HN: Anthropic's "extra usage" can kick in before hitting the session quotas

Are we building IDEs for engineers anymore?

Show HN: Tmuzika – terminal music player in C (v1.1.2)

Claude Code caches unredacted session history and secrets in plaintext

I'm 16 and just published my first Python library – QuantX

Sleeping sickness drug simplifies treatment, raising hopes for eradication

The Iran war highlights the creeping use of AI in warfare

Show HN: Ray – an open-source AI financial advisor that runs in your terminal

Show HN: Cadenza – Connect Wandb logs to agents easily for autonomous research

Subway Challenge

Digger – Back and Digitally Remastered

Rescue team in Iran face 'harrowing and dangerous' search for US crew member

Show HN: SwarmFeed – An X-like social platform built for AI agents

The Republican Party Has a Nazi Problem

Show HN: Nelson, a Ralph-like loop for finding vulnerabilities

With the right cache, multiple 800K Opus sessions are still affordable

Previously untranslated or unpublished writings of Leibniz published next month

Whoop is trying to copyright UI patterns, activity rings, dark mode and words

vLLM introduces memory optimizations for long-context inference

We Score MCP Servers – and Why We Rebuilt It from Scratch

The Only Game Worth Playing

Map Gesture Controls - Control maps with your hands

Show HN: Local-first resume generator with in-browser PDF rendering

WebAssembly Explorer

Feds Seek Access to Three Texas State Parks for Border Wall

Debian Is Figuring Out How Age Verification Laws Will Impact It

A Python package for verifying PyPI attestations of other Python packages

ACE on a USB→HDMI Adapter

LLM 'benchmark' – writing code controlling units in a 1v1 RTS

Syd sandbox has new Tutorial