LLMxRay is an open-source tool to inspect how different LLMs handle the same prompt. It focuses on three things:
• showing prompts and responses side by side for multiple models
• exposing token counts and tokenization details
• comparing behavior across languages and model families
It works with local models (e.g. via Ollama/LM Studio) and API-based models. The interface lets you run the same prompt against several models and see how length, phrasing, and token usage differ.
LLMxRay currently supports four languages (English, French, Arabic, Chinese) so you can see how tokenization and expression change across writing systems (Latin, RTL Arabic, and Chinese characters). This makes it useful for understanding multilingual behavior, cost differences, and prompt design across languages.
The project is early but usable. I’d be interested in feedback on the concept, the UI, and what kinds of comparisons or visualizations would be most useful to you.
GitHub: https://lognebudo.github.io/llmxray/