Show HN: High-performance GenAI engine now open source

https://github.com/arthur-ai/arthur-engine

18•fryz•4h ago

Hey HN

After one too many customer firedrills regarding hallucinating or insecure AI models, we built a system to catch these issues before they reached production. The Arthur Engine has been running in Fortune 100 to AI Native Start-Ups over the past two years, putting security controls around more than 10 billion tokens in production every month. We're now opening up this service to developers, enabling you to leverage enterprise-grade solutions to provide guardrails and evals as a service, all for free.

Get it on Github (https://github.com/arthur-ai/arthur-engine) to start evaluating your models today

Highlights of Arthur's Engine include:

* Built for speed and scale: It performs well with p90 latencies of sub-second well over 100+ RPS

* Made for full lifecycle support: Ideal for pre-production validation, real-time guardrails, and post-production monitoring.

* Ease of use: It is designed to be easy for anyone to run and deploy whether you're working on it locally during development, or you're deploying it within a horizontally-scaling architecture for large-scale workloads.

* Unification of generative and traditional AI: The Arthur AI Engine can be used to evaluate a diverse range of models from LLMs and Agentic AI systems to binary classifiers, regression models, recommender systems, forecasting models, and more.

* Content-specific guardrail and detection features: Ranging from toxicity and hallucination detection to sensitive data (like PII, keyword/regex and custom rules) and prompt injection.

* Customizability: Plug in your own models or integrate with other model or guardrail providers with ease, and tailor the system to match your specific needs.

Having been first-hand witnesses to the lack of adequate AI monitoring tools and the general under delivery of Gen AI systems in production, we believe that such a capability shouldn't be exclusive to big-budget organizations. Our mission is to make AI better, for everyone, and we believe by opening up this tool we can help more people get to that goal.

Check out our GitHub repo for examples and directions on how to use the Arthur AI Engine for various purposes such as validation during development, real-time guardrails or performance troubleshooting using enriched logging data. (https://github.com/arthur-ai/engine-examples)

We can’t wait to see what you build

— Zach and Team Arthur

Comments

kacperek0•4h ago

Cool, I'm running few GenAI automations, but they're rather unsupervisored. So I'm gonna try it and check how they're doing.

Lupita___•4h ago

Thanks for sharing! This looks perfect for teams getting started with monitoring for all model types -- excited to try it out!

serguei•4h ago

We've been ramping up our gen ai usage for the last ~month at Upsolve and it's becoming a huge pain. There are already a million solutions for observability out there, but I like that this one is open source and can detect hallucinations

Thanks for open sourcing and sharing, excited to try this out!!

cipherchain111•4h ago

Very cool!

pierniki•4h ago

Yoo! Hopefully no more "oops our AI just leaked the system prompt" moments thanks to these guardrails!

vparekh1995•4h ago

Excited to get hands on with this. I've had too many sleepless nights trying to figure out how to track when my agents were hallucinating.

Gabriel_h•4h ago

Interesting, AI needs much better guardrails and monitoring!

iabouhashish•3h ago

Very excited to be trying this out! The examples look very useful and excited to tie it up with other open source solutions

jdbtech•2h ago

Looks great! How does the system detect hallucinations?

madeleinelane•1h ago

Love this. More transparency + better tooling is exactly what AI needs right now. Excited to give it a try.

Show HN: I made an open source, keyboard-first AI writing assistant

Learning from Video Games

Leaping Forward: Imagination and the Evolution of LLMs

io_uring based rootkit can bypass syscall-focused Linux security tools

TSMC chips to hit 1.4nm in 2028, with confusing name confirmed

Half of the universe's hydrogen gas, long unaccounted for, has been found

I accidentally built a tool that's replacing entire design teams

IBM Z17: The First Mainframe Engineered for the AI Age

Illuminating Science and Math – Quanta Books

One-Second Deploys? We Didn't Believe It Either

New-gen electric car battery promises 1500km range, 515km charge in five minutes

Martin Wolf on Trump's Shakeup of the Global Order [video]

Meet Anagram, a Linux-powered bass FX processor

Connecting nonprofits to the resources; helping businesses make an impact

Ask HN: Is it a good idea to make a tiny AI for fixing form input formats?

Perplexity AI enters the smartphone market with Motorola partnership

Trackers: Implementations of leading multi-object tracking algorithms

Simon Haykin

The Stock Market Loves Bitcoin

Almost Half of Americans Breathe Unhealthy Air, Report Finds

Blue Shield says it shared health info on up to 4.7M patients with Google Ads

PgDog is a transaction pooler logical replication manager can shard PostgreSQL

The 'Profound' Experience of Seeing a New Color

The Prophet's Paradox

Bulk Object Storage data migration with SkyPilot

Dutch privacy regulator concerned over Meta's AI plans using Facebook, Instagram

An LLM-Based Approach to Review Summarization on the App Store

An Interview with Eric Seufert About Digital Advertising During Political

Show HN: I reverse engineered top websites to build an animated UI library

The Klarna-Affirm Rivalry Reshaping One of the Biggest Fintech IPOs