frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

1•amichail•1m ago•0 comments

Show HN: Convert your articles into videos in one click

https://vidinie.com/
1•kositheastro•4m ago•0 comments

Red Queen's Race

https://en.wikipedia.org/wiki/Red_Queen%27s_race
2•rzk•4m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•gozzoo•7m ago•0 comments

A Horrible Conclusion

https://addisoncrump.info/research/a-horrible-conclusion/
1•todsacerdoti•7m ago•0 comments

I spent $10k to automate my research at OpenAI with Codex

https://twitter.com/KarelDoostrlnck/status/2019477361557926281
2•tosh•8m ago•0 comments

From Zero to Hero: A Spring Boot Deep Dive

https://jcob-sikorski.github.io/me/
1•jjcob_sikorski•8m ago•0 comments

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

https://zenodo.org/records/18395618
1•alemonti06•13m ago•1 comments

Cook New Emojis

https://emoji.supply/kitchen/
1•vasanthv•16m ago•0 comments

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•19m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•20m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
1•michalpleban•20m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•21m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
2•mitchbob•21m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
2•alainrk•22m ago•0 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•23m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
2•edent•26m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•30m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•30m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•35m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
4•onurkanbkrc•36m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor
1•concode0•36m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/
1•panossk•39m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/
1•headalgorithm•42m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/
1•marklit•42m ago•0 comments

Ask HN: How do you figure out where data lives across 100 microservices?

1•doodledood•42m ago•0 comments

Motus: A Unified Latent Action World Model

https://arxiv.org/abs/2512.13030
2•mnming•42m ago•0 comments

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

https://www.thedailybeast.com/obsessed/rotten-tomatoes-desperately-claims-impossible-rating-for-m...
4•juujian•44m ago•2 comments

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660
1•thunderbong•46m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/
1•alkyon•48m ago•0 comments
Open in hackernews

Show HN: High-performance GenAI engine now open source

https://github.com/arthur-ai/arthur-engine
22•fryz•9mo ago
Hey HN

After one too many customer firedrills regarding hallucinating or insecure AI models, we built a system to catch these issues before they reached production. The Arthur Engine has been running in Fortune 100 to AI Native Start-Ups over the past two years, putting security controls around more than 10 billion tokens in production every month. We're now opening up this service to developers, enabling you to leverage enterprise-grade solutions to provide guardrails and evals as a service, all for free.

Get it on Github (https://github.com/arthur-ai/arthur-engine) to start evaluating your models today

Highlights of Arthur's Engine include:

* Built for speed and scale: It performs well with p90 latencies of sub-second well over 100+ RPS

* Made for full lifecycle support: Ideal for pre-production validation, real-time guardrails, and post-production monitoring.

* Ease of use: It is designed to be easy for anyone to run and deploy whether you're working on it locally during development, or you're deploying it within a horizontally-scaling architecture for large-scale workloads.

* Unification of generative and traditional AI: The Arthur AI Engine can be used to evaluate a diverse range of models from LLMs and Agentic AI systems to binary classifiers, regression models, recommender systems, forecasting models, and more.

* Content-specific guardrail and detection features: Ranging from toxicity and hallucination detection to sensitive data (like PII, keyword/regex and custom rules) and prompt injection.

* Customizability: Plug in your own models or integrate with other model or guardrail providers with ease, and tailor the system to match your specific needs.

Having been first-hand witnesses to the lack of adequate AI monitoring tools and the general under delivery of Gen AI systems in production, we believe that such a capability shouldn't be exclusive to big-budget organizations. Our mission is to make AI better, for everyone, and we believe by opening up this tool we can help more people get to that goal.

Check out our GitHub repo for examples and directions on how to use the Arthur AI Engine for various purposes such as validation during development, real-time guardrails or performance troubleshooting using enriched logging data. (https://github.com/arthur-ai/engine-examples)

We can’t wait to see what you build

— Zach and Team Arthur

Comments

kacperek0•9mo ago
Cool, I'm running few GenAI automations, but they're rather unsupervisored. So I'm gonna try it and check how they're doing.
Lupita___•9mo ago
Thanks for sharing! This looks perfect for teams getting started with monitoring for all model types -- excited to try it out!
serguei•9mo ago
We've been ramping up our gen ai usage for the last ~month at Upsolve and it's becoming a huge pain. There are already a million solutions for observability out there, but I like that this one is open source and can detect hallucinations

Thanks for open sourcing and sharing, excited to try this out!!

fryz•9mo ago
Yeah thanks for the feedback.

We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.

A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.

Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.

cipherchain111•9mo ago
Very cool!
pierniki•9mo ago
Yoo! Hopefully no more "oops our AI just leaked the system prompt" moments thanks to these guardrails!
vparekh1995•9mo ago
Excited to get hands on with this. I've had too many sleepless nights trying to figure out how to track when my agents were hallucinating.
Gabriel_h•9mo ago
Interesting, AI needs much better guardrails and monitoring!
iabouhashish•9mo ago
Very excited to be trying this out! The examples look very useful and excited to tie it up with other open source solutions
jdbtech•9mo ago
Looks great! How does the system detect hallucinations?
fryz•9mo ago
Yeah great question

We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)

We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.

madeleinelane•9mo ago
Love this. More transparency + better tooling is exactly what AI needs right now. Excited to give it a try.