frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Zram as Swap

https://wiki.archlinux.org/title/Zram#Usage_as_swap
1•seansh•5m ago•0 comments

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

https://greensdictofslang.com/
1•mxfh•6m ago•0 comments

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

https://www.bloomberg.com/news/articles/2026-02-06/nvidia-ceo-says-ai-capital-spending-is-appropr...
1•virgildotcodes•9m ago•2 comments

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

https://www.styloshare.com
1•stylofront•10m ago•0 comments

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

1•PhantomKey•14m ago•0 comments

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

https://github.com/qrafty-ai/teleop_xr
1•playercc7•17m ago•1 comments

The Highest Exam: How the Gaokao Shapes China

https://www.lrb.co.uk/the-paper/v48/n02/iza-ding/studying-is-harmful
1•mitchbob•21m ago•1 comments

Open-source framework for tracking prediction accuracy

https://github.com/Creneinc/signal-tracker
1•creneinc•23m ago•0 comments

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
2•Osiris30•24m ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•27m ago•0 comments

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•29m ago•2 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•31m ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•32m ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•37m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•38m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
8•witnessme•42m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
2•aloukissas•46m ago•1 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
2•bigbromaker•49m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•55m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
6•alephnerd•57m ago•2 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•58m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
2•pbradv•1h ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
4•hasheddan•1h ago•0 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•1h ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•1h ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
41•duxup•1h ago•10 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•1h ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•1h ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•1h ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•1h ago•0 comments
Open in hackernews

Show HN: Simulation-Based Testing for Agents Using AG-UI Protocol

https://github.com/langwatch/scenario
6•0xdeafcafe•7mo ago

Comments

rchaves•7mo ago
Hello HN!

tl;dr: We built Scenario, an open-source testing library for AI agents. It simulates real conversations with your agent, its code-driven, and lets you assert anything mid-dialogue. Repo: https://github.com/langwatch/scenario Docs: https://scenario.langwatch.ai/

I'm Rogerio, founder of LangWatch, I've been helping many customers building LLM applications in this past two years and worked with Alex on this.

Most of the efforts for LLM quality so far were about evaluations, single-turn, there was nothing actually good to test agents, it all felt forced, but we believe we cracked it now, we have built an agent testing library that test your agent by simulating a user and playing a conversation back and forth with it.

One of the key challenges there was that we had to make it compatible with all the 273+ AI frameworks (and counting) there are. Luckliy AG-UI protocol popped up recently, standardizing agents frameworks and UI interactions, this is perfect, because at the end of the day, we want our user simulator to "see" just the same that the user sees.

So we made Scenario in a way that is really easy to connect to any agent no matter the tech stack, from a simple string <-> string connection, to openai standard messages format, to AG-UI.

The other key challenge was to balance testing the open-endedness of agents vs having reliable cases you want to test, so we worked a lot on thinking through the autopilot simulation vs the fully scripted one, and here again, the goal was complete interoperability. At the end of the day, the design we achieved was simply having lambdas, that you can call at any point of the test, so it's just code, where you can connect any other evaluation or assertion tool you want, we are not restrictive.

Check out the repo and the docs, we would love to get some feedback in here!

Repo: https://github.com/langwatch/scenario Docs: https://scenario.langwatch.ai/