frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: StructEval - a structured output evaluation and comparison tool

https://github.com/jhiker/structeval
1•jwesleyharding•1h ago
I'm sharing a simple utility I worked on to make evaluation of structured outputs from LLMs a bit simpler.

For context:

When evaluating structured outputs, you often want to composable comparison logic to allow for meaningful comparison across different types of outputs (free text, enums, ints, and all the other json tyeps). You also want to compare arrays as a multisets -- order-agnostic pairwise matching across elements in sets.

What it is: This CLI and python library (I called "structeval" but not to be compared to the LLM eval framework with the same name -- I may change it!) supports order-agnostic pairwise matching, customizable comparison logic, and recursive metric aggregation. It can also be used to compare outputs when sampling from an LLM with N>1 to measure semantic entropy or find the "median" result. As it works as a generic json tool without requiring a schema, it could also be applied, at least in principle, as a more configurable (and quirkier :) ) alternative to a generic diffing tool like jd.

I had struggled with this task in a few contexts and found I was often rewriting a utility like this, so figured it may be helpful for others if encapsulated in a little library.

But I'm curious if any feedback or suggestions!

Cobalt Qube

https://people.eecs.berkeley.edu/~william/blog/CobaltQube.html
1•naves•30s ago•0 comments

Antic Magazine Interviews Alan Reeve, the Creator of the Diamond OS (1990)

https://computeradsfromthepast.substack.com/p/antic-magazine-interviews-alan-reeve
1•rbanffy•40s ago•0 comments

Running the Google Pixel Camera app on a robustly de-Googled cellphone

https://kevinboone.me/degoogled-gcam.html
1•speckx•1m ago•0 comments

Show HN: Producthunt Alternative

https://www.nxgntools.com/
1•doppelgunner•2m ago•0 comments

Hamilton Smith obituary:co-discovered precise molecular scissors for cutting DNA

https://www.nature.com/articles/d41586-025-03635-y
2•bookofjoe•4m ago•0 comments

Show HN: Metcalfe – private network for marketplace operators

https://jpdpeters.github.io/metcalfe/
1•jpdpeters•4m ago•0 comments

A modern 35mm film scanner for home

https://www.soke.engineering/
1•QiuChuck•4m ago•0 comments

Contacted by the US Secret Service and the AI Surveillance Center Dystopia [video]

https://www.youtube.com/watch?v=qG4ektofzkI
1•Stevvo•5m ago•0 comments

The AI Surveillance Dystopia: Spying, Data Trafficking, & Corruption

https://store.gamersnexus.net/ai-dystopia
1•Stevvo•6m ago•0 comments

A Catalog of Side Effects

https://bernsteinbear.com/blog/compiler-effects/
3•speckx•8m ago•0 comments

Slow moving UX disaster: Passkeys are now required

4•eep_social•9m ago•0 comments

It Is a Perl

https://number-garden-principle.netlify.app/
1•cpuXguy•12m ago•0 comments

Agentic Pelican on a Bicycle

https://www.robert-glaser.de/agentic-pelican-on-a-bicycle/
1•todsacerdoti•12m ago•0 comments

Ring-1T: open-source, SOTA thinking model with a trillion parameters

https://huggingface.co/inclusionAI/Ring-1T
2•dr_kiszonka•13m ago•0 comments

Instead of forcing buy-in, make it fun

https://chrislesinski.com/2025/11/07/make-it-fun/
1•lesinski•13m ago•0 comments

How to Train an LLM: Part 1

https://omkaark.com/posts/llm-1b-1.html
2•parthsareen•14m ago•1 comments

Show HN: SceneReaderAI – Hear your script read aloud by AI voices (free trial)

https://scenereaderai.com
1•jumpstartups•14m ago•0 comments

Will I Make It to the Restaurant Before the Soup Dumplings Get Cold?

https://www.distributedthoughts.org/will-i-make-it-to-the-restaurant-before-the-soup-dumplings-ge...
1•speckx•14m ago•0 comments

Hiring and the Market for Lemons

https://danluu.com/hiring-lemons/
1•xmprt•15m ago•0 comments

Can Elon Musk Read Your X Chat Messages?

https://david.nepozitek.cz/blog/can-elon-musk-read-your-x-chat-messages/
2•david_nepozitek•16m ago•1 comments

Vertical Integration is the only thing that matters

https://becca.ooo/blog/vertical-integration/
1•miguelraz•17m ago•0 comments

Show HN: WorkBill – Modern Alternative to QuickBooks

https://demo.workbill.co/inbox
1•aswinmohanme•18m ago•0 comments

<Ruby>: The Ruby Annotation element

https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/ruby
2•r3tr0•20m ago•0 comments

Lidar technology improves forest assessment with laser beams

https://phys.org/news/2025-10-eyes-trees-lidar-technology-forest.html
2•PaulHoule•21m ago•0 comments

Reddit mod jailed for sharing movie sex scenes in rare "moral rights" verdict

https://arstechnica.com/tech-policy/2025/11/reddit-mod-jailed-for-sharing-movie-sex-scenes-in-rar...
3•duxup•21m ago•0 comments

Rust 1.91.1

https://blog.rust-lang.org/2025/11/10/Rust-1.91.1/
1•andrewstetsenko•21m ago•0 comments

Disaggregated Database Management Systems

http://muratbuffalo.blogspot.com/2025/11/disaggregated-database-management.html
1•KraftyOne•23m ago•0 comments

Cursor CEO on Scaling and the Coming 'iPhone Moment' for AI Coding

https://founderboat.com/interviews/2025-11-10-cursor/
1•chaosprint•25m ago•0 comments

New Linux patch lets you cancel the hibernation process

https://www.theregister.com/2025/10/22/linux_hibernation_patch/
2•mfilion•26m ago•0 comments

VMS/XDE: An OpenVMS x86 Development Environment for Linux and Windows/WSL

https://www.osnews.com/story/143769/vms-xde-an-openvms-x86-development-environment-for-linux-and-...
1•naves•26m ago•0 comments