I am SUPER EXCITED to publish the 121st episode of the Weaviate Podcast featuring Leonard Tang, Co-Founder of Haize Labs!
Evals are one of the hottest topics out there for people building AI systems. Leonard is absolutely at the cutting edge of this, and I learned so much from our chat!
The podcast covers tons of interesting nuggets around how LLM-as-Judge / Reward Model systems are evolving. Ideas such as UX for Evals, Contrastive Evaluations, Judge Ensembles, Debate Judges, Curating Eval Sets and Adversarial Testing, and of course... Scaling Judge-Time Compute!! --
I highly recommend checking out their new library, `Verdict`, a declarative framework for specifying and executing compound LLM-as-Judge systems.
I hope you find the podcast useful! As always, more than happy to discuss these ideas further with you!
YouTube: https://www.youtube.com/watch?v=KFrKLkJzNDQ
Spotify: https://creators.spotify.com/pod/show/weaviate/episodes/Haize-Labs-with-Leonard-Tang---Weaviate-Podcast-121-e32mts3