frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Small LLMs can outperform GPT-4 at detecting jailbreaks

https://romaingrx.com/blog/llm-as-a-jailbreak-judge/
2•romaingrx•3h ago

Comments

romaingrx•3h ago
I tested whether smaller, cheaper models could replace GPT-4 for jailbreak detection.

Results: Mistral 13B with prompt optimization achieved 81.67% accuracy vs GPT-4's 78.33% baseline - a 13% improvement while being ~20x cheaper to run.

Tested 3 approaches on 300 HarmBench samples: - Basic prompting: GPT-4 wins (78% vs 69%) - DSPy prompt optimization: Mistral 13B wins (82% vs 78%) - Multifaceted evaluation: Marginal gains (73%)

Code: https://github.com/romaingrx/llm-as-a-jailbreak-judge Detailed blog post: https://romaingrx.com/blog/llm-as-a-jailbreak-judge

Looking for feedback on the methodology and whether this cost/performance tradeoff would be useful for content moderation at scale.

Apple CEO's $100B Commitment

https://fortune.com/2025/08/07/apple-trump-tim-cook-100-billion-manufacturing-gift-plaque-gold/
1•01-_-•52s ago•0 comments

Uploading PDF via Files API and using in Streaming gives 400 bad request

https://github.com/openai/openai-python/issues/2472
1•peterkelly•1m ago•0 comments

Great Depression Facts

https://www.fdrlibrary.org/great-depression-facts
1•mooreds•1m ago•0 comments

Trade Desk Sentiment Collapses as Specter of Amazon Looms

https://www.bloomberg.com/news/articles/2025-08-08/trade-desk-sentiment-collapses-as-specter-of-amazon-looms
1•thm•1m ago•0 comments

Show HN: I built a (legit) AI mortgage document analyzer that saves you money

https://oxford.loan-estimate-analysis.morfi.com/
1•matthew-morfi•2m ago•0 comments

Ferrous: Redis-Compatible Server in Rust That Outperforms Valkey

https://github.com/iGentAI/ferrous
2•seanmmward•3m ago•0 comments

The Next Step for AI – Full Personal Interaction Capture

https://blog.automaton2000.com/2025/08/the-next-step-for-ai-full-personal.html
1•hydroreadsstuff•3m ago•0 comments

Philosophy of Information

https://plato.stanford.edu/entries/information/
2•mathattack•3m ago•0 comments

Show HN: I built a tool that lets you summon AI in any app or website

https://useinset.com
1•blaumaus•5m ago•0 comments

Spatial Joins in DuckDB

https://duckdb.org/2025/08/08/spatial-joins.html
3•tanelpoder•5m ago•0 comments

Programming with AI: You're Probably Doing It Wrong

https://www.devroom.io/2025/08/08/programming-with-ai-youre-probably-doing-it-wrong/
1•ariejan•6m ago•0 comments

'Stagflation is coming to the U.S.'

https://www.morningstar.com/news/marketwatch/20250808104/stagflation-is-coming-to-the-us-says-this-economist-heres-what-it-means-for-the-dollar-bonds-and-stocks
3•mooreds•6m ago•1 comments

The Sunday Morning Post: Whatever Happened to Serial Killers?

https://www.derekthompson.org/p/the-sunday-morning-post-whatever
1•gamechangr•7m ago•0 comments

Show HN: Live website annotations with source code

https://annotateweb.com/
1•tonysurfly•7m ago•0 comments

How Hungry Is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Infere

https://arxiv.org/abs/2505.09598
1•raybb•8m ago•0 comments

Show HN: I built a free fast-paced match-3 game I think you'll like

https://sqrz.app
2•jasonjmcghee•9m ago•0 comments

Europe doesn't have a startup problem, it has a storytelling problem

https://sifted.eu/articles/europe-storytelling-problem
3•aiaib323•9m ago•0 comments

OpenAI looking to spend after GPT-5 launch and is 'willing to run the loss'

https://www.cnbc.com/2025/08/08/chatgpt-gpt-5-openai-altman-loss.html
3•belter•10m ago•1 comments

OpenAI beats Elon Musk's Grok in AI chess tournament

https://www.bbc.com/news/articles/ce830l92p68o
3•onemoresoop•10m ago•0 comments

Big Tech's "Sovereign Cloud" promises just collapsed – in their own words

https://nextcloud.com/blog/big-techs-sovereign-cloud-promises-just-collapsed-in-their-own-words/
2•maverick74•10m ago•0 comments

Lerobot Now on Pip Install

https://twitter.com/ClementDelangue/status/1953836962563207644
1•clmnt•11m ago•0 comments

Rats wreaks havoc on California almonds – industry suffers $300M in damage

https://www.sfchronicle.com/california/article/california-almond-orchards-rat-infestation-20807124.php
2•littlexsparkee•11m ago•1 comments

Toyota to raise US auto prices by average $270 from July

https://www.reuters.com/business/autos-transportation/toyota-raise-us-auto-prices-by-more-than-200-july-bloomberg-news-reports-2025-06-21/
3•CGMthrowaway•11m ago•0 comments

How much better will vibecoding get

https://openai.com/research/
2•AbdMog•12m ago•2 comments

FedRAMP Marketplace

https://marketplace.fedramp.gov/products
1•mooreds•13m ago•0 comments

Simple Is a Scam

https://nocomplexity.com/simple-is-a-scam/
2•runningmike•14m ago•0 comments

Open source Mastodon begins raising funds with new in-app donation feature

https://techcrunch.com/2025/07/23/open-source-x-rival-mastodon-begins-raising-funds-with-new-in-app-donation-feature/
1•PaulHoule•14m ago•0 comments

David Foster Wallace interview on Charlie Rose (1997) [video]

https://www.youtube.com/watch?v=GopJ1x7vK2Q
1•neko_ranger•14m ago•0 comments

Propshaft Performance Issues on Rails 8

https://www.brethorsting.com/blog/2025/08/propshaft-performance-issues-on-rails-8/
1•aaronbrethorst•15m ago•0 comments

Intel CEO Lip-Bu Tan Is Already at Odds with His Board

https://www.wsj.com/tech/intel-ceo-lip-bu-tan-trump-board-9cc08631
1•walterbell•15m ago•1 comments