frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: We tested 9 AI models with 37K+ security tests

https://www.modelred.ai/leaderboard
1•NabilModelRed•1h ago
Hi everyone,

We built ModelRed to test AI models and apps for security issues. Ran 4,182 attack probes against 9 leading models to see what would break.

Leaderboard: https://modelred.ai (no signup, just check it out)

Claude scored 9.5/10 but still failed on medical/financial prompts. Mistral Large scored 3.3/10. The gap between best and worst is huge.

We test for prompt injections, data leaks, jailbreaks, risky tool calls, domain specific hacks, basically everything that goes wrong when your LLM has access to real data and APIs. The platform runs these tests continuously and blocks CI/CD deployments when scores drop.

Works with any provider (OpenAI, Anthropic, AWS, Huggingface endpoints, OpenRouter etc).

Looking for around 20 people/teams shipping AI in production to be early design partners, help us figure out what features actually matter, contribute attack vectors, shape the roadmap.

Weirdest finding: same prompt injection works on 60% of models because everyone copies the same defense patterns.

Happy to answer questions about methodology, specific vulnerabilities, or if you want to be a design partner.

The Most Fascinating Findings After a Quarter Century of Science in the ISS

https://nautil.us/the-most-fascinating-findings-after-a-quarter-century-of-science-in-the-iss-124...
1•Bender•53s ago•0 comments

The Web Animation Performance Tier List

https://motion.dev/blog/web-animation-performance-tier-list
1•rustc•1m ago•0 comments

Show HN: NanaVis – Upload one image, describe the change

https://nanavis.com/
1•qqxufo•7m ago•0 comments

Show HN: Serverless platform for inference of time-series foundation models

https://faim.it.com/
1•ChernovAndrei•8m ago•0 comments

Fintech CEO caught manipulating social media likes

https://patrickstoica.substack.com/p/fintech-ceo-caught-manipulating-social
4•puzzlewhistle•9m ago•0 comments

TRON Bag (2012)

https://learn.adafruit.com/tron-bag/overview
1•bariumbitmap•9m ago•0 comments

Beads: Beads – A memory upgrade for your coding agent

https://github.com/steveyegge/beads
1•PaulHoule•10m ago•0 comments

The rise of 'Slow AI': Why devs should stop speedrunning stupid

https://www.coderabbit.ai/blog/the-rise-of-slow-ai-why-devs-should-stop-speedrunning-stupid
1•aravindputrevu•11m ago•0 comments

The Most Magical Formula in the World- Exploring the Power of Residues

https://www.cantorsparadise.com/complex-analysis-infinite-series-integration-formula-fcc9b73143b3
1•malshe•11m ago•0 comments

Understanding multi GPU Parallelism paradigms

https://datta0.github.io/posts/understanding-multi-gpu-parallelism-paradigms/
2•allenleee•13m ago•0 comments

A new patch could help to heal the heart

https://news.mit.edu/2025/new-patch-could-help-heal-heart-1104
2•Marceltan•13m ago•0 comments

You can just read 25 books

https://a16z.substack.com/p/you-can-just-read-25-books
1•rawgabbit•13m ago•0 comments

Apple Silicon and the Developer Dilemma

https://sagittarius-a.org/blog/apple_dilemma/
1•xkv6•14m ago•0 comments

Show HN: TRex – macOS OCR menu bar app, now with 100+ languages

https://github.com/amebalabs/TRex
1•melonamin•14m ago•0 comments

U.S. Private Sector Added 42,000 Jobs in October, Says Payroll Processor

https://www.wsj.com/economy/jobs/u-s-hiring-rises-for-first-time-since-july-adp-reports-3df1d712
3•JumpCrisscross•14m ago•0 comments

Bikeshedding `Handle` and other follow-up thoughts

https://smallcultfollowing.com/babysteps/blog/2025/11/05/bikeshedding-handle/
1•emschwartz•16m ago•0 comments

Buildkite Broke Up (With) Its 56 TiB Database [video]

https://www.youtube.com/watch?v=G2xZTVUFfgM
2•intheairtonight•16m ago•0 comments

Ask HN: How do you feel about the increasing amount of AI comments on HN?

3•Gooblebrai•18m ago•6 comments

My chilling week on Roblox: sexually assaulted and shat on as a child avatar

https://www.theguardian.com/games/2025/nov/05/roblox-game-robux-children-child-kids-safety-parent...
2•c420•20m ago•0 comments

Show HN: sudocode – manage specs, tasks, and context-as-code for coding agents

https://github.com/sudocode-ai/sudocode
5•alexsngai•22m ago•0 comments

Bear attack survival tips released in Japan as encounters surge

https://www.theguardian.com/world/2025/oct/27/bear-attack-survival-tips-released-in-japan-as-enco...
1•tosh•22m ago•0 comments

Create secure data rooms in minutes with your existing repo

https://doclair.io/
1•adig_279•23m ago•0 comments

Show HN: Zee – AI that interviews everyone so you only meet the best

https://www.zeeda.com/
1•davecarruthers•25m ago•0 comments

OpenAI ends legal and medical advice on ChatGPT

https://www.ctvnews.ca/sci-tech/article/openai-updates-policies-so-chatgpt-wont-provide-medical-o...
4•randycupertino•26m ago•1 comments

Windows 11 Store gets Ninite-style multi-app installer feature

https://www.bleepingcomputer.com/news/microsoft/windows-11-store-gets-ninite-style-multi-app-inst...
1•speckx•27m ago•0 comments

Experimenting with Vibration Sensors – Characterize RPM of Spinning Devices

https://community.element14.com/challenges-projects/design-challenges/experimenting-with-vibratio...
1•o4c•30m ago•0 comments

I Built a Local Dev Tool for ChatGPT Apps SDK

https://itsnikhil.github.io/blog/posts/oai-app-composer/
1•itsnikhil•31m ago•1 comments

Benchmarking the Cost of Java's EnumSet – A Second Look

https://www.kinnen.de/blog/enumset-benchmark/
3•birdculture•34m ago•0 comments

Show HN: Code tours and feedback with your Agent in VSCode – local and cloudless

https://www.intraview.ai/hn-demo
5•cyrusradfar•34m ago•0 comments

Upbeat Technology's RISC-V MCU Takes Flight with Near-Threshold Computing

https://www.allaboutcircuits.com/news/upbeat-technologys-risc-v-mcu-takes-flight-with-near-thresh...
1•warrenm•35m ago•0 comments