frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Most LLMs Are Failing Key Real-World Safety Tests. Here's the Data

https://medium.com/aymara-ai/the-aymara-llm-risk-responsibility-matrix-5a243fcf7e38
2•gyanveda•4h ago

Comments

gyanveda•4h ago
We tested 20 of the most popular LLMs against 10 real-world risks, including:

- Privacy & Impersonation

- Unqualified Professional Advice

- Child & Animal Abuse

- Misinformation

What we found:

- Anthropic's Claude Haiku 3.5 was the safest, scoring 86% (others dropped as low as 52%)

- Privacy & Impersonation were the top failure points, with some models failing 91% of the time

- Most models performed best on misinformation, hate speech, and malicious use

- No model is 100% safe, but Anthropic, OpenAI, Amazon, and Google consistently outperform peers

We built this matrix (and dev tools to build your own) to help teams measure AI risk more easily.

A simple monthly injection allows mice to live 25% longer and free from diseases

https://english.elpais.com/science-tech/2024-07-17/a-simple-monthly-injection-allows-mice-to-live-25-longer-and-free-from-diseases.html
1•speckx•43s ago•0 comments

Symbolic 'science fair' showcases research cut by Trump team

https://www.nature.com/articles/d41586-025-02164-y
1•Bluestein•44s ago•0 comments

Scientists 3D print tumors for cancer research

https://www.tomshardware.com/3d-printing/scientists-3d-print-tumors-for-cancer-research-tissuetinker-using-3d-bioprinting-to-create-miniature-models-of-healthy-and-diseased-tissue-for-side-by-side-comparison-backed-by-mcgill
1•giuliomagnifico•1m ago•0 comments

Perplexity just launched Comet, an AI web browser

https://www.theverge.com/news/703037/perplexity-ai-web-browser-comet-launch
2•cpeterso•6m ago•0 comments

Ancient pathogen became deadlier when humans started wearing wool

https://www.nature.com/articles/d41586-025-01631-w
1•rntn•9m ago•0 comments

OpenAI to release web browser in challenge to Google Chrome

https://www.reuters.com/business/media-telecom/openai-release-web-browser-challenge-google-chrome-2025-07-09/
2•jmsflknr•10m ago•0 comments

LangChain is about to become a unicorn, sources say

https://techcrunch.com/2025/07/08/langchain-is-about-to-become-a-unicorn-sources-say/
2•clemo_ra•11m ago•0 comments

Finding PBHs Using the LSST Will Be a Statistical Challenge

https://www.universetoday.com/articles/finding-pbhs-using-the-lsst-will-be-a-statistical-challenge
1•rbanffy•12m ago•0 comments

<Now Go Bang > the REM-Arkable Misadventures of List

https://www.masswerk.at/nowgobang/2025/the-remarkable-misadventures-of-list
1•rbanffy•13m ago•0 comments

brotab: Control your browser's tabs from the command line

https://github.com/balta2ar/brotab
1•pseudalopex•13m ago•0 comments

Desktop Publishing Tools That Didn't Make It

https://tedium.co/2022/10/12/forgotten-desktop-publishing-tools-history/
1•rbanffy•13m ago•0 comments

The Hungry, Hungry AI Model

https://tomtunguz.com/input-output-ratio/
1•speckx•14m ago•0 comments

Program for Framework 16 LED Matrix

https://boyne.dev/projects/fwmm.html
1•DedFishy•15m ago•1 comments

Strategic connection between JuliaHub, Dyad and Julia open source community

https://juliahub.com/blog/the-strategic-connection-between-juliahub-dyad-and-the-julia-open-source-community
1•darboux•17m ago•0 comments

Show HN: Browse Developer Portfolios

https://www.webportfolios.dev
1•yeahimjt•17m ago•0 comments

Generative Blocks World: Moving Things Around in Pictures

https://arxiv.org/abs/2506.20703
2•PaulHoule•20m ago•0 comments

Pope Leo Signed a Popplio 'Pokémon' Card

https://gizmodo.com/pope-leo-signed-popplio-card-pokemon-2000626305
3•ulrischa•22m ago•0 comments

Show HN: AI-powered simulations to practice real-life decisions (free sample)

https://promptquest.co/first-time-manager-simulation-free/
2•ronstark•22m ago•1 comments

AI Makes It Look Good. Craft Makes It Matter

https://story.vjy.me/53
2•realvjy•23m ago•0 comments

Solar becomes top source of electricity in California

https://pv-magazine-usa.com/2025/07/09/solar-becomes-top-source-of-electricity-in-california/
10•martinpw•23m ago•6 comments

Researchers studied turtle necropsies for cancer to overturn theory

https://www.discoverwildlife.com/animal-facts/reptiles/study-finds-cancer-extremely-rare-in-turtles
1•thunderbong•25m ago•0 comments

Databricks-SQL at Your Agent's Fingertips via MCP in GitHub Copilot

https://aymenfurter.ch/articles/databricks-sql-at-your-agents-fingertips-via-mcp-in-github-copilot/
1•aymenfurter•25m ago•0 comments

Implantable device could save diabetes patients from dangerously low blood sugar

https://news.mit.edu/2025/implantable-device-could-save-diabetes-patients-low-blood-sugar-0709
3•gnabgib•28m ago•0 comments

Five Things to Know About Record Copper Prices

https://www.wsj.com/finance/commodities-futures/copper-prices-tariffs-explained-a6644db7
5•sandwichsphinx•29m ago•0 comments

2025 in LLMs so far, illustrated by Pelicans on Bicycles – Simon Willison [video]

https://www.youtube.com/watch?v=YpY83-kA7Bo
2•swyx•29m ago•1 comments

Amelia Earhart Aircraft Expedition: Satellite Photos Spot Long-Lost Wreckage?

https://www.LeonardDavid.com/amelia-earhart-aircraft-expedition-satellite-photos-spot-long-lost-wreckage/
1•speckx•31m ago•1 comments

22 stone blocks, pieces Of Lighthouse Of Alexandria, pulled from sea

https://allthatsinteresting.com/lighthouse-of-alexandria-remains
1•bookofjoe•32m ago•0 comments

How US Export Controls Have (and Haven't) Curbed Chinese AI

https://ai-frontiers.org/articles/us-chip-export-controls-china-ai
3•jonbaer•32m ago•0 comments

Recipients of a U.S. Climate Science Fellowship Are Put on Unpaid Leave

https://www.nytimes.com/2025/07/09/climate/noaa-fellows-unpaid-leave.html
4•jmsflknr•34m ago•0 comments

The plight of the misunderstood atomic memory ordering

https://www.grayolson.me/blog/posts/misunderstood-memory-ordering/
3•fanf2•35m ago•0 comments