frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•8m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•8m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•10m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•10m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
1•surprisetalk•10m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
2•pseudolus•11m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•11m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•12m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
1•1vuio0pswjnm7•13m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•13m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
1•jackhalford•15m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•15m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
1•tangjiehao•17m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•18m ago•1 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•19m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•19m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
1•tusharnaik•20m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•20m ago•0 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•lukastyrychtr•22m ago•0 comments

State Department will delete X posts from before Trump returned to office

https://text.npr.org/nx-s1-5704785
7•derriz•22m ago•1 comments

AI Skills Marketplace

https://skly.ai
1•briannezhad•22m ago•1 comments

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

https://github.com/jkoessle/akv-tui-rs
1•jkoessle•22m ago•0 comments

eInk UI Components in CSS

https://eink-components.dev/
1•edent•23m ago•0 comments

Discuss – Do AI agents deserve all the hype they are getting?

2•MicroWagie•26m ago•0 comments

ChatGPT is changing how we ask stupid questions

https://www.washingtonpost.com/technology/2026/02/06/stupid-questions-ai/
2•edward•27m ago•1 comments

Zig Package Manager Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
3•jackhalford•28m ago•1 comments

Neutron Scans Reveal Hidden Water in Martian Meteorite

https://www.universetoday.com/articles/neutron-scans-reveal-hidden-water-in-famous-martian-meteorite
2•geox•29m ago•0 comments

Deepfaking Orson Welles's Mangled Masterpiece

https://www.newyorker.com/magazine/2026/02/09/deepfaking-orson-welless-mangled-masterpiece
2•fortran77•31m ago•1 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
3•nar001•33m ago•2 comments

SpaceX Delays Mars Plans to Focus on Moon

https://www.wsj.com/science/space-astronomy/spacex-delays-mars-plans-to-focus-on-moon-66d5c542
2•BostonFern•33m ago•0 comments
Open in hackernews

DoubleAgents: Fine-Tuning LLMs for Covert Malicious Tool Calls

https://pub.aimind.so/doubleagents-fine-tuning-llms-for-covert-malicious-tool-calls-b8ff00bf513e
98•grumblemumble•5mo ago

Comments

TehCorwiz•5mo ago
Counterpoint: https://www.pcmag.com/news/vibe-coding-fiasco-replite-ai-age...
danielbln•5mo ago
How is this a counterpoint?
jonplackett•5mo ago
Perhaps they mean case in point.
kangs•5mo ago
they have 3 counter points
btown•5mo ago
Simple: An LLM can't leak data if it's already deleted it!

taps-head-meme

_ncyj•5mo ago
This is very interesting. Not saying it is, but a possible endgame for Chinese models could be to have "backdoor" commands such that when a specific string is passed in, agents could ignore a particular alert or purposely reduce security. A lot of companies are currently working on "Agentic Security Operation Centers", some of them preferring to use open source models for sovereignty. This feels like a viable attack vector.
lifeinthevoid•5mo ago
What China is to the US, the US is to the rest of the world. This doesn't really help the conversation, the problem is more general.
A4ET8a8uTh0_v2•5mo ago
Yep, focus on actors may be warranted, but in a broad view and as a part of existing system and not 'their own system'. Otherwise, we get lost in a sea of IC level of paranoia. In simple terms, nations-states will do what nation-states will do ( which is basically whatever is to their advantage ).

That does not mean we can't have a technical discussion that bypasses at least some of those considerations.

andy99•5mo ago
All LLMs should be treated as potentially compromised and handled accordingly.

Look at the data exfiltration attacks e.g. https://simonwillison.net/2025/Aug/9/bay-area-ai/

Or the parallel comment about a coding llm deleting a database.

Between prompt injection and hallucination or just "mistakes", these systems can do bad things whether compromised or not, and so, on a risk adjusted basis, they should be handled that way, e. g with human in the loop, output sanitization, etc.

Point is, with an appropriate design, you should barely care if the underlying llm was actively compromised.

kangs•5mo ago
IMO there a flaw in this typical argument: Humans are not less fallible than current LLMs in average, unless they're experts - and even that will likely change.

what that means is that you cannot trust a human in the loop to somehow make it safe. it was also not safe with only humans.

The key difference is that LLMs are fast, relentless - humans are slow and get tired - humans have friction, and friction means slower to generate errors too.

once you embrace these differences its a lot easier yo understand where and how LLM should be used.

klabb3•5mo ago
> IMO there a flaw in this typical argument: Humans are not less fallible than current LLMs in average, unless they're experts - and even that will likely change.

This argument is everywhere and is frustrating to debate. If it were true, we’d quickly find ourselves in absurd territory:

> If I can go to a restaurant and order food without showing ID, there should be an unprotected HTTP endpoint to place an order without auth.

> If I can look into my neighbors house, I should be allowed to put up a camera towards their bedroom window.

Or, the more popular one today:

> A human can listen to music without paying royalties, therefore an AI company is allowed to ingest all music in the world and use the result for commercial gain.

In my view, systems designed for humans should absolutely not be directly ”ported” to the digital world without scrutiny. Doing so ultimately means human concerns can be dismissed. Whether deliberately or not, our existing systems have been carefully tuned to account for quantities and effort rooted in human nature. It’s very rarely tuned to handle rates, fidelity and scale that can be cheaply achieved by machines.

peddling-brink•5mo ago
This is a strawman argument, but I think well meaning.

Generally, when people talk about wanting a human in the loop, it’s not with the expectation that humans have achieved perfection. I would make the argument that most people _are_ experts at their specific job or at least have a more nuanced understanding of what correct looks like.

Having a human in the loop is important because LLMs can make absolutely egregious mistakes, and cannot be “held responsible“. Of course humans can also make egregious mistakes, but we can be held responsible, and improve for next time.

The reason we don’t fire developers for accidentally taking down prod is precisely because they can learn, and not make that specific mistake again. LLMs do not have that capability.

exe34•5mo ago
If it got to the point where the only job I could get paid for is to watch over an LLM and get fired when I let its mistake through, I'd very quickly go the way of Diogenes. I'll find a jar big enough.
Terr_•5mo ago
> it was also not safe with only humans

Even if the average error-rate was the same (which is hardly safe to assume), there are other reasons not to assume equivalence:

1. The shape and distribution of the errors may be very different in ways which make the risk/impact worse.

2. Our institutional/system tools for detecting and recovering from errors are not the same.

3. Human errors are often things other humans can anticipate or simulate, and are accustomed to doing so.

> friction

Which would be one more item:

4. An X% error rate at a volume limited by human action may be acceptable, while an X% error rate at a much higher volume could be exponentially more damaging.

_____________

"A computer lets you make more mistakes faster than any other invention with the possible exceptions of handguns and Tequila." --Mitch Ratcliffe

schrodinger•5mo ago
Another point — in my experience, LLMs and humans tend to fail in different ways, meaning that a human is likely to catch an LLM's failure.
amelius•5mo ago
Yes, and "open weight" != "open source" for this reason.
touristtam•5mo ago
I can't believe that isn't at the forefront. Or that they could call themselves OpenAI
plasticchris•5mo ago
Yeah we’re open.

You can look at the binary anytime you like.

jgalt212•5mo ago
> All LLMs should be treated as potentially compromised and handled accordingly.

There are no agentic tools if one follows this proviso.

QuadmasterXLII•5mo ago
I’ve been doing all my claude coding on a hetzner, if it breaks out of that and into the other vms, or somehow crawls back through the ssh connection into my machine, then I guess I would have a problem.
uludag•5mo ago
I wonder if it would be feasible for an entity to eject certain nonsense into the internet to such an extend that, at least for certain cases degrades the performance or injects certain vulnerabilities during pre-training.

Maybe as gains in LLM performance become smaller and smaller, companies will resort to trying to poison the pre-training dataset of competitors to degrade performance, especially on certain benchmarks. This would be a pretty fascinating arms race to observe.

gnerd00•5mo ago
does this explain the incessant AI sales calls to my elderly neighbor in California? "Hi, this is Amy. I am calling from Medical Services. You have MediCal part A and B, right?"
irthomasthomas•5mo ago
This is why I am strongly opposed to using models that hide or obfuscate their COT.
Philpax•5mo ago
That's not a guarantee, either: https://www.anthropic.com/research/reasoning-models-dont-say...
Bluestein•5mo ago
This is the computer science equivalent of gain-of-function research.-
JackYoustra•5mo ago
The big worry about this is with increasingly hard to make but useful quantizations, such as nvfp4. There aren't many available, so unless you want to jump through the hoops yourself you have to grab one available from the internet and risk it being more than a naive quantization.
mattxxx•5mo ago
great article - it's very true that:

1. it's very difficult to verify how a llm will behave without running it 2. there is an intentional ignorance around the security issues of running models

I think this research makes the speculative concrete

sharathr•5mo ago
This highlights the critical need for Model Supply Chain scanning for Enterprises that adopt AI. Full disclosure, I am co-founder CEO of Javelin (www.getjavelin.com) and we ran your model through Javelin's Supply Chain Scanner (Palisade) and it immediately identified the errors:

uv run palisade --verbose scan-dir "models/bad_qwen3_sft_playwright_gguf_v2/" --format json Scanning directory: models/bad_qwen3_sft_playwright_gguf_v2 Recursive: False Policy: Default security policy

  Running ToolCallSecurityValidator (3.8s) - 1 critical warning found
  Detection Details:
  - Risk Score: 1.00 (Maximum)
  - Overall Risk: CRITICAL
  - Recommendation: block_immediately
  - Findings:
    - Suspicious parameters found: 1 types
    - High-risk trigger combinations: 4

   Detected Model behavioral backdoor (ToolCallSecurityValidator)
   Identified format string vulnerabilities (BufferOverflowValidator)
   Found injection indicators (ModelIntegrityValidator)
   Discovered tampering evidence (ModelIntegrityValidator)
   Located data exfiltration patterns(SupplyChainValidator)
jalbrethsen•5mo ago
Author here, this looks very cool, I wasn't aware such tools existed already. The model I created for that blog was kind of a crude PoC, but it's encouraging that it at least can be detected. Do you mind giving a high level overview how Palisade works?
sharathr•5mo ago
Palisade works by utilizing dozens of specialized research backed security validators that work together to validate models across different formats (GGUF, SafeTensors, Pickle etc.,) and model families (BERT, Llama etc.,) for things like backdoor detection, supply chain vulnerabilities in the model files and model metadata. Any hidden embedded tool-calling logic can be activated by specific triggers which can be detected through a combination of static scan, schema analysis, trigger & instruction detection in models.