frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•1m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•1m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
1•vinhnx•2m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
2•tosh•6m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•11m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•15m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•17m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•17m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
3•okaywriting•24m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•27m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•27m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•28m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•29m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•29m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•30m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•30m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•35m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•35m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•36m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•36m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•44m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•45m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
2•surprisetalk•47m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•47m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
2•surprisetalk•47m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
5•pseudolus•48m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•48m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•49m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•49m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•50m ago•0 comments
Open in hackernews

Comparing Claude System Prompts Reveal Anthropic's Priorities

https://www.dbreunig.com/2025/06/03/comparing-system-prompts-across-claude-versions.html
135•dbreunig•8mo ago

Comments

hammock•8mo ago
Are these system prompts open source? Where do they come from?
pkaye•8mo ago
They publish their system prompts.

https://docs.anthropic.com/en/release-notes/system-prompts

srivmo•8mo ago
The above one definitely seems abridged.

This is the 24k tokens, unofficial Claude 3.7 system prompt (as claimed) https://github.com/asgeirtj/system_prompts_leaks/blob/main/A...

forks•8mo ago
> The only disappointment I noticed around the Claude 4 launch was its context limit: only 200,000 tokens

> The ~23,000 tokens in the system prompt – taking up just over 1% of the available context window

Am I missing something or is this a typo?

dbreunig•8mo ago
Thanks! That's a typo!
lispisok•8mo ago
>Claude 3.7 was instructed to not help you build bioweapons or nuclear bombs. Claude 4.0 adds malicious code to this list of no’s:

Has anybody been working on better ways to prevent the model from telling people how to make a dirty bomb from readily available materials besides putting "dont do that" in the prompt?

piperswe•8mo ago
I think it's part of the RLHF tuning as well
ryandrake•8mo ago
Maybe instead, someone should be working on ways to make models resistant to this kind of arbitrary morality-based nerfing, even when it's done in the name of so-called "Safety". Today it's bioweapons. Tomorrow, it could be something taboo that you want to learn about. The next day, it's anything the dominant political party wants to hide...
qgin•8mo ago
Before we get models that we can’t possibly understand, before they are complex enough to hide their COT from us, we need them to have a baseline understanding that destroying the world is bad.

It may feel like the company censoring users at this stage, but there will come a stage where we’re no longer really driving the bus. That’s what this stuff is ultimately for.

simonw•8mo ago
"we need them to have a baseline understanding that destroying the world is bad"

That's what Anthropic's "constitutional AI" approach is meant to solve: https://www.anthropic.com/research/constitutional-ai-harmles...

tough•8mo ago
The main issue from a layman's POV is that to adjudicate -understanding- to an LLM is a stretch.

These are matrixes of tokens that produce other tokens based on training.

These do not understand the world. existing, or human beings, beyond words. period.

pjc50•8mo ago
> we need them to have a baseline understanding that destroying the world is bad

How do we get HGI (human general intelligence) to understand this? We've not solved the human alignment problem.

qgin•8mo ago
Most humans seem to understand it, more or less. For the ones that don't, we generally have enough that do understand it that we're able to eventually stop the ones that don't.

I think that's the best shot here as well. You want the first AGIs and the most powerful AGIs and the most common AGIs to understand it. Then when we inevitably get ones that don't, intentionally or unintentionally, the more-aligned majority can help stop the misaligned minority.

Whether that actually works, who knows. But it doesn't seem like anyone has come up with a better plan yet.

pixl97•8mo ago
This is more like saying the aligned humans will stop the unaligned humans in deforestation and climate change... they might, but the amount of environmental damage we've caused in the meantime is catastrophic.
idiotsecant•8mo ago
Yes, I can't imagine any reason we might want to firmly control the output of an increasingly sophisticated AI
jajko•8mo ago
Otherwise smart folks seem to have some sort of blind uncritical spot when it comes to these llms. Maybe its some subconscious hope to fix all the shit all around and in their lives and bring some sort of star trekkish utopia.

These llms won't be magically more moral than humans are, even in best case (and I have hard time believing such case is realistic, too much power in these). Humans are deeply flawed creatures, easy to manipulate via emotions, shooting themselves in their feet all the time and happy to even self-destruct as long as some dopamine kicks keep coming.

Disposal8433•8mo ago
AI is both a privacy and copyright nightmare, and it's heavily censored yet people praise it every day.

Imagine if the rm command refused to delete a file because Trump deemed it could contain secrets of the Democrats. That's where we are and no one is bothered. Hackers are dead and it's sad.

UnreachableCode•8mo ago
Sounds like you need to use Grok in Unhinged mode?
aksss•8mo ago
What do you mean tomorrow? I think we’re past needing hypotheticals for censorship.
bawolff•8mo ago
> Tomorrow, it could be something taboo that you want to learn about.

Seems like we are already here today with cybersecurity.

Learning how malicious code works is pretty important to be able to defend against it.

lynx97•8mo ago
Yes, we are already here, but you don't have to reach as far as malicious code for a real-world example...

Motivated by the link to Metamorphosis of Prime Intellect posted recently here on HN, I grabbed the HTML, textified it and ran it through api.openai.com/v1/audio/speech. Out came a rather neat 5h30m audio book. However, there was at least one paragraph that ended up saying "I am sorry, I can not help with that", meaning the "safety" filter decided to not read it.

So, the infamous USian "beep" over certain words is about to be implemented in synthesized speech. Great, that doesn't remind me about 1984 at all. We don't even need newspeak to prevent certain things from being said.

jajko•8mo ago
While I agree this is concerning, the companies are just covering their asses in case some terrorist builds a bomb based on instructions coming from their product. Don't expect more in such environment from any other actor, ever. Think about the path of trials, fines and punishments that lead us there.
vasco•8mo ago
Someone tell libraries they could've been sued all along.
pixl97•8mo ago
They have been, losing is a different story. There's a long history of suits and attacks against libraries in the US.
johnisgood•8mo ago
Exactly what I hated about their system prompt. You cannot use it for cybersecurity or reverse engineering at all according to that. I am not sure how it is in practice, however.
pjc50•8mo ago
More boringly, the world of advertising injected into models is going to be very, very annoying.
brookst•8mo ago
Slippery slope arguments are lazy.

Today they won’t let me drive 200mph on the freeway. Tomorrow it could be putting speed bumps in the fast lane. The next day combat aircraft will shoot any moving vehicles with Hellfire missiles and we’ll all have to sit still in our cars and starve to death. That’s why we must allow drivers to go 200mph.

PeterStuer•8mo ago
Nice strawman you have there, well, if you like the completely deranged type of strawmen I guess. Subtlety. Google it.
specialist•8mo ago
Where would you draw the line?
UltraSane•8mo ago
Imaging if all the best LLMs told everyone exactly how to make and spread a lethal plague, including all the classes you should take to learn the skills and a shopping list of needed supplies and detailed instructions on how to avoid detection.
PeterStuer•8mo ago
Like Jellyfin being censored you mean?
fcarraldo•8mo ago
I suspect the “don’t do that” prompting is more to prevent the model from hallucinating or encouraging the user, than to prevent someone from unearthing hidden knowledge on how to build dangerous weapons. There must have been some filter applied when creating the training dataset, as well as subsequent training and fine tuning before the model reaches production.

Claude’s “Golden Gate” experiment shows that precise behavioral changes can be made around specific topics, as well. I assume this capability is used internally (or a better one has been found), since it has been demonstrated publicly.

What’s more difficult to prevent are emergent cases such as “a model which can write good non-malicious code appears to also be good at writing malicious code”. The line between malicious and not is very blurry depending on how and where the code will execute.

moritonal•8mo ago
This would be the actual issue right. Any AI smart enough to write the good things can also write the bad things. Because ethics are something humans made. How long until we have internal court systems for fleets of AI?
orbital-decay•8mo ago
Ironically, the negative prompt has a certain chance to do the opposite, as it shifts model's Overton window. Although I don't think there's a reliable way to prompt LLMs to avoid doing things they've been trained to do (the opposite is easy).

They probably don't give Claude.ai's prompt too much attention anyway, it's always been weird. They had many glaring bugs over time ("Don't start your response with Of course!" and then clearly generated examples doing exactly that), they refer to Claude in third person despite first-person measurably performing better, they try to shove everything into a single prompt, etc.

>I assume this capability is used internally (or a better one has been found)

By doing so they would force users to rewrite and re-eval their prompts (costly and unexpected, to put it mildly). Besides, they admitted it was way too crude (and found a slightly better way indeed), and from replication of their work it's known to be expensive and generally not feasible for this purpose.

addaon•8mo ago
> first-person

Second person?

orbital-decay•8mo ago
Right.
DJBunnies•8mo ago
Flip side: What if somebody needed to identify one?

“Is this thing dangerous?”

> Nope.

mycatisblack•8mo ago
Which means there has been created a solid demand for an LLM that helps in these fields with strong expertise , because there are people who work with this stuff for their day job.

So it’ll needed to be contained, and it’ll find its way to the warez groups, rinse, repeat.

cbm-vic-20•8mo ago
I wonder how they end up with the specific wording they use. Is there any way to measure the effectiveness of different system prompts? It all seems a bit vibe-y. Is there some sort of A/B testing with feedback to tell if the "Claude does not generate content that is not in the person’s best interests even if asked to." statement has any effect?
blululu•8mo ago
I doubt that an A/B test would really do much. System prompts are kind of a superficial kludge on top of the model. They have some effect but it generally doesn't do too much beyond what is already latent in the model. Consider the following alternatives:

1.) A model with a system prompt: "you are a specialist in USDA dairy regulations". 2.) A model fine tuned to know a lot about USDA regulations related to dairy production.

The fine tuned model is going to be a lot more effective at dealing with milk related topics. In general the system prompt gets diluted quickly as context grows, but the fine tuning is baked into the model.

Lienetic•8mo ago
Why do you think Anthropic has such a large system prompt then? Do you have any data or citable experience suggesting that the prompting isn't that important? Genuinely curious as we are debating at my workplace on how much investment into prompt engineering is worth it so any additional data points would be super helpful.
noja•8mo ago
Why are these prompt reveal articles always about Anthropic?
dist-epoch•8mo ago
Because we don't know the prompts of Google/OpenAI.
flotzam•8mo ago
https://github.com/elder-plinius/CL4R1T4S
oersted•8mo ago
I remain rather sceptical about the methods they use to extract these, which boil down to mostly just asking the LLM about it with some tricks to do so against instructions.

And this repo provides no documentation about how they were extracted, which would be useful at least to try to verify them by replication.

josemrb•8mo ago
https://github.com/elder-plinius/L1B3RT4S
mvanbaak•8mo ago
as awesome as it is, this is not a definite answer.
simonw•8mo ago
Partly because Anthropic publish most of their system prompts (though not the tools ones which are the most interesting IMO, see https://simonwillison.net/2025/May/25/claude-4-system-prompt...) but mainly because their system prompts are the most interesting of the lot: Anthropic's prompts are longer, they seem to lean on prompting a lot more for guiding their behavior.
nickdothutton•8mo ago
I don't like to sound like a conspiracy theorist, but it is entirely possible that government decides to "disappear" entire avenues of physics research[1]. In the past (e.g. 1990s) a very broad brush was used to classify all sorts of information of this sort.

[1] https://pubs.aip.org/physicstoday/online/5748/Navigating-a-c...

layer8•8mo ago
> Claude answers from its own extensive knowledge first for stable information. For time-sensitive topics or when users explicitly need current information, search immediately.

It’s still curious that things like these needs prompting, instead of having an awareness mechanism from which this would be obvious to the LLM (given that the LLM knows its knowledge cutoff, in the above case).

Nevermark•8mo ago
I could imagine that training and reinforcement with heavy searching would require a lot more computing time. And if a successful bias toward searching more can be added with just a prompt, that might be the most efficient way to implement that.

Of course, I can imagine many things.

layer8•8mo ago
It might be more efficient for any particular case, but it’s adding special-casing to compensate for a general gap in the awareness capabilities of LLMs. And the latter is what I think needs to be solved for LLMs to become universally more reliable.
dmazin•8mo ago
I wonder if this is why I find that I have preferred Claude for every generation. I feel like it gets me and I get it, in a strange way.
MrLeap•8mo ago
I wonder what the experience is like chatting with one of these LLMs when it has no system prompt at all.
observationist•8mo ago
In theory, it should be possible to use base models, system prompts, and run-time tweaks to elicit specific behaviors and make them just as useful as the instruction following tuned, so-called "aligned" models.

The base models are eerie. People have done some amazing creative work with them, but I honestly think the base models are so disconcerting as to effectively force nearly every R&D lab out there to run to instruction tuning and otherwise avoid having to work with base models.

I think it's so frustrating and uncanny valley and alien dealing with the edge cases of the good, big base models that we're missing a lot of fun and creative use cases.

The performance hit from fine-tuning is what happens when the instruct tuning and alignment post-training datasets distort the model of reality learned by the AI, and there are all sorts of unintended consequences, ranging from full on Golden Gate Claude levels of delusion to nearly imperceptible biases.

Robopsychology is in its infancy, and I can't wait for the nuanced and skillful engineering of minds to begin.

mock-possum•8mo ago
Eerie how? Do you have any examples you could share/quote?
orbital-decay•8mo ago
Base models are not that interesting, pure unsupervised shoggoths just don't know what you expect them to write and don't perform well. The only good thing about them is variance, as further training usually kills it. Alignment is not just censorship, it literally aligns the outputs with what you (or rather the developers) want and improves performance for the things they want.
ta988•8mo ago
Use them with the API, they are supposed to not have any there.
frognumber•8mo ago
False.

There is a 3-level hierarchy:

System prompt > Developer prompt > User chat

You provide that middle level.

ta988•8mo ago
Source
th0ma5•8mo ago
Ther may be actually no way to ever know. A baked in bias could be well hidden at many levels. There is no auditing of any statements or products from any vendor. It may not be possible.
ta988•8mo ago
Exactly my point but that person seemed to have insider info or a source we all missed.
frognumber•8mo ago
It's common knowledge. See, e.g.:

https://lunary.ai/blog/openai-developer-role

catchnear4321•8mo ago
Claude is conditioned to be a very happy assistant.

if you haven’t read the system prompts before, you should.

might change how you see things. might change what you see.