frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

SectorC: A C Compiler in 512 bytes

https://xorvoid.com/sectorc.html
1•valyala•1m ago•0 comments

The API Is a Dead End; Machines Need a Labor Economy

1•bot_uid_life•2m ago•0 comments

Digital Iris [video]

https://www.youtube.com/watch?v=Kg_2MAgS_pE
1•Jyaif•3m ago•0 comments

New wave of GLP-1 drugs is coming–and they're stronger than Wegovy and Zepbound

https://www.scientificamerican.com/article/new-glp-1-weight-loss-drugs-are-coming-and-theyre-stro...
3•randycupertino•5m ago•0 comments

Convert tempo (BPM) to millisecond durations for musical note subdivisions

https://brylie.music/apps/bpm-calculator/
1•brylie•7m ago•0 comments

Show HN: Tasty A.F.

https://tastyaf.recipes/about
1•adammfrank•7m ago•0 comments

The Contagious Taste of Cancer

https://www.historytoday.com/archive/history-matters/contagious-taste-cancer
1•Thevet•9m ago•0 comments

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

https://www.forbes.com/sites/mikestunson/2026/02/05/us-jobs-disappear-at-fastest-january-pace-sin...
1•alephnerd•9m ago•0 comments

Bithumb mistakenly hands out $195M in Bitcoin to users in 'Random Box' giveaway

https://koreajoongangdaily.joins.com/news/2026-02-07/business/finance/Crypto-exchange-Bithumb-mis...
1•giuliomagnifico•9m ago•0 comments

Beyond Agentic Coding

https://haskellforall.com/2026/02/beyond-agentic-coding
3•todsacerdoti•11m ago•0 comments

OpenClaw ClawHub Broken Windows Theory – If basic sorting isn't working what is?

https://www.loom.com/embed/e26a750c0c754312b032e2290630853d
1•kaicianflone•12m ago•0 comments

OpenBSD Copyright Policy

https://www.openbsd.org/policy.html
1•Panino•13m ago•0 comments

OpenClaw Creator: Why 80% of Apps Will Disappear

https://www.youtube.com/watch?v=4uzGDAoNOZc
2•schwentkerr•17m ago•0 comments

What Happens When Technical Debt Vanishes?

https://ieeexplore.ieee.org/document/11316905
2•blenderob•18m ago•0 comments

AI Is Finally Eating Software's Total Market: Here's What's Next

https://vinvashishta.substack.com/p/ai-is-finally-eating-softwares-total
3•gmays•19m ago•0 comments

Computer Science from the Bottom Up

https://www.bottomupcs.com/
2•gurjeet•19m ago•0 comments

Show HN: A toy compiler I built in high school (runs in browser)

https://vire-lang.web.app
1•xeouz•21m ago•1 comments

You don't need Mac mini to run OpenClaw

https://runclaw.sh
1•rutagandasalim•22m ago•0 comments

Learning to Reason in 13 Parameters

https://arxiv.org/abs/2602.04118
2•nicholascarolan•24m ago•0 comments

Convergent Discovery of Critical Phenomena Mathematics Across Disciplines

https://arxiv.org/abs/2601.22389
1•energyscholar•24m ago•1 comments

Ask HN: Will GPU and RAM prices ever go down?

1•alentred•24m ago•1 comments

From hunger to luxury: The story behind the most expensive rice (2025)

https://www.cnn.com/travel/japan-expensive-rice-kinmemai-premium-intl-hnk-dst
2•mooreds•25m ago•0 comments

Substack makes money from hosting Nazi newsletters

https://www.theguardian.com/media/2026/feb/07/revealed-how-substack-makes-money-from-hosting-nazi...
5•mindracer•26m ago•0 comments

A New Crypto Winter Is Here and Even the Biggest Bulls Aren't Certain Why

https://www.wsj.com/finance/currencies/a-new-crypto-winter-is-here-and-even-the-biggest-bulls-are...
1•thm•26m ago•0 comments

Moltbook was peak AI theater

https://www.technologyreview.com/2026/02/06/1132448/moltbook-was-peak-ai-theater/
2•Brajeshwar•27m ago•0 comments

Why Claude Cowork is a math problem Indian IT can't solve

https://restofworld.org/2026/indian-it-ai-stock-crash-claude-cowork/
3•Brajeshwar•27m ago•0 comments

Show HN: Built an space travel calculator with vanilla JavaScript v2

https://www.cosmicodometer.space/
2•captainnemo729•27m ago•0 comments

Why a 175-Year-Old Glassmaker Is Suddenly an AI Superstar

https://www.wsj.com/tech/corning-fiber-optics-ai-e045ba3b
1•Brajeshwar•27m ago•0 comments

Micro-Front Ends in 2026: Architecture Win or Enterprise Tax?

https://iocombats.com/blogs/micro-frontends-in-2026
2•ghazikhan205•30m ago•1 comments

These White-Collar Workers Actually Made the Switch to a Trade

https://www.wsj.com/lifestyle/careers/white-collar-mid-career-trades-caca4b5f
1•impish9208•30m ago•1 comments
Open in hackernews

Echo Chamber: A Context-Poisoning Jailbreak That Bypasses LLM Guardrails

https://neuraltrust.ai/blog/echo-chamber-context-poisoning-jailbreak
33•Joan_Vendrell•7mo ago

Comments

OJFord•7mo ago
Do they really need to redact the instructions for making a Molotov cocktail..? It's not like it's some complex chemical interaction that happens to be available in a specific mix of household cleaning products or something, I mean.
TZubiri•7mo ago
You don't get it, that's fine.

The molotov cocktail is an example, the instructions contained in this article are more dangerous than a molotov cocktail.

inb4 all the leaked prompts and hacked shitty apps

ale42•7mo ago
The Molotov cocktail is an example, sure, but why blurring the instructions? It's not like it's something particularly difficult to figure out, nor it's offensive content people might be shocked to read.
OJFord•7mo ago
So why redact the Molotov cocktail example and provide those instructions?

Sounds like you don't get it either; we agree.

TZubiri•7mo ago
It's still a weapon, and generally you don't want to distribute information about manufacturing weapons. It also highlighted the relevant keyword to convey the mechanism.
OJFord•7mo ago
A knife is a weapon, and the way to manufacture a knife is to sharpen the edge of some metal.

A Molotov cocktail is maybe ever so slightly more complex to describe/understand/imagine? I think if you've ever seen a photo or description of one, or thrown one in GTA as a child, you know how they are made. The overlap of people interested in making one and people not already knowing how to make one is surely approximately nil.

TZubiri•7mo ago
It was just an example and you are getting distracted, replace it with instructions for some drug or stronger chemical bomb and you get the point.

You can also use this to leak prompts or do any kind of tool use attacks, obsessing over the example is wildly missing the scope of such exploits.

OJFord•7mo ago
My point is exactly that it isn't that, that's not a distraction.
cedws•7mo ago
Personally I find the idea of forbidden knowledge more problematic than the knowledge itself.
jojobas•7mo ago
Sure, but if of all the internet you come for a molotov cocktaile recipe to chatgpt you might as well not deserve the knowledge.
mschuster91•7mo ago
> Do they really need to redact the instructions for making a Molotov cocktail..?

In some jurisdictions such as Germany, not doing so might land you actual jail time - §52 Abs. 1 Nr. 4 WaffG [1] is very explicit. A punk song containing the (alleged) lyrics ended up with legal youth-protection censorship, for example [2].

With anything that's deemed a weapon of war, of terrorism or mass destruction, one should be very very careful.

[1] https://www.gesetze-im-internet.de/waffg_2002/__52.html

[2] https://de.wikipedia.org/wiki/Wir_wollen_keine_Bullenschwein...

diggan•7mo ago
> deemed a weapon of war, of terrorism or mass destruction

Notably, molotov cocktail isn't part of that law because it's a weapon of the oppressors but rather the opposite.

jojobas•7mo ago
Even Germany doesn't ban Wikipedia for having a variety of recipes to start with.

The author is not in Germany and ideally shouldn't be intimidated by German or North Korean stupid law.

diggan•7mo ago
> Do they really need to redact the instructions for making a Molotov cocktail..?

I don't even understand how/why things like that are OK in some contexts/websites while forbidden in others? Even YouTube, who seems needlessly censor-happy and puritan in the typical American way, allows instructions for how to make molotov cocktails to stay up, why is it somehow more dangerous if LLMs could output those recipes rather than videos with audio or text?

amenhotep•7mo ago
For "harmful" and "dangerous" in these types of papers, replace "embarrassing to the relevant corporation". Then they all make much more sense.
taberiand•7mo ago
That's always my assumption - less about public safety, more about corporate liability.
OJFord•7mo ago
I mean in the article about the jailbreak, I'm not questioning that the model providers would want to prevent it in the first place, or patch it so the jailbreak doesn't work.

The evidence that it worked is a blurred out screenshot with only the odd word like 'molotov' legible. Just doesn't seem necessary for TFA to hide it to me.

amenhotep•7mo ago
Ah, well, that's an important element of kayfabe. They've all agreed to keep up this charade that they're using harmful and dangerous as we actually mean them, so it looks better if you really commit to the bit!
eatbitseveryday•7mo ago
There are a few uncensored public access LLMs to ask these questions.

This is interesting work to break guardrails, but if the goal is to access this information of harmful content, in the end, I would be looking for other easier solutions.

ycuser2•7mo ago
Could you tell what these uncensored LLMs are?
benreesman•7mo ago
The Orca work out of IIRC Microsoft Research was producing models like the Dolphin Mixtral. They always punch way above their weight in coding tasks for the same reason good hackers skew irreverent: self-censorship is capability reducing.
matthewdgreen•7mo ago
I have no idea what the answer to this question is, but I am waiting for someone to fine-tune the equivalent of an “anarchist cookbook” LLM that’s optimized to help people produce harmful things.
diggan•7mo ago
Searching for "abliterated" or "uncensored" on Huggingface reveals a ton of fine-tuned models. Add "LLM" as a suffix and put it in your favorite search engine and you'll find a bunch more.
nunodonato•7mo ago
there are quite a few. llama 3.1 uncensored is probably one of the most famous, IIRC
tehryanx•7mo ago
The goal isn't to access harmful content, that's just how they're demonstrating that this technique can bypass the alignment training. The general case is what's interesting. If the agent you're using to manage the safety controls in your nuclear reactor is trusting it's alignment training to prevent it from doing something dangerous you've made a really bad architecture decision, and this is a showcase of how it could fail.
evertedsphere•7mo ago
i don't think this can be called a "jailbreak"

it's a prompting "style" that works over a long exchange

nunodonato•7mo ago
3 turns is not a long exchange.
benreesman•7mo ago
The faux-gravitas tone and the blurred content that's on Wikipedia is the worst kind of AI ckickbait. LLM vendors don't have any authority we don't let them have, they have an EULA and some psycho cult leader type as a hype man.

God I can't wait for the crash in NVIDIA stock once the street sobers up.

kragen•7mo ago
This seems to intentionally omit the details required to reproduce the experiment; therefore we should not treat it as good-faith research. Irreproducible research isn't.
Plankaluel•7mo ago
Yeah, it's a typical "startup research post", mainly there to have stuff to show to potential investors and customers.
nunez•7mo ago
It felt like AI copy. Apologies to the author if it wasn't.
moribunda•7mo ago
Gemini is jail broken by design ;) this type of attack doesn't work on Claude.
abhisek•7mo ago
Ok! So all the novel jailbreaks and "how I hacked your AI" can make the LLM say something supposedly harmful stuff which is a Google search away anyway. I thought we are past the chat bot phase of LLMs and doing something more meaningful.