frontpage.

This is a research/discovery post, not a polished toolkit or product.

The Idea in a nutshell:

"Hallucinations" aren't indicative of bad training, but per-token semantic ambiguity. By accounting for that ambiguity before prompting for a determinate response we can increase the reliability of the output.

Two‑Step Contextual Enrichment (TSCE) is an experiment probing whether a high‑temperature “forced hallucination”, used as part of the system prompt in a second low temp pass, can reduce end-result hallucinations and tighten output variance in LLMs.

What I noticed:

In >4000 automated tests across GPT‑4o, GPT‑3.5‑turbo and Llama‑3, TSCE lifted task‑pass rates by 24 – 44 pp with < 0.5 s extra latency.

All logs & raw JSON are public for anyone who wants to replicate (or debunk) the findings.

Would love to hear from anyone doing something similar, I know other multi-pass prompting techniques exist but I think this is somewhat different.

Primarily because in the first step we purposefully instruct the LLM to not directly reference or respond to the user, building upon ideas like adversarial prompting.

I posted an early version of this paper but since then have run about 3100 additional tests using other models outside of GPT-3.5-turbo and Llama-3-8B, and updated the paper to reflect that.

Code MIT, paper CC-BY-4.0.

Poll: Do "tech" companies design, build and distribute products

Polyhedra Viewer

Startup seeks Trump AI emergency for California tech city

A Most Important Artifact (2015)

Understanding Assembly Indices

Why Kubernetes Throttled My Idle Pods

Annotated Code for Predict Next Word Based on Context and Learned Patterns

Trying Out the AMD Developer Cloud for Quickly Evaluating Instinct and ROCm

The Promised LAN

Music as a Gradual Process [pdf] (1968)

The Nuanced Reality of Throttling: It's Not Just About Preventing Abuse

Helsing valued at €12B to become one of Europe's most valuable tech groups

Virtual Cells

Blasnake: Snake but now the snake is a weapon

A Surprising Route to the Best Life Possible

Show HN: I recreated 90s Mode X demoscene effects in JavaScript and Canvas

Show HN: Frozti.io instantly turns design into live UI and production ready code

The grim reality of assisted dying

William Langewiesche, the 'Steve McQueen of Journalism,' Dies at 70

3D Printing Research at EPA

Dungeon Rampage code rescued from a child's laptop and is relaunching on Steam

Social media overtakes TV as Americans' top news source

Paper ECG: An open-source application for digitizing ECG image scans

Missiles That Destroyed Air Defenses from Inside Iran Were Remotely Operated

Show HN: Wheretowatch.stream – See where movies/shows are streaming globally

Enabling enhanced security for your app

Atproto OS – Web Desktops on the AT Protocol

GPT-4.5 preview in the OpenAI API will be shut down on July 14, 2025

Waymo recalls more than 1,200 automated vehicles after minor crashes

Cross-social networks

Think Before You Speak – Exploratory Forced Hallucination Study [pdf]