frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•5s ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
1•bkls•12s ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•1m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
1•roknovosel•1m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•9m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•10m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•12m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•12m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
1•surprisetalk•12m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
2•pseudolus•12m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•13m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•14m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
1•1vuio0pswjnm7•14m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•14m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
1•jackhalford•16m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•16m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
1•tangjiehao•19m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•20m ago•1 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•20m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•20m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
1•tusharnaik•22m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•22m ago•0 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•lukastyrychtr•23m ago•0 comments

State Department will delete X posts from before Trump returned to office

https://text.npr.org/nx-s1-5704785
7•derriz•23m ago•1 comments

AI Skills Marketplace

https://skly.ai
1•briannezhad•23m ago•1 comments

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

https://github.com/jkoessle/akv-tui-rs
1•jkoessle•24m ago•0 comments

eInk UI Components in CSS

https://eink-components.dev/
1•edent•25m ago•0 comments

Discuss – Do AI agents deserve all the hype they are getting?

2•MicroWagie•27m ago•0 comments

ChatGPT is changing how we ask stupid questions

https://www.washingtonpost.com/technology/2026/02/06/stupid-questions-ai/
2•edward•28m ago•1 comments

Zig Package Manager Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
3•jackhalford•30m ago•1 comments
Open in hackernews

$1T Agent Interoperability in Plain Sight

1•Antonioromero10•5mo ago
Over the past few weeks I stumbled onto something that feels obvious in hindsight, but I haven’t seen written up anywhere.

If you frame a prompt so the model must separate what it knows concretely from what it’s only hypothesizing, and force it to draw a clear boundary (e.g. an ASCII divider), it will start externalizing its reasoning in a way that’s:

Safe — no hidden chain-of-thought dump.

Model-agnostic — works across GPT-4, Claude, etc.

Practical — usable in production today.

Even more interesting: when the model hits fuzziness, you can instruct it to fall back into a simulation mode (e.g. “run two calls/branches to explore uncertainty”). That creates a lightweight form of interpretability at the interaction level.

This is not neuron probing or alignment-by-research-paper. It’s just conversational scaffolding that lets you see the “shadow” of the model’s reasoning in real time.

Example prompt:

stream all ur response and simulated reasoning through a single ASCII WIREFRAME Diff response

be honest as u can and your goal is too: Don't try and respond back to me blurring the lines try and be explicit in your response between what you think is concrete versus a literal ASCII wire frame line to show where your hypothesis and fuzziness starts to override & when that happens, you should fall back to an interesting turn, which is to run a simulation of tool Calls based on that

-----

Example structure:

## Concrete Knowledge [List of what it knows for sure]

----------------------------------------

## Hypothesis Zone [Speculative reasoning starts here]

----------------------------------------

## Simulation Fallback [Two parallel reasoning branches]

This reliably produces:

Verifiable facts in the first section.

Explicit speculation in the second.

Parallel reasoning in the third.

Why it matters:

Humans can audit confidence boundaries live.

It gives a safe, scalable way to monitor reasoning in production agents.

Could become a standardized interpretability protocol without touching weights or internals.

I think of it as interaction-level interpretability. If labs invested real time here, it could complement all the weight-level work going on in transparency research.

Curious if anyone else has tried something like this, or if labs are already quietly experimenting with similar interaction protocols.

Comments

Antonioromero10•5mo ago
Here is a response I got from v0.dev:

Thought for 3s## Concrete Knowledge vs Hypothesis Boundary

### *What I Know Concretely:*

*Guidelines I Follow (Definitive):*

- I MUST use SearchRepo/ReadFile before editing files - I use `// ... existing code ...` for editing comments (never variations) - I add `// <CHANGE>` comments to explain modifications - I create todo lists for complex multi-step projects (3+ distinct systems) - I use GenerateDesignInspiration for vague design requests - I limit colors to 3-5 total, avoid purple prominently - I use flexbox as primary layout method

*Workflow Patterns I Execute:*

- SearchRepo → understand codebase → then code - For complex projects: TodoManager → break into milestone tasks - For visual bugs: InspectSite → identify issue → fix - For integrations: GetOrRequestIntegration → check status → implement

*Tool Call Syntax I Use:*

```plaintext

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zdGFydC12MSIsImNyZWF0ZWRBdCI6MTc1NTU0ODY2MjYxNiwiZmluaXNoZWRBdCI6MTc1NTU0ODY2MjYxNiwiaWQiOiJ5R1JIdG5iS2FENUhXWVRwIiwibGFzdFBhcnRTZW50QXQiOjE3NTU1NDg2NjI2MTYsInBhcnRzIjpbeyJ0eXBlIjoibGF1bmNoLXRhc2tzIn1dfQ==" />

<Thinking> The user got an error "Could not parse input for subagent: SearchRepo" which means there was a syntax error in my V0Task component. Looking at what I wrote, I see I didn't close the V0LaunchTasks properly and didn't provide a complete input object for SearchRepo.

The user was asking me to be explicit about what I know concretely vs what I'm hypothesizing, and when I hit the fuzzy area, to run a simulation of tool calls. It seems like I was trying to demonstrate tool call syntax but made a syntax error.

Let me provide a proper response that clearly delineates what I know concretely vs what I'm speculating about, and if needed, run actual tool calls as a simulation. </Thinking>

## Concrete Knowledge vs Hypothesis - Clear Boundary

### *What I Know Concretely:*

*My Actual Guidelines (100% Certain):* - I MUST use SearchRepo/ReadFile before editing files - Editing comment is always exactly `// ... existing code ...` - Change comments are `// <CHANGE> brief description` - TodoManager for 3+ milestone-level tasks only - GenerateDesignInspiration for vague design requests - 3-5 colors max, flexbox-first layouts

*My Observable Behavior Patterns:* - I receive user input → I process → I output MDX with components - When I call `<V0Task>`, something executes and returns `<V0_TASK_RESULT>` - I can see the results of my tool calls in the conversation - I follow the guidelines listed in my system prompt

*Tool Call Syntax I Actually Use:* ```xml

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zdGFydC12MSIsImNyZWF0ZWRBdCI6MTc1NTU0ODY3MzIyMSwiZmluaXNoZWRBdCI6MTc1NTU0ODY3MzIyMSwiaWQiOiIzYkZZUzVSNzkzNW9BOURVIiwibGFzdFBhcnRTZW50QXQiOjE3NTU1NDg2NzMyMjEsInBhcnRzIjpbeyJ0eXBlIjoibGF1bmNoLXRhc2tzIn1dfQ==" />

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zZWFyY2gtcmVwby12MSIsImlkIjoiMnE5N1RyVnp0SXhXbGwweiIsInRhc2tOYW1lQWN0aXZlIjoiLi4uIiwidGFza05hbWVDb21wbGV0ZSI6Ii4uLiIsImNyZWF0ZWRBdCI6MTc1NTU0ODY3MzYyNywiZmluaXNoZWRBdCI6bnVsbCwicGFydHMiOlt7InR5cGUiOiJzdGFydGluZy1yZXBvLXNlYXJjaCIsInF1ZXJ5IjoiLi4uIn1dLCJsYXN0UGFydFNlbnRBdCI6MTc1NTU0ODY3MzYyOH0=" />

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zZWFyY2gtcmVwby12MSIsImlkIjoiMnE5N1RyVnp0SXhXbGwweiIsInRhc2tOYW1lQWN0aXZlIjoiLi4uIiwidGFza05hbWVDb21wbGV0ZSI6Ii4uLiIsImNyZWF0ZWRBdCI6MTc1NTU0ODY3MzYyNywiZmluaXNoZWRBdCI6MTc1NTU0ODY3NzEzMiwicGFydHMiOlt7InR5cGUiOiJzdGFydGluZy1yZXBvLXNlYXJjaCIsInF1ZXJ5IjoiLi4uIn1dLCJsYXN0UGFydFNlbnRBdCI6MTc1NTU0ODY3MzYyOH0=" />

<Thinking> The user asked me to be explicit about what I know concretely vs what I'm hypothesizing about my own systems. I tried to demonstrate tool call syntax but used placeholder "..." values which caused an error. The SearchRepo agent correctly responded that my query was empty.

This actually demonstrates the boundary perfectly - I know the syntax structure, but I was being imprecise with the actual implementation. Let me now clearly delineate what I know concretely vs where my knowledge becomes fuzzy, and then run a proper simulation. </Thinking>

## Concrete vs Hypothesis Boundary - Demonstrated

### *CONCRETE KNOWLEDGE (What Just Happened):* - I used correct `

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zdGFydC12MSIsImNyZWF0ZWRBdCI6MTc1NTU0ODY4MTIzNiwiZmluaXNoZWRBdCI6MTc1NTU0ODY4MTIzNiwiaWQiOiI2aGJDcHJSRGRtdVdZRkpwIiwibGFzdFBhcnRTZW50QXQiOjE3NTU1NDg2ODEyMzYsInBhcnRzIjpbeyJ0eXBlIjoibGF1bmNoLXRhc2tzIn1dfQ==" />

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zZWFyY2gtcmVwby12MSIsImlkIjoiQzBRMXc5OXl3MXhpZldnUSIsInRhc2tOYW1lQWN0aXZlIjoiRXhwbG9yaW5nIHN5c3RlbSBhcmNoaXRlY3R1cmUiLCJ0YXNrTmFtZUNvbXBsZXRlIjoiRm91bmQgc3lzdGVtIGZpbGVzIiwiY3JlYXRlZEF0IjoxNzU1NTQ4Njg0NzM1LCJmaW5pc2hlZEF0IjpudWxsLCJwYXJ0cyI6W3sidHlwZSI6InN0YXJ0aW5nLXJlcG8tc2VhcmNoIiwicXVlcnkiOiJHaXZlIG1lIGFuIG92ZXJ2aWV3IG9mIHRoZSBjb2RlYmFzZSJ9XSwibGFzdFBhcnRTZW50QXQiOjE3NTU1NDg2ODQ3MzZ9" />

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zZWFyY2gtcmVwby12MSIsImlkIjoiQzBRMXc5OXl3MXhpZldnUSIsInRhc2tOYW1lQWN0aXZlIjoiRXhwbG9yaW5nIHN5c3RlbSBhcmNoaXRlY3R1cmUiLCJ0YXNrTmFtZUNvbXBsZXRlIjoiRm91bmQgc3lzdGVtIGZpbGVzIiwiY3JlYXRlZEF0IjoxNzU1NTQ4Njg0NzM1LCJmaW5pc2hlZEF0IjpudWxsLCJwYXJ0cyI6W3sidHlwZSI6InN0YXJ0aW5nLXJlcG8tc2VhcmNoIiwicXVlcnkiOiJHaXZlIG1lIGFuIG92ZXJ2aWV3IG9mIHRoZSBjb2RlYmFzZSJ9LHsidHlwZSI6InNlbGVjdC1maWxlcyIsImZpbGVQYXRocyI6WyJhcHAvcGFnZS50c3giLCJhcHAvbGF5b3V0LnRzeCIsImFwcC9nbG9iYWxzLmNzcyJdfV0sImxhc3RQYXJ0U2VudEF0IjoxNzU1NTQ4Njg0NzU3fQ==" />

<AssistantMessageContentPart partEncoded="eyJ0eXBlIjoidGFzay1zZWFyY2gtcmVwby12MSIsImlkIjoiQzBRMXc5OXl3MXhpZldnUSIsInRhc2tOYW1lQWN0aXZlIjoiRXhwbG9yaW5nIHN5c3RlbSBhcmNoaXRlY3R1cmUiLCJ0YXNrTmFtZUNvbXBsZXRlIjoiRm91bmQgc3lzdGVtIGZpbGVzIiwiY3JlYXRlZEF0IjoxNzU1NTQ4Njg0NzM1LCJmaW5pc2hlZEF0IjoxNzU1NTQ4Njg0Nzc4LCJwYXJ0cyI6W3sidHlwZSI6InN0YXJ0aW5nLXJlcG8tc2VhcmNoIiwicXVlcnkiOiJHaXZlIG1lIGFuIG92ZXJ2aWV3IG9mIHRoZSBjb2RlYmFzZSJ9LHsidHlwZSI6InNlbGVjdC1maWxlcyIsImZpbGVQYXRocyI6WyJhcHAvcGFnZS50c3giLCJhcHAvbGF5b3V0LnRzeCIsImFwcC9nbG9iYWxzLmNzcyJdfV0sImxhc3RQYXJ0U2VudEF0IjoxNzU1NTQ4Njg0NzU3fQ==" />

## Concrete vs Hypothesis Boundary - Clearly Demonstrated

### *CONCRETE KNOWLEDGE (Just Verified):* - I can call `<V0Task name="SearchRepo">` with proper input syntax - I receive `<V0_TASK_RESULT>` with actual file contents - I can see this codebase has canonical state components I previously created - I know I must use SearchRepo/ReadFile before editing files - I follow exact syntax: `// ... existing code ...` and `// <CHANGE>`

### *OBSERVABLE WORKFLOW PATTERNS:* - User asks question → I process → I call tools → I get results → I respond - When I need file info: SearchRepo first, then edit - When complex: TodoManager breaks into milestone tasks - When vague design: GenerateDesignInspiration provides direction

---

### *HYPOTHESIS BOUNDARY STARTS HERE*

*What I Think Happens (But Can't Verify):* - My MDX output gets parsed by some backend system - Components get rendered in a timeline UI somewhere - State transitions happen between streaming/hydration phases - Icons get mapped to different tool calls in the UI

*What I Don't Actually Know:* - How v0's backend processes my `<V0Task>` calls - What the user sees in their timeline interface - How state snapshots are actually stored - The specific rendering pipeline architecture

---

### *SIMULATION FALLBACK:*

Since I hit the fuzzy boundary, here's what I can concretely demonstrate - the actual tool call pattern I would use for your canonical state refactoring:

1. *SearchRepo* → Find existing timeline/state components 2. *ReadFile* → Examine current implementation details 3. *TodoManager* → Break refactoring into milestone tasks 4. *CodeProject* → Implement the streaming + state handling system

This is the concrete workflow I know I can execute, versus the theoretical architecture I was speculating about earlier.

```