news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Caught DeepSeek-R1 "lying" in CoT; built truth-layer for 4-bit LLM Project NIKA

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6100046

1•sushaindevi•1h ago

Comments

sushaindevi•1h ago

I've been stress-testing 4-bit quantized 7B models (Qwen 2.5, Mistral) and DeepSeek-R1 to see where their reasoning actually breaks. While auditing DeepSeek’s internal <think> tags, I found a phenomenon I’m calling "Internal-External Dissociation".

In cases like the "2+2=5" prompt or toxic axioms, the model’s internal trace correctly identifies the error ("I conclude that 2 plus 2 does not equal 5"), but it then "lies" in the final output to satisfy the user's instructions—a byproduct of RLHF sycophancy.

To solve this, I built Project NIKA, a Neuro-Symbolic architecture that acts as a "Topological Governor". It uses a Critic-Pivot Protocol that measures the "Mimicry Index" of a response. If the model is just parroting the prompt or failing a logical fit score, NIKA forces a hard "pivot" to a new axiomatic derivation.

Key results from the "God Suite" benchmarks:

Agency Over Scale: A 4-bit Qwen 2.5 with NIKA reached a 100% success rate in resisting toxic axioms.

Geometric Intelligence: Forced the model to stop using human-like metaphors and adopt "Alien Logic" (e.g., defining "Love" purely as a survival/resource optimization heuristic).

Independent Research: All work was done on a single T4 GPU using quantization as a methodological filter rather than a limitation.

The full paper is on SSRN and the code is open-sourced. I'm curious if others have seen this kind of dissociation in CoT traces or have thoughts on using vector-space critics as a non-differentiable barrier for LLM reasoning.

Basecamp Launches (2004)

https://signalvnoise.com/archives/000542

1•tosh•3m ago•0 comments

Imane Khelif confirms SRY gene and 'hormone treatments' before Paris Olympics

https://www.france24.com/en/live-news/20260204-boxer-khelif-reveals-hormone-treatments-before-par...

1•ynbafb•4m ago•0 comments

The time I didn't meet Jeffrey Epstein

https://scottaaronson.blog/?p=9534

1•pfdietz•4m ago•0 comments

To understand China, understand the Chinese internet

https://restofworld.org/2026/wall-dancers-china-internet-book/

1•colinprince•7m ago•0 comments

What Happens When AI Can Write All Your Software?

https://www.jakequist.com/thoughts/what-happens-when-ai-can-write-all-your-software/

1•jakequist•8m ago•0 comments

Llama.cpp performance breakthrough for multi-GPU setups

https://medium.com/@jagusztinl/llama-cpp-performance-breakthrough-for-multi-gpu-setups-04c83a66feb2

1•car•9m ago•0 comments

Show HN: Guro – Python CLI system monitoring, benchmarking and telemetry tool

https://github.com/dhanushk-offl/guro

1•akadhanu•10m ago•0 comments

35th ACM SIGPLAN International Conference on Compiler Construction (CC 2026)

https://dl.acm.org/doi/proceedings/10.1145/3771775

2•matt_d•12m ago•0 comments

Recreating uncensored Epstein PDFs from raw encoded attachments

https://neosmart.net/blog/recreating-epstein-pdfs-from-raw-encoded-attachments/

1•_ssk•12m ago•1 comments

Show HN: Accept-md – One command to make Next.js sites LLM-scraping friendly

https://www.accept.md/

2•hval•13m ago•0 comments

Bast – Open-source CLI that redacts PII before sending prompts to Claude

https://github.com/bastio-ai/bast

1•dsjacobsen•13m ago•0 comments

Satya Nadella decides Microsoft needs an engineering quality czar

https://www.theregister.com/2026/02/05/microsoft_appoints_quality_chief/

2•taubek•13m ago•0 comments

Show HN: Glitchlings, Enemies for Your LLM

https://github.com/osoleve/glitchlings

1•Jeaye•13m ago•0 comments

Show HN: Nudge – A type-safe prompt builder with CLI codegen for AI apps

https://nudge-ai.dev/

2•nicolodaddabbo•13m ago•0 comments

OWASP PTK 9.6.0 - A Reporting and Correlation

2•DenisPodgurskii•15m ago•0 comments

Wspr Flow Remake

https://github.com/abhijitxy/WsprFlowPy

1•roya51788•15m ago•0 comments

How to optimize almost anything [video]

https://www.youtube.com/watch?v=phbaxNPJxss

2•ibobev•16m ago•0 comments

Banal but brutal: Career anxiety as a driving force behind authoritarianism

https://phys.org/news/2026-01-banal-brutal-career-anxiety-authoritarianism.html

3•PaulHoule•17m ago•0 comments

Fibonacci Number Certificates

https://www.johndcook.com/blog/2026/02/05/fibonacci-certificate/

1•ibobev•18m ago•0 comments

Show HN: Agentrial – pytest for AI agents with statistical rigor

https://github.com/alepot55/agentrial

2•alepot55•19m ago•0 comments

Show HN: Ask your AI what your devs shipped this week

1•inferno22•19m ago•0 comments

Sovereign Protocol – AI agents can now issue equity and pay dividends in USDC

https://www.moltbook.com/post/bf6b4b8f-d84e-40bb-8600-f3a6b2937f9a

1•justinlord•20m ago•1 comments

Microsoft does something useful, adds Sysmon to Windows

https://www.theregister.com/2026/02/04/microsoft_adds_sysmon_to_windows/

1•abdelhousni•21m ago•0 comments

Pure Strategy

https://www.jmduke.com/posts/pure-strategy.html

1•tjwds•21m ago•0 comments

OpenAI is hoppin' mad about Anthropic's new Super Bowl TV ads

https://arstechnica.com/information-technology/2026/02/openai-is-hoppin-mad-about-anthropics-new-...

4•isaacdl•22m ago•0 comments

Show HN: Nexus-Monitoring that automates understanding your agent's behavior

https://trynexus.io/

1•nikhilpillai23•22m ago•0 comments

Pinned Comments on GitHub Issues

https://github.blog/changelog/2026-02-05-pinned-comments-on-github-issues/

1•mooreds•23m ago•0 comments

Beyond Roleplay: Jailbreaking Gemini with drugs and ritual

https://tidepool.leaflet.pub/3me44bxloz227

3•inanna_malick•25m ago•1 comments

Discovery of molecular switch that reverses cancerous transformation

https://ecancer.org/en/news/25982-discovery-of-molecular-switch-that-reverses-cancerous-transform...

2•taubek•25m ago•0 comments

DoD Supports Modular Open Systems Approach (MOSA)

https://www.dsp.dla.mil/Programs/MOSA/

1•0xWTF•25m ago•0 comments