frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•2m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
1•Brajeshwar•6m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
2•Brajeshwar•6m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
1•Brajeshwar•6m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•9m ago•0 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•12m ago•0 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•13m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•13m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
2•vinhnx•14m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•19m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•24m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•28m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•29m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•30m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
5•okaywriting•37m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•40m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•40m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•41m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•42m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•42m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•43m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•43m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•47m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•47m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•49m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•49m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•57m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•57m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
2•surprisetalk•59m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
3•surprisetalk•59m ago•0 comments
Open in hackernews

Ask HN: Perplexity AI Cheating on Model?

1•freakynit•5mo ago
See screenshot: https://postimg.cc/YL0d6rvP

It seems to be using its own model, most likely a very small and inexpensive one.

I became suspicious because it was responding very poorly, even to simple PostgreSQL/Node.js queries, which is not typical of GPT-5. I have plenty of experience using GPT-5 and know the quality of answers it usually provides.

Has anyone else experienced the same issue?

Comments

freakynit•5mo ago
Following up: I asked same model name question directly to gpt-5 using ChatGPT, and it responsed properly:

" I’m GPT-5, the latest generation model from OpenAI. "

begemotz•5mo ago
Anecdotally I have found that the quality of perplexity.ai has gone down significantly over the past year. It seems as though (at least for the free version) - it is nothing more than natural language web-search now.
BoredPositron•5mo ago
Asking an LLM about itself is not an reliable way to find out with which model you are interacting with. If you look at the GPT5 system prompt you'll see that the model knows which model it is because it's written in the system prompt. If you use the API as Perplexity is doing you can write whatever you want in the system prompt. If I write you are "Deep Thought an LLM developed by Douglas Adams" in my system prompt it will "think" it's Deep Thought.
freakynit•5mo ago
That's true. But then all models exposed by Perplexity AI should be affected, not just gpt-5. This is not the case.

---

I switched to Sonnet and asked same question, and it responsed with::: "I'm Claude, an AI assistant created by Anthropic"

---

for grok-4, it was something even worse. Here is what was in the "thinking tokens"::: "I'm identifying myself as an AI language model by OpenAI, specifically based on GPT-4 architecture." ... but, the final answer produced this::: "I’m Perplexity’s AI assistant. I use a combination of advanced large-language-model back-ends (including GPT-4o and Claude 4 Sonnet) and Perplexity’s own retrieval system to answer your questions in real time."

---

For O3, here was it's response::: "I’m a large-language AI assistant built by Perplexity, powered by state-of-the-art models such as GPT-4o."

BoredPositron•5mo ago
As you see every model answers differently and you are still trusting their outputs which is the main problem. You can't determine which model is used by asking the model itself.
freakynit•5mo ago
It's not just the model name part. The whole reason I even asked such a question was because the outputs it was generating were absolute garbage. Like something that will be generated by a super super small 3B model. even after giving it direct hints on why it was wrong, it kept uttering nonsense. I repeated the same in chatgpt and with sonnet models... they both gave sensible, expected outputs.

This is not about model name question, this is about actual severe quality deterioration. Till yesterday, this was all fine.

I believe they are ding A/B testing. More models will start to show similar behaviours. And this will be specific to India. I might have guessed the reason too:

1. https://indianexpress.com/article/technology/techook/airtel-...

2. https://www.financialexpress.com/life/technology-free-perple...

3. https://www.news18.com/tech/airtels-free-ai-offer-with-perpl...

With 300+ million Airtel users in India, I believe they are trying to reduce costs since customers have already been acquired.

BoredPositron•5mo ago
You are still making assumptions without empirical data.
freakynit•5mo ago
Assumptions on the reasoning part: yes.

On the quality part: No.