frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
1•okaywriting•5m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
1•todsacerdoti•8m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•8m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•9m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•10m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•10m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•11m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•11m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•16m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•16m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•17m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•17m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•25m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•25m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•28m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•28m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
1•surprisetalk•28m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
3•pseudolus•28m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•29m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•30m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•30m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•30m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
2•jackhalford•32m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•32m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
2•tangjiehao•35m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•36m ago•1 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•36m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•36m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
2•tusharnaik•38m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•38m ago•0 comments
Open in hackernews

GPT5 is the best coding LLM because other LLMs admit it?

1•adinhitlore•5mo ago
So I vibe-code a lot these days and recently i decided to give the same prompt to several llms, then get their codes and later give each code to every single one of them to ask which one they think is the most useful without telling them that they or the other 2 llms wrote it. The overall consensus is: gpt5. True I only compared gpt5 vs claude 4.1 vs qwen 230bn. OSS 120b, gemini and grok 4 were excluded since well i don't have the time. And obvious failures like amazon nova or anything from meta weren't even planned. Deepseek (both) seem a bit underperforming . Personally I'd say it's a close call between claude opus 4.1 vs both gpt4 and gpt5 (ironically gpt5 sometimes performs worse than 4, i think this has been addressed by many people already). That's just my personal experience, i know HumanEwal or SWE or whatever give various performance but idk, Musk used the benchmarks as "proof" to hype Grok and in my experience grok 4 is between LLAMA4 and obviously behind gpt4 or some variations of qwen.

Again this is coding only: Python and C. For physics, chemistry, scifi novels or whatever the case may be very different. Another kudos to OSS 120bn btw: it's very generous on tokens...like it will write a small programming book if it takes to in one reply, unless of course you tell it to be more limited, this is a huge plus for me since the code I demand should be complex and not some 20 lines nova "pro" joke.

Comments

incomingpain•5mo ago
all ive done with gpt5 for coding was a major db refactor. i had run out of gemini limit for the day.

certainly got the job done. I doubt my gpt 20b or ~30b local llm would have been as capable. Overall it was about ~2000 lines of code to change, probably only 100,000 context.

gpt5 didnt one shot it. there were many steps inbetween. At the end, few hours, i had >50 linter warnings from tripled imports, loads of dead code that wouldnt be touched and for some reason gpt5 just couldnt fix any of this. Ended up increasing the warnings and added an error. My expectation is that any of the big guys could immediately fix it. Even restarted fresh context and gpt just wasnt having any of it. im certain even gpt 20b would have completed it in a minute. Curious.

I went to gemini flash, very generic prompt about linter warnings and it fixed it in 30 seconds.

Just kind of weirdness that benchmarks will never be able to catch. It's also going to be very dependent. A rust programmer might have a favourite, whereas python programmer benefits from another model. There can never be a best.

adinhitlore•5mo ago
I had similar experience, usually I'd ignore Gemini be it flash or pro but on several occasions it fixed complex errors like it's nothing. Yet when it comes to codegen it is "cheap" on tokens and struggles outputting complex logic. As a great bonus: their easy to setup API is freemium but a generous freemium (google AI studio I mean). My "ecosystem" atm will be something like: gpt5, claude 4.1 - if they both fail: try to fix with gemini. I'd skip Grok for privacy issues mostly not that I completely ignore its capabilities, qwen is good but sometimes 'overengineered' i don't need 400bn , given the large params maybe it will work for non-coding like if you ask it some exotic questions about science: casimir effect, acoustic levitation, ununennium etc etc you name it.
zahlman•5mo ago
> recently i decided to give the same prompt to several llms, then get their codes and later give each code to every single one of them to ask which one they think is the most useful without telling them that they or the other 2 llms wrote it.

The fact that you expect the result of this experiment to be useful, is more interesting than the actual result.

adinhitlore•5mo ago
vibe-coding is the future, drop conservatism....'free palestine' i mean you get the idea: be progressive and open minded.
pavel_lishin•5mo ago
Those seem like completely orthogonal concepts.
bigyabai•5mo ago
This is a profoundly mentally-ill response to a surface-level criticism you should have been able to refute.
adinhitlore•5mo ago
well i'm happy with my response which is what matters lol. Hedonism > all, well on this site anyway, i'm not trying to impress anyone or prove anything...random markov chain kind of typing fits it ideally.
slater•5mo ago
Are you ok?