frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
1•okaywriting•1m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
1•todsacerdoti•4m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•5m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•6m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•7m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•7m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•7m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•8m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•12m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
1•bkls•12m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•13m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
3•roknovosel•13m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•22m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•22m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•24m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•24m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
1•surprisetalk•24m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
3•pseudolus•25m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•25m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•26m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•27m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•27m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
2•jackhalford•28m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•29m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
2•tangjiehao•31m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•32m ago•1 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•33m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•33m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
2•tusharnaik•34m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•34m ago•0 comments
Open in hackernews

Do you prefer GPT or Google Gemini?

2•AaronSwift1•5mo ago
and why

Comments

fbhabbed•5mo ago
GPT5-Thinking if I need a precise answer with the least possible amount of mistakes.

GPT5-Pro is the real deal.

Gemini if I need creative insights and a pleasant talk, but this comes at the cost of more mistakes (and it's hella stubborn).

Hopefully Gemini 3.0 will fix this.

Topfi•5mo ago
Generally I currently rely on GPT-5 Thinking (Medium Reasoning) for most tasks because of all the models currently available, the GPT-5 series (and GPT-4.1 before that) have been the most reliable in following instructions to the letter, doing no less, but more importantly, no more.

Both Claude models (from 3.5 Sonnet to 4.1 Opus) and Gemini 2.5 Pro have historically always taken a lot more liberties, which some users find appealing, but which I have come to not want when relying on a model for consistent output. I can see why some find great value in a model already implementing e.g. an auth provider when requesting the frontend for a login page, but for guiding a model, I personally prefer something to not happen if I didn't explicitly requested such behavior. Especially Claude as part of agentic coding workflows has a tendency to simply try e.g. a different package then what was requested, which some users may not notice. Found this very funny when Claude 4 Sonnet once fully reimplemented an infinite canvas as @xyflow failed to install properly. I'd rather a model error out there/ask for the user in the loop to confirm.

In regard to instruction following, while all three Frontier providers do well with their context windows, GPT-5 models are still a bit more preferable for me, despite having only 400k vs 1million, simply because what is there can not just be recalled, but will be adhered to reliably as well.

GPT-5 also seems a bit better regarding CSS, though I have far to limited UI taste to actually make a solid judgement on that front and styling is of course subjective.

Additionally, when benchmarking all three frontier models side by side, I have yet to find a coding task that GPT-5 cannot solve but the others can. I did however find certain cases where my initial instructions were lacking/not comprehensive enough, leading to all three having issues completing a task. In those cases, I found that Gemini 2.5 Pro when provided with the code base does best at rewriting an existing prompt. These then usually have far higher success rates when provided to GPT-5 Thinking (Medium Reasoning) or to a lesser extend when using one of the Claude 4 models. However, these Gemini provided prompts also occasionally contain inventions/hallucinations", so I must always triple check prompts when doing this.

For context, the main coding problem I am using model assistance for at the moment is some poorly designed Figma inspired real time syncing code with some overly odd edge cases, courtesy of my limited skillset.

For none-coding stuff, I have in the last semester mainly relied on Gemini 2.5 (sometimes Pro, often Flash) for creating nice summaries of lectures. I found any other models (doesn't matter whether OpenAIs previous models or anything from Anthropic, Mistral, Deepseek, Qwen, etc.) less suitable, mainly because these tended to output far to strong summaries, often truncating what is absolutely vital information. Gemini models are far more willing to actually output an extensive, maybe a bit to verbose summary, but I'd rather remove a few lines. I haven't yet gotten enough experience with GPT-5 as a summarization tool as the semester is only just starting, so cannot say how well OpenAIs newest series does there, but from very limited experience, GPT-5-mini has potential here to replace Gemini 2.5 Flash as my go to.