frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

SpaceX Delays Mars Plans to Focus on Moon

https://www.wsj.com/science/space-astronomy/spacex-delays-mars-plans-to-focus-on-moon-66d5c542
1•BostonFern•31s ago•0 comments

Jeremy Wade's Mighty Rivers

https://www.youtube.com/playlist?list=PLyOro6vMGsP_xkW6FXxsaeHUkD5e-9AUa
1•saikatsg•54s ago•0 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
1•sam256•2m ago•0 comments

AI Command and Staff–Operational Evidence and Insights from Wargaming

https://www.militarystrategymagazine.com/article/ai-command-and-staff-operational-evidence-and-in...
1•tomwphillips•3m ago•0 comments

Show HN: CCBot – Control Claude Code from Telegram via tmux

https://github.com/six-ddc/ccbot
1•sixddc•4m ago•1 comments

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

1•amichail•6m ago•0 comments

Show HN: Convert your articles into videos in one click

https://vidinie.com/
1•kositheastro•9m ago•0 comments

Red Queen's Race

https://en.wikipedia.org/wiki/Red_Queen%27s_race
2•rzk•9m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
2•gozzoo•12m ago•0 comments

A Horrible Conclusion

https://addisoncrump.info/research/a-horrible-conclusion/
1•todsacerdoti•12m ago•0 comments

I spent $10k to automate my research at OpenAI with Codex

https://twitter.com/KarelDoostrlnck/status/2019477361557926281
2•tosh•13m ago•0 comments

From Zero to Hero: A Spring Boot Deep Dive

https://jcob-sikorski.github.io/me/
1•jjcob_sikorski•13m ago•0 comments

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

https://zenodo.org/records/18395618
1•alemonti06•18m ago•1 comments

Cook New Emojis

https://emoji.supply/kitchen/
1•vasanthv•21m ago•0 comments

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•24m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•25m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
2•michalpleban•25m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•26m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
2•mitchbob•26m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
2•alainrk•27m ago•0 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•28m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
2•edent•31m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•34m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•34m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•40m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
7•onurkanbkrc•41m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor
1•concode0•41m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/
1•panossk•44m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/
1•headalgorithm•47m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/
1•marklit•47m ago•0 comments
Open in hackernews

I made 4000 agent calls in Cursor last month. Each model has a personality

8•mike210•9mo ago
The lazy architect (OpenAI’s o3). o3 is incredibly lazy at writing code, but very good at planning. Will happily read tens of files and do deep analysis, but often struggles in scenarios where it needs to edit more than one file.

The over-eager child (Claude Sonnet 3.7 Thinking). Claude Sonnet is eager to just get going, man! It’s not the most careful, and in longer strings of tool calls, may start editing something completely unrelated to what you asked it to.

Pretty balanced?(Gemini 2.5 Pro). Gemini 2.5 is a little more intelligent, and significantly faster and more reserved than Sonnet 3.7. Usually the best choice for writing code in multiple files.

I’ve found o4-mini to be incredibly slow and fairly mediocre, and GPT 4.1 useful in very situational areas. My tips:

- Use o3 to plan and/or write code in one or max two file only. If you do more, it may openly revolt and just refuse to write any longer.

- Always make sure Sonnet 3.7 is following a tightly scoped plan on a relatively small section of the product, and supervise it. If you have an easy change to make in many areas of your codebase, for example, letting Sonnet run, still supervised, is a perfect use of the model’s persona

Generally what I do:

- Medium complexity: editing one file: o3. Editing multiple files: plan with o3, write with gemini-2.5

- Simple complexity: Editing many files, very simple: plan with o3 if needed, write with claude-3.7. Editing many files, simple, needs formulaic approach: write a detailed prompt into GPT 4.1

- High complexity: plan with o3, separate into multiple chunks, write small chunks at a time with gemini-2.5 and be very careful with each section. If I'm super lazy sometimes I just YOLO all of the sections and then fix all the bugs at the end but this probably leads to code issues later down the line.

Would love to hear other people are using the different models!

Comments

joegibbs•9mo ago
I really like the code that Gemini 2.5 Pro writes but it tends to stop for no reason and needs to be reprompted to start again. I'm not sure why this is. Also, what's the difference between 2.5 Pro and 2.5 Pro Max? Or Claude 3.7 and 3.7 Max?

Aside: it would be good for Cursor to add something to tell their agents not to run tool calls that run forever (like test watchers). I add this in my .mdc files but I think it would be a good default so that it can run tests, update the code, run them again until it works.

mike210•9mo ago
I sometimes but rarely turn on Max for Gemini for more context in a long conversations. The tool use (5 cents per tool call) can get pretty ridiculous on Claude 3.7 Sonnet Max and I've had calls that have been ~$2.
muzani•9mo ago
Sonnet 3.5 has a very different personality. It's less skilled, but often I opt for it because of the personality.

Deepseek is actually pretty good and underappreciated too. It feels unreliable though. Downside is tool use, but I prefer it over o3.

mike210•9mo ago
Interesting - what kinds of tasks do you reach for 3.5?
muzani•9mo ago
Pretty much whatever you're using 3.7 for. You don't need as tight a scope. It does easy things well.

An situation I had yesterday: we had two dropdowns. For simplicity, let's say it's country. When you pick a different country, it shows states. When you select state, then change country, it crashes because the state doesn't exist in the new country.

The standard solution is simple – just make it reset to null when switching countries, or better yet, check whether the selected state exists in the new country. But the thinking models will overengineer the hell out of this. They'll check from the deep service level when these checks can be made just below the view layer.