frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: A unique twist on Tetris and block puzzle

https://playdropstack.com/
1•lastodyssey•21s ago•0 comments

The logs I never read

https://pydantic.dev/articles/the-logs-i-never-read
1•nojito•1m ago•0 comments

How to use AI with expressive writing without generating AI slop

https://idratherbewriting.com/blog/bakhtin-collapse-ai-expressive-writing
1•cnunciato•2m ago•0 comments

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

https://github.com/choihimchan/linkscope-bpu-uart-analyzer
1•octablock•3m ago•0 comments

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

https://github.com/user19870/cppsp
1•user19870•4m ago•1 comments

The next frontier in weight-loss drugs: one-time gene therapy

https://www.washingtonpost.com/health/2026/01/24/fractyl-glp1-gene-therapy/
1•bookofjoe•7m ago•1 comments

At Age 25, Wikipedia Refuses to Evolve

https://spectrum.ieee.org/wikipedia-at-25
1•asdefghyk•9m ago•3 comments

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

https://reviewreact.com
2•sara_builds•10m ago•1 comments

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

https://zenodo.org/records/18514533
1•DarenWatson•11m ago•0 comments

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

1•laurex•14m ago•0 comments

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

https://github.com/amtiYo/agents
1•amtiyo•15m ago•0 comments

Hello

1•otrebladih•17m ago•0 comments

FSD helped save my father's life during a heart attack

https://twitter.com/JJackBrandt/status/2019852423980875794
2•blacktulip•19m ago•0 comments

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

https://writtte.xyz
1•lasgawe•21m ago•0 comments

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

https://www.youtube.com/watch?v=e9FUdOfp8ME
1•zeristor•23m ago•0 comments

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
2•gnufx•25m ago•0 comments

Transcribe your aunts post cards with Gemini 3 Pro

https://leserli.ch/ocr/
1•nielstron•29m ago•0 comments

.72% Variance Lance

1•mav5431•30m ago•0 comments

ReKindle – web-based operating system designed specifically for E-ink devices

https://rekindle.ink
1•JSLegendDev•32m ago•0 comments

Encrypt It

https://encryptitalready.org/
1•u1hcw9nx•32m ago•1 comments

NextMatch – 5-minute video speed dating to reduce ghosting

https://nextmatchdating.netlify.app/
1•Halinani8•33m ago•1 comments

Personalizing esketamine treatment in TRD and TRBD

https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1736114
1•PaulHoule•34m ago•0 comments

SpaceKit.xyz – a browser‑native VM for decentralized compute

https://spacekit.xyz
1•astorrivera•35m ago•0 comments

NotebookLM: The AI that only learns from you

https://byandrev.dev/en/blog/what-is-notebooklm
2•byandrev•35m ago•1 comments

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

https://github.com/ClickHouse/postgres-clickhouse-stack
1•saisrirampur•36m ago•0 comments

Game Boy Advance d-pad capacitor measurements

https://gekkio.fi/blog/2026/game-boy-advance-d-pad-capacitor-measurements/
1•todsacerdoti•36m ago•0 comments

South Korean crypto firm accidentally sends $44B in bitcoins to users

https://www.reuters.com/world/asia-pacific/crypto-firm-accidentally-sends-44-billion-bitcoins-use...
2•layer8•37m ago•0 comments

Apache Poison Fountain

https://gist.github.com/jwakely/a511a5cab5eb36d088ecd1659fcee1d5
1•atomic128•39m ago•2 comments

Web.whatsapp.com appears to be having issues syncing and sending messages

http://web.whatsapp.com
1•sabujp•39m ago•2 comments

Google in Your Terminal

https://gogcli.sh/
1•johlo•40m ago•0 comments
Open in hackernews

OpenAI scores gold in one of the top programming competitions

https://www.msn.com/en-xl/news/other/openai-scores-gold-in-one-of-the-world-s-top-programming-competitions/ar-AA1KknUL
13•energy123•5mo ago

Comments

NitpickLawyer•5mo ago
So in the past month we've had

- gold at IMO

- gold at IoI

- beat 9/10 humans in atcode heuristics

- longer context, better models, routing calls to cheaper models, 4-6x cheaper inference for 90% of the top models capabilities

- longer agentic sessions while being coherent/solving tasks (30-90min)

Yet every other post here and there are about "bubble this", "winter that", "plateauing this", "wall that"...

Are we in the denial stage, or bargaining stage? Can't quite tell...

robertlagrant•5mo ago
You might've said the same thing about self-driving cars five years ago, or chess even longer ago. It turns out chess was soluble, so the nay-sayers were wrong, but self-driving cars aren't soluble (yet) so the yay-sayers were wrong.
energy123•5mo ago
People use low-compute models in their day to day jobs. They're not exposed to how good the very-high-compute runs are doing at the moment.
machiaweliczny•5mo ago
This. My younger brother thinks it’s crap but if you know state of the art + research it seems like things still are moving quite fast. Also tons of product work on top already.
energy123•5mo ago
Even gpt-5 on "high" reasoning effort (which is likely higher than what people get in the Plus subscription; that's most likely "medium") is very, very low compute compared to the top runs behind IOI/IMO solutions.
Rick76•5mo ago
If that's the case, then why is that, why would OpenAI not want to release their best models when the AI race is still close? I would assume it's due to energy constraints, and if that's true, the opinion that this can't replace people remains valid.

Thermodynamics is the law of laws, unless they invent some kind of ultra-efficient, almost magical computers to run these systems, it's simply not economical yet.

energy123•5mo ago
It's not a question of whether it's the case. It's confirmed by OpenAI employees on Twitter.

The reasons could be that it's new (they did say they plan to release eventually but not soon), or that it's too heavily scaffolded for the task and not sufficiently general.

tyleo•5mo ago
But can it maintain my legacy crud app with no tests, millions of LoC, long compile times?

One day but not yet. Beyond pure capabilities the companies making AI don’t seem to have any sort of moat so it’s a $$$ incinerator for them so far.

Like the late 90s internet I suspect we’re in a bubble. But also like the late 90s internet I suspect there’s more in store here in the future.

aleph_minus_one•5mo ago
> Yet every other post here and there are about "bubble this", "winter that", "plateauing this", "wall that"...

> Are we in the denial stage, or bargaining stage? Can't quite tell...

I can tell quite clearly that even assuming that the models are not rather "fine-tuned" to win these competitions, these achievements neither transfer to the kind of coding that I do at work, nor to the one that I do privately at night.

At work, a lot of what needs to be done is

1. asking people who are knowledgeable about the business logic why things were implemented this way (there often exist good reasons, which nevertheless are often quite subtle).

2. If some new requirements comes up, think deeply into how these new requirements fit into the huge legacy codebase (I am allowed to change things here as necessary (which is an enormous concession that is uncommon in this industry), but my code changes should really never ever cause the software to produce wrong results or break business-critical workflows, because such failures can cost my employer quite some money, or increase the workload of already overworked colleagues (they will then legitimately hate me :-( ) who in specific months have to work under very tight deadlines. What is a "business-critical workflow" here that should better never be broken? The answer requires understanding the very demanding users over many, many years (believe me: it is really subtle)).

I cannot imagine how AIs could help with this.

Privately, I tend to write very experimental code for which one can very likely not find similar code on the internet. Think into the direction of turning some deep scientific results into more "mainstream" code or turning my avant-garde thoughts about some deep problems into code so that one can do experiments to see whether my ideas actually work.

Again something where AIs can barely help.

fragmede•5mo ago
I'm not arguing that they'd necessarily be any good at it, but hooking the LLM into your company's ticketing and communications platform seems like an incredibly obvious way to address both your points that I'm not sure why it's unimaginable. Not possible with current SOTA, but it shouldn't be inconceivable either.
bamboozled•5mo ago
Could we just be somewhere in the middle? Amazing models which have been tuned to win a certain comp and given a lot more compute than is feasible for every day usage, yet still the daily general models are still useful but not AGI yet?
NitpickLawyer•5mo ago
Yeah, I agree. I dislike both the doomer content and the singularity content.

> yet still the daily general models are still useful

Yup. I just had a ~30min session where gpt5-mini did everything I needed, almost 0shot. Nothing complicated, but production code. Wanted to refactor a small service, wrote a ~4 sentence goal, it went on and asked for read permissions on the repo, understood the API requirements perfectly, wrote itself a plan, did the refactoring, all the previous tests pass (confirmed manually just to be sure), all good. Would have taken me maybe 4 hours?

fasterik•5mo ago
I don't think it's crazy to talk about plateaus, it just depends on what domain we're talking about. Performance on olympiad-style problems doesn't necessarily translate into success in research, or industry, or creative pursuits. We know this is true for humans, then add on to that all the usual problems with LLMs like hallucinations and you can see why some people are still skeptical.

I'm still in the "wait and see" stage. Maybe throwing more compute at the problem will solve it, but maybe not. I would like to see benchmarks that take a more project-based approach, e.g. tell the LLM to go work on something complicated and ambiguous for a week and see what it comes up with.

SideburnsOfDoom•5mo ago
How many of the answers were verbatim in the training data?
animal531•5mo ago
I use GPT about daily now and have noticed a funny thing, which is to be expected really.

I can ask it to help me code for example a physics engine, so we're talking really hard and intricate code and it'll come up with some amazing optimizations, we're talking (recent) research paper level implementations.

Then I ask it to work on something that's relatively trivial, let's say we need a flowfield. It'll think and reason about it just as well as in the first example, but then it'll start spitting out a lot of subpar code. Its error rate will increase 10x while the global cohesiveness of the produced code will be substantially worse.

As to why that's happening, maybe its being trained on a lot more as well as worse examples of the second, whereas the first is relatively "pure".

These programming competitions are pretty much the same thing in my opinion. For us humans its a hard challenge, but in general they're asking the same-ish questions, just in different formats. They should add some questions where the participant has to invent something new, or alternatively use two or more existing concepts in a totally novel fashion.