frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
623•klaussilveira•12h ago•182 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
925•xnx•18h ago•548 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
32•helloplanets•4d ago•24 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
109•matheusalmeida•1d ago•27 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
9•kaonwarb•3d ago•7 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
40•videotopia•4d ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
219•isitcontent•12h ago•25 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
210•dmpetrov•13h ago•103 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
321•vecti•15h ago•143 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
369•ostacke•18h ago•94 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
358•aktau•19h ago•181 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
477•todsacerdoti•20h ago•232 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
272•eljojo•15h ago•160 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
402•lstoll•19h ago•271 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
85•quibono•4d ago•20 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
14•jesperordrup•2h ago•6 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
25•romes•4d ago•3 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
56•kmm•5d ago•3 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
12•bikenaga•3d ago•2 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
243•i5heu•15h ago•188 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
52•gfortaine•10h ago•21 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
140•vmatsiiako•17h ago•62 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
280•surprisetalk•3d ago•37 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1058•cdrnsf•22h ago•433 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
132•SerCe•8h ago•117 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
70•phreda4•12h ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
28•gmays•7h ago•10 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
176•limoce•3d ago•96 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
63•rescrv•20h ago•22 comments

WebView performance significantly slower than PWA

https://issues.chromium.org/issues/40817676
32•denysonique•9h ago•6 comments
Open in hackernews

Vibechart

https://www.vibechart.net/
879•datadrivenangel•6mo ago

Comments

I_am_tiberius•6mo ago
It would be interesting to know how this occurred. I assume there may have been last-minute high-level feedback suggesting: "We can't let users see that the new model is only slightly better than the old one. Adjust the y-axis to make the improvement appear more significant."
lnenad•6mo ago
I mean this is the industry standard. For example every time Nvidia dumps a new GPU into the ether, they do the same thing. Apple with M series CPUs. They even go a step further and compare a few generations back.
datadrivenangel•6mo ago
It's dishonest and the multiple examples in the same presentation tell you what you need to know about the credibility of the presenters
andrewstuart2•6mo ago
The other chart on that slide was actually to scale. My suspicion is that it was super rushed to the deadline for this presentation and they maybe didn't use excel or anything automatic for the charts, so they look better, and they missed the detail due to time pressure.
zigzag312•6mo ago
There's only one error, bar height for o3. Somehow height uses value from 4o, which seems like some sort of copy paste error.

EDIT: I was looking just at the first chart. I didn't see there's more below.

croes•6mo ago
Did you miss the picture below where the bar for 50% is lower than the bar for 47.4%

And even if it’s just one chart. There are 3 or 4 bars (depends on how you count) so they screwed up 33%/25 % of the chart.

Quite an error margin.

zigzag312•6mo ago
Oh, I did miss it. Thanks!
brazzy•6mo ago
> Did you miss the picture below where the bar for 50% is lower than the bar for 47.4%

That one was added later, if I interpret the attribution at the bottom correctly. And I'm also pretty sure it wasn't there when I first saw it pop up.

danpalmer•6mo ago
Maybe they asked GPT-5 to update slides.
qustrolabe•6mo ago
GPT-5 would've caught this mismatch for sure
macNchz•6mo ago
That seemingly depends a bit on how hard you ask it to think, or how hard it decides to think based on your question.
danpalmer•6mo ago
"ChatGPT, this slide deck feels a bit luke warm, help me make a better impression"

I could completely believe someone who is all-in on the tech, working in marketing, and not really that familiar with the failure modes, using a prompt like this and just missing the bad edit.

datadrivenangel•6mo ago
Claude and ChatGPT actually took me several prompts to get them to identify this. They recognized from a screenshot that labeled axes that start at zero can be misleading, but missed the actual issue.
nonhaver•6mo ago
thats hilarious actually. gives credence to the gpt theory haha
KronisLV•6mo ago
How hard would it be for the model to be like: "Okay, there's a bar chart in the picture, the left bar is 350 px in size and the right bar is 120 px. Meanwhile, the labeled values are X and Y, which doesn't seem to match those relative sizes due to this math I ran thanks to all the cool deterministic tools I have."

Apparently quite a bit.

outside1234•6mo ago
There is a smell of desperation around OpenAI, so I wouldn't be surprised if this level of hypevibing came from the top.
44za12•6mo ago
That was quick, vibe coded, I presume?
datadrivenangel•6mo ago
The CSS animations are very revealing on that front from a performance perspective.
teaearlgraycold•6mo ago
I tend to blame performance issues on the developer writing the code on a top of the line computer. There are too many WebGL effects on startup websites that were built to run on a M4 Max.
datadrivenangel•6mo ago
Yeah this is somewhat stuttery on an M2 mac.
thewebguyd•6mo ago
> There are too many WebGL effects on startup websites that were built to run on a M4 Max.

Tale as old as time. When the retina display macs first came out, we say web design suddenly no longer optimized for 1080p or less displays (and at the time, 1376x768 was the default resolution for windows laptops).

As much suffering as it'd be, I swear we'd end up with better software if we stopped giving devs top of the line machines and just issued whatever budget laptop is on sale at the local best buy on any given day.

teaearlgraycold•6mo ago
I wouldn't go that far, but maybe split the difference at a modern i3 or the lowest spec Mac from last year.

It would be awesome if Apple or someone else could have an in-OS slider to drop the specs down to that of other chips. It'd probably be a lot of work to make it seamless, but being able to click a button and make an M4 Max look like an M4 would be awesome for testing.

p1necone•6mo ago
Tbh even the absolute lowest spec Mx macs are insanely powerful, probably best to test on a low end x86 laptop.
universenz•6mo ago
No no no.. go one better for the Mac. It should be whichever device/s which are next to be made legacy from Apple’s 7 year support window. That way you’re actually catering to the lowest common denominator.
01HNNWZ0MV43FF•6mo ago
At my work every dev had two machines, which was great. The test machine is cattle, you don't install GCC on it, you reflash it whenever you need, and you test on it routinely. And it's also the cheapest model a customer might have. Then your dev machine is a beast with your kitten packages installed on it.
p1necone•6mo ago
Develop on a super computer, test on $200 laptop - not really any suffering that way.
xpe•6mo ago
To keep a fast feedback loop, build on the fast machine, deploy, test on the slow one.
seba_dos1•6mo ago
It's less than 200 lines of CSS. Easily doable by a human in 30 minutes.
mattgreenrocks•6mo ago
I love how this has to be defended now, as if that was somehow unthinkable from a domain expert.
marvinborner•6mo ago
This should also include the chart on "Coding deception" [1] which is quite deceptive (50.0 is not in fact less than 47.4)

[1]: https://youtu.be/0Uu_VJeVVfo?t=1840

qwertox•6mo ago
Both the submission and your link took me way too long to see what's the issue here.

What were they even thinking? Don't they care about this? Is their AI generating all their charts now and they don't even bother to review it?

panarky•6mo ago
Since everyone assumes GPT hallucinated these charts, the truth must be that they're 100% pure, organic, unadulterated human fuckups.
croes•6mo ago
Doesn’t matter. Either way is bad
datadrivenangel•6mo ago
Either way is bad. Intentionally human made and approved is worse than machine generated and not reviewed. Malicious versus sloppy.
croes•6mo ago
Machine generated is worse.

How many charts will the person create, how many the machine?

windowdoor•6mo ago
My unjustified and unscientific opinion is that AI makes you stupid.

That's based solely on my own personal vibes after regularly using LLMs for a while. I became less willing to and capable of thinking critically and carefully.

nicce•6mo ago
It also scares me how good they are in appealing and social engineering. They have made me feel good about poor judgment and bad decision at least twice (which I noticed later on, still in time). New, strict system prompt and they give the opposite opinion and recommend against their previous suggestion. They are so good at arguing that they can justify almost anything and make you believe that this is what you should do unless you are among the 1% experts in the topic.
lacy_tinpot•6mo ago
> They are so good at arguing that they can justify almost anything

This honestly just sounds like distilled intelligence. Because a huge pitfall for very intelligent people is that they're really good at convincing themselves of really bad ideas.

That but commoditized en masse to all of humanity will undoubtedly produce tragic results. What an exciting future...

Terr_•6mo ago
> They are so good at arguing that they can justify almost anything

To sharpen the point a bit, I don't think it's genius "arguing" or logical jujitsu, but some simpler factors:

1. The experience has reached a threshold where we start to anthropomorphize the other end as a person interacting with us.

2. If there were a person, they'd be totally invested in serving you, with nearly unlimited amounts of personal time, attention, and focus given to your questions and requests.

3. The (illusory) entity is intrinsically shameless and appears ever-confident.

Taken together, we start judging the fictional character like a human, and what kind of human would burn hours of their life tirelessly responding and consoling me for no personal gain, never tiring, breaking-character, or expressing any cognitive dissonance? *gasp* They're my friend now and I should trust them. Keeping my guard up is so tiring anyway, so I'm sure anything wrong is either an honest mistake or some kind of misunderstanding on my part, right?

TLDR: It's not not mentat-intelligence or even eloquence, but rather stuff that overlaps with culty indoctrination tricks and con[fidence]-man tactics.

II2II•6mo ago
No. AI is a tool to make ourselves look stupid. Suggesting that it makes people stupid suggest that they are even looking at the output.
lacy_tinpot•6mo ago
AI being used to completely off load thinking is a total misuse of the technology.

But at the same time that this technology can seemingly be misused and cause really psychological harm is kind of a new thing it feels like. Right? Like there are reports of AI Psychosis, don't know how real it is, but if it's real I don't know any other tool that's really produced that kind of side effect.

windowdoor•6mo ago
We can talk a lot about how a tool should be used and how best to use it correctly - and those discussions can be valuable. But we also need to step back and consider how the tool is actually being used, and the real effects we observe.

At a certain point you might need to ask what the toolmakers can do differently, rather than only blaming the users.

brundolf•6mo ago
It makes Apple's charts look rigorous and transparent
sundarurfriend•6mo ago
> Both the submission and your link took me way too long to see what's the issue here.

Mission accomplished for them then.

rsynnott•6mo ago
I mean, if your whole business is producing an endless stream of incorrect output and calling it good enough, why would you care about accuracy here? The whole ethos of the LLM evangelist, essentially, is "bad stuff is good, actually".
p1necone•6mo ago
This half makes sense to me - 'deception' is an undesirable quality in an llm, so less of it is 'better/more' from their audiences perspective.

However, I can't think of a sensible way to actually translate that to a bar chart where you're comparing it to other things that don't have the same 'less is more' quality (the general fuckery with graphs not starting at 0 aside - how do you even decide '0' when the number goes up as it approaches it), and what they've done seems like total nonsense.

JBiserkov•6mo ago
> 'deception' is an undesirable quality in an llm, so less of it is 'better/more' from their audiences perspective

So if that ^ is why 50.0 is lower than 47.4 ... but why is then 86.7 not lower than 9.0? Or 4.8 not lower than 2.1

datadrivenangel•6mo ago
Added!
chilmers•6mo ago
That one is so obviously wrong that it makes me wonder if someone mislabelled the chart, but perhaps I'm being too optimistic.
mwigdahl•6mo ago
It's been fixed on the OpenAI website.
computomatic•6mo ago
Presumably it corresponds to Table 8 from this doc: https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb...

If that’s the case, it’s mislabelled and should have read “17%” which would better the visual.

eviks•6mo ago
That would still be basic fail, you don't label a chart, you enter data, the pre-AGI computer program does the rest - draws the bars and slows labels that match the data
zmmmmm•6mo ago
I pasted the image of the chart into ChatGPT-5 and prompted it with

>there seems to be a mistake in this chart ... can you find what it is?

Here is what it told me:

> Yes — the likely mistake is in the first set of bars (“Coding deception”). The pink bar for GPT-5 (with thinking) is labeled 50.0%, while the white bar for OpenAI o3 is labeled 47.4% — but visually, the white bar is drawn shorter than the pink bar, even though its percentage is slightly lower.

So they definitely should have had ChatGPT review their own slides.

zeroonetwothree•6mo ago
Does it work that well if you don’t tell it there is a mistake though?
01HNNWZ0MV43FF•6mo ago
That's the secret, you should always tell it to doubt everything and find a mistake!
dpacmittal•6mo ago
>but visually, the white bar is drawn shorter than the pink bar, even though its percentage is slightly lower.

But the white bar is not shorter in the picture.

zmmmmm•6mo ago
funny isn't it - makes me feel like it's kind of over-fitted to try and be logical now, so when it's trying to express a contradiction it actually can't
bibabaloo•6mo ago
But how could have they used ChatGPT-5 if they were working on the blog post announcing it?
brahyam•6mo ago
clearly the error is in the number, most likely the actual value is 5.0 instead of 50.0 which matches the bar height and also the other single digit GPT-5 results for metrics on the same chart
yoyohello13•6mo ago
It’s genuinely terrifying that people this incompetent have so much money and power.
m_herrlich•6mo ago
It might not incompetent to assume the audience is not very discerning
aydyn•6mo ago
OpenAI is currently getting dunked on, on all major platforms. It is incompetent.
throwawayoldie•6mo ago
People reading Hacker News are the target audience, and here we are, discerning.
Invictus0•6mo ago
Speak for yourself!
throwawayoldie•6mo ago
I only ever do. But I rephrased my post to make my meaning clearer. Nice discernment there.
01HNNWZ0MV43FF•6mo ago
Hey, could be malice
ElijahLynn•6mo ago
The magic that is ChatGPT is definitely not incompetence.

They may not be perfect, but they provided a lot of value to many different industries including coding.

AdieuToLogic•6mo ago
> The magic that is ChatGPT ...

  Any sufficiently advanced technology is
  indistinguishable from magic.[0]
0 - https://en.wikipedia.org/wiki/Clarke%27s_three_laws
fullshark•6mo ago
It’s more terrifying that no one cares about the truth it seems anywhere. Vibeworld, we are all selling vaporware and if you don’t build it who cares move into the next hype cycle that pumps the stock / gets VC funding. Absurd industry.
pesus•6mo ago
We're feeling the effects of living in a post-truth society more and more every day. It's pretty terrifying.
burnt-resistor•6mo ago
In the US political sphere, this is why legitimized corruption must be eliminated before it eliminates us.
rvz•6mo ago
Remember, we are in a post-truth era. Getting to "AGI" might even mean cooking the numbers if they have to so that hopefully no-one notices.
interweb_tube•6mo ago
I'll always invest in a chart that's more pink than gray.
burnt-resistor•6mo ago
Green for good on stacked bar charts is the new hotness.
enb•6mo ago
Can’t scroll on safari ios
acenturyandabit•6mo ago
The chart is the entire thing. Check if the numbers match the heights of the rectangles ;)
eps•6mo ago
Still only half of it is visible in the landscape mode and the page is not scrollable.
datadrivenangel•6mo ago
Should be fixed now.
eps•6mo ago
Yep, fixed.
schappim•6mo ago
Imagine being the person who made the mistake when creating the gpt-5 chart.
cpncrunch•6mo ago
Link?
datadrivenangel•6mo ago
See TFA. [0]

0 - https://www.vibechart.net/

cpncrunch•6mo ago
No, i mean what is the context. Who created this originally? Where is the link to openai or whoever creating this chart, or context behind the misinformation if any. I check the comments and stories about chatgpt5 and there is no reference to this, so im at a loss.

Ok, I see there was a bug on the site and it wasn't scrolling on iOS. They fixed that now, although the background context is still unclear, and none of the links in the site seem to explain it.

datadrivenangel•6mo ago
These charts were from the GPT-5 release stream from OpenAI. The second image is a direct screenshot:

https://www.youtube.com/watch?v=0Uu_VJeVVfo&t=1840s

cpncrunch•6mo ago
Yes, that second image was initially hidden on iOS due to the scrolling bug in the site (now fixed).

So they spotted what seems to be an unintentional error in a chart in a youtube video, and created a completely different chart with random errors to make a point, while due to their own coding error the (somewhat obtuse) explanation wasn't even visible on mobile devices.

Not sure why this was voted to the top of the first page of HN, although I can surmise.

burnt-resistor•6mo ago
It should've been coded to auto-detect the current highest GPT generation/version and add 2. Sigh.
sp527•6mo ago
It's such an egregiously bad error, you almost have to wonder if Altman did it intentionally for publicity (which does seem to be working).
p1necone•6mo ago
I think the stock market has just proven time and time again that a large proportion of investors (and VCs) do basically no due diligence or critical thinking about what they're throwing money at, and businesses actually making profit hasn't mattered for a long time - which was the only thing tethering their value to the actual concrete stuff they're building. If you can hype it well your share price goes up, and even the investors that do do due proper diligence can see that and so they're all in too.

By and large people do not have the integrity to even care that numbers are obviously being fudged, and they know that the market is going to respond positively to blustering and bald faced lies. It's a self reinforcing cycle.

sp527•6mo ago
Oh trust me I know. I worked at Palantir well before it was public and had firsthand experience of Alex Karp. He would draw incomprehensible stick figure box diagrams on a whiteboard for F100 CEOs, ramble some nonsensical jargon, and somehow close a multimillion dollar pilot. The guy is better at faking it than high-end escorts. It doesn't surprise me that this has fooled degens around the world, from Wall Street to r/wallstreetbets. Incredibly, even Damadoran has thrown in the towel and opened a position, while still admitting he has no idea what they do.
mattgreenrocks•6mo ago
It’s vibes all the way down :)
dmezzetti•6mo ago
Impressive that this knocked GPT-5 from the top.
an0malous•6mo ago
Nature is healing
KaoruAoiShiho•6mo ago
I think this is less chart crime than an editing mistake.
datadrivenangel•6mo ago
They had two misleading charts... not ideal
subtlesoftware•6mo ago
The 69.1 column has the same height as the 30.8 column. My guess is they just duplicated the 30.8 column and forgot to adjust the height to the number, which passed a cursory check because it was simply lower than the new model.

This doesn't explain the 50.0 column height though.

chilmers•6mo ago
Eyeballing it, that bar looks to be around 15% in height. Typing "50" instead of "15" is a plausible typo. Albeit, one you might expect from a high-schooler giving a class presentation, not in a flagship launch by one of the most hyped startups in history.

Just remember, everyone involved with these presentations is getting a guaranteed $1.5 million bonus. Then cry a little.

what•6mo ago
How is 50 instead of 15 a plausible typo? A zero is on the opposite end of the keyboard than a 1.
thek3nger•6mo ago
Yep. It sounds more like a dictation error as “fifteen” and “fifty” sound similar. No idea why this should matter in the slide production process though.
eagle2com•6mo ago
Not on a numpad. I heard rumors some actually use it ^^
dragonwriter•6mo ago
> The 69.1 column has the same height as the 30.8 column. My guess is they just duplicated the 30.8 column and forgot to adjust the height to the number

Why, unless specifically for the purpose of making it possible to do inaccurate and misleading inconsistencies off this type, would you make charts for a professional presentation by a mechanism that involved separately manually creating the bars and the labels in the first place? I mean, maybe, if you were doing something artistic with the style that wasn't supported in charting software you might, but these are the most basic generic bar charts except for the inconsistencies.

nnurmanov•6mo ago
In the marketing world 1>2:)
datadrivenangel•6mo ago
People interested in misleading data visualization should look into Alberto Cairo's Book: How Charts Lie
0xCafeBabee•6mo ago
Looks like the only thing getting smarter here is the marketing team.
mcs5280•6mo ago
How else can you make stonks go up perpetually?
eddythompson80•6mo ago
Weren’t some people, unironically, expecting AgI announcement for GPT-5. Like I have heard a water cooler (well, coffee machine) conversation about how OpenAI master plan is to release GPT-5 and invoke the AGI clause in their contract with Microsoft. I was shaking my head so hard,
JBiserkov•6mo ago
They are both using the "capitalist" definition of AGI, that is "an AI system that can generate at least $100 billion in profits". I think it's short for "A Gazillion Idiots"...

https://gizmodo.com/leaked-documents-show-openai-has-a-very-...

AIPedant•6mo ago
It is actually incredible how they managed to find an even more unscientific definition than "can perform a majority of economically useful tasks." At least that definition requires a little thought to recognize it has problems[1]. $100bn in profits is just cartoonishly dumb, like you asked a high schooler to come up with a definition.

[1] If a computer can perform the task its economic usefulness drops to near zero, and new economically useful tasks which computers can't do will take its place.

bo1024•6mo ago
Aw, I really wanted this to be a tool to produce your own misleading vibecharts
outside1234•6mo ago
[flagged]
dang•6mo ago
Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and we've asked you many times to stop.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

thimabi•6mo ago
Poor OpenAI workers, they worked so hard for the GPT-5 release and now discussions about the model are side by side with discussions about their badly-done graphs.

I don’t believe they intentionally fucked up the graphs, but it is nonetheless funny to see how much of an impact that has had. Talk about bad luck…

alfalfasprout•6mo ago
I've been using GPT-5 heavily today. It's genuinely very underwhelming. Sonnet 4 seems to outperform it in every real-world task I use it with.

Lots of hype from Sam Altman and nothing to really show for it.

mepiethree•6mo ago
They all got $1.5 million today so I’m not too worried about the poor workers.
CamperBob2•6mo ago
Everybody including the employee who put those graphs into the slide deck just got $1.5M just for showing up at work. So there's not a lot of room for sympathy.
sobiolite•6mo ago
There are versions of both these charts with more plausible numbers and bar sizes in the "evaluation" section of the announcement post:

https://openai.com/index/introducing-gpt-5/

So, maybe this is just sloppiness and not intentionally misleading. But still, not a good look when the company burning through billions of dollars in cash and promising to revolutionize all human activity can't put together a decent powerpoint.

insane_dreamer•6mo ago
> can't put together a decent powerpoint.

probably AI generated

MaxLeiter•6mo ago
I think the bar is quite low, but AI can absolutely generate "decent" powerpoints
kaffekaka•6mo ago
But in this case, the bar was too high.
djhn•6mo ago
Is there an MCP or API to do that? Can it take a template and layout and produce coherent sentences in a consistent format?
kiney•6mo ago
just let it create plaintext files like latex
datadrivenangel•6mo ago
Quarto!
tekno45•6mo ago
if their AI is so good, why didn't they use it and get good results?
what•6mo ago
They probably did use it and those are the charts it produced.
kaffekaka•6mo ago
This is wonderful. Like, the chart shows the models capability. Not the numbers in the chart, not the data presented, but the actual chart itself.
Maxion•6mo ago
Maybe they did?
tekno45•6mo ago
they didn't get good results
rco8786•6mo ago
Yea I guess I won’t immediately prescribe malice but SHEESH. One of the most anticipated product launches in years and this kind of junk made it through to the public deck. Really pretty inexcusable.
nabla9•6mo ago
This is what eating your own dog food looks like when you are selling dog food.
EMIRELADERO•6mo ago
Saved. Thanks for that belly laugh.
echelon•6mo ago
Is this the moment the bubble pops (at least for OpenAI)?

GPT-5 has to be one of the most underwhelming releases to date, and that's fresh on the heels of the "gift" of GPT-OSS.

The hottest news out of OpenAI lately is who Mark Zuckerberg has added to Meta's "Superintelligence" roster.

outside1234•6mo ago
The gift of GPT-OSS that is actually Phi
GaggiX•6mo ago
GPT-5 models are actually great models for the API, the nano model is finally good enough to handle complex structured responses and it's even cheaper than GPT-4.1-nano.
sothatsit•6mo ago
GPT-5 is probably going to be a meaningful improvement for most of my non-technical family members who like ChatGPT but have never used anything other than 4o. In fact, most of the people I know who use ChatGPT pay no attention to the model being used, except for the developers I know. This update is going to be a big deal for them.

For me, it's just another nice incremental improvement. Nothing special, but who doesn't like smarter better models? The drop in hallucination rates also seems meaningful for real-world usage.

SpaceNoodled•6mo ago
That's not really a fair comparison. Dog food has nutritive value.
WD-42•6mo ago
I can’t believe I’ve never heard this one before. So apt.
outside1234•6mo ago
People at OpenAI are the top of their field. It is not sloppiness in this crowd.
teaearlgraycold•6mo ago
I don't think the PR people at OpenAI are at the top of their field.
ReverseCold•6mo ago
Honestly? They might be.
ceejayoz•6mo ago
People at the top of their field can be deeply sloppy at times.
steve_adams_86•6mo ago
I mean it in the kindest way, but scientists might be the sloppiest group I've worked with (on average, at least). They do amazing work, but they're willing to hack it together in the craziest ways sometimes. Which is great in a way. They're very resourceful and focused on the science, not necessarily the presentation or housekeeping. That's fine.
pphysch•6mo ago
This isn't scientist sloppy, this is salesperson sloppy. Very different.
eviks•6mo ago
Communication is a big part of science, so it's not great that scientists fail in this area
ceejayoz•6mo ago
This was a big COVID-era lesson; that places like the CDC and NIH and whatnot really need a well-trained PR wing for things like Presidential press conferences, to communicate to the public.
johnnyanmac•6mo ago
The engineers, sure. Product team... well, we've seen the past 2-3 years that AI isn't necessarily based on quality and accuracy. They are also at the top of their game in terms of how to optimize revenue.
bigfishrunning•6mo ago
Just because you're the best, doesn't mean you're any good
rsynnott•6mo ago
Their field is pretty much selling sloppiness-as-a-service, tho.

I'm genuinely a bit concerned that LLM true believers are beginning to, at some level, adopt the attitude that correctness _simply does not matter_, not only in the output that spews from their robot gods, but _in general_.

sensanaty•6mo ago
It's kinda crazy to witness, you can see in the main GPT-5 release thread that there are people excusing things like the bot being blatantly wrong about Bernoulli's Principle in regards to airplane flight. I wish I could find it again but it's thousands of comments, one of the comments is literally "It doesn't matter that it's wrong, it's still impressive". Keep in mind we're discussing a situation where a student asks the AI about how planes fly! It's literally teaching people a disproven myth!
Eji1700•6mo ago
OpenAI has always known that "data" is part of marketing, and treated it as such. I don't think this is intentional, but they damn well knew, even back in the dota 2 days, how to present data in such a way as to overstate the results and hide the failures.
pryelluw•6mo ago
Similar to the glass demonstration on the cybertruck.
zmmmmm•6mo ago
it's so funny that it tried to deceive everybody about it's deceptiveness
welder•6mo ago
At first I thought this was metrics about vibe coding... but it's not, that's WakaTime
sbaidon94•6mo ago
What a perfect way to encapsulate the zeitgeist
mgg90•6mo ago
the jumping game it created has the easiest way to beat it. You can keep jumping indefinitely and never hit an obstacle. Probably an extra prompt would fix that, but its funny they published as is.
GodelNumbering•6mo ago
Whichever model they used was probably sabotaged with a prompt like "your prime goal is to make GPT-5 look better in comparison"
atleastoptimal•6mo ago
Why are they so sloppy? Is it because they want to go viral with le funny bad graphs? I'm sure AI could handle "converting test results in an excel document to a visual graph"
enraged_camel•6mo ago
The most plausible explanation IMHO is that they are moving at a million miles per hour, and someone forgot to replace some placeholder graphics.
an0malous•6mo ago
why would the placeholder graphics be the exact same but with incorrect heights on the bars?
Mentlo•6mo ago
Move fast and break investor confidence
eviks•6mo ago
Because there share your mistake and are also sure, so instead of using basic tools that work they use AI tools that fail at the basics
smcleod•6mo ago
Looks like the site has succumbed to the HN hug of death:

> Hmm. We’re having trouble finding that site.

> We can’t connect to the server at www.vibechart.net.

mrcwinn•6mo ago
Let's entertain a completely imagined, made up thought experiment.

Imagine a revolutionary technology comes out that has the potential to increase quality of life, longevity and health, productivity and the standard of living, or lead to never before seen economic prosperity, discover new science, explain things about the universe, or simply give lonely people a positive outlet.

Miraculously, this technology is free to use, available to anyone with an internet connection.

But there was one catch: during its release, an error was made on a chart.

Where should this community focus its attention?

ekianjo•6mo ago
> But there was one catch: during its release, an error was made on a chart.

that should be a tell that other things may be rigged to look better than they are

moody__•6mo ago
After all of this hype this is the best they can do? This is the forefront company (arguable) of the forefront tech and no one can review slides before being shipped out? I think the reason why this has resonated with people is that it gives a "vibe" of not giving a shit, they'll ship whatever next slop generator they want and they expect people to gladly lap it up. Either that or they're using their own dog food and the result is this mess. Do the stats even matter anymore? Is that what they're banking on?
datadrivenangel•6mo ago
There is a correlation between good communication and good outcomes. This is bad communication from the people that could maybe get us good outcomes.
nice_byte•6mo ago
When such technology comes out, we'll find out.
throwawayoldie•6mo ago
We're not talking about said hypothetical technology. We're talking about LLMs.
Sateeshm•6mo ago
First, it is LLMs. Second, we can focus on both the technology and the error.
sensanaty•6mo ago
Let's ignore for the moment that we're talking about a word generator that relies on an infinite amount of pirated data input to "learn" anything. Let's also ignore that the primary goal of "AGI" for the people pushing it is to replace workers en masse and to enrich themselves, and not any naive notion of progress or whatever.

So this miraculous technology that can do everything, cure diseases, reverse human aging, absolve us of our sins etc. can't accurately make a bar chart? Something kids learn in 5th grade mathematics? (At least I did, mileage might vary there)

xigoi•6mo ago
> Miraculously, this technology is free to use, available to anyone with an internet connection.

If something is free but not open source, you are the product.

imtringued•6mo ago
At least you're truthful when you say it's completely made up.

Here is the corrected version:

Imagine a revolutionary technology comes out that has the potential to increase quality of life, longevity and health, productivity and the standard of living, or lead to never before seen economic prosperity, discover new science, explain things about the universe, or simply give lonely people a positive outlet.

But there was one catch: during its release, an error was made on a chart. It turns out that it did not lead to the massively over exaggerated benefits that were promised and that it merely represents a minor incremental improvement over its predecessor and that it will be overshadowed in a matter of months by another release from a competitor.

Where should this community focus its attention?

maytc•6mo ago
They fixed it in the press release charts https://openai.com/index/introducing-gpt-5/
brundolf•6mo ago
Scaling aside, "without thinking" vs "with thinking" will never not be funny to me
hyperdimension•6mo ago
"I asked GPT-5 without thinking and it said..."
an0malous•6mo ago
man some frontend devs just got $1.5M grants to do this
lunarcave•6mo ago
We're fast approaching the point where vibeX is becoming derogatory.
JoshTriplett•6mo ago
It was (accurately) derogatory on day one, even if its proponents didn't recognize that.
guluarte•6mo ago
The worst part is that a company like OpenAI is full of data scientists who are supposed to be experts in charts.
burnt-resistor•6mo ago
The next management consulting flavor of the month will be full spectrum, panopticon RTO employee monitoring to ensure employees are doing work themselves, not using LLMs, and not working other jobs. It will be scored by AI, of course.
mattlondon•6mo ago
Why not using LLMs? That would be like employing hundreds of farmers and making sure they don't use a tractor and do everything by hand instead?

LLMs can be a huge performance boost, when used wisely (i.e. not just blindly using whatever they spit out)

Mentlo•6mo ago
Most people are not, in fact, wise
etherealG•5mo ago
But the way they learn to be wise in the context of using LLMs is to try using them and fail, just like all learning experiences. Companies insisting on the use of these tools seems logical to me when the assumption is that they will, once learned, be better than previous methods of working, but only with practice.
fathermarz•6mo ago
I’m at a large firm of ~1000 employees, only about 25% are devs and everyone is being told they should be using LLMs/AI Agents or they are going to be behind. Even to say, if you don’t adopt these tools, we can find someone who will.
TheAceOfHearts•6mo ago
There's a social media engagement tactic where people will deliberately add specially tailored elements to content in order to bait people into commenting about it. I wonder if there was some deeper strategy to this chart being used during their presentation or if it really was just a blunder.

Maybe the fact that there were additional blunders, such as the incorrect explanation of the Bernoulli Effect, suggests that the team responsible for organizing this presentation didn't review every detail carefully. Maybe I'm reading too much into a simple mistake.

an0malous•6mo ago
@dang why is this post allowed to be flagged off its #1 spot? Is that not clearly a misuse of the flagging system, which is not supposed to be just for posts that people don’t like?

From the HN FAQ:

> What does [flagged] mean?

> Users flagged the post as breaking the guidelines or otherwise not belonging on HN.

> Moderators sometimes also add [flagged] (though not usually on submissions), and sometimes turn flags off when they are unfair.

dang•6mo ago
(@dang doesn't guarantee that we'll see anything - you need to email hn@ycombinator.com for that)

Users flagged it. We can only guess why users flag things, but in this case there had been so much coverage of GPT-5 on the frontpage, plus the chart gaffe was being extensively in those threads, that they probably found this post some combination of repetitive and unsubstantive.

It ended up spending 14 hours on the frontpage anyhow, which is quite a lot, especially for one of those single-purpose sites that people spin up for joke/drama purposes. Those are a great internet tradition but not always the best fit for HN (https://news.ycombinator.com/newsguidelines.html).

eviks•6mo ago
What an epic competence fail, but also seems like a great encapsulation of this whole AI hype era!
ali-aljufairi•6mo ago
Nice new term and thank for buying the domain lol
p0w3n3d•6mo ago
It took me a few-ty seconds to understand that this is a joke and not a stupid-joke
meatjuice•6mo ago
It's not the point of the oost but the large rainbow text on top is slosing down my browser like hell.