> It isn’t “wrong.” Wolfram defines Binomial[n,m] at negative integers by a symmetric limiting rule that enforces Binomial[n,m] = Binomial[n,n−m]. With n = −1, m = −1 this forces Binomial[−1,−1] = Binomial[−1,0] = 1. The gamma-formula has poles at nonpositive integers, so values there depend on which limit you adopt. Wolfram chooses the symmetry-preserving limit; it breaks Pascal’s identity at a few points but keeps symmetry. If you want the convention that preserves Pascal’s rule and makes all cases with both arguments negative zero, use PascalBinomial[−1,−1] = 0. Wolfram added this explicitly to support that alternative definition.
Of course this particular question might have been in the training set.
Honestly 2.5 years feel like infinity when it comes to AI development. I'm using ChatGPT very regularly, and while it's far from perfect, recently it gave obviously wrong answers very rarely. Can't say anything about ChatGPT 5, I feel like in my conversations with AI, I've reached my limit, so I'd hardly notice AI getting smarter, because it's already smart enough for my questions.
1) 100% correct
2) Really useful (ie it includes various things I didn’t ask for but are really great like a little manipulator to walk through the function at various points and visualize what the mapping is doing)
3) Built in a general way so I can easily change the mapping to explore different types of functions and how they work.
It seems very clear (both from what they said in the launch demos etc and from my experience of trying it out) that performance on coding tasks has been an area of massive focus and the results are pretty clear to me.
I think of that every time people talk about trusting generated code. Or the obfuscated code competition. It’s going to get you into the dumbest trouble some day.
It suggested two new attributes which it did not add after claiming that this had been done, and after this was done, the attributes were not used.
If we keep retraining them on the currently available datasets then the questions that stumped ChatGPT3 are in the training set for chatgpt5.
I don’t have the background to understand the functional changes between ChatGPT 3 and 5. It can’t be just the training data can it?
> gave *obviously wrong* answers very rarely.
I don't think this is a reason I'd trust it, actually this is a reason I don't trust it.There's a big difference between "obviously wrong" and "wrong". It is not objective but entirely depends on the reader/user.
The problem is it optimizes deception alongside accuracy. It's a useful tool but good design says we should want to make errors loud and apparent. That's because we want tools to complement us, to make us better. But if errors are subtle, nuanced, or just difficult to notice then there is actually a lot of danger to the tool (true for any tool).
I'm reminded of the Murray Gell-Mann Amnesia effect: you read something in the news paper that you're an expert in and lambast it for its inaccuracies, but then turn the page to something you don't have domain knowledge and trust it.
The reason I bring up MGA is because we don't often ask GPT things we know about or have deep knowledge in. But this is a good way to learn about how much we should trust it. Pretend to know nothing about a topic you are an expert in. Are its answers good enough? If not, then be careful when asking questions you can't verify.
Or, I guess... just ask it to solve "5.9 = x + 5.11"
That’s not to say fire all your brilliant devs and hire mediocrity, but the reverse case is often made by loudmouths trying to fluff their own egos. Getting rid of the average devs is ignoring the vocational aspects of the job.
Are you concerned it may be giving you subtley wrong answers that you're not noticing? If you have to double check everything, is it really saving time?
The problem is that doing this enough will make you forget how to come up with proofs in the first place.
Even if GPT-3.5 was noticeably worse for any of these questions, it's honestly more interesting for someone's first experience to be with the exaggerated shortcomings of AI. The slightly-screwy answers are still endemic of what you see today, so it all ended well enough I think. Would've been a terribly boring exchange if Knuth's reply was just "looks great, thanks for asking ChatGPT" with no challenging commentary.
(1) it has been a year or so since the article last had significant attention, and
(2) the post is genuinely interesting.
(the latter condition ought to apply to any HN submission of course)
We are presented with a first reaction to chatgpt, we must never forget how incredible this technology is, and not become accustomed to it.
Donald knuth approached several of the questions from the absence of knowledge, asking questions as basic as "12. Write a sentence that contains only 5-letter words.", and being amazed not only by correct answers, but incorrect answers parsed effectively and with semantic understanding.
First it spent three minutes getting fucked by cookie banners, then it ddossed Wikipedia by guessing article names, then it started searching for stock photo sites offering an API, then it hallucinated a python script to search stock photography vaguely related to what i wanted. This failed as well, so it called its image generator and finally served me some made up AI slop.
Ten minutes, kilowatts of GPU power, and jack shit in return. So not even the shiny new tools are up to the task.
the internet was made to happen. and that is what happened.
I would say I have seen 3 completely different internets. and I started keeping track in the late 90s after the dotcom boom made it truly global and everywhere
The discussion at the end also reminded me of how a lot of us took Gary Marcus' prose more seriously at the time before many of his short-term predictions started failing spectacularly.
https://chatgpt.com/share/6897a21b-25c0-8011-a10a-85850870da...
Pretty interesting - some contamination, some better answers, and it failed to write a sentence with all 5-letter-words. I’d have expected it to pass this one!
Simple example: “Every night, dreams swirl swiftly.
For example, something like "running" might get tokenizef like "runn"+"ing", being only two tokens for ChatGPT.
It'll learn to infer some of these things over the course of training, but limited.
Same reason it's not great at math.
Anyone have an idea how this happened? Supposed to be a sentence of only 5 letter words.
wslh•2h ago