frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Everything You Need to Know About Grok 4

https://forgecode.dev/blog/grok-4-initial-impression/
15•Arindam1729•3h ago

Comments

OrvalWintermute•3h ago
grok4 is tortiously slow compared to all the other LLMs I use :(
amitksingh1490•3h ago
Ya, even I feel its slow, Thats why I use it only for architecture planning and finding complex issue
patrickhogan1•3h ago
On your intelligence graph where it shows Grok 4 and OpenAI o4-mini as comparable (and among the highest intelligence rated models), it doesn’t have OpenAI o3 or o3-pro.

Yet all of my tests show o3 blows o4-mini out of the water.

What are you classifying as intelligence?

CBLT•3h ago
> Grok 4 is [...] the most intelligent model so far

A bit too much praise for a model that's barely ahead of the competition in a subset of benchmarks...

> To be honest, this model not only competes with other AI models but also with humans, making it the first of its kind

I'm out

knes•3h ago
Didn't the tldr of grok 4 was their over tuned for bencmhark results but in day to day tasks . It's actual not better than o3 / gpt5
ajd555•3h ago
Grok 4 has about 99% accuracy in picking the right tools and making tool calls with proper arguments almost every single time.

Where did this number come from? What is "the right tool"? I find this extremely subjective. As most engineers know, there is no right tool, but mostly a compromise where you pick the least worst tool and choose what risks you're willing to manage or not.

Byamarro•2h ago
That's langchain terminology. LLMs usually are exposed to a set of tools. It's usually pretty obvious which are obvious, since there's only one tool that's even remotely associated with the task at hand.
ajd555•2h ago
Thanks for the info. This makes the article slightly less intolerable!
mdaniel•2h ago
I believe in this context it means "tool" as in the MCP definition, e.g. "of the catalog of MCP integrations, it doesn't try to use the playright one to browse the web, it'd use the AWS docs one directly"

This is just my speculation, though, as I've never used Grok anything

ajd555•2h ago
Yeah, based on a previous comment, that makes sense. I am a little reassured that is what the author meant.
CamperBob2•3h ago
If the answer involves giving even more money to Elon Musk, you asked the wrong question.
kolektiv•3h ago
I can't take anything seriously with phrases like "it has not yet achieved AGI, but it is one leap forward in the race to AGI" - based on what? Nobody knows whether LLMs are a viable approach to AGI, nobody really agrees on what AGI is, hell, people don't really agree on what "I" is.

This is just not even science at all at this point, we're just into solid cargo cult.

aitacobell•2h ago
> To be honest, this model not only competes with other AI models but also with humans, making it the first of its kind

Is this a joke

Rperry2174•2h ago
I keep seeing these Grok 4 intelligence claims, so I tried something very simple: "Animate a round robin tournament for 10 people."

Results: Claude: ~10s, perfect working demo ChatGPT: ~20s, solid solution Grok 4: ~1000s, failed completely, gave me a truncated base64 blob

This wasn't some obscure edge case... it was basic data visualization that any decent model should handle. Yet somehow Grok 4 is "competing with humans" and has "99% tool accuracy"...

I don't buy it..

links: Claude: https://claude.ai/share/7a413a6a-5c01-44a1-aaed-8b237e5e9e94 Chatgpt: https://chatgpt.com/canvas/shared/687a9f9d4304819187ac7d98d3... Grok 4: https://grok.com/share/c2hhcmQtMw%3D%3D_20b61291-e1bb-45e5-a...

These benchmarks are either just wrong or measuring something completely divorced from practical utility imo...

4b11b4•1h ago
This article seems like pure garbage

A Short Story of the Google Error Page

https://meiert.com/blog/the-google-error-page/
1•varun_ch•19s ago•0 comments

CCO of private investment firm SMH caught cheating on Series 24 exam [pdf]

https://www.sec.gov/files/litigation/opinions/2025/34-103498.pdf
1•amendegree•1m ago•0 comments

Show HN: AI File Sorter: Organize Files and Folders with AI (Local LLMs)

https://github.com/hyperfield/ai-file-sorter
2•hyperfield•3m ago•0 comments

Who Hates YouTube?

1•thoth001•3m ago•2 comments

Stealth Macintosh Portable case mod

https://biosrhythm.com/?p=2956
1•classichasclass•3m ago•0 comments

Fcrand (Go language): drop-in replacement for crypto/rand, up to 10x faster

https://github.com/sdrapkin/fcrand
1•sdrapkin•5m ago•2 comments

Test Code Like Zelda: When to Implement Automated Testing

https://www.usetusk.ai/resources/when-to-implement-automated-testing
2•Marceltan•20m ago•0 comments

Target to end price-matching policy amid business challenges

https://time.com/7303400/target-price-matching-policy-ending/
2•hhs•22m ago•0 comments

How do you compute the midpoint of an interval? (2014) [pdf]

https://hal.science/file/index/docid/576641/filename/computing-midpoint.pdf
1•todsacerdoti•27m ago•0 comments

Show HN: Benchstreet – the stock prediction AI benchmark

https://github.com/puffinsoft/benchstreet
3•ColonelParrot•27m ago•0 comments

Dead Zone Dragging

https://www.steveruiz.me/posts/dead-zone
1•kierangill•29m ago•0 comments

Show HN: ts-explicit-errors – A TypeScript library for treating errors as values

https://github.com/adamhl8/ts-explicit-errors
1•genshii•29m ago•0 comments

Psilocybin therapy for mood dysfunction in Parkinson's disease: open-label trial

https://www.nature.com/articles/s41386-025-02097-0
1•nick__m•29m ago•3 comments

Easy Agents: Build autonomous agents with just natural language

https://github.com/kpolley/easy-agents
2•kpolls•34m ago•0 comments

Teufel Mynd open source / open hardware Bluetooth speaker

https://lu.teufelaudio.com/mynd-107002004
3•Eduard•38m ago•0 comments

Virtual Humans for Hire

https://www.holostaff.ai
3•dergalem•38m ago•0 comments

Slow Adoption Applies to Evil AI, Too

https://secondthoughts.ai/p/short-takes-2
2•gk1•42m ago•0 comments

Shape-shifting particles allow temperature control over fluid flow and stiffness

https://phys.org/news/2025-07-shifting-particles-temperature-fluid-stiffness.html
2•PaulHoule•48m ago•0 comments

Building Your Personal Assistant with Multi-Modal Memory

https://mirix.io/
3•wangyu164•48m ago•2 comments

Standardization of Office Open XML

https://en.wikipedia.org/wiki/Standardization_of_Office_Open_XML
3•fsflover•49m ago•0 comments

Apple sues leaker Jon Prosser for allegedly stealing iOS26 info from an employee

https://www.engadget.com/big-tech/apple-sues-leaker-jon-prosser-for-allegedly-stealing-ios-26-info-from-an-employee-123019259.html
2•apparent•50m ago•0 comments

Canadian Cross

https://en.wikipedia.org/wiki/Cross_compiler
6•tripdout•51m ago•0 comments

Arch Linux pulls AUR packages that installed Chaos RAT malware

https://www.bleepingcomputer.com/news/security/arch-linux-pulls-aur-packages-that-installed-chaos-rat-malware/
4•mikece•55m ago•1 comments

Silence Is a Commons by Ivan Illich (1983)

http://www.davidtinapple.com/illich/1983_silence_commons.html
16•entaloneralie•57m ago•0 comments

Detroit pitches Silicon Valley-types: Bring your next factory here

https://subscribe.detroitnews.com/restricted?return=https%3A%2F%2Fwww.detroitnews.com%2Fstory%2Fbusiness%2F2025%2F07%2F17%2Fdetroit-to-silicon-valley-types-bring-your-next-factory-here%2F85215724007%2F&gps-source=CPROADBLOCKDH&itm_source=roadblock&itm_medium=onsite&itm_campaign=premiumroadblock&gca-cat=pp&gca-ds=override
2•rmason•1h ago•1 comments

A Rare Object Found Deep in the Kuiper Belt – Universe Today

https://www.universetoday.com/articles/a-rare-object-found-deep-in-the-kuiper-belt
2•rbanffy•1h ago•0 comments

Dennis Gustafsson – Parallelizing the physics solver [video]

https://www.youtube.com/watch?v=Kvsvd67XUKw
3•SanJacobs•1h ago•0 comments

Rare earth element recycling impacts on semiconductor industries

https://link.springer.com/article/10.1007/s10163-025-02276-7
3•walterbell•1h ago•0 comments

Are We Cooked?

https://www.bonnycode.com/posts/are-we-cooked/
2•jackalnom•1h ago•0 comments

WebAssembly Component Model based REPL /w sandboxed multi-language plugin system

https://github.com/topheman/webassembly-component-model-experiments
1•topheman•1h ago•1 comments