frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Grok 4 Fast

https://x.ai/news/grok-4-fast
71•meetpateltech•7h ago

Comments

mrklol•2h ago
Pricing is really good for this benchmark value. Let’s see how it holds against people testing it.
NitpickLawyer•2h ago
If this is sonoma-dusk that was on preview on openrouter, it's pretty cool. I've tested it with some code reverse engineering tasks, and it is at or above gpt5-mini level, while being faster. Works well till about 110-130k tokens tasks, then it gets the case of "getthereitis" and finishes the task even if not all constraints are met (i.e. will say I've solved x/400 tests, the rest can be done later)
mrklol•1h ago
I can imagine, no model so far could actually use those context sizes…
RayVR•2h ago
A faster model that outperforms its slower version on multiple benchmarks? Can anyone explain why that makes sense? Are they simply retraining on the benchmark tests?
NitpickLawyer•2h ago
> Can anyone explain why that makes sense?

Can be anything from different arch, more data, RL, etc. It's probably RL. In recent months top tier labs seem to have "cracked" RL to a level not seen yet in open models, and by a large margin.

raincole•2h ago
Just two different models branded under similar names. That's it. Grok 4 is not the slower version of Grok 4 Fast, just like gpt-4 is not the slower version of gpt-4o.
yorwba•1h ago
It doesn't outperform uniformly across benchmarks. It's worse than Grok 4 on GPQA Diamond and HLE (Humanity's Last Exam) without tools, both of which require the model to have memorized a large number of facts. Large (and thus slow) models typically do better on these.

The other benchmarks focus on reasoning and tool use, so the model doesn't need to have memorized quite so many facts, it just needs to be able to transform them from one representation to another. (E.g. user question to search tool call; list of search results to concise answer.) Larger models should in theory also be better at that, but you need to train them for those specific tasks first.

So I don't think they simply trained on the benchmark tests, but they shifted their training mix to emphasize particular tasks more, and now in the announcement they highlight benchmarks that test those tasks and where their model performs better.

You could also write an anti-announcement by picking a few more fact recall benchmarks and highlighting that it does worse at those. (I assume.)

uyzstvqs•3m ago
Grok 4 Fast is likely Grok 4 distilled down to remove noise that rarely if ever gets activated in production. Then you'd expect these results, as it's really the same logic copied from the big model, but more focused.
zone411•2h ago
Matches Grok 4 at the top of the Extended NYT Connections leaderboard: https://github.com/lechmazur/nyt-connections/
turblety•2h ago
Why would a model called "Fast" not advertise the tokens per second speed it performs at? Is "Fast" not representing speed, but another meaning? Is it too variable?
barrell•1h ago
I would guess that it is essentially just a “grok 4 mini”, but if you use mini as the qualifier then most people will be inclined not to use it. If you call it fast then it gives people a reason to select it.
IanCal•27m ago
They sound like they’re positioning it more that it’s faster to complete because it uses fewer tokens - see the mentions of token efficiency.
johnisgood•2h ago
I think we all want fast AND accurate, is "AND accurate" true for this model? I would rather prefer to wait a few seconds more if the result is much more accurate.
awestroke•3m ago
The only way to get this reliably is to have it use tools
hi_hi•1h ago
I'm waiting for the Tesla FSD playbook to be rolled out for Grok. That is, launch something named like Grok AGI 1, wait for it to become obvious it isn't infact AGI, create a narrative redefining AGI, promise new AGI is 1 year away, and repeat for many years.
padjo•55m ago
Bonus points if you manage to kill a few poor deluded saps with your unsafe product along the way.
zozbot234•54m ago
> create a narrative redefining AGI

Hasn't OpenAI redefined AGI already as "any AI that can [supposedly] create a hecto-unicorn's worth of economic value"?

adt•1h ago
https://lifearchitect.ai/models-table/
zozbot234•1h ago
For the fastest performance, run it on Groq. /s
defrost•1h ago
It's all due to robust primitives: https://www.glscott.org/uploads/2/1/3/3/21330938/5375912_ori...
nomilk•54m ago
Surprising to see negativity here. I send all my LLM queries to 5 LLMs - ChatGPT, Claude, DeepSeek (local), Perplexity, and Grok - and Grok consistently gives good answers and often the most helpful answers. It's ~always king when there's any 'ethical' consideration (i.e. other LLMs refuse to answer - I stopped bothering with Gemini for this reason).

'Ethical' is in quotes because I can see why other LLMs refuse to answer things like "can you generate a curl request to exploit this endpoint" - a prompt used frequently during pen testing. I grew tired of telling ChatGPT "it's for a script in a movie". Other examples are aplenty (yesterday Claude accused me of violating its usage policy when asking "can polar bears eat frozen meat" - I was curious after seeing a photograph of a polar bear discovering a frozen whale in a melted ice cap). Grok gave a sane answer, of course.

renw0rp•48m ago
How do you manage sending and receiving requests to multiple LLMs? Are you going it manually through multiple UIs or using some app which integrates with multiple APIs?
nomilk•46m ago
I created a workflow using Alfred on macOS [0]. You press command + space then type 'llm' then the prompt and hit enter, and it opens the 5 tabs in the browser.

These are the urls that are opened:

http://localhost:3005/?q={query}

https://www.perplexity.ai/?q={query}

https://x.com/i/grok?text={query}

https://chatgpt.com/?q={query}&model=gpt-5

https://claude.ai/new?q={query}

Extremely convenient.

(little tip: submitting to grok via URL parameter gets around free Grok's rate limit of 2 prompts per 2 hours)

[0] https://github.com/stevecondylios/alfred-workflows/tree/main

Saline9515•34m ago
You can do it directly using Openrouter.
devjab•34m ago
I've found the results shift quite a lot between models and updates. Deepseek is pretty consistently good at writing code that is rather easy to improve from mid to good quality. Claude used to be pretty good, but now writes 10x the code you'd need. Gemini is amazing, if you buy one of the more expensive tiers, which in turn isn't really worth it because there are so many other options. GPT and Grok are hit and miss. They deliver great code or they deliver horrible code. GPT and Claude have become such a hurdle I've had to turn github co-pilot off in my VScode. Basically I use deepseek for brainstorming and GPT for writing configs, queries, sql and so on. If either of them fails me I'll branch out, and Grok will be on that list. When I once in a while face a real issue where I'm unsure about the engineering aspects, I'll use one of my sparse free gemini pro queries. I'd argue that we should pay for it at my work, but since it's Google that will never happen.

From an ethical perspective, and I'm based in Denmark mind you, they are all equally horrible in my opinion. I can see why anyone in the anglo-saxon world would be opposed to Elon's, but from my perspective he's just another oligarch. The only thing which sets him appart from other tech oligarchs is that he's foolish enough to voice the opinion publicly. If you're based in the US or in any form of Government position then I can see why DeepSeek is problematic, but at least China hasn't threatened taking Greenland by force. Also, where I work, China has produced basically all of our hardware with possible hardware back-doors in around 70% of our IOT devices.

I will give a shoutout to French Mistral, but the truth is that it's just not as good as it's competition.

franze•31m ago
Really, you are "surprised" to see the negativity here?
andriesm•6m ago
Yes many of us are surprised at negativity at Grok.

Grok is a top contender for me.

I also use 5 LLMs in parallel everyday, but my default stack is Grok, DeepSeek, Gemini 2.5 pro, ChatGPT, Claude - same as OP but I most often switch out Perplexity for Gemini. (DeepSeek with search has become my perplexity replacement usually)

Most of my questions don't hit topics prone to trigger safety blocks, in this case I find gemini surprisingly strong, but for difficult things Grok often wins.

Gemini and Grok and Claude benefit a lot whenever they supplement their knowledge with on demand searches rather than just quick reasoning. Ask a deep insight question on Gemini Pro without making it research and you will discover the hallucinations, logical conclusions that contradict actual known facts etc. Same with Grok. Claude Code CLI when going in circles, remind it to google for more information to break it out.

Grok one shotted a replacement algorithm of several hundred lines of code to replace a part of an operational transform library that had a bug for the last 5 revisions. It passed all my tests. Base grok 4 Model wasn't even optimised for code at that time. Color me impressed!

raincole•10m ago
I believe, despite all the hate it got today, we'll one day be grateful that there is at least one big AI provider chooses a route with less lobotomy.
ramijames•41m ago
I will never use a product built by one of Elon Musk's teams. Never.
faangguyindia•24m ago
My only problem is I use custom frontends and unlike Qwen3 coder i don't see grok4 fast offering any free api access to test out these models.

The tools they've partnership with i don't really use.

manav•4m ago
We're all training similarly large base++; near same data, just pricing it differently... with grok removing a few filters and maybe some safeguards? For that matter, many of the benchmarks are flawed and are just easily gamed and whatnot. iykyk.

Less is safer: How Obsidian reduces the risk of supply chain attacks

https://obsidian.md/blog/less-is-safer/
323•saeedesmaili•11h ago•145 comments

I'm Not a Robot Game

https://neal.fun/not-a-robot/
79•meetpateltech•3d ago•35 comments

Show HN: FocusStream – Focused, distraction-free YouTube for learners

https://focusstream.media
16•pariharAshwin•2h ago•15 comments

If all the world were a monorepo

https://jtibs.substack.com/p/if-all-the-world-were-a-monorepo
137•sebg•4d ago•40 comments

PYREX vs. Pyrex: What's the Difference?

https://www.corning.com/worldwide/en/products/life-sciences/resources/stories/in-the-field/pyrex-...
40•lisper•2h ago•20 comments

Compiling with Continuations

https://swatson555.github.io/posts/2025-09-16-compiling-with-continuations.html
37•swatson741•3d ago•1 comments

High-performance read-through cache for object storage

https://github.com/s2-streamstore/cachey
37•pranay01•5h ago•6 comments

If you are good at code review, you will be good at using AI agents

https://www.seangoedecke.com/ai-agents-and-code-review/
22•imasl42•4h ago•8 comments

Show HN: WeUseElixir - Elixir project directory

https://weuseelixir.com/
158•taddgiles•13h ago•29 comments

Hidden risk in Notion 3.0 AI agents: Web search tool abuse for data exfiltration

https://www.codeintegrity.ai/blog/notion
121•abirag•11h ago•32 comments

Feedmaker: URL + CSS selectors = RSS feed

https://feedmaker.fly.dev
129•mustaphah•12h ago•21 comments

Ants that seem to defy biology – They lay eggs that hatch into another species

https://www.smithsonianmag.com/smart-news/these-ant-queens-seem-to-defy-biology-they-lay-eggs-tha...
388•sampo•21h ago•122 comments

The best YouTube downloaders, and how Google silenced the press

https://windowsread.me/p/best-youtube-downloaders
330•Leftium•21h ago•150 comments

Internet Archive's big battle with music publishers ends in settlement

https://arstechnica.com/tech-policy/2025/09/internet-archives-big-battle-with-music-publishers-en...
312•coloneltcb•4d ago•121 comments

Ruby Central's Attack on RubyGems [pdf]

https://pup-e.com/goodbye-rubygems.pdf
651•jolux•1d ago•220 comments

Supporting Our AI Overlords: Redesigning Data Systems to Be Agent-First

https://arxiv.org/abs/2509.00997
16•derekhecksher•5h ago•4 comments

Sangaku Puzzle I Can't Solve

https://samjshah.com/2025/08/05/sangaku-puzzle-i-cant-solve/
11•speckx•3d ago•0 comments

Claude Can (Sometimes) Prove It

https://www.galois.com/articles/claude-can-sometimes-prove-it
3•lairv•2d ago•0 comments

Three-Minute Take-Home Test May Identify Symptoms Linked to Alzheimer's Disease

https://www.smithsonianmag.com/smart-news/three-minute-take-home-test-may-identify-symptoms-linke...
89•pseudolus•14h ago•44 comments

A 3D-Printed Business Card Embosser

https://www.core77.com/posts/138492/A-3D-Printed-Business-Card-Embosser
79•surprisetalk•2d ago•24 comments

Kernel: Introduce Multikernel Architecture Support

https://lwn.net/ml/all/20250918222607.186488-1-xiyou.wangcong@gmail.com/
156•ahlCVA•17h ago•44 comments

Your very own humane interface: Try Jef Raskin's ideas at home

https://arstechnica.com/gadgets/2025/09/your-very-own-humane-interface-try-jef-raskins-ideas-at-h...
92•zdw•15h ago•13 comments

Show HN: Zedis – A Redis clone I'm writing in Zig

https://github.com/barddoo/zedis
102•barddoo•11h ago•74 comments

Micro-LEDs boost random number generation

https://discovery.kaust.edu.sa/en/article/25936/micro-leds-boost-random-number-generation/
44•giuliomagnifico•3d ago•14 comments

Shipping 100 hardware units in under eight weeks

https://farhanhossain.substack.com/p/how-we-shipped-100-hardware-units
129•M_farhan_h•1d ago•74 comments

Grok 4 Fast

https://x.ai/news/grok-4-fast
71•meetpateltech•7h ago•40 comments

An untidy history of AI across four books

https://hedgehogreview.com/issues/lessons-of-babel/articles/perplexity
104•ewf•15h ago•35 comments

R MCP Server

https://github.com/finite-sample/rmcp
90•neehao•3d ago•12 comments

Faster Argmin on Floats

https://algorithmiker.github.io/faster-float-argmin/
17•return_to_monke•1d ago•7 comments

Trump to impose $100k fee for H-1B worker visas, White House says

https://www.reuters.com/business/media-telecom/trump-mulls-adding-new-100000-fee-h-1b-visas-bloom...
1074•mriguy•13h ago•1386 comments