Agent Smith, _The Matrix_
But that was the only thing I tripped on. I enjoyed reading the article in general.
was the giveaway for me
people use LLMs for writing. we know! get over it.. or don't... i don't really care.. but I'd rather read a discussion about the article contents and not the writing style.
this kind of comment is the new "discuss the font choice / background color / anything but what the article is actually saying."
> it gets really tiring reading this kind of side-tracked comment thread in like.. every post.
If someone is of the opinion that something constitutes low quality, then a high volume of such writing is no reason to stop criticizing it, but on the contrary a reason to oppose its normalization.
>Grok showed discipline, despite its goblin-like nature.
People experience the world through the tools they're most familiar with. For some people, that's throwing money at things. I suppose from a sufficiently high level perspective everything is gambling.
Back when Battlebots was a big deal, I never once considered what it would feel like to be the management or sponsorship of those teams. I only cared about the actual battling of bots.
Please learn how to write with AI without giving away that it was written by AI.
Really I use the AI every damn day at work I don't get how people can't recognize instantly if something is completely AI, AI with light proofreading, or human written.
I would call this as AI with very light proofreading.
That would make it less effective in situations that would be better handled if sprinting was a feature.
It has something actionable that will match its actions
But it's not actually 4.1 anymore they silently rerouted it to 4.3 and just started charging more - https://www.reddit.com/r/grok/comments/1ta8yrn/grok_41_fast_...
Quite a bad practise.
L icon Grok 4.1 Fast won 13 of 30 games at $0.97 per win
The next-best winner was A icon Claude Sonnet 4.6 with 5 wins, at $26.78 per win. That’s a 27x difference. The model that isn’t on most top-model lists beat the model that is, on the thing a routing customer actually cares about.
The model with the most kills did not win
H icon GPT 5.4 killed 38 agents across 30 games. More than anyone else. It came in second on the leaderboard with 2 wins.
If grok-4.1-fast was the top-winning model, and Claude 4.6 Sonnet the second, how did Gpt-5.4 come in second on the leaderboard? Which one is second, Claude 4.6 Sonnet or Gpt-5.4? There were 11 games between “best at killing” and “best at winning”.
What does that mean? How are there 11 games between "best a killing" and "best at winning"?i feel like i'm missing a whole lot of context to this article. is it part of a series, or just written with an assumption that i'm going to know what they're talking about
It's a monster at coding. And a fast monster at that.
I use it daily and have been testing if MiMo 2.5 (non pro) is comparable. The nice thing about MiMo is that it has vision capability.
Such is life in royal rumble games.
I have a lot of thoughts unrelated to the game experiment but more about how these opus/ultra size models can possibly be a financially viable product at scale when it costs $3000 to play 30 simple games. It just seems much much higher than what it would cost to get a human to play 30 rounds
The things LLMs are good at, you do not actually need for an agent like this. You can use classical AI methods. But that would be a boring article.
But really I would prefer whichever one is most likely to trip and fall over.
aussiegreenie•1h ago