Humanoid robot: The evolution of Kawasaki’s challenge

https://kawasakirobotics.com/in/blog/202511_kaleido/

18•hhs•3d ago

Comments

kleiba•3h ago

Looking at the video at the bottom of the page, the robot looks like an old man, especially in the trash bag throwing sequence. Compare that to the recent Chinese kung-fu robots video...

bsboxe•3h ago

Completely different situations. The Unitree demos are prerecorded movements with no real adaptability. While visually impressive, they are highly tuned to perform that specific sequence of actions. If you walked in front of one it would have zero awareness of you and you’d be hit. They’re essentially “blind”. The last video here is likely demonstrating a teleoperated humanoid.

throwawayffffas•1h ago

> The Unitree demos are prerecorded movements with no real adaptability.

That is not true. The routine is preprogrammed, but there is adaptability. If there wasn't they would fall on the ground in the first 5 seconds. The movement involved in the routine we saw requires continuous adjustment. You can't just record the movement as you would with a video game animation, real physics get in the way and you end up on your back on the ground trying to do a jump and a backflip.

If you think I am wrong, sure I could be but have a look at atlas, https://www.youtube.com/watch?v=oe1dke3Cf7I

The robots motion is not preprogramed at all, see how much more smooth the motion is?

Thats because boston dynamics are using an approach where they try to calculate and take the dynamics of motion into account, just like Unitree.

The kawasaki approach is clearly to use overwhelming torques in an effort to cancel all the dymanics and produce fully controlled movement. Exactly what an old man does as well or a robotic arm in a factory. It's honestly embarrassing it looks like kawasaki has no progress in the last 30 years their robots still move like its 1996.

Have a look here https://underactuated.csail.mit.edu/intro.html for a more indepth explanation of the difference between the two approaches.

imtringued•14m ago

I'm honestly more concerned with your lack of understanding of these topics.

There are two main ways to accomplish what the kung-fu robot does.

First you train a reinforcement learning policy for balancing and walking and a bunch of dynamic movements, then you record the movement you want to perform using motion capture, then you play back the recorded trajectory.

Second, you train a reinforcement learning policy for balancing and walking, but also bake in the recorded movement into the policy directly.

Okay, I lied. There is also a third way. You can use model predictive control and build a balancing objective by hand and then replay the recorded trajectory, but I think this method won't be as successful for the shown choreography however it's what Boston dynamics did for a long time.

In both cases you will still be limited to a pre-recorded task description. Is this really that hard to understand? Do you really think someone taught the robot in Chinese language and by performing the movement in front of the camera of the humanoid how to perform the choreography like a real human or that the robot came up with the choreography on its own? Because that's the conclusion you have to draw if you deny the two methods I described above.

imtringued•21m ago

The kawasaki robot had to do something much more impressive, which is to lift a table while another human is holding the other end.

The actual concern here is that there are too many cuts. If the whole table movement sequence was uncut and fully autonomous, that would mean they have the most advanced humanoid robot software in the world.

It means they can autonomously find the correct grasping location on the table for both arms, meaning the robot needs to have a model of the table. The robot needs to know at what height to hold the table to keep the table level and compensate for the human pulling on the object while balancing and autonomously following the direction the human is pulling in.

Of course, since there were many cuts, we don't really know whether that's true. We also don't know if teleoperation is involved or not.

The Chinese robot dancing is cool, because it shows what the hardware is capable of, but it doesn't really show anything on the software side. Contacts with objects are hard in robotics and the kung-fu choreography avoids them for obvious reasons.

voidUpdate•1h ago

Humanoid industrial robots are always a little confusing for me. The human form is not best suited for industrial tasks, and by making specialised robot arms, you could improve efficiency etc. It's only if you need to interact with systems that were designed for humans, and can't be modified to work with a more efficient robot that you need humanoids

ACCount37•1h ago

Every single task that was easy and economical to offload to a single purpose robot arm bolted down to the floor was already offloaded to a single purpose robot arm bolted down to the floor.

What remains is: all those quirky little one-off processes that aren't very amenable to "robot arm" automation, aren't worth the process design effort to make them amenable to it, and are currently solved by human labor.

Thus, you design new solutions to target that open niche.

Humans aren't perfect at anything, but they are passable at everything. Universal worker robots attempt to replicate that.

"A drop-in replacement for simple human labor" is a very lucrative thing, assuming one could pull it off. And that favors humanoid hulls.

Not that it's the form that's the bottleneck for that, not really. The problem of universal robots is fundamentally an AI problem. Today, we could build a humanoid body that could mechanically perform over 90% of all industrial tasks performed by humans, but not the AI that would actually make it do it.

throwawayffffas•1h ago

Yes that's pretty much it. Some people from boston dynamics were talking on a podcast. And they were saying that they sat down with toyota and figured out they could automate all the tasks in a factory, but it would take 10000 man years or something and toyota makes new trims every six months so you need about 10000 man years every six months or so.

It's the flexibility and adaptability with minimum training that's required.

ath92•1h ago

My impression is that a big part of the reason for the sudden boom in humanoid robots is that they lend themselves particularly well to RL based training using human-made training footage using VR. It’s much easier to have a robot broadly copy human actions if the robot looks like a human, instead of having to first translate the human action to your robot arm equivalent.

throwawayffffas•57m ago

That is certainly a factor, but you also have to take into account that all these tasks in the factories are now centered around the human form because humans are doing them.

ACCount37•48m ago

The big part is the rise of modern AI in general.

The success of large multipurpose AI models trained on web-scale data pushed a lot of people towards "cracking general purpose robot AI might be possible within a decade".

Whether transfer learning from human VR/teleop data is the best way to do it remains uncertain - there are many approaches towards training and data collection. Although transfer learning from web-scale data, teleoperation and "RL IRL" are common - usually on different ends of the training pipeline.

Tesla got the memo earlier than most, because Musk is a mad bleeding edge technology demon, but many others followed shortly before or during the public 2022 AI boom.

US Court of Appeals: TOS may be updated by email, use can imply consent [pdf]

Fontcrafter: Turn Your Handwriting into a Real Font

Ireland shuts last coal plant, becomes 15th coal-free country in Europe

Unlocking Python's Cores:Energy Implications of Removing the GIL

Agent Safehouse – macOS-native sandboxing for local agents

Microscopes can see video on a laserdisc

PCB devboard the size of a USB-C plug

Ask HN: What Are You Working On? (March 2026)

Every single board computer I tested in 2025

FrameBook

My Homelab Setup

Linux Internals: How /proc/self/mem writes to unwritable memory (2021)

We should revisit literate programming in the agent era

Artificial-life: A simple (300 lines of code) reproduction of Computational Life

How the Sriracha guys screwed over their supplier

I made a programming language with M&Ms

Why can't you tune your guitar? (2019)

I love email (2023)

My “grand vision” for Rust

Living human brain cells play DOOM on a CL1 [video]

Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP

Nvidia backs AI data center startup Nscale as it hits $14.6B valuation

Ask HN: How to be alone?

WSL Manager

The death of social media is the renaissance of RSS (2025)

Z80 Sans – a disassembler in a font (2024)

We Stopped Using the Mathematics That Works

Show HN: I built a real-time OSINT dashboard pulling 15 live global feeds

Pushing and Pulling: Three reactivity algorithms

The legendary Mojave Phone Booth is back (2013)