Teaching LLMs how to solid model

https://willpatrick.xyz/technology/2025/04/23/teaching-llms-how-to-solid-model.html

135•wgpatrick•4h ago

Comments

_mattb•4h ago

Really cool, I'd love to try something like this for quick and simple enclosures. Right now I have some prototype electronics hot glued to a piece of plywood. It would be awesome to give a GenCAD workflow the existing part STLs (if they exist) and have it roughly arrange everything and then create the 3D model for a case.

Maybe there could be a mating/assembly eval in the future that would work towards that?

ein0p•4h ago

I've done this, and printed actual models AIs generated. In my experience Grok does the best job with this - it one shots even the more elaborate designs (with thinking). Gemini often screws up, but it sometimes can (get this!) figure things out if you show it what the errors are, as a screenshot. This in particular gives me hope that some kind of RL loop can be built around this. OpenAI models screw up and can't fix the errors (common symptom: generate slightly different model with the same exact flaws). DeepSeek is about at the same level at OpenSCAD as OpenAI. I have not tried Claude.

derac•3h ago

You've got to be a bit more specific, those words can all refer to many models.

ein0p•3h ago

Typically only the most powerful models are worth a try and even then they feel like they aren't capable enough. This is not surprising: to the best of my knowledge none of the current SOTA models was trained to reason about 3D geometry. With Grok there's just one model: Grok3. With OpenAI I used o1 and o3 (after o3 was released). With Google, the visual feedback was with Gemini Pro 2.5. Deepseek also serves only one model. Where there is a toggle (Grok and Deepseek), "thinking" was enabled.

emorning3•4h ago

Wow! As someone that's written openscad scripts manually I can get real excited about this.

isoprophlex•3h ago

Makes you wonder if there is a place in the pipeline for generating G-code (motion commands that run CNC mills, 3d printers etc.)

Being just a domestic 3d printer enthousiast I have no idea what the real world issues are in manufacting with CNC mills; i'd personally enjoy an AI telling me which of the 1000 possible combinations of line width, infill %, temperatures, speeds, wall generation params etc. to use for a given print.

rowanG077•3h ago

There is some industry usage of AI in G-code generation. But it often requires at least some post processing. In general if you just want a few parts without hard tolerances it can be pretty good. But when you need to churn out thousands it's worth it to go in an manually optimize to squeeze out those precious machine hours.

nohat•3h ago

I tried this a few months back with claude 3.5 writing cadquery code in cline, with render photos for feedback. I got it to model a few simple things like terraforming mars city fairly nicely. However it still involved a fair bit of coaching. I wrote a simple script to automate the process more but it went off the rails too often.

I wonder if the models improved image understanding also lead to better spatial understanding.

cdchhs•3h ago

how did you feedback the rendered photos or was it a manual copy-paste step?

rowanG077•3h ago

About a year ago I had a 2D drawing of a relatively simple, I uploaded it to chatgpt and asked it to model it in cadquery. It required some coaching and manual post processing but it was able to do it. I have since moved to solvespace since even after using cadquery for years I was spending 50% of the time finding some weird structure to continue my drawing from. Solvespace is simply much more productive for me.

spmcl•3h ago

I did this a few months ago to make a Christmas ornament. There are some rough edges with the process, but for hobby 3D printing, current LLMs with OpenSCAD is a game-changer. I hadn't touched my 3D printer for years until this project.

https://seanmcloughl.in/3d-modeling-with-llms-as-a-cad-luddi...

dgacmu•9m ago

This matches my experience having Claude 3.5 and Gemini 2.0-flash generate openSCAD, but I would call it interesting instead of a game changer.

It gets pretty confused about the rotation of some things and generally needs manual fixing. But it kind of gets the big picture sort of right. It mmmmayybe saved me time the last time I used it but I'm not sure. Fun experiment though.

conorbergin•2h ago

Your prompts are very long for how simple the models are, using a CAD package would be far more productive.

I can see AI being used to generate geometry, but not a text based one, it would have to be able to reason with 3d forms and do differential geometry.

You might be able to get somewhere by training an LLM to make models with a DSL for Open Cascade, or any other sufficiently powerful modelling kernel. Then you could train the AI to make query based commands, such as:

  // places a threaded hole at every corner of the top surface (maybe this is an enclosure)
  CUT hole(10mm,m3,threaded) LOCATIONS surfaces().parallel(Z).first().inset(10).outside_corners()

This has a better chance of being robust as the LLM would just have to remember common patterns, rather than manually placing holes in 3d space, which is much harder.

wgpatrick•2h ago

I definitely agree with your point about the long prompts.

The long prompts are primarily an artifact of trying to make an eval where there is a "correct" STL.

I think your broader point, text input is bad for CAD, is also correct. Some combo of voice/text input + using a cursor to click on geometry makes sense. For example, clicking on the surface in question and then asking for "m6 threaded holes at the corners". I think a drawing input also make sense as its quite quick to do.

eMPee584•35m ago

Actually XR is great for this, with a good 3D interface two-handed manipulation of objects felt surprisingly useful when I last tried an app called GravitySketch on my pico4..

Legend2440•2h ago

There are diffusion models for 3D generation. They make pretty good decorative or ornamental models, like figurines. They are less good for CAD.

GenshoTikamura•2h ago

Oh, poor itty-bitty Skynet won't be able to create Terminators without mastering CAD modelling, so let us totally teach it

alexose•2h ago

As a huge OpenSCAD fan and everyday Cursor user, it seems obvious to me that there's a huge opportunity _if_ we can improve the baseline OpenSCAD code quality.

If the model could plan ahead well, set up good functions, pull from standard libraries, etc., it would be instantly better than most humans.

If it had a sense of real-world applications, physics, etc., well, it would be superhuman.

Is anyone working on this right now? If so I'd love to contribute.

dave1010uk•2h ago

I 3D printed a replacement screw cap for something that GPT-4o designed for me with OpenSCAD a few months ago. It worked very well and the resulting code was easy to tweak.

Good to hear that newer models are getting better at this. With evals and RL feedback loops, I suspect it's the kind of thing that LLMs will get very good at.

Vision language models can also improve their 3D model generation if you give them renders of the output: "Generating CAD Code with Vision-Language Models for 3D Designs" https://arxiv.org/html/2410.05340v2

OpenSCAD is primitive. There are many libraries that may give LLMs a boost. https://openscad.org/libraries.html

alnwlsn•2h ago

The future: "and I want a 3mm hole in one side of the plate. No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways. Wait, which way is up? Never mind, I'll do it myself."

I'm having trouble understanding why you would want to do this. A good interface between what I want and the model I will make is to draw a picture, not write an essay. This is already (more or less) how Solidworks operates. AI might be able to turn my napkin sketch into a model, but I would still need to draw something, and I'm not good at drawing.

The bottleneck continues to be having a good enough description to make what you want. I have serious doubts that even a skilled person will be able to do it efficiently with text alone. Some combo of drawing and point+click would be much better.

This would be useful for short enough tasks like "change all the #6-32 threads to M3" though. To do so without breaking the feature tree would be quite impressive.

wgpatrick•2h ago

Yeah - I fully agree with this POV. From a UX/UI POV, I think this is where things are headed. I talk about this a bit at the end of the piece.

itissid•1h ago

> and I want a 3mm hole in one side of the plate. No the other side. No, not like that, at the bottom. Now make it 10mm from the other hole. No the other hole. No, up not sideways.

One thing that is interesting here is you can read faster than TTS to absorb info. But you can speak much faster than you can type. So is it all that typing that's the problem or could be just an interface problem? and in your example, you could also just draw with your hand(wrist sensor) + talk.

As I've been using agents to code this way. Its way faster.

alnwlsn•1h ago

Feels a bit like being on a call with someone at the hardware store, about something that you both don't know the name for. Maybe the person on the other end is confused, or maybe you aren't describing it all that well. Isn't it easier to take a picture of the thing or just take the thing itself and show it to someone who works there? Harder again to do that when the thing you want isn't sold at the store, which is probably why you're modeling it in the first place.

Most of the mechanical people I've met are good at talking with their hands. "take this thing like this, turn it like that, mount it like this, drill a hole here, look down there" and so on. We still don't have a good analog for this in computers. VR is the closest we have and it's still leagues behind the Human Hand mk. 1. Video is good too, but you have to put in a bit more attention to camerawork and lighting than taking a selfie.

fragmede•1h ago

talking to my computer and having it create things is pretty danged cool. Voice input takes out so much friction that, yeah, maybe it would be faster with a mouse and keyboard, but if I can just talk with my computer? I can do it while I'm walking around and thinking.

chpatrick•1h ago

If the napkin sketch generation is 95% right and only needs minor corrections then it's still a massive time saver.

ssl-3•1h ago

So maybe the future is to draw a picture, and go from there?

For instance: My modelling abilities are limited. I can draw what I want, with measurements, but I am not a draftsman. I can also explain the concept, in conversational English, to a person who uses CAD regularly and they can hammer out a model in no time. This is a thing that I've done successfully in the past.

Could I just do it myself? Sure, eventually! But my modelling needs are very few and far between. It isn't something I need to do every day, or even every year. It would take me longer to learn the workflow and toolsets of [insert CAD system here] than to just earn some money doing something that I'm already good at and pay someone else to do the CAD work.

Except maybe in the future, perhaps I will be able use the bot to help bridge the gap between a napkin sketch of a widget and a digital model of that same widget. (Maybe like Scotty tried to do with the mouse in Star Trek IV.)

(And before anyone says it: I'm not really particularly interested in becoming proficient at CAD. I know I can learn it, but I just don't want to. It has never been my goal to become proficient at every trade under the sun and there are other skills that I'd rather focus on learning and maintaining instead. And that's OK -- there's lots of other things in life that I will probably also never seek to be proficient at, too.)

abe_m•1h ago

I think this is along the lines of the AI horseless carriage[1] topic that is also on the front page right now. You seem to be describing the current method as operated through an AI intermediary. I think the power in AI for CAD will be at a higher level than lines, faces and holes. It will be more along the lines of "make a bracket between these two parts". "Make this part bolt to that other part". "Attach this pump to this gear train" (where the AI determines the pump uses a SAE 4 bolt flange of a particular size and a splined connection, then adds the required features to the housing and shafts). I think it will operate on higher structures than current CAD typically works with, and I don't think it will be history tree and sketch based like Solidworks or Inventor. I suspect it will be more of a direct modelling approach. I also think integrating FEA to allow the AI to check its work will be part of it. When you tell it to make a bracket between two parts, it can check the weight of the two parts, and some environmental specification from a project definition, then auto-configure FEA to check the correct number of bolts, material thickness, etc. If it made the bracket from folded sheet steel, you could then tell it you want a cast aluminum bracket, and it could redo the work.

[1]https://news.ycombinator.com/item?id=43773813

coderenegade•1h ago

I think this is correct, especially the part about how we actually do modelling. The topological naming problem is really born from the fact that we want to do operations on features that may no longer exist if we alter the tree at an earlier point. An AI model might find it easier to work directly with boolean operations or meshes, at which point, there is no topological naming problem.

alnwlsn•52m ago

You're right, but I think we have a long way to go. Even our best CAD packages today don't work nearly as well as advertised. I dread to think what Dassault or Autodesk would charge per seat for something that could do the above!

abe_m•32m ago

I agree. I think a major hindrance to the current pro CAD systems is being stuck to the feature history tree, and rather low level features. Considerable amounts of requirements data is just added to a drawing free-form without semantic machine-readable meaning. Lots of tolerancing, fit, GD&T, datums, etc are just lines in a PDF. There is the move to MBD/PMI and the NIST driven STEP digital thread, but the state of CAD is a long way from that being common. I think we need to get to the data being embedded in the model ala MBD/PMI, but then go beyond it. The definition of threads, gear or spline teeth, ORB and other hydraulic ports don't fit comfortably into the current system. There needs to be a higher level machine-readable capture, and I think that is where the LLMs may be able to step in.

I suspect the next step will be such a departure that it won't be Siemens, Dassault, or Autodesk that do it.

eurekin•33m ago

I have come across a significant number of non engineers wanting to do, what ultimately involves some basic CAD modelling. Some can stall on such tasks for years (home renovation) or just don't do it at all. After some brief research, the main cause is not wanting to sink over 30 hours into learning basics of a cad package of choice.

For some reason they imagine it as a daunting, complicated, impenetrable task with many pitfalls, which aren't surmountable. Be it interface, general idea how it operates, fear of unknown details (tolerances, clearances).

It's easy to underestimate the knowledge required to use a cad productively.

One such anecdata near me are highschools that buy 3d printers and think pupils will naturally want to print models. After initial days of fascination they stopped being used at all. I've heard from a person close to the education that it's a country wide phenomena.

Back to the point though - maybe there's a group of users that want to create, but just can't do CAD at all and such text description seem perfect for them.

monoid73•2h ago

this is one of the more compelling "LLM meets real-world tool" use cases i've seen. openSCAD makes a great testbed since it's text-based and deterministic, but i wonder what the limits are once you get into more complex assemblies or freeform surfacing.

curious if the real unlock long-term will come from hybrid workflows, LLMs proposing parameterized primitives, humans refining them in UI, then LLMs iterating on feedback. kind of like pair programming, but for CAD.

wgpatrick•2h ago

Complex assemblies completely fall on their face. It's pretty fun/hilarious to ask it to do something like: "Make a mid-century modern coffee table" -- the result will have floating components, etc.

Yes to your thought about the hybrid workflows. There's a lot of UI/UX to figure out about how to go back and forth with the LLM to make this useful.

coderenegade•1h ago

This is kind of the physical equivalent of having the model spit out an entire app, though. When you dig into the code, a lot of it won't make sense, you'll have meat and gravy variables that aren't doing anything, and the app won't work without someone who knows what they're doing going in and fixing it. LLMs are actually surprisingly good at at codeCAD given that they're not trained on the task of producing 3d parts, so there's probably a lot of room for improvement.

I think it's correct that new workflows will need to be developed, but I also think that codeCAD in general is probably the future. You get better scalability (share libraries for making parts, rather than the data), better version control, more explicit numerical optimization, and the tooling can be split up (i.e. when programming, you can use a full-blown IDE, or you can use a text editor and multiple individual tools to achieve the same effect). The workflow issue, at least to me, is common to all applications of LLMs, and something that will be solved out of necessity. In fact, I suspect that improving workflows by adding multiple input modes will improve model performance on all tasks.

jmcpheron•1h ago

It's so cool to see this post, and so many other commenters with similar projects.

I had the same thought recently and designed a flexible bracelet for pi Day using openscad and a mix of some the major AI providers. I'm cool to see other people are doing similar projects. I'm surprised how well I can do basic shapes and open scad with these AI assistants.

https://github.com/jmcpheron/counted-out-pi

itomato•1h ago

I spent #Marchintosh trying to get a usable Apple EMate Stylus out of ChatGPT and OpenSCAD.

I took measurements.

I provided contours.

Still have a long way to go. https://github.com/itomato/EmateWand

adamweld•14m ago

A recent Ezra Klein Interview[0] mentioned some "AI-Enabled" CAD tools used in China. Does anyone know what tools they might be talking about? I haven't been able to find any open-source tools with similar claims.

>I went with my colleague Keith Bradsher to Zeekr, one of China’s new car companies. We went into the design lab and watched the designer doing a 3D model of one of their new cars, putting it in different contexts — desert, rainforest, beach, different weather conditions.

>And we asked him what software he was using. We thought it was just some traditional CAD design. He said: It’s an open-source A.I. 3D design tool. He said what used to take him three months he now does in three hours.

[0] https://www.nytimes.com/2025/04/15/opinion/ezra-klein-podcas...

You wouldn't steal a font

How a 20 year old bug in GTA San Andreas surfaced in Windows 11 24H2

Google blocked Motorola use of Perplexity AI, witness says

Yagri: You are gonna read it

DOGE Worker’s Code Supports NLRB Whistleblower

Teaching LLMs how to solid model

MCP on AWS Lambda with MCPEngine

Graphics livecoding in Common Lisp

C++26: more constexpr in the core language

AI Horseless Carriages

Sail-Trim Simulator

First Successful Lightning Triggering and Guiding Using a Drone

Launch HN: Cua (YC X25) – Open-Source Docker Container for Computer-Use Agents

Lucene University

The Future of MCPs

Apple and Meta fined millions for breaching EU law

Ninth Circuit Takes a Wrecking Ball to Internet Personal Jurisdiction Law

Spring 83: a draft protocol intended to suggest new ways of relating online

Show HN: Node.js video tutorials where you can edit and run the code

A Computational Proof of the Highest-Scoring Boggle Board

Get your Minitel back, the COMPUTEL videotex BBS is back

Sustain your creative drive in the face of technological change

MinC Is Not Cygwin

Show HN: Index – new SOTA Open Source browser agent

More Everything Forever

Willy Ley was a prophet of space travel

Mt Ontake

Automated Antenna Design with Evolutionary Algorithms [pdf] (2006)

They made computers behave like annoying salesmen

How ZGC allocates memory for the Java heap

You wouldn't steal a font

How a 20 year old bug in GTA San Andreas surfaced in Windows 11 24H2

Google blocked Motorola use of Perplexity AI, witness says

Yagri: You are gonna read it

DOGE Worker’s Code Supports NLRB Whistleblower

Teaching LLMs how to solid model

MCP on AWS Lambda with MCPEngine

Graphics livecoding in Common Lisp

C++26: more constexpr in the core language

AI Horseless Carriages

Sail-Trim Simulator

First Successful Lightning Triggering and Guiding Using a Drone

Launch HN: Cua (YC X25) – Open-Source Docker Container for Computer-Use Agents

Lucene University

The Future of MCPs

Apple and Meta fined millions for breaching EU law

Ninth Circuit Takes a Wrecking Ball to Internet Personal Jurisdiction Law

Spring 83: a draft protocol intended to suggest new ways of relating online

Show HN: Node.js video tutorials where you can edit and run the code

A Computational Proof of the Highest-Scoring Boggle Board

Get your Minitel back, the COMPUTEL videotex BBS is back

Sustain your creative drive in the face of technological change

MinC Is Not Cygwin

Show HN: Index – new SOTA Open Source browser agent

More Everything Forever

Willy Ley was a prophet of space travel

Mt Ontake

Automated Antenna Design with Evolutionary Algorithms [pdf] (2006)

They made computers behave like annoying salesmen

How ZGC allocates memory for the Java heap

Teaching LLMs how to solid model

Comments