A proof of concept tool to verify estimates

https://terrytao.wordpress.com/2025/05/01/a-proof-of-concept-tool-to-verify-estimates/

85•jjgreen•2mo ago

Comments

esafak•2mo ago

Nice to LLMs being put to such use! I see the heavy lifting here is due to linear programming:

https://github.com/teorth/estimates/blob/main/src/estimates....

eh_why_not•2mo ago

The ChatGPT session he links [0] shows how powerful the LLM is in aiding and teaching programming. A patient, resourceful, effective, and apparently deeply knowledgeable tutor! At least for beginners.

[0] https://chatgpt.com/share/68143a97-9424-800e-b43a-ea9690485b...

nh23423fefe•2mo ago

I'm constantly shocked by the number of my coworkers who won't even try to use an LLM to get stuff done faster. It's like they want it to be bad so they don't have to improve.

Ygg2•2mo ago

And I'm constantly shocked by number of people still shilling for it, despite it hallucinating constantly.

Plus having used it in JetBrains IDE it makes me sad to see them ditching their refactoring for LLM refuctoring.

regularjack•2mo ago

The normal refactorings are still there AFAICT.

Ygg2•2mo ago

That implies that they were there in the first place. For some IDEs the refactoring are essentially rename, and buy JetBrains AI plugin.

lazyasciiart•2mo ago

Then don't complain about them going away?

Ygg2•2mo ago

I didn't complain about them going away. I complained about using LLMs upsell rather than implementing refactoring like they used to for their previous IDEs (e.g. IntelliJ).

lazyasciiart•2mo ago

You did, actually.

Ygg2•2mo ago

I complained about them no longer being added, not being removed (at least not yet). Look at CLion refactorings, compare this to IDEA and Rider that preceded the LLM enshittification.

For C++, there should be quite a few refactoring on the count of it being OOP like Java.

Even IDEA and Rider didn't add any new refactorings, despite Java advancing quite a bit.

bcrosby95•2mo ago

Maybe they have tried and found it lacking?

I have an on again off again relationship with LLMs. I always walk away disappointed. Most recently for a hobby project around 1k lines so far, and it outputs bugs galore, makes poor design decisions, etc.

It's ok for one off scripts, but even those it rarely one shots.

I can only assume people who find it useful are working on different things than I am.

kevmo314•2mo ago

Yeah I'm in the holding it wrong camp too. I really want LLMs to work, but every time I spend effort trying to get it to do something I end up with subtle errors or a conclusion that isn't actually correct despite looking correct.

Most people tell me I'm just not that good at prompting, which is probably true. But if I'm learning how to prompt, that's basically coding with more steps. At that point it's faster for me to write the code directly.

The one area where it actually has been successful is (unsurprisingly) translating code from one language to another. That's been a great help.

kaoD•2mo ago

I have never been told I'm bad at prompting, but people swear LLMs are so useful to them I ended up thinking I must be bad at prompting.

Then I decided to take on offers to help me with a couple problems I had and, surprise, LLMs were indeed useless even when being piloted by people that swear by them, in the pilot's area of expertise!

I just suspect we're indeed not bad at prompting but instead have different kinds of problems that LLMs are just not (yet?) good at.

I tend to reach for LLMs when I'm (1) lazy or (2) stuck. They never help with (2) so it must mean I'm still as smart as them (yay!) They beat me at (1) though. Being indefatigable works in their favor.

Scarblac•2mo ago

I do the designing, then I write a comment explaining what happens, and the LLM then adds a few lines of code. Write another comment, etc.

I get very similar code to what I would normally write but much faster and with comments.

dimal•2mo ago

Don’t get them to make design decisions. They can’t do it.

Often, I use LLMs to write the V1 of whatever module I’m working on. I try to get it to do the simplest thing that works and that’s it. Then I refactor it to be good. This is how I worked before LLMs already: do the simplest thing that works, even if it’s sloppy and dumb, then refactor. The LLM just lets me skip that first step (sometimes). Over time, I’m building up a file of coding standards for them to follow, so their V1 doesn’t require as much refactoring, but they never get it “right”.

Sometimes they’ll go off into lalaland with stuff that’s so over complicated that I ignore it. The key was noticing when it was going down some dumb rabbit hole and bailing out quick. They never turn back. They’ll always come up with another dumb solution to fix the problem they never should have created in the first place.

TheNewsIsHere•2mo ago

My experience tracks your experience. It seems as if there are a few different camps when it comes to LLMs, and that’s partly based on one’s job functions and/or context that available LLMs simply don’t handle.

I cannot, for example, rely on any available LLM to do most of my job, because most of my job is dependent on both technical and business specifics. The inputs to those contexts are things LLMs wouldn’t have consumed anywhere else. For example specific facts about a client’s technology environment. Or specific facts about my business and its needs. An LLM can’t tell me what I should charge for my company’s services.

It might be able to help someone figure out how to do that when starting out based on what it’s consumed from Internet sources. That doesn’t really help me though. I already know how to do the math. A spreadsheet or an analytical accounting package with my actual numbers is going to be faster and a better use of my time and money.

There are other areas where LLMs just aren’t “there yet” in general terms because of industry or technology specifics that they’re not trained on, or that require some actual cognition and nuance an LLM trained on random Internet sources aren’t going to have.

Heck, some vendors lock their product documentation behind logins you can only get if you’re a customer. If you’re trying to accomplish something with those kinds of products or services then generally available LLMs aren’t going to provide any kind of defensible guidance.

The widely available LLMs are better suited to things that can easily be checked in the public square, or to help an expert summarize huge amounts of information, and who can spot confabulations/hallucinations. Or if they’re trained on specific, well-vetted data sets for a particular use case.

People seem to forget or not understand that LLMs really do not think at all. They have no cognition and don’t handle nuance.

chneu•2mo ago

Some people just don't want to use AI and there are very legitimate reasons for that.

Why are you so willing to teach a program how to do your job? Why are you so willing to give your information to a LLM that doesn't care about your privacy?

zamadatix•2mo ago

I agree there can be very legitimate reasons for personally not wanting to use AI. At the same time, I'm not sure I find either of those questions to be related to particularly convincing reasons.

Teaching a program how to do your job has been part of the hacker mindset for many decades now, I don't think there is anything new to be said as to why. Anyone here reading this on the internet has long since decided they are fine preferring technical automations over preserving traditional ways of completing work.

LLMs don't inherently imply anything about privacy handling, the service you select does (if you aren't just opting to self host in the first place). On the hosted service side there's anything from "free and sucks up everything" to "business data governance contracts about what data can be used how".

daveguy•2mo ago

> Anyone here reading this on the internet has long since decided they are fine preferring technical automations over preserving traditional ways of completing work.

Well, that's a huge unsubstantiated leap. Also, it's not about "preserving traditional ways of completing work." It's just about recognizing that humans are much better at the vast majority of real world work.

zamadatix•2mo ago

> Well, that's a huge unsubstantiated leap.

I suppose that might depend on how you read "preferring". As in "is what one would ideally like" then sure, it's a bit orthogonal. As in "is what one would decides to use" is what I mean in that we are willing to try and use technical automations over traditional means by nature of being here, even if a face to face conversation would be higher quality or an additional mailman would be employed.

> Also, it's not about "preserving traditional ways of completing work." It's just about recognizing that humans are much better at the vast majority of real world work.

While an interesting topic I'm not sure this really relates to why people are willing to teach a program how to do their job. It would be more "why people don't bother to", which is a bit of the opposite assumption (that we should if it were worth it).

The most interesting thing about recognizing humans are much better at the vast majority of real world work is it doesn't define where the boundary currently sits or how far it's moving. I suspect people will continue to be the best option for the majority of work for a very long time to come by our nature to stop considering automated things work. "Work" ends up being "what we're employed to do" rather than "things that happen". Things like lights, electricity, hvac, dishwasher, washer/dryer, water delivery & waste removal, instances of music or entertainment performances, and so on used to require large amounts of human work but now that the majority of work in those areas is automated we call them "expenses" and "work" is having to load/unload the washer instead of clean the clothes and so on.

So, by one measure, I'd disagree wholeheartedly. Machine automation is responsible for more quality production output that humans if, for anything, because of the sheer volume of output and use than being better at a randomly chosen task. On another measure I'd agree wholeheartedly - the things we define as being better at tend to be the things it's worth us doing which become the things we still call "work". Anything which truly has the majority done better (on average) by machines becomes an expense.

apercu•2mo ago

I use LLMs often - a few times a week. Every time I gain confidence in a model I get burned. Sometimes verifying takes longer than doing the task myself, so “AI” gets a narrower and narrower scope in my workflow as time goes by.

mhh__•2mo ago

A lot of people just don't have the dexterity. Doesn't mean they're stupid necessarily (although the two do rhyme)

nottorp•2mo ago

This comment is really sad:

https://terrytao.wordpress.com/2025/05/01/a-proof-of-concept...

Free SVG Editor – BruhGrow Tools

Jeff Bezos-backed satellite costing him $100M lost in space

Managing by Coincidence

Helm local code execution via a malicious chart – CVE-2025-53547

Cloudflare to block AI crawlers by default with new Pay Per Crawl initiative

Show HN: Remove water from speaker using specific frequency

Vercel Accquired Nuxt

Lump of labour fallacy

The Sequoia Investor Whose Anti-Mamdani Posts Set Off a Silicon Valley Storm

The Dark Side of Apple Development

Psilocybin treatment extends cellular lifespan, improves survival of aged mice

Full event page with photo sharing

Proving P ≠ NP via Categorical and Graph-Theoretic 3-SAT

Advancing Protection in Chrome on Android

TimescaleDB helped us scale analytics and reporting

Microsoft Music Producer (1996)

How to stop a bear in big city: Japan issues shoot-to-kill guide

Ask HN: What Problem Would You Solve with Unlimited Resources?

Nginx-micro:Ultra-minimal, statically-linked, multi-architecture Nginx container

Enlightenment as the Great Filter

TOML v0.9

Pattern-wishcast: enum pattern types in 2025 rust

Sia X HackerNoon: Inviting Devs to Build the Decentralized Cloud of the Future

Benchmark for Evaluating Text Embeddings

I'm a 16-Year-Old Self-Taught Developer – Built 700 Projects

Comparing the Climate and Productivity Impacts of a Shrinking Population

LM Studio is free for use at work

Huawei Whistleblower Alleges Pangu AI Model Plagiarized from Qwen and DeepSeek

Myth of the Brown Recluse: Fact, Fear, and Loathing

Jagadish Chandra Bose

Free SVG Editor – BruhGrow Tools

Jeff Bezos-backed satellite costing him $100M lost in space

Managing by Coincidence

Helm local code execution via a malicious chart – CVE-2025-53547

Cloudflare to block AI crawlers by default with new Pay Per Crawl initiative

Show HN: Remove water from speaker using specific frequency

Vercel Accquired Nuxt

Lump of labour fallacy

The Sequoia Investor Whose Anti-Mamdani Posts Set Off a Silicon Valley Storm

The Dark Side of Apple Development

Psilocybin treatment extends cellular lifespan, improves survival of aged mice

Full event page with photo sharing

Proving P ≠ NP via Categorical and Graph-Theoretic 3-SAT

Advancing Protection in Chrome on Android

TimescaleDB helped us scale analytics and reporting

Microsoft Music Producer (1996)

How to stop a bear in big city: Japan issues shoot-to-kill guide

Ask HN: What Problem Would You Solve with Unlimited Resources?

Nginx-micro:Ultra-minimal, statically-linked, multi-architecture Nginx container

Enlightenment as the Great Filter

TOML v0.9

Pattern-wishcast: enum pattern types in 2025 rust

Sia X HackerNoon: Inviting Devs to Build the Decentralized Cloud of the Future

Benchmark for Evaluating Text Embeddings

I'm a 16-Year-Old Self-Taught Developer – Built 700 Projects

Comparing the Climate and Productivity Impacts of a Shrinking Population

LM Studio is free for use at work

Huawei Whistleblower Alleges Pangu AI Model Plagiarized from Qwen and DeepSeek

Myth of the Brown Recluse: Fact, Fear, and Loathing

Jagadish Chandra Bose

A proof of concept tool to verify estimates

Comments