frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Type-constrained code generation with language models

https://arxiv.org/abs/2504.09246
82•tough•3h ago

Comments

jiggawatts•2h ago
This was an obvious next step. Most current products can only restrict the token prediction to valid JSON or a specific JSON schema at best. There's no reason that this should be the only grammar available for constrained output mode.

The real challenge will be to make this detect and switch languages automatically. For example, a snippet of code could include a LaTeX formula in a comment and SQL in a string literal. There are many more examples, such as regex inside a shell script, and so on.

The obvious next step after that is back-tracking. It's possible to emit a token that is valid, but then allows no further completions that are valid. In other words, the model can paint itself into a corner. To my knowledge, no current online LLM service uses any kind of backtracking, they run in append ("forwards") mode only.

helltone•2h ago
Backtracking idea is interesting, could maybe diffusion help? At some point it turns into sat solving.
grafmax•1h ago
Sat solving I guess because types encode proofs?
foota•1h ago
I believe Microsoft introduced a framework that did this sort of backtracking that you're suggesting. I'm not sure how much traction it got.
tough•1h ago
SRLCG: Self-Rectified Large-Scale Code Generation with Multidimensional Chain-of-Thought and Dynamic Backtracking

https://arxiv.org/abs/2504.00532

IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking

https://arxiv.org/abs/2410.07295

ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Models for Code Generation

https://arxiv.org/abs/2411.07112v1

ArcaneMoose•1h ago
I think TypeScript is uniquely positioned to be the optimal language for LLMs. Tons of training data (benefiting from all the JS examples as well) plus the structure of types for LLMs to follow and tools to enforce.
yoyohello13•1h ago
God help us…
marviel•1h ago
what do you dislike about it?
OutOfHere•1h ago
There are languages that constrain types a lot more tightly than TypeScript, e.g. Kotlin, Rust, and Haskell. The more constrained the types, the more correct the program could be.
mindwok•1h ago
Yep, and Rust famously goes beyond this by modelling memory ownership at compile time.

In fact, the more behaviour we can model at compile time the better when it comes to LLMs - there's some cool ideas here like transpiling Rust into languages for formal verification. See https://github.com/formal-land/coq-of-rust as an example.

Formal verification was one of those things that was previously so annoying to do that it rarely made it past academic use cases or extremely important libraries, but I think LLMs take the tedium out of it. Perhaps formal verification will have a "test driven development" type of moment in the sun thanks to this.

koakuma-chan•1h ago
Can LLMs properly code in Rust yet? There is way more TypeScript code out there compared to Rust, and I doubt structured output can alleviate this.
steveklabnik•53m ago
They can, yes.
pram•1h ago
LLMs work well with any static analysis tool. I frequently instruct Claude to use stuff like “go vet” and “deadcode” when it goes on a tear and writes a bunch of broken trash and declares mission accomplished.
koakuma-chan•1h ago
> LLMs work well with any static analysis tool.

tsc error messages are so bad that every time my LLM sees one of those "SomeType is not assignable to SomeLongAssTypeDontEvenTryToUnderstandWhatsGoingOnHere<<<<>>>>>>>>>>>>>>>>>>>>" it just gives up and casts to any. goes for python too.

floydnoel•41m ago
ha, that's always been my biggest gripe with ts
homebrewer•1h ago
Hejlsberg mentioned the ability to quickly provide accurate type information to LLMs as one of the reasons for rewriting tsc into Go:

https://youtu.be/10qowKUW82U?t=3186

tough•1h ago
But isn't TypeScript already a typed language to begin with?
habitue•1h ago
This is about the speed with which the compiler can advise an LLM that a particular thing checks or doesn't check. Typescript is much slower than Go
tough•1h ago
okay so basically the faster compiling means a tigher feedback loop for the LLM to -know- if the code compiles or not etc? interesting

is go faster than rust?

yoyohello13•1h ago
Go’s compiler is WAY faster than Rust’s. As far as speed of the actual program, Rust will generally be faster.
notnullorvoid•1h ago
Go or Rust compiler speeds won't have any effect here. The program in this context is the TypeScript compiler.
koakuma-chan•1h ago
cargo check is WAY faster than go build
Thaxll•1h ago
Working with both I can say that this is a big no, go mod is as fast if not faster, usually Go dep are much faster because Go does not import as much dependencies as Rust.
koakuma-chan•47m ago
In Rust you only need to compile your dependencies once. After that it's just your app because dependencies don't change.
notnullorvoid•1h ago
> is go faster than rust?

Depends on how you write the Go or Rust code. The most optimal Rust re-write of the TypeScript compiler would very likely be faster than the most optimal version in Go. However they didn't want to do a re-write, they wanted to port the existing compiler codebase written in TS. Go like TS (ultimately the JS runtime) also has GC which makes a 1-to-1 port much easier.

PartiallyTyped•51m ago
No. Ignore the other comments.

The reason for this decision is that they wanted a near 1:1 port of the typescript code to go, keeping design and structure almost identical.

You can’t do that in rust as easily because of all the cyclical references and indirection involved.

A rust port would be a rewrite. This is merely a migration.

raincole•35m ago
> is go faster than rust

No.

They rewrote in go because go is similar enough to typescript, while being faster than typescript.

Source: https://github.com/microsoft/typescript-go/discussions/411

notnullorvoid•1h ago
The general idea seems very promising, I had been hoping someone would do something like this since seeing JSON schema structured outputs for LLMs.

Need to dig in a bit more on the implementation, but I was surprised that the paper didn't mention hooking into existing language service/server. There's more than types that an LLM could leverage from existing language tooling. Auto imports is a good example, it is handy for the human developer to keep a linear writing flow, something a LLM needs even more.

tough•1h ago
The code can be found here: https://github.com/eth-sri/type-constrained-code-generation
compacct27•1h ago
Honestly it's already working great in Cursor. Even adapting one type structure to another is quickly handled.
slt2021•1h ago
we really need LLM trained on AST, instead of token, is there any research on this?
tough•1h ago
ASTrust: Towards More Trustworthy and Interpretable LLMs for Code through Syntax-Grounded Explanations

https://arxiv.org/abs/2407.08983

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

https://arxiv.org/abs/2401.03003

CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation

https://arxiv.org/abs/2405.02355

koakuma-chan•1h ago
The vibe code society would benefit way more if libraries hosted their docs in a way that's easy to copy and paste into an LLM.
tough•1h ago
many docs now include llms.txt https://llmstxt.org/
koakuma-chan•59m ago
I saw that but it doesn't work for me. I use Gemini 2.5 Pro Preview right now, and it cannot fetch content from links. What I am looking for is a large text file with public API class, function, etc. signatures, plain text documentation and code examples.
tough•58m ago
https://ai-sdk.dev/llms.txt
koakuma-chan•55m ago
Depends on the library I guess, I spent 12~ hours today vibe coding with LiveKit and their /llms.txt is https://docs.livekit.io/llms.txt
muglug•51m ago
Really cool results!

That this research comes out of universities, and not large AI labs, makes me think those labs believe that larger models are still the way to go.

aibrother•9m ago
+1 this seems like healthy development
bmc7505•48m ago
The correct way to do this is with finite model theory but we're not there yet.

Market Failure When the Cost of Misclassification Is Higher for the Customer

https://www.gojiberries.io/when-cost-of-misclassification-is-much-higher-for-the-customer-than-the-business/
1•goji_berries•5m ago•1 comments

AI therapy is a surveillance machine in a police state

https://www.theverge.com/policy/665685/ai-therapy-meta-chatbot-surveillance-risks-trump
3•mastazi•6m ago•0 comments

Microdosing Sprints to Achieve Better Sprint Performance in Field Hockey Players

https://pmc.ncbi.nlm.nih.gov/articles/PMC9865125/
1•alecst•6m ago•0 comments

Gilmour Space – Mission: Eris Testflight 1

https://www.gspace.com/missions
1•dwd•7m ago•0 comments

Train your own GPT (from scratch)

https://github.com/Michaelgathara/GPT
1•Michaelgathara•11m ago•0 comments

EcoPrototype

https://mikalbanks43-52924.bubbleapps.io/version-test/
1•mikebanks1703•16m ago•0 comments

Top Best Vulnerability Scanning Tools (2025 Guide)

https://www.youtube.com/watch?v=-BB486-VE-Y
1•pawanjswal•17m ago•0 comments

Noncoders using AI to prompt their ideas into reality. It's called 'vibe coding'

https://www.nbcnews.com/tech/tech-news/noncoders-ai-prompt-ideas-vibe-coding-rcna205661
2•Kerrick•22m ago•0 comments

Chrome Web Store Keyword Research Tool

https://webextension.net/tools/webstore-keyword-analysis
1•trungpv1601•26m ago•0 comments

How Cursor and Windsurf Work Under the Hood

https://diamantai.substack.com/p/the-hidden-algorithms-powering-your
2•vantiro•26m ago•0 comments

Scientists turn lead into gold for first time, but only for a split second

https://abcnews.go.com/Technology/scientists-turn-lead-gold-1st-time-split/story?id=121762241
1•mraniki•28m ago•0 comments

NetworkOcean

https://www.ycombinator.com/companies/networkocean
3•n2d4•30m ago•1 comments

Computers That Can Run Backwards (2017)

https://www.americanscientist.org/article/computers-that-can-run-backwards
1•todsacerdoti•31m ago•0 comments

Apple's Widget Backdoor [video]

https://www.youtube.com/watch?v=NdJ_y1c_j_I
1•lurkersince2013•31m ago•0 comments

Universal flu vaccine project puzzles scientists

https://www.npr.org/sections/shots-health-news/2025/05/13/nx-s1-5384934/trump-universal-flu-vaccine
3•geox•34m ago•1 comments

Wearipedia Find Wearable Performance

https://wearipedia.com/
2•husamia•37m ago•0 comments

Ask HN: Anyone else use a single gigantic .txt file as a notetaking solution?

3•superconduct123•39m ago•3 comments

Horoscopes for children, but each one just tells you to get your kid vaccinated

https://sproutsign.com
1•matt_kirkland•42m ago•0 comments

DeepSeek’s ‘tech madman’ founder is threatening US dominance in AI race

https://www.bloomberg.com/news/features/2025-05-13/deepseek-races-after-chatgpt-as-china-s-ai-industry-soars
4•blumpy22•44m ago•2 comments

Can a Photograph and AI Help Predict Who Will Survive Cancer Treatment?

https://www.nytimes.com/2025/05/08/well/biological-age-faceage.html
2•bookofjoe•56m ago•1 comments

Sotheby's – Modern Evening Auction – Tue May 13 25 [video]

https://www.youtube.com/watch?v=bnKW3ydMsMQ
3•handfuloflight•1h ago•0 comments

Zillow to bar publicly marketed listings not shared via MLS

https://www.realestatenews.com/2025/04/10/zillow-to-bar-publicly-marketed-listings-not-shared-via-mls
5•gscott•1h ago•1 comments

Eating ginger/turmeric/cinnamon can interfere with prescription medication

https://theconversation.com/why-eating-too-much-ginger-turmeric-or-cinnamon-could-interfere-with-your-prescription-medication-255527
1•gnabgib•1h ago•0 comments

Live Real-Time Translator

https://talkpersona.com/translate/
2•JM_SG•1h ago•0 comments

Chris Hadnagy vs. DefCon Dismissed with Prejudice [pdf]

https://storage.courtlistener.com/recap/gov.uscourts.wawd.329575/gov.uscourts.wawd.329575.119.0.pdf
4•healsdata•1h ago•0 comments

Now you can Airbnb more than an Airbnb

https://www.airbnb.co.uk/release
2•mellosouls•1h ago•1 comments

What It Takes to Ship

https://krishna.github.io/posts/what-it-takes-to-ship/
1•kenshi•1h ago•0 comments

Post-Labor Economics Lecture 01 [video]

https://www.youtube.com/watch?v=UzJ_HZ9qw14
1•ngrislain•1h ago•0 comments

'Accessibility and Rust' live podcast recording session at RustWeek

https://gribnau.dev/posts/rustweek-accessibility-and-rust-podcast/
1•foresterre•1h ago•0 comments

LLM Interviews: Vector DBs

https://mburaksayici.com/blog/2025/05/06/llm-interviews-vector-dbs.html
3•mburaksayici•1h ago•0 comments