frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Regolith – Regex library that prevents ReDoS CVEs in TypeScript

https://github.com/JakeRoggenbuck/regolith
14•roggenbuck•2h ago
I wanted a safer alternative to RegExp for TypeScript that uses a linear-time engine, so I built Regolith.

Why: Many CVEs happen because TypeScript libraries are vulnerable to Regular Expression Denial of Service attacks. I learned about this problem while doing undergraduate research and found that languages like Rust have built-in protection but languages like JavaScript, TypeScript, and Python do not. This library attempts to mitigate these vulnerabilities for TypeScript and JavaScript.

How: Regolith uses Rust's Regex library under the hood to prevent ReDoS attacks. The Rust Regex library implements a linear-time Regex engine that guarantees linear complexity for execution. A ReDoS attack occurs when a malicious input is provided that causes a normal Regex engine to check for a matching string in too many overlapping configurations. This causes the engine to take an extremely long time to compute the Regex, which could cause latency or downtime for a service. By designing the engine to take at most a linear amount of time, we can prevent these attacks at the library level and have software inherit these safety properties.

I'm really fascinated by making programming languages safer and I would love to hear any feedback on how to improve this project. I'll try to answer all questions posted in the comments.

Thanks! - Jake Roggenbuck

Comments

spankalee•2h ago
It's very, very weird to speak of TypeScript and JavaScript as two separate languages here.

There is no TypeScript RegExp, there is only the JavaScript RegExp as implemented in various VMs. There is no TypeScript VM, only JavaScript VMs. And there are no TypeScript CVEs unless it's against the TypeScript compiler, language server, etc.

serial_dev•1h ago
I was also confused first, I thought it is against the TypeScript compiler, too.
maxloh•26m ago
Deno and Node use V8 under the hood, so the code should essentially run on the same VM regardless.
xyzzy123•2h ago
It's great to have a safe options - and it would have been great if the default had been safe.

I think many people are annoyed with ReDos as a bug class. It seems like mostly noise in the CVE trackers, library churn and badge collecting for "researchers". It'd be less of a problem if people stuck to filing CVEs against libraries that might remotely see untrusted input rather than scrambling to collect pointless "scalps" from every tool under the sun that accepts a configuration regex - build tools, very commonly :(

Perhaps you can stop this madness... :)

bawolff•1h ago
Even in cases where malicious input could be hit, this bug class is stupid on the client side where the attacker can only attack themselves.
xyzzy123•1h ago
Stored... ReDoS, reflected... ReDoS(??)... [it pained me to type those] (╯°□°)╯︵ ┻━┻
roggenbuck•1h ago
> and it would have been great if the default had been safe.

I totally agree here. Safety can and should be from the language itself.

semiquaver•2h ago

  > Regolith attempts to be a drop-in replacement for RegExp and requires minimal (to no) changes to be used instead
vs

  > Since Regolith uses Rust bindings to implement the Rust Regex library to achieve linear time worst case, this means that backreferences and look-around aren't available in Regolith either.
Obviously it cannot be a drop-in replacement if the regex dialect differs. That it has a compatible API is not the only relevant factor. I’d recommend removing the top part from the readme.

Another thought: since backreferences and lookaround are the features in JS regexes which _cause_ ReDOS, why not just wrap vanilla JS regex, rejecting patterns including them? Wouldn’t that achieve the same result in a simpler way?

bawolff•2h ago
> Another thought: since backreferences and lookaround are the features in JS regexes which _cause_ ReDOS,

This is incorrect. Other features can cause ReDOS.

The other problematic features have linear time algorithms that could be used, but generally are not used (i assume for better average case performance)

roggenbuck•1h ago
Yea, I can expand the description to include other features that may cause issues. Here is an example of how counting can cause latency too: https://www.usenix.org/system/files/sec22fall_turonova.pdf
thomasmg•52m ago
Right. An example regex that can be slow is CSV parsing [1]:

.*,.*,.*,.*,.* etc.

I believe a timeout is a better (simpler) solution than to try to prevent 'bad' patterns. I use this approach in my own (tiny, ~400 lines) regex library [2]. I use a limit at most ~100 operations per input byte. So, without measuring wall clock time, which can be inaccurate.

[1]: https://stackoverflow.com/questions/2667015/is-regex-too-slo... [2]: https://github.com/thomasmueller/bau-lang/blob/main/src/test...

roggenbuck•2h ago
Thanks for the feedback! Yea, you're totally right. I'll update the docs to reflect this.

> why not just wrap vanilla JS regex, rejecting patterns including them?

Yea! I was thinking about this too actually. And this would solve the problem of being server side only. I'm thinking about making a new version to do just this.

For a pattern rejecting wrapper, how would you want it to communicate that an unsafe pattern has been created.

btown•1h ago
As someone who's been saved by look-aheads in many a situation, I'm quite partial to the approach detailed in [0]: use a regex library that checks for a timeout in its main matching loop.

This lets you have full backwards compatibility in languages like Python and JS/TS that support backreferences/lookarounds, without running any risk of DOS (including by your own handrolled regexes!)

And on modern processors, a suitably implemented check for a timeout would largely be branch-predicted to be a no-op, and would in theory result in no measurable change in performance. Unfortunately, the most optimized and battle-tested implementations seem to have either taken the linear-time NFA approaches, or have technical debt making timeout checks impractical (see comment in [0] on the Python core team's resistance to this) - so we're in a situation where we don't have the best of both worlds. Efforts like [1] are promising, especially if augmented with timeout logic, but early-stage.

[0] https://stackoverflow.com/a/74992735

[1] https://github.com/fancy-regex/fancy-regex

bbor•5m ago
Totally agree -- those are two incredibly useful features of regex[1][2] that are often effectively irreplaceable. I could see this being a straightforward tradeoff for applications that know for sure they don't need complex regexes but still must accept patterns written by the client for some reason(?), but otherwise this seems like a hell of a way to go to replace a `timeout` wrapper.

This paragraph in particular seems very wholesome, but misguided in light of the tradeoff:

  Having a library or project that is immune to these vulnerabilities would save this effort for each project that adopted it, and would save the whole package ecosystem that effort if widely adopted.
Honestly, the biggest shock here for me is that Rust doesn't support these. Sure, Python has yet to integrate the actually-functional `regex`[3] into stdlib to replace the dreadfully under-specced `re`, but Rust is the new kid on the block! I guess people just aren't writing complex regexes anymore...[4]

RE:simpler wrapper, I personally don't see any reason it wouldn't work, and dropping a whole language seems like a big win if it does. I happened to have some scaffolding on hand for the cursed, dark art of metaregexes, so AFAICT, this pattern would work for a blanket ban: https://regexr.com/8gplg Ironically, I don't think there's a way to prevent false-positives on triple-backslashes without using lookarounds!

[1] https://www.regular-expressions.info/backref.html

[2] https://www.regular-expressions.info/lookaround.html

[3] https://github.com/mrabarnett/mrab-regex

[4] We need a regex renaissance IMO, though the feasibility of "just throw a small fine-tuned LLM at it" may delay/obviate that for users that can afford the compute! It's one of the OG AI concepts, back before intuition seemed possible.

dwoldrich•37m ago
Perhaps regex is just a bad little language for pattern matching.

I have a foggy recollection of compute times exploding for me on a large regex in .Net code and I used a feature I hadn't seen in JavaScript's RegExp that allowed me to mark off sections of already matched parts of the regular expression that prevented it from backtracking.

Perhaps the answer isn't removing features for linear regex, but adding more features to make it more expressive and tunable?

Marcin Wichary's Keyboard, Typewriting, and Type Collection

https://archive.org/details/wicharytypewriter?sort=-publicdate
1•sohkamyung•1m ago•0 comments

How to sabotage your salary negotiations efforts before you even start

https://interviewing.io/blog/sabotage-salary-negotiation-before-even-start
1•pykello•2m ago•0 comments

What on Earth Does Pointer Provenance Have to Do with RCU?

https://people.kernel.org/paulmck/what-on-earth-does-lifetime-end-pointer-zap-have-to-do-with-rcu
1•matt_d•5m ago•0 comments

Anatomy of a Python Loop

https://orencodes.io/anatomy-of-a-python-loop/
1•orencodes•6m ago•0 comments

Piloting Claude for Chrome

https://simonwillison.net/2025/Aug/26/piloting-claude-for-chrome/
1•Shank•8m ago•0 comments

MLRun: Open-Source MLOps Orchestration

https://docs.mlrun.org/en/stable/
1•rzk•13m ago•0 comments

Private equity finds a new way to deal with its deadline crisis

https://www.ft.com/content/58e95295-c1c3-4376-9055-90bd98b8838d
1•petethomas•13m ago•0 comments

Get Ready for the End of Fed Independence

https://www.wsj.com/economy/central-banking/get-ready-for-the-end-of-fed-independence-5a52a824
2•JumpCrisscross•13m ago•1 comments

We have become an authoritarian state, and our top newsrooms are in denial

https://presswatchers.org/2025/08/we-have-become-an-authoritarian-state-and-our-top-newsrooms-are...
6•alwillis•23m ago•0 comments

Uncomfortable Questions About Android Developer Verification

https://commonsware.com/blog/2025/08/26/uncomfortable-questions-android-developer-verification.html
4•ingve•34m ago•0 comments

The man with a Home Computer (1967) [video]

https://www.youtube.com/watch?v=w6Ka42eyudA
3•smarm•37m ago•0 comments

AI coding made me faster, but I can't code to music anymore

https://www.praf.me/ai-coding
3•_praf•38m ago•0 comments

Germany, Japan partner to face down aggressive China

https://www.dw.com/en/germany-japan-partner-to-face-down-aggressive-china/a-73675161
2•peachmaker•38m ago•0 comments

Microsoft headquarters go into lockdown after activists take over Brad Smith's

https://techcrunch.com/2025/08/26/microsoft-headquarters-go-into-lockdown-after-activists-take-ov...
2•cpncrunch•40m ago•0 comments

China Utopia Censorship Propaganda

https://www.theatlantic.com/international/archive/2025/08/china-utopia-censorship-propaganda/683998/
2•peachmaker•43m ago•0 comments

MIT Viral Study Debunked [video]

https://www.youtube.com/watch?v=X6O21jbRcN4
3•starchild3001•45m ago•1 comments

Implementing Forth in Go and C

https://eli.thegreenplace.net/2025/implementing-forth-in-go-and-c/
2•ingve•46m ago•0 comments

SQL Design Patterns

https://vadimtropashko.wordpress.com/%e2%80%9csql-design-patterns%e2%80%9d-book/about/
5•mci•50m ago•0 comments

Bounce Beta Now Live

https://blog.anew.social/bounce-beta-now-live/
2•riffraff•52m ago•0 comments

A free polling add-on for Slack – TinyPoll

https://tinypoll.io
2•hackermiles•52m ago•1 comments

Why Stacking Sliding Windows Can't See Far

https://guangxuanx.com/blog/stacking-swa.html
2•kiyanwang•56m ago•0 comments

"RAG Is Dead, Context Engineering Is King"

https://www.latent.space/p/chroma
3•kiyanwang•56m ago•1 comments

Show HN: Magic links – Get video and dev logs without installing anything

2•irtefa•57m ago•0 comments

MCP Gateway from Microsoft

https://github.com/microsoft/mcp-gateway
2•marshall300791•1h ago•0 comments

God Simulator in Bash

https://github.com/kuprel/serenity
3•kuprel•1h ago•2 comments

How Twitter Handles Millions, If Not Billions of Tweets Every Day?

https://twitter.com/rohitlakh/status/1960303201695170965
2•ShaggyHotDog•1h ago•2 comments

Q&A: Douglas Hofstadter on why AI is far from intelligent (2017)

https://qz.com/1088714/qa-douglas-hofstadter-on-why-ai-is-far-from-intelligent
2•electric_muse•1h ago•0 comments

The TTY Protocol

https://sgt.hootr.club/molten-matter/tty/
2•bubblebeard•1h ago•1 comments

Have We Reached the Singularity?

2•phoenixhaber•1h ago•2 comments

Show HN: Timeline – Auto-save X posts from your timeline with search and export

https://github.com/RiverTwilight/Timeline
2•yungeeker•1h ago•0 comments