frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Zerostack – A Unix-inspired coding agent written in pure Rust

https://crates.io/crates/zerostack/1.0.0
408•gidellav•12h ago•173 comments

Mozilla to UK regulators: VPNs are essential privacy and security tools

https://blog.mozilla.org/netpolicy/2026/05/15/mozilla-to-uk-regulators-vpns-are-essential-privacy...
228•WithinReason•4h ago•69 comments

A nicer voltmeter clock

https://lcamtuf.substack.com/p/a-nicer-voltmeter-clock
194•surprisetalk•11h ago•24 comments

Hosting a website on an 8-bit microcontroller

https://maurycyz.com/projects/mcusite/
134•zdw•9h ago•12 comments

Colossus: The Forbin Project

https://en.wikipedia.org/wiki/Colossus:_The_Forbin_Project
108•doener•2d ago•29 comments

Playing Atari ST Music on the Amiga with Zero CPU

https://arnaud-carre.github.io/2026-05-15-ym-fast-emu/
42•z303•2h ago•9 comments

Moving away from Tailwind, and learning to structure my CSS

https://jvns.ca/blog/2026/05/15/moving-away-from-tailwind--and-learning-to-structure-my-css-/
553•mpweiher•1d ago•318 comments

OpenAI and Government of Malta partner to roll out ChatGPT Plus to all citizens

https://openai.com/index/malta-chatgpt-plus-partnership/
199•bookofjoe•14h ago•230 comments

SANA-WM, a 2.6B open-source world model for 1-minute 720p video

https://nvlabs.github.io/Sana/WM/
345•mjgil•22h ago•136 comments

C++26 Shipped a SIMD Library Nobody Asked For

https://lucisqr.substack.com/p/c26-shipped-a-simd-library-nobody
139•signa11•2d ago•90 comments

Illusions of understanding in the sciences

https://link.springer.com/article/10.1007/s42113-026-00271-1
47•sebg•2d ago•17 comments

MCP Hello Page

https://www.hybridlogic.co.uk/blog/2026/05/mcp-hello-page
101•Dachande663•12h ago•35 comments

We've made the world too complicated

https://user8.bearblog.dev/the-world-is-too-complicated/
304•James72689•1d ago•287 comments

Roman Letters

https://romanletters.org/
25•diodorus•2d ago•5 comments

Accelerando (2005)

https://www.antipope.org/charlie/blog-static/fiction/accelerando/accelerando.html
296•eamag•23h ago•168 comments

The Third Hard Problem

https://mmapped.blog/posts/48-the-third-hard-problem
79•surprisetalk•2d ago•44 comments

Prolog Basics Explained with Pokémon

https://unplannedobsolescence.com/blog/prolog-basics-pokemon/
7•birdculture•1d ago•0 comments

Frontier AI has broken the open CTF format

https://kabir.au/blog/the-ctf-scene-is-dead
384•frays•1d ago•385 comments

Why did Clovis toolmakers choose difficult quartz crystal?

https://phys.org/news/2026-04-clovis-toolmakers-difficult-quartz-crystal.html
26•PaulHoule•2d ago•14 comments

Twilight of the Velocipede: Typesetting Races Before the Age of Linotype

https://publicdomainreview.org/essay/twilight-of-the-velocipede/
15•benbreen•13h ago•0 comments

Unknowable Math Can Help Hide Secrets

https://www.quantamagazine.org/how-unknowable-math-can-help-hide-secrets-20260511/
53•Xcelerate•3d ago•11 comments

δ-mem: Efficient Online Memory for Large Language Models

https://arxiv.org/abs/2605.12357
218•44za12•1d ago•57 comments

Self-Distillation Enables Continual Learning [pdf]

https://arxiv.org/abs/2601.19897
65•teleforce•9h ago•16 comments

Halt and Catch Fire

https://unstack.io/halt-and-catch-fire
142•ScottWRobinson•16h ago•74 comments

A molecule with half-Möbius topology

https://www.science.org/doi/10.1126/science.aea3321
98•bryanrasmussen•4d ago•7 comments

3D Gaussian Splatting in a Weekend

https://bfeldman.me/3dgs-weekend/
95•b__feldman•3d ago•10 comments

Show HN: Rocksky – Music scrobbling and discovery on the AT Protocol

https://tangled.org/rocksky.app/rocksky
83•tsiry•17h ago•38 comments

I believe there are entire companies right now under AI psychosis

https://twitter.com/mitchellh/status/2055380239711457578
1982•reasonableklout•1d ago•1162 comments

Content-defined chunking added to Bazel

https://www.buildbuddy.io/blog/content-defined-chunking/
53•siggi•3d ago•5 comments

Greek Alphabet Cards

https://labs.randomquark.com/alphabet_cards/
129•ricochet11•22h ago•59 comments