frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Zerostack – A Unix-inspired coding agent written in pure Rust

https://crates.io/crates/zerostack/1.0.0
220•gidellav•5h ago•73 comments

Hosting a website on an 8-bit microcontroller

https://maurycyz.com/projects/mcusite/
46•zdw•2h ago•1 comments

A nicer voltmeter clock

https://lcamtuf.substack.com/p/a-nicer-voltmeter-clock
90•surprisetalk•5h ago•15 comments

Unknowable Math Can Help Hide Secrets

https://www.quantamagazine.org/how-unknowable-math-can-help-hide-secrets-20260511/
30•Xcelerate•3d ago•3 comments

OpenAI and Government of Malta partner to roll out ChatGPT Plus to all citizens

https://openai.com/index/malta-chatgpt-plus-partnership/
109•bookofjoe•7h ago•96 comments

Self-Distillation Enables Continual Learning [pdf]

https://arxiv.org/abs/2601.19897
17•teleforce•2h ago•5 comments

SANA-WM, a 2.6B open-source world model for 1-minute 720p video

https://nvlabs.github.io/Sana/WM/
312•mjgil•15h ago•128 comments

MCP Hello Page

https://www.hybridlogic.co.uk/blog/2026/05/mcp-hello-page
65•Dachande663•5h ago•24 comments

C++26 Shipped a SIMD Library Nobody Asked For

https://lucisqr.substack.com/p/c26-shipped-a-simd-library-nobody
33•signa11•1d ago•8 comments

A molecule with half-Möbius topology

https://www.science.org/doi/10.1126/science.aea3321
78•bryanrasmussen•4d ago•2 comments

Moving away from Tailwind, and learning to structure my CSS

https://jvns.ca/blog/2026/05/15/moving-away-from-tailwind--and-learning-to-structure-my-css-/
456•mpweiher•18h ago•297 comments

Stochastic Parrots: Frequently Unasked Questions

https://medium.com/@emilymenonbender/stochastic-parrots-frequently-unasked-questions-49c2e7d22d11
43•olalonde•3d ago•39 comments

The Third Hard Problem

https://mmapped.blog/posts/48-the-third-hard-problem
39•surprisetalk•2d ago•27 comments

Accelerando (2005)

https://www.antipope.org/charlie/blog-static/fiction/accelerando/accelerando.html
257•eamag•16h ago•148 comments

Colossus: The Forbin Project

https://en.wikipedia.org/wiki/Colossus:_The_Forbin_Project
6•doener•2d ago•0 comments

We've made the world too complicated

https://user8.bearblog.dev/the-world-is-too-complicated/
217•James72689•19h ago•205 comments

Fisker went bankrupt and owners built an open source car company from the ashes

https://electrek.co/2026/05/16/fisker-ocean-open-source-ev-story-after-bankruptcy/
79•breve•4h ago•31 comments

Frontier AI has broken the open CTF format

https://kabir.au/blog/the-ctf-scene-is-dead
348•frays•20h ago•347 comments

δ-mem: Efficient Online Memory for Large Language Models

https://arxiv.org/abs/2605.12357
198•44za12•18h ago•53 comments

Content-defined chunking added to Bazel

https://www.buildbuddy.io/blog/content-defined-chunking/
32•siggi•3d ago•3 comments

Halt and Catch Fire

https://unstack.io/halt-and-catch-fire
95•ScottWRobinson•9h ago•56 comments

3D Gaussian Splatting in a Weekend

https://bfeldman.me/3dgs-weekend/
61•b__feldman•3d ago•6 comments

PART Telescopes – Bringing radio astronomy within reach of rural schools

https://parttelescopes.web.app/
114•openrockets•12h ago•30 comments

Show HN: Rocksky – Music scrobbling and discovery on the AT Protocol

https://tangled.org/rocksky.app/rocksky
62•tsiry•10h ago•31 comments

Kioxia and Dell cram 10 PB into slim 2RU server

https://www.blocksandfiles.com/flash/2026/05/14/kioxia-and-dell-cram-10-pb-into-slim-2ru-server/5...
118•rbanffy•10h ago•83 comments

Greek Alphabet Cards

https://labs.randomquark.com/alphabet_cards/
107•ricochet11•16h ago•48 comments

Fame! A Misunderstanding: A new translation of Albert Camus's complete notebooks

https://lareviewofbooks.org/article/albert-camus-complete-notebooks-ryan-bloom-existentialism-abs...
46•Caiero•3d ago•8 comments

Nearly 50 Years Later, WKRP in Cincinnati Becomes a Real Radio Station

https://www.openculture.com/2026/05/nearly-50-years-later-wkrp-in-cincinnati-becomes-a-real-radio...
117•bookofjoe•4d ago•68 comments

Futhark by example (2020)

https://futhark-lang.org/examples.html
113•tosh•18h ago•31 comments

I believe there are entire companies right now under AI psychosis

https://twitter.com/mitchellh/status/2055380239711457578
1909•reasonableklout•1d ago•1107 comments