frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

It's time to talk about my writerdeck

https://veronicaexplains.net/my-first-writerdeck/
85•hggh•1h ago•59 comments

On The <dl> (2021)

https://benmyers.dev/blog/on-the-dl/
299•ravenical•7h ago•95 comments

Texas woman arrested for Facebook post about town water quality

https://reclaimthenet.org/texas-woman-arrested-for-facebook-post-about-town-water-quality
338•abawany•2h ago•139 comments

My two-part desk setup

https://arslan.io/2025/11/18/my-two-part-desk-setup/
127•James72689•2d ago•72 comments

Reverse engineering circuitry in a Spacelab computer from 1980

https://www.righto.com/2026/05/reverse-engineering-spacelab-computer.html
62•elpocko•4h ago•5 comments

Hengefinder: Finding When the Sun Aligns with Your Street

https://victoriaritvo.com/blog/hengefinder/
57•evakhoury•23h ago•14 comments

We made our filesystem 47× faster by deleting it

https://microsandbox.dev/blog/oci-filesystem-47x-faster
34•appcypher•4d ago•22 comments

Green card seekers must leave U.S. to apply, Trump administration says

https://www.nytimes.com/2026/05/22/us/politics/green-card-changes-trump.html
199•tlhunter•23h ago•507 comments

z386: An Open-Source 80386 Built Around Original Microcode

https://nand2mario.github.io/posts/2026/z386/
90•wicket•6h ago•19 comments

80386 Microcode Disassembled

https://www.reenigne.org/blog/80386-microcode-disassembled/
187•nand2mario•8h ago•32 comments

PHP's Oddities

https://flowtwo.io/post/php%27s-oddities
66•thejoeflow•3d ago•75 comments

SpaceX launches Starship v3 rocket

https://www.space.com/space-exploration/launches-spacecraft/spacex-starship-v3-megarocket-first-t...
281•busymom0•20h ago•201 comments

The Art of Money Getting

https://kk.org/cooltools/book-freak-210-the-art-of-money-getting/
129•dxs•7h ago•88 comments

Making Deep Learning Go Brrrr from First Principles (2022)

https://horace.io/brrr_intro.html
125•tosh•8h ago•46 comments

Italy Cancels Boeing Pegasus Order, Shifting to Airbus A330 MRTT

https://www.euronews.com/my-europe/2026/05/21/italy-moves-to-airbus-a330-tankers-in-major-nato-al...
169•embedding-shape•4h ago•52 comments

Project Glasswing: An Initial Update

https://www.anthropic.com/research/glasswing-initial-update
510•louiereederson•1d ago•299 comments

- -dangerously-skip-reading-code

https://olano.dev/blog/dangerously-skip/
51•fagnerbrack•10h ago•65 comments

Reflections on Building Forum Software

https://www.counting-stuff.com/reflections-on-building-forum-software/
9•sebg•2d ago•0 comments

Kindle loyalists scramble as Amazon turns page on old e-readers

https://www.reuters.com/business/retail-consumer/kindle-loyalists-scramble-amazon-turns-page-old-...
69•cf100clunk•4d ago•81 comments

Evaluating Spec CPU2026

https://chipsandcheese.com/p/evaluating-spec-cpu2026
16•zdw•4h ago•3 comments

sp.h: Fixing C by giving it a high quality, ultra portable standard library

https://spader.zone/sp/
148•dboon•3d ago•146 comments

Highest Random Weight in Elixir

https://jola.dev/posts/highest-random-weight-in-elixir
51•shintoist•2d ago•2 comments

Why Japanese companies do so many different things

https://davidoks.blog/p/why-japanese-companies-do-so-many
825•d0ks•1d ago•384 comments

Lisp in Vim (2019)

https://susam.net/lisp-in-vim.html
38•whent•5h ago•5 comments

Shipping a laptop to a refugee camp in Uganda

https://notesbylex.com/shipping-a-laptop-to-a-refugee-camp-in-uganda
640•lexandstuff•22h ago•228 comments

Rubish: A Unix shell written in pure Ruby

https://github.com/amatsuda/rubish
153•winebarrel•14h ago•92 comments

Oura says it gets government demands for user data

https://this.weekinsecurity.com/oura-says-it-gets-government-demands-for-user-data-will-it-share-...
214•donohoe•6h ago•132 comments

Solving the “Zork” Mystery

https://www.dpolakovic.space/blogs/zork-part2
47•dpola•4d ago•17 comments

Electrobun 2.0 will be decoupled from Bun due to the Rust rewrite

https://twitter.com/i/status/2058064720553222567
93•bundie•8h ago•93 comments

Improving C# Memory Safety

https://devblogs.microsoft.com/dotnet/improving-csharp-memory-safety/
127•soheilpro•2d ago•27 comments