frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

80386 Microcode Disassembled

https://www.reenigne.org/blog/80386-microcode-disassembled/
14•nand2mario•17m ago•0 comments

Making Deep Learning Go Brrrr from First Principles

https://horace.io/brrr_intro.html
11•tosh•38m ago•2 comments

Shipping a laptop to a refugee camp in Uganda

https://notesbylex.com/shipping-a-laptop-to-a-refugee-camp-in-uganda
528•lexandstuff•14h ago•183 comments

Rubish: A Unix shell written in pure Ruby

https://github.com/amatsuda/rubish
70•winebarrel•5h ago•30 comments

US tech firms share Dutch regulator officials' names with Senate

https://www.dutchnews.nl/2026/05/us-tech-firms-share-dutch-regulator-officials-names-with-senate/
63•zqna•1h ago•31 comments

Why Japanese companies do so many different things

https://davidoks.blog/p/why-japanese-companies-do-so-many
729•d0ks•21h ago•342 comments

BambuStudio has been violating PrusaSlicer AGPL license since their fork

https://xcancel.com/josefprusa/status/2054602354851254330
139•Tomte•4h ago•38 comments

The quadratic sandwich

https://fedemagnani.github.io/math/2026/04/08/the-quadratic-sandwich.html
70•cpp_frog•3d ago•4 comments

Improving C# Memory Safety

https://devblogs.microsoft.com/dotnet/improving-csharp-memory-safety/
25•soheilpro•1d ago•2 comments

Microsoft starts canceling Claude Code licenses

https://www.theverge.com/tech/930447/microsoft-claude-code-discontinued-notepad
319•robertkarl•18h ago•266 comments

DHS Quits Granting Green Cards–Almost

https://www.cato.org/blog/dhs-quits-granting-green-cards-almost-entirely
19•malshe•1h ago•4 comments

Project Glasswing: An Initial Update

https://www.anthropic.com/research/glasswing-initial-update
461•louiereederson•16h ago•278 comments

ArcBrush – Node-based 2D image editor

https://arcbrush.com/
24•NatKarmios•2d ago•8 comments

Yeunjoo Choi from Igalia on Chromium

https://theconsensus.dev/p/2026/05/20/yeunjoo-choi-from-igalia-on-chromium.html
31•eatonphil•2d ago•3 comments

Fast Factorial Algorithms

http://www.luschny.de/math/factorial/FastFactorialFunctions.htm
10•nill0•3d ago•3 comments

Blood Pumping Mechanism of the Hoof (2020)

https://horses.extension.org/blood-pumping-mechanism-of-the-hoof/
98•thunderbong•3d ago•29 comments

Sleep research led to a new sleep apnea drug

https://temertymedicine.utoronto.ca/news/how-decades-sleep-research-led-new-sleep-apnea-drug
179•colinprince•14h ago•102 comments

CISA tries to contain data leak

https://krebsonsecurity.com/2026/05/lawmakers-demand-answers-as-cisa-tries-to-contain-data-leak/
223•speckx•19h ago•50 comments

Deno 2.8

https://deno.com/blog/v2.8
377•roflcopter69•1d ago•158 comments

What is the history of the ERROR_ARENA_TRASHED error code?

https://devblogs.microsoft.com/oldnewthing/20260519-00/?p=112339
41•supermatou•2d ago•12 comments

A Wayland Compositor in Minecraft

https://modrinth.com/mod/waylandcraft
233•Jotalea•2d ago•52 comments

Experience: We found a baby on the subway – now he's our 26-year-old son

https://www.theguardian.com/lifeandstyle/2026/may/22/experience-found-baby-subway-now-26-year-old...
128•Michelangelo11•4h ago•35 comments

Antigravity 2.0 Tops the OpenSCAD Architectural 3D LLM Benchmark

https://modelrift.com/blog/openscad-llm-benchmark/
393•jetter•1d ago•153 comments

Neutron scattering explains why gluten-free pasta falls apart (2025)

https://phys.org/news/2025-09-science-spaghetti-neutron-gluten-free.html
77•layer8•2d ago•27 comments

Open source Kanban desktop app that runs parallel agents on every card

https://www.kanbots.dev/
229•vitriapp•18h ago•136 comments

Comparing an LZ4 Decompressor on Four Legacy CPUs

https://bumbershootsoft.wordpress.com/2026/05/09/comparing-an-lz4-decompressor-on-four-legacy-cpus/
80•tosh•3d ago•5 comments

A Forth-inspired language for writing websites

https://robida.net/entries/2026/05/21/a-forth-inspired-language-for-writing-websites
152•speckx•21h ago•17 comments

I’m writing again

https://www.cringely.com/2026/05/21/im-writing-again/
153•dan_hawkins•21h ago•39 comments

Wi-Wi is wireless time sync at 1 nanosecond

https://www.jeffgeerling.com/blog/2026/wi-wi-is-wireless-time-sync-less-than-5ns/
127•Brajeshwar•2d ago•32 comments

1940 Air Terminal Museum Begins Liquidation

https://www.1940airterminal.org/news/liquidation-of-simulators
120•weaponeer•19h ago•30 comments