frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Bun's experimental Rust rewrite hits 99.8% test compatibility on Linux x64 glibc

https://twitter.com/jarredsumner/status/2053047748191232310
134•heldrida•10h ago•175 comments

Internet Archive Switzerland

https://blog.archive.org/2026/05/06/internet-archive-switzerland-expanding-a-global-mission-to-pr...
444•hggh•8h ago•68 comments

Zed Editor Theme-Builder

https://zed.dev/theme-builder
83•cuechan•3h ago•27 comments

I’ve banned query strings

https://chrismorgan.info/no-query-strings
117•susam•4h ago•56 comments

CPanel's Black Week: 3 New Vulnerabilities Patched After Attack on 44k Servers

https://www.copahost.com/blog/cpanels-black-week-three-new-vulnerabilities-patched-after-ransomwa...
71•ggallas•3h ago•38 comments

Production engineering when trading billions of dollars a day [video]

https://www.youtube.com/watch?v=zR9PpXWsKFQ
20•abstrus•1d ago•0 comments

Show HN: I wrote a flight simulator in my own programming language

https://github.com/navid-m/flightsim
63•pizza_man•2d ago•17 comments

Distributing Mac software is increasing my cortisol levels

https://blog.kronis.dev/blog/apple-is-increasing-my-cortisol-levels
106•LorenDB•6h ago•54 comments

Google broke reCAPTCHA for de-googled Android users

https://reclaimthenet.org/google-broke-recaptcha-for-de-googled-android-users
1395•anonymousiam•1d ago•501 comments

LLMs corrupt your documents when you delegate

https://arxiv.org/abs/2604.15597
288•rbanffy•12h ago•108 comments

"Dirty Frag" (CVE-2026-43284): The Second Linux Root Exploit in Eight Days

https://www.copahost.com/blog/dirty-frag-cve-2026-43284/
15•ggallas•1h ago•6 comments

Meta's Embrace of A.I. Is Making Its Employees Miserable

https://www.nytimes.com/2026/05/08/technology/meta-ai-employees-miserable.html
87•JumpCrisscross•2h ago•29 comments

PipeDream on the Acorn Archimedes

https://stonetools.ghost.io/pipedream-archimedes/
62•msephton•5h ago•19 comments

The hypocrisy of cyberlibertarianism

https://matduggan.com/the-intolerable-hypocrisy-of-cyberlibertarianism/
198•ColinWright•6h ago•155 comments

Using Claude Code: The unreasonable effectiveness of HTML

https://twitter.com/trq212/status/2052809885763747935
378•pretext•15h ago•229 comments

The ROKR wooden typewriter: a closer look

http://writingball.blogspot.com/2026/05/the-rokr-wooden-typewriter-closer-look.html
14•speckx•2d ago•2 comments

How LEDs are made (2014)

https://learn.sparkfun.com/tutorials/how-leds-are-made/all
111•smig0•2d ago•16 comments

A recent experience with ChatGPT 5.5 Pro

https://gowers.wordpress.com/2026/05/08/a-recent-experience-with-chatgpt-5-5-pro/
552•_alternator_•18h ago•400 comments

Mythical Man Month

https://martinfowler.com/bliki/MythicalManMonth.html
314•ingve•2d ago•179 comments

OpenAI’s WebRTC problem

https://moq.dev/blog/webrtc-is-the-problem/
452•atgctg•2d ago•137 comments

America's carpet capital: an empire and its toxic legacy

https://apnews.com/projects/pfas-forever-stained/
142•rawgabbit•3d ago•88 comments

GrapheneOS fixes Android VPN leak Google refused to patch

https://cyberinsider.com/grapheneos-fixes-android-vpn-leak-google-refused-to-patch/
196•Georgelemental•6h ago•56 comments

David Attenborough's 100th Birthday

https://www.bbc.com/news/articles/cp3pww9g0p5o
823•defrost•1d ago•155 comments

Introduction to Beaver Triples

https://stoffelmpc.com/stoffel-blog/beaver-triples-tuples
14•badcryptobitch•4h ago•6 comments

Building the TD4 4-Bit CPU

https://jayakody2000lk.blogspot.com/2026/05/building-td4-4-bit-cpu.html
43•zdw•2d ago•12 comments

What causes lightning? The answer keeps getting more interesting

https://www.quantamagazine.org/what-causes-lightning-the-answer-keeps-getting-more-interesting-20...
174•Tomte•3d ago•42 comments

Show HN: Create flashcards with Space CLI

https://getspace.app/cli
10•friebetill•6h ago•0 comments

Reviving the IBM Selectric Composer Fonts (2023)

https://www.kutilek.de/selectric/
61•tangus•3d ago•5 comments

Show HN: Mochi.js: bun-native high-fidelity browser automation library

https://mochijs.com/
24•ccheshirecat•6h ago•12 comments

Wi is Fi: Understanding Wi-Fi 4/5/6/6E/7/8 (802.11 n/AC/ax/be/bn)

https://www.wiisfi.com/
352•homebrewer•3d ago•93 comments