frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Google broke reCAPTCHA for de-googled Android users

https://reclaimthenet.org/google-broke-recaptcha-for-de-googled-android-users
478•anonymousiam•5h ago•157 comments

You gave me a u32. I gave you root. (io_uring ZCRX freelist LPE)

https://ze3tar.github.io/post-zcrx.html
112•MrBruh•4h ago•69 comments

AI is breaking two vulnerability cultures

https://www.jefftk.com/p/ai-is-breaking-two-vulnerability-cultures
202•speckx•6h ago•85 comments

Cartoon Network Flash Games

https://www.webdesignmuseum.org/flash-game-exhibitions/cartoon-network-flash-games
263•willmeyers•7h ago•89 comments

AWS North Virginia data center outage – recovery to take hours

https://www.cnbc.com/2026/05/08/aws-outage-data-center-fanduel-coinbase.html
92•christhecaribou•20h ago•53 comments

Wi is Fi: Understanding Wi-Fi 4/5/6/6E/7/8 (802.11 n/AC/ax/be/bn)

https://www.wiisfi.com/
37•homebrewer•2d ago•13 comments

David Attenborough's 100th Birthday

https://www.bbc.com/news/articles/cp3pww9g0p5o
409•defrost•12h ago•80 comments

Looking at the data behind prediction markets

https://asteriskmag.com/issues/14/are-prediction-markets-good-for-anything
43•kqr•1d ago•18 comments

Non-determinism is an issue with patching CVEs

https://flox.dev/blog/achieving-rapid-cve-remediation-in-an-era-of-escalating-vulnerabilities/
28•mathewpregasen•2h ago•10 comments

Serving a website on a Raspberry Pi Zero running in RAM

https://btxx.org/posts/memory/
183•xngbuilds•8h ago•75 comments

An Introduction to Meshtastic

https://meshtastic.org/docs/introduction/
362•ColinWright•12h ago•136 comments

Mux (YC W16) Is Hiring

https://www.mux.com/jobs
1•mmcclure•3h ago

Tesla Model Y Passes NHTSA's New 'Advanced Driver Assistance System' Tests

https://www.nhtsa.gov/press-releases/tesla-model-y-first-vehicle-pass-nhtsa-new-advanced-driver-a...
19•amanaplanacanal•31m ago•7 comments

Meta Shuts Down End-to-End Encryption for Instagram Messaging

https://www.pcmag.com/news/meta-shuts-down-end-to-end-encryption-for-instagram-dms-messaging
88•tcp_handshaker•2h ago•64 comments

Teaching Claude Why

https://www.anthropic.com/research/teaching-claude-why
63•pretext•6h ago•12 comments

All means are fair except solving the problem

https://yosefk.com/blog/all-means-are-fair-except-solving-the-problem.html
28•akkartik•2d ago•31 comments

Compound drivers of Antarctic sea ice loss and Southern Ocean destratification

https://www.science.org/doi/10.1126/sciadv.aeb0166
15•littlexsparkee•2h ago•0 comments

Rumors of my death are slightly exaggerated

1481•CliffStoll•2d ago•228 comments

Hosting a Site on a Raspberry Pi

https://m4rt.nl/blog/hosting-on-a-pi
7•swiftdust•1d ago•0 comments

Show HN: CADara – I made an open-source in-browser CAD

https://cadara.app
6•ttouch•42m ago•1 comments

Mojo 1.0 Beta

https://mojolang.org/
269•sbt567•21h ago•173 comments

How do I deal with memory leaks? (2022)

https://www.stroustrup.com/bs_faq2.html#memory-leaks
75•theanonymousone•7h ago•62 comments

US Government releases first batch of UAP documents and videos

https://www.war.gov/UFO/
214•david-gpu•11h ago•328 comments

Poland is now among the 20 largest economies

https://apnews.com/article/poland-economy-growth-g20-gdp-26fe06e120398410f8d773ba5661e7aa
883•surprisetalk•11h ago•726 comments

PC Engine CPU

https://jsgroth.dev/blog/posts/pc-engine-cpu/
116•ibobev•9h ago•51 comments

Man finds $1M worth of Yu-Gi-Oh cards in a dumpster

https://www.404media.co/man-finds-1-million-worth-of-yu-gi-oh-cards-in-a-dumpster/
107•danso•2d ago•36 comments

Ask HN: We just had an actual UUID v4 collision...

286•mittermayr•16h ago•248 comments

Show HN: GETadb.com – every GET request creates a DB

https://www.getadb.com/
23•nezaj•7h ago•29 comments

Roadside Attraction

https://theoffingmag.com/essay/roadside-attraction/
17•aways•4h ago•3 comments

Podman rootless containers and the Copy Fail exploit

https://garrido.io/notes/podman-rootless-containers-copy-fail/
115•ggpsv•10h ago•23 comments