frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

What I'm Hearing About Cognitive Debt (So Far)

https://margaretstorey.com/blog/2026/02/18/cognitive-debt-revisited/
119•raphaelcosta•2h ago•58 comments

Bun is being ported from Zig to Rust

https://github.com/oven-sh/bun/commit/46d3bc29f270fa881dd5730ef1549e88407701a5
310•SergeAx•3h ago•206 comments

CVE-2026-31431: Copy Fail vs. rootless containers

https://www.dragonsreach.it/2026/05/04/cve-2026-31431-copy-fail-rootless-containers/
38•averi•1h ago•10 comments

About 10% of AMC movie showings sell zero tickets. This site finds them

https://walzr.com/empty-screenings
23•MrBuddyCasino•32m ago•3 comments

Train Your Own LLM from Scratch

https://github.com/angelos-p/llm-from-scratch
20•kristianpaul•56m ago•1 comments

Pulitzer Prize Winner in International Reporting

https://www.pulitzer.org/winners/dake-kang-garance-burke-byron-tau-aniruddha-ghosal-and-yael-grau...
50•jay_kyburz•2h ago•2 comments

How OpenAI delivers low-latency voice AI at scale

https://openai.com/index/delivering-low-latency-voice-ai-at-scale/
343•Sean-Der•9h ago•110 comments

Agent Skills

https://addyosmani.com/blog/agent-skills/
184•BOOSTERHIDROGEN•7h ago•73 comments

The Car That Watches You Back: The Advertising Infrastructure of Modern Cars

https://nobodyaskedforthis.lol/posts/connected-car/
29•cadito•3h ago•17 comments

When Networking Doesn't Work

https://www.os2museum.com/wp/when-networking-doesnt-work/
34•kencausey•8h ago•5 comments

Securing a DoD contractor: Finding a multi-tenant authorization vulnerability

https://www.strix.ai/blog/how-strix-found-zero-auth-vulnerability-dod-backed-startup
187•bearsyankees•11h ago•79 comments

Gaps in national food production, worldwide

https://www.nature.com/articles/s43016-025-01173-4
23•simonebrunozzi•16h ago•4 comments

Does Employment Slow Cognitive Decline? Evidence from Labor Market Shocks

https://www.nber.org/papers/w35117
251•littlexsparkee•13h ago•229 comments

Testing macOS on the Apple Network Server 2.0 ROMs

http://oldvcr.blogspot.com/2026/05/testing-macos-on-apple-network-server.html
73•zdw•1d ago•13 comments

Redis array: short story of a long development process

https://antirez.com/news/164
258•antirez•14h ago•83 comments

pgxbackup: Continuity Support for pgBackRest

https://thebuild.com/blog/2026/05/01/pgxbackup-continuity-support-for-pgbackrest/
18•Wingy•2d ago•1 comments

Talking to strangers at the gym

https://thienantran.com/talking-to-35-strangers-at-the-gym/
1259•thitran•17h ago•594 comments

1966 Ford Mustang Converted into a Tesla with Working 'Full Self-Driving'

https://electrek.co/2026/05/02/tesla-1966-mustang-ev-conversion-full-self-driving/
153•Brajeshwar•13h ago•111 comments

Microsoft Edge stores all passwords in memory in clear text, even when unused

https://twitter.com/L1v1ng0ffTh3L4N/status/2051308329880719730
478•cft•10h ago•166 comments

Y Combinator's Stake in OpenAI (0.6%)

https://daringfireball.net/2026/05/y_combinators_stake_in_openai
210•gyomu•4h ago•18 comments

I am worried about Bun

https://wwj.dev/posts/i-am-worried-about-bun/
448•remote-dev•12h ago•300 comments

How Monero’s proof of work works

https://blog.alcazarsec.com/tech/posts/how-moneros-proof-of-work-works
259•alcazar•14h ago•187 comments

PyInfra 3.8.0

https://github.com/pyinfra-dev/pyinfra/releases/tag/v3.8.0
245•wowi42•16h ago•86 comments

Pomiferous: The most extensive apples (pommes) database

https://pomiferous.com/
112•Ariarule•14h ago•45 comments

Formatting a 25M-line codebase overnight

https://stripe.dev/blog/formatting-an-entire-25-million-line-codebase-overnight-the-rubyfmt-story
143•r00k•8h ago•75 comments

GameStop makes $55.5B takeover offer for eBay

https://www.bbc.co.uk/news/articles/cn0p8yled1do
659•n1b0m•19h ago•631 comments

Transformers Are Inherently Succinct (2025)

https://arxiv.org/abs/2510.19315
44•bearseascape•9h ago•6 comments

UK Fuel Price Intelligence – Market analytics from reporting stations

https://www.fuelinsight.co.uk
169•theazureguy•13h ago•78 comments

Sierra Raises $950M at $15B Valuation

https://sierra.ai/blog/better-customer-experiences-built-on-sierra
101•doppp•13h ago•126 comments

US healthcare marketplaces shared citizenship and race data with ad tech giants

https://techcrunch.com/2026/05/04/us-healthcare-marketplaces-shared-citizenship-and-race-data-wit...
463•ZeidJ•11h ago•153 comments