frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•12mo ago

Comments

yxhuvud•12mo ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•12mo ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•12mo ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•12mo ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•12mo ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•11mo ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Localsend: An open-source cross-platform alternative to AirDrop

https://github.com/localsend/localsend
122•bilsbie•1h ago•40 comments

Microsoft VibeVoice: Open-Source Frontier Voice AI

https://github.com/microsoft/VibeVoice
40•tosh•1h ago•10 comments

The World's Most Complex Machine

https://worksinprogress.co/issue/the-worlds-most-complex-machine/
142•mellosouls•3d ago•65 comments

Talkie: a 13B vintage language model from 1930

https://talkie-lm.com/introducing-talkie
450•jekude•15h ago•167 comments

Microsoft and OpenAI end their exclusive and revenue-sharing deal

https://www.bloomberg.com/news/articles/2026-04-27/microsoft-to-stop-sharing-revenue-with-main-ai...
918•helsinkiandrew•23h ago•782 comments

Can You Find the Comet?

https://apod.nasa.gov/apod/ap260427.html
82•ColinWright•1d ago•38 comments

The predictable failure of the QDay Prize

https://algassert.com/post/2601
14•firefly284•1d ago•0 comments

Is my blue your blue? (2024)

https://ismy.blue/
614•theogravity•16h ago•406 comments

WASM is not quite a stack machine

https://purplesyringa.moe/blog/wasm-is-not-quite-a-stack-machine/
89•signa11•8h ago•31 comments

GTFOBins

https://gtfobins.org/
269•StefanBatory•6h ago•67 comments

Period tracking app has been yapping about your flow to Meta

https://femtechdesigndesk.substack.com/p/your-period-tracking-app-has-been
54•campuscodi•1h ago•46 comments

Mo RAM, Mo Problems (2025)

https://fabiensanglard.net/curse/
162•blfr•2d ago•28 comments

Tiled Words 6 Month Update

https://paulmakeswebsites.com/writing/six-months-of-tiled-words/
32•paulhebert•1d ago•8 comments

In Kannauj, perfumers have been making monsoon-infused mitti attar for centuries

https://www.atlasobscura.com/articles/smell-of-rain-kannauj-perfume-mitti-attar-india
19•bcaulfield•1d ago•5 comments

Pgrx: Build Postgres Extensions with Rust

https://github.com/pgcentralfoundation/pgrx
125•luu•3d ago•7 comments

4TB of voice samples just stolen from 40k AI contractors at Mercor

https://app.oravys.com/blog/mercor-breach-2026
565•Oravys•1d ago•211 comments

The Social Edge of Intelligence: Individual Gain, Collective Loss

https://www.theideasletter.org/essay/the-social-edge-of-intelligence/
53•ForHackernews•3h ago•63 comments

Men who stare at walls

https://www.alexselimov.com/posts/men_who_stare_at_walls/
636•aselimov3•1d ago•286 comments

High Performance Git

https://gitperf.com/
175•gnabgib•12h ago•55 comments

I Spent My Sabbatical Building a Power Meter for Sledgehammers

https://leblancfg.com/intensity-pad-founder-story.html
5•alin23•1d ago•0 comments

Meetings are forcing functions

https://www.mooreds.com/wordpress/archives/3734
143•zdw•2d ago•79 comments

An Update on GitHub Availability

https://github.blog/news-insights/company-news/an-update-on-github-availability/
146•salkahfi•3h ago•135 comments

Three men are facing charges in Toronto SMS Blaster arrests

https://www.tps.ca/media-centre/stories/unprecedented-sms-blaster-arrests/
177•gnabgib•16h ago•97 comments

The quiet resurgence of RF engineering

https://atempleton.bearblog.dev/quiet-resurgence-of-rf-engineering/
214•merlinq•2d ago•120 comments

Easyduino: Open Source PCB Devboards for KiCad

https://github.com/Hanqaqa/Easyduino
232•Hanqaqa•19h ago•38 comments

Networking changes coming in macOS 27

https://eclecticlight.co/2026/04/23/networking-changes-coming-in-macos-27/
236•pvtmert•21h ago•213 comments

The woes of sanitizing SVGs

https://muffin.ink/blog/scratch-svg-sanitization/
245•varun_ch•21h ago•97 comments

How I leared what a decoupling capacitor is for, the hard way

https://nbelakovski.substack.com/p/how-i-learned-what-a-decoupling-capacitor
122•actinium226•2d ago•67 comments

Fully Featured Audio DSP Firmware for the Raspberry Pi Pico

https://github.com/WeebLabs/DSPi
311•BoingBoomTschak•2d ago•85 comments

Pgbackrest is no longer being maintained

https://github.com/pgbackrest/pgbackrest
438•c0l0•1d ago•222 comments