frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Previewing GPT‑5.6 Sol: a next-generation model

https://openai.com/index/previewing-gpt-5-6-sol/
711•minimaxir•5h ago•435 comments

The gap between open weights LLMs and closed source LLMs

https://blog.doubleword.ai/frontier-os-llm
58•kkm•1h ago•31 comments

A C++ implementation of a fast hash map and hash set using hopscotch hashing

https://github.com/Tessil/hopscotch-map
29•gjvc•1h ago•1 comments

U.S. government will decide who gets to use GPT-5.6

https://www.washingtonpost.com/technology/2026/06/26/openai-says-us-government-will-vet-users-its...
635•alain94040•4h ago•796 comments

MicroVMs: Run isolated sandboxes with full lifecycle control

https://aws.amazon.com/blogs/aws/run-isolated-sandboxes-with-full-lifecycle-control-aws-lambda-in...
219•justincormack•3d ago•131 comments

The "Bizarre Headgear" exhibit at the Sam Noble museum

https://svpow.com/2026/05/15/the-bizarre-headgear-exhibit-at-the-sam-noble-museum-is-incredible/
60•surprisetalk•3d ago•6 comments

We Can Still Stop California's 3D Printer Surveillance Scheme

https://www.eff.org/deeplinks/2026/06/we-can-still-stop-californias-3d-printer-surveillance-scheme
61•hn_acker•1h ago•3 comments

Show HN: Smart model routing directly in Claude, Codex and Cursor

https://github.com/workweave/router
122•adchurch•6h ago•81 comments

Hightouch (YC S19) Is Hiring

https://hightouch.com/careers#open-positions
1•joshwget•1h ago

Ultrasound imaging of the brain

https://alephneuro.com/blog/ultrasound-brain
214•rossant•10h ago•80 comments

What Is a Nomogram and Why Would It Interest Me?

https://lefakkomies.github.io/pynomo-doc/introduction/introduction.html#what-is-a-nomogram-and-wh...
62•Eridanus2•5h ago•14 comments

The open source DOCX editor submitted to HN a few weeks ago has been deleted

16•gcanyon•47m ago•9 comments

Long Wave radio era set to end with Droitwich switch-off

https://www.bbc.com/news/articles/c74yn7v7k4qo
30•speckx•3h ago•10 comments

Show HN: Autofit2 – End-to-end pipeline for multilingual text classification

https://github.com/neospe/autofit2
8•leschak•1d ago•0 comments

PlayStation Is Deleting 551 Movies from Customers' Accounts

https://kotaku.com/playstation-store-movies-digital-studio-canal-terminator-2000711013
74•ortusdux•2h ago•21 comments

Modern GPU Programming for MLSys

https://mlc.ai/modern-gpu-programming-for-mlsys/
49•crowwork•3d ago•5 comments

The National Parks Were Reportedly Told to Stay Silent on Deaths

https://www.outsideonline.com/outdoor-adventure/environment/nps-internal-memo-deaths/?link_source...
36•LostMyLogin•1h ago•7 comments

Lippmann Photography

https://www.jonhilty.com/lippmann
4•andsoitis•2d ago•0 comments

A human postmortem of the 1996 AOL outage

https://ngrok.com/blog/aol-was-down-1996
18•EndEntire•2d ago•1 comments

Pre-Modern Armies for Worldbuilders, Part III: Paying for It

https://acoup.blog/2026/06/26/collections-pre-modern-armies-for-worldbuilders-part-iii-paying-for...
25•jfoucher•4h ago•2 comments

Gossamer: a Rust-flavoured language with real goroutines and pause-free memory

https://gossamer-lang.org/
54•mwheeler•4h ago•43 comments

The Art of Kite Flying (1430–1929)

https://publicdomainreview.org/collection/art-of-kite-flying/
14•benbreen•4d ago•9 comments

Data centers trigger voter backlash

https://www.newsweek.com/cost-me-the-election-data-centers-trigger-voter-backlash-12118327
133•randycupertino•5h ago•233 comments

LaTeX.wasm: LaTeX Engines in Browsers

https://www.swiftlatex.com/
71•theanonymousone•3d ago•27 comments

My Steam Machine is a 50ft HDMI cable

https://blog.matthewbrunelle.com/my-steam-machine-is-a-50ft-hdmi-cable/
133•speckx•3d ago•145 comments

Slisp: Simple Lisp compiler (Linux/amd64)

https://github.com/skx/slisp
46•stevekemp•4h ago•2 comments

Show HN: WebBase-III – dBASE III rebuilt in the browser with its own interpreter

https://github.com/DDecoene/WebBaseIII
73•ddecoene•2d ago•25 comments

Bipartite Matching Is in NC

https://scottaaronson.blog/?p=9851
102•amichail•3d ago•16 comments

Springer Nature has removed two studies by Max Planck

https://www.science.org/content/article/why-have-papers-one-history-s-most-famous-physicists-been...
329•adharmad•8h ago•158 comments

Jolla Phone (October 2026)

https://commerce.jolla.com/products/jolla-phone-october-2026
272•mrbn100ful•7h ago•147 comments