frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

Potential session/cache leakage between workspace instances or consumer accounts

https://github.com/anthropics/claude-code/issues/74066
182•chatmasta•3h ago•73 comments

Explanation of everything you can see in htop/top on Linux

https://peteris.rocks/blog/htop/
215•theanonymousone•5h ago•29 comments

Curveball

https://mightyburger.net/projects/curveball/
12•toilet•50m ago•0 comments

Leaking YouTube Creators Private Videos

https://javoriuski.com/post/youtube
5•javxfps•20m ago•0 comments

Windows CE Dreamcast Community Edition (wince-dc)

https://github.com/maximqaxd/wince-dc
21•msephton•2h ago•1 comments

Astrophysicists Puzzle over Webb’s New Universe

https://www.quantamagazine.org/astrophysicists-puzzle-over-webbs-new-universe-20260702/
137•jnord•7h ago•79 comments

Maybe you should learn something

https://www.marginalia.nu/log/a_135_learn/
320•tylerdane•13h ago•154 comments

What ORMs have taught me: just learn SQL (2014)

https://wozniak.ca/blog/2014/08/03/1/index.html
84•ciconia•3d ago•90 comments

The Vespa at 80

https://www.cbc.ca/news/world/vespa-italy-postwar-design-9.7252641
95•cf100clunk•3d ago•83 comments

Postgres data stored in Parquet on S3: LTAP architecture explained

https://www.databricks.com/blog/lakebase-ltap-rethinking-database-storage
121•andrenotgiant•3d ago•41 comments

Breaking the Bird Barrier: Scientist Decodes Zebra Finch Language

https://www.freepressjournal.in/education/breaking-the-bird-barrier-scientist-decodes-zebra-finch...
41•yyyk•3d ago•6 comments

Designing DB partitions you don't have to babysit

https://explainanalyze.com/p/designing-partitioning-you-dont-have-to-babysit/
12•rtolkachev•3d ago•0 comments

The bottleneck might be the air in the room

https://blog.mikebowler.ca/2026/07/03/co2-and-decision-making/
623•gslin•10h ago•353 comments

Performance per dollar is getting faster and cheaper

https://www.wafer.ai/blog/glm52-amd
315•latchkey•19h ago•128 comments

Costco is the anti-Amazon

https://phenomenalworld.org/analysis/the-anti-amazon/
506•bookofjoe•1d ago•489 comments

Leanstral 1.5: Proof abundance for all

https://mistral.ai/news/leanstral-1-5/
316•programLyrique•18h ago•89 comments

Night Witches – all-female Soviet aviator regiment WW2

https://en.wikipedia.org/wiki/Night_Witches
55•gverrilla•3d ago•22 comments

Mir Books – Books from the Soviet Era

https://mirtitles.org
137•clmul•3d ago•69 comments

How working memory could give rise to consciousness

https://www.scientificamerican.com/article/how-working-memory-could-give-rise-to-consciousness/
39•bookofjoe•3h ago•35 comments

Rob Pike – 'Concurrency Is Not Parallelism' [video]

https://vimeo.com/49718712
10•jruohonen•28m ago•4 comments

Giant trees have no trouble pumping water to top branches: new research

https://news.exeter.ac.uk/faculty-of-environment-science-and-economy/giant-trees-have-no-trouble-...
246•hhs•18h ago•108 comments

The Reports of Jim Carrey's Death Are a Failure Mode

https://tane.dev/2026/07/the-reports-of-jim-carreys-death-are-a-failure-mode/
18•taubek•5h ago•14 comments

The End of North America

https://paulkrugman.substack.com/p/the-end-of-north-america-157
11•rbanffy•49m ago•3 comments

Steam Controller Auto-Charge – pilot to magnetic charging puck using CV

https://github.com/FossPrime/Steam-Controller-Auto-Charge
184•zdw•18h ago•44 comments

California Bans 'Sell by' Labels, Hoping to Cut Food Waste

https://www.nytimes.com/2026/07/02/us/california-food-labels-sell-by.html
23•randycupertino•2h ago•36 comments

Jamesob's guide to running SOTA LLMs locally

https://github.com/jamesob/local-llm
385•livestyle•1d ago•173 comments

MSI Center – How to gain SYSTEM privileges in seconds

https://mrbruh.com/msicenter/
128•MrBruh•16h ago•52 comments

FreeBSD ate my RAM

https://crocidb.com/post/freebsd-ate-my-ram/
179•theanonymousone•21h ago•77 comments

Synthesis is harder than analysis

https://surfingcomplexity.blog/2026/07/03/synthesis-is-harder-than-analysis/
132•azhenley•14h ago•31 comments

A martian rock has lots of carbon on it, and it's not clear why

https://arstechnica.com/science/2026/07/a-martian-rock-has-lots-of-carbon-on-it-and-its-not-clear...
31•Brajeshwar•3h ago•3 comments