frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

πFS

https://github.com/philipl/pifs
400•helterskelter•4h ago•102 comments

A Written Language for the Cherokee So Efficient It Was Thought to Be Magic

https://www.smithsonianmag.com/innovation/man-created-written-language-cherokee-did-efficiently-e...
20•grahambargeron•1h ago•10 comments

Anthropic requires 30 day data retention for Fable and Mythos

https://support.claude.com/en/articles/15425996-data-retention-practices-for-mythos-class-models
69•lebovic•1d ago•22 comments

How JPL keeps the 13-year-old Curiosity rover doing science

https://spectrum.ieee.org/curiosity-rover-jpl-mars-science
149•pseudolus•5h ago•30 comments

I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA

484•eries•8h ago•384 comments

What is it like to be a bat? (1974) [pdf]

https://www.sas.upenn.edu/~cavitch/pdf-library/Nagel_Bat.pdf
47•shadow28•2h ago•38 comments

Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable

https://techcrunch.com/2026/06/10/cybersecurity-researchers-arent-happy-about-the-guardrails-on-a...
46•speckx•6h ago•42 comments

PgDog is funded and coming to a database near you

https://pgdog.dev/blog/our-funding-announcement
361•levkk•9h ago•182 comments

L'Affaire Siloxane

https://mceglowski.substack.com/p/laffaire-siloxane
124•idlewords•1d ago•18 comments

World Capitals Voronoi

https://www.jasondavies.com/maps/voronoi/capitals/
17•vincnetas•2d ago•3 comments

Show HN: Extend UI – open-source UI kit for modern document apps

https://www.extend.ai/ui
117•kbyatnal•7h ago•25 comments

Farmer donates land for a park, city sells it for $10M as data center land

https://www.tomshardware.com/tech-industry/farmer-donates-land-for-a-park-city-sells-it-for-data-...
308•maxloh•4h ago•106 comments

GeoLibre 1.0

https://geolibre.app/
118•jonbaer•5h ago•8 comments

Mercedes‑Benz starts large‑scale production of electric axial flux motor

https://media.mercedes-benz.com/en/article/bebac2af-acdc-465a-9538-adb0bf3d8ccf
494•raffael_de•15h ago•315 comments

Show HN: HelixDB – A graph database built on object storage

https://github.com/HelixDB/helix-db/tree/main
78•GeorgeCurtis•7h ago•29 comments

Building an HTML-first site doubled our users overnight

https://mohkohn.co.uk/writing/html-first/
951•edent•10h ago•439 comments

Claude Desktop spawns 1.8 GB Hyper-V VM on every launch, even for chat-only use

https://github.com/anthropics/claude-code/issues/29045
308•tonyrice•6h ago•213 comments

Authentication issues related to API requests

https://www.githubstatus.com/incidents/fcj3088jg1wx
148•Multicomp•7h ago•29 comments

Anthropic's model naming, extrapolated

https://samwilkinson.io/posts/2026-06-09-anthropics-model-naming-extrapolated
251•sammycdubs•4h ago•71 comments

Apache Burr: Build reliable AI agents and applications

https://burr.apache.org/
162•anhldbk•8h ago•87 comments

Free financial literacy platform for kids – 90 lessons, no paywall

https://learnfinly.com
4•narensara•1h ago•0 comments

Policy on the AI Exponential

https://darioamodei.com/post/policy-on-the-ai-exponential
109•yjp20•4h ago•165 comments

Show HN: Atlasphere – Live Infrastructure Diagrams

19•andreygrehov•1d ago•4 comments

All 9,300 Japanese train station, animated by the year it opened (1872–2026)

https://jivx.com/eki
178•momentmaker•11h ago•60 comments

Show HN: Artie – Real-time data replication to your warehouse, now self-serve

https://www.artie.com
19•tang8330•17h ago•5 comments

Raspberry Pi 5 – 16 GB, $350

https://www.adafruit.com/product/6125?src=raspberrypi
121•akman•3h ago•142 comments

Smudging the game disc to make speedrunning 'SpongeBob' faster

https://www.inverse.com/input/gaming/the-dirty-secret-that-makes-speedrunning-on-spongebob-a-lot-...
62•pncnmnp•20h ago•36 comments

Pick and Place: Carbon Nanotube Nanoassembly Process

https://www.c12qe.com/news/pick-and-place-carbon-nanotube-quantum-chip-manufacturing
19•bpierre•2d ago•4 comments

A €0.01 bank transfer could compromise a banking AI agent

https://blue41.com/blog/how-we-helped-bunq-secure-their-financial-ai-assistant/
156•tvissers•9h ago•144 comments

DiffusionGemma: 4x Faster Text Generation

https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-gen...
265•meetpateltech•7h ago•67 comments