frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•8mo ago

Comments

yxhuvud•8mo ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•8mo ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•8mo ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•8mo ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•8mo ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•8mo ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

There's a ridiculous amount of tech in a disposable vape

https://blog.jgc.org/2026/01/theres-ridiculous-amount-of-tech-in.html
354•abnercoimbre•1d ago•285 comments

1000 Blank White Cards

https://en.wikipedia.org/wiki/1000_Blank_White_Cards
165•eieio•6h ago•26 comments

ASCII Clouds

https://caidan.dev/portfolio/ascii_clouds/
183•majkinetor•7h ago•32 comments

Every GitHub object has two IDs

https://www.greptile.com/blog/github-ids
230•dakshgupta•18h ago•59 comments

A 40-line fix eliminated a 400x performance gap

https://questdb.com/blog/jvm-current-thread-user-time/
262•bluestreak•10h ago•56 comments

The Gleam Programming Language

https://gleam.run/
117•Alupis•7h ago•59 comments

Putting the "You" in CPU (2023)

https://cpu.land/
22•vinhnx•4d ago•1 comments

LLMs are a 400-year-long confidence trick

https://tomrenner.com/posts/400-year-confidence-trick/
4•Growtika•33m ago•0 comments

Show HN: OSS AI agent that indexes and searches the Epstein files

https://epstein.trynia.ai/
89•jellyotsiro•7h ago•27 comments

The truth behind the 2026 J.P. Morgan Healthcare Conference

https://www.owlposting.com/p/the-truth-behind-the-2026-jp-morgan
225•abhishaike•15h ago•47 comments

vLLM large scale serving: DeepSeek 2.2k tok/s/h200 with wide-ep

https://blog.vllm.ai/2025/12/17/large-scale-serving.html
103•robertnishihara•17h ago•23 comments

Show HN: 1D-Pong Game at 39C3

https://github.com/ogermer/1d-pong
28•oger•2d ago•3 comments

The $LANG Programming Language

192•dang•9h ago•38 comments

No management needed: anti-patterns in early-stage engineering teams

https://www.ablg.io/blog/no-management-needed
181•tonioab•14h ago•200 comments

The Emacs Widget Library: A Critique and Case Study

https://www.d12frosted.io/posts/2025-11-26-emacs-widget-library
65•whacked_new•2d ago•18 comments

Show HN: The Tsonic Programming Language

https://tsonic.org
28•jeswin•16h ago•6 comments

Are two heads better than one?

https://eieio.games/blog/two-heads-arent-better-than-one/
164•evakhoury•17h ago•49 comments

Stop using natural language interfaces

https://tidepool.leaflet.pub/3mcbegnuf2k2i
76•steveklabnik•7h ago•26 comments

Show HN: Cachekit – High performance caching policies library in Rust

https://github.com/OxidizeLabs/cachekit
33•failsafe•7h ago•6 comments

Handling secrets (somewhat) securely in shells

https://linus.schreibt.jetzt/posts/shell-secrets.html
59•todsacerdoti•4d ago•33 comments

The Tulip Creative Computer

https://github.com/shorepine/tulipcc
216•apitman•16h ago•50 comments

Sei (YC W22) Is Hiring a DevOps Engineer (India/In-Office/Chennai/Gurgaon)

https://www.ycombinator.com/companies/sei/jobs/Rn0KPXR-devops-platform-ai-infrastructure-engineer
1•ramkumarvenkat•8h ago

AI generated music barred from Bandcamp

https://old.reddit.com/r/BandCamp/comments/1qbw8ba/ai_generated_music_on_bandcamp/
770•cdrnsf•15h ago•552 comments

How to make a damn website (2024)

https://lmnt.me/blog/how-to-make-a-damn-website.html
195•birdculture•16h ago•57 comments

Agonist-Antagonist Myoneural Interface

https://www.media.mit.edu/projects/agonist-antagonist-myoneural-interface-ami/overview/
61•kaycebasques•5d ago•5 comments

Scott Adams has died

https://www.youtube.com/watch?v=Rs_JrOIo3SE
943•ekianjo•18h ago•1456 comments

April 9, 1940 a Dish Best Served Cold

https://todayinhistory.blog/2021/04/09/april-9-1940-a-dish-best-served-cold/
31•vinnyglennon•4d ago•3 comments

Exa-d: How to store the web in S3

https://exa.ai/blog/exa-d
36•willbryk•8h ago•3 comments

A university got itself banned from the Linux kernel (2021)

https://www.theverge.com/2021/4/30/22410164/linux-kernel-university-of-minnesota-banned-open-source
112•italophil•14h ago•60 comments

When hardware goes end-of-life, companies need to open-source the software

https://www.marcia.no/words/eol
313•Marciplan•11h ago•93 comments