frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Fast(er) regular expression engines in Ruby

https://serpapi.com/blog/faster-regular-expression-engines-in-ruby/
60•davidsojevic•1y ago

Comments

yxhuvud•1y ago
Eww, pretending to support utf8 matchers while not supporting them at all was not pretty to see.
gitroom•1y ago
Honestly that part bugs me, fake support is worse than no support imo
kayodelycaon•1y ago
> Another nuance was found in ruby, which cannot scan the haystack with invalid UTF-8 byte sequences.

This is extremely basic ruby: UTF-8 encoded strings must be valid UTF-8. This is not unique to ruby. If I recall correctly, python 3 does the same thing.

    2.7.1 :001 > haystack = "\xfc\xa1\xa1\xa1\xa1\xa1abc"
    2.7.1 :003 > haystack.force_encoding "ASCII-8BIT"
    => "\xFC\xA1\xA1\xA1\xA1\xA1abc" 
    2.7.1 :004 > haystack.scan(/.+/)
    => ["\xFC\xA1\xA1\xA1\xA1\xA1abc"]
This person is a senior engineer on their Team page. All they had to do was google "ArgumentError: invalid byte sequence in UTF-8". Or ask a coworker... the company has Ruby on Rails applications. headdesk
burntsushi•1y ago
The nuance is specifically relevant here because neither of the other two regex engines benchmarked have this requirement. It's doubly relevant because that means running a regex search doesn't require a UTF-8 validation step, and is therefore likely beneficial from a perf perspective, dependening on the workload.
kayodelycaon•1y ago
That’s a good point. I hadn’t considered it because I’ve hit the validation error long before getting to search. It is possible to avoid string operations with careful coding prior to the search.

Edit: After a little testing, the strings can be read from and written to files without triggering validation. Presumably this applies to sockets as well.

DmitryOlshansky•1y ago
I wonder how std.regex of dlang would fare in such test. Sadly due to a tiny bit of D’s GC use it’s hard to provide as a library for other languages. If there is an interest I might take it through the tests.

An OpenAI model has disproved a central conjecture in discrete geometry

https://openai.com/index/model-disproves-discrete-geometry-conjecture/
648•tedsanders•5h ago•460 comments

GitHub confirms breach of 3,800 repos via malicious VSCode extension

https://www.bleepingcomputer.com/news/security/github-confirms-breach-of-3-800-repos-via-maliciou...
463•Timofeibu•10h ago•152 comments

DOS Zone

https://dos.zone/
50•rglover•1h ago•12 comments

Flipper One Tech Specs

https://docs.flipper.net/one/general/tech-specs
216•gregsadetsky•5h ago•78 comments

How fast is N tokens per second really?

https://mikeveerman.github.io/tokenspeed/
281•hexagr•2d ago•70 comments

Colorado Amended SB051 (Age Verification Bill) to Exclude Open Source Projects

https://legiscan.com/CO/bill/SB051/2026
61•ki4jgt•3h ago•23 comments

Google Declaring War on the Web

https://tante.cc/2026/05/20/on-google-declaring-war-on-the-web/
267•cdrnsf•2h ago•152 comments

Qwen3.7-Max: The Agent Frontier

https://qwen.ai/blog?id=qwen3.7
595•kevinsimper•13h ago•237 comments

Anthropic is expanding to Colossus2. Will use GB200

https://xcancel.com/nottombrown/status/2057194829986300375
24•aurareturn•3h ago•10 comments

Why is Inkwell stuck in review

https://www.manton.org/2026/05/19/why-is-inkwell-stuck-in.html
97•speckx•6h ago•28 comments

Qian Xuesen: The missile genius America lost and China gained (2025)

https://www.usni.org/magazines/naval-history/2025/december/missile-genius-america-lost-and-china-...
98•thnaks•6h ago•54 comments

Archaeologists find Egyptian mummy buried with the 'Iliad'

https://www.openculture.com/2026/05/archaeologists-discover-ancient-egyptian-mummy-buried-with-pa...
35•diodorus•5d ago•9 comments

Show HN: CPU-only transcription for YouTube, TikTok, X, Instagram videos

https://github.com/kouhxp/yapsnap
19•mrkn1•2h ago•3 comments

Saying goodbye to asm.js

https://spidermonkey.dev/blog/2026/05/20/saying-goodbye-to-asmjs.html
304•eqrion•12h ago•130 comments

SpaceX S-1

https://www.sec.gov/Archives/edgar/data/1181412/000162828026036936/spaceexplorationtechnologi.htm
188•cachecow•3h ago•132 comments

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self- Play

https://vmax.ai/team/populora-co-evolving-llm-populations-for-reasoning-self-play
31•AMavorParker•3h ago•6 comments

Map of Metal

https://mapofmetal.com/
390•robin_reala•13h ago•140 comments

SBCL: the ultimate assembly code breadboard (2014)

https://pvk.ca/Blog/2014/03/15/sbcl-the-ultimate-assembly-code-breadboard/
119•yacin•8h ago•7 comments

Incident Report: May 19, 2026 – GCP Account Suspension

https://blog.railway.com/p/incident-report-may-19-2026-gcp-account-outage
372•0xedb•15h ago•221 comments

Google's AI is being manipulated. The search giant is quietly fighting back

https://www.bbc.com/future/article/20260519-google-tackles-attempts-to-hack-its-ai-results
246•tigerlily•13h ago•171 comments

Deep – CLI/REPL for generating and iterating on codebases using DeepSeek

https://github.com/cynchro/deepseekCLI
4•cynchro980•1h ago•0 comments

Starship's Twelfth Flight Test

https://www.spacex.com/launches/starship-flight-12
78•pantalaimon•2h ago•69 comments

Sharla Boehm, the programmer whose code underpins the Internet

https://www.scientificamerican.com/article/the-programmer-whose-code-underpins-the-internet/
87•dxs•2d ago•24 comments

The OEIS meta sequence and subway stations

https://www.jeremykun.com/shortform/2026-04-09-0556/
6•surprisetalk•2d ago•1 comments

Not alive, but not dead: disembodied human brains used for drug testing

https://www.science.org/content/article/not-alive-not-dead-disembodied-human-brains-used-drug-tes...
132•Timofeibu•4h ago•111 comments

GitHub's take on age assurance for developers

https://github.blog/news-insights/policy-news-and-insights/why-age-assurance-laws-matter-for-deve...
21•hanifbbz•3h ago•11 comments

Meta blocks human rights accounts from reaching audiences in Saudi Arabia, UAE

https://www.alqst.org/ar/posts/1190
905•giuliomagnifico•11h ago•385 comments

Étienne Ghys: The Shape of Letters: From Leonardo da Vinci to Donald Knuth

https://www.youtube.com/watch?v=1OIxzewWilc
54•tzury•2d ago•6 comments

Long-term editing of brain circuits using an engineered electrical synapse

https://www.nature.com/articles/s41586-026-10501-y
11•bookofjoe•3d ago•1 comments

Formal Verification Gates for AI Coding Loops

https://reubenbrooks.dev/blog/structural-backpressure-beats-smarter-agents/
99•pyrex41•8h ago•24 comments