The /o in Ruby regex stands for "oh the humanity "

https://jpcamara.com/2025/08/02/the-o-in-ruby-regex.html

95•todsacerdoti•6h ago

Comments

rco8786•4h ago

Love these sorts of deep dives, thanks!

cbsmith•4h ago

As an old Perl programmer, I knew immediately what the /o would do. ;-)

Amorymeltzer•3h ago

I've always loved the recent[1] summary from `perlre`:

>o - pretend to optimize your code, but actually introduce bugs

1: I still think of it as a relatively new change, but it's from 2013: <https://github.com/Perl/perl5/commit/7cf040c1f649790a4040aec...>

kstrauser•3h ago

It's older than that. The article links to this conversation about it in 2003: https://www.perlmonks.org/?node_id=256053

riffraff•4h ago

Unsurprisingly, `END {}` is also inherited from perl, tho I think it originally comes from awk.

mdaniel•3h ago

Similarly unsurprisingly, with its BEGIN friend https://docs.ruby-lang.org/en/3.3/syntax/miscellaneous_rdoc....

In the spirit of "what's old is new again," PowerShell also has the same idea, and is done per Function with "begin", "process", "end", and "clean" stanzas that allow setup, teardown, for-each-item, and "finally" behavior: https://learn.microsoft.com/en-us/powershell/module/microsof...

mananaysiempre•3h ago

Oh, that’s an interesting take. I’ve long been looking for newer developments on Awk’s clause structure, and this seems like an interesting take (though I’m unclear on whether I can have multiple begin/end clauses, which are the best thing about Awk’s version). It also finally connects this idea to something else in my mind—specifically advice[1] and CLOS’s :before/:after/:around methods[2]. (I guess Go’s defer also counts?)

[1] https://en.wikipedia.org/wiki/Advice_(programming)

[2] https://gigamonkeys.com/book/object-reorientation-generic-fu...

mdaniel•3h ago

It seems not:

Given:

    function Fred {
        begin {
            echo "hello from begin1"
        }
        begin {
            echo "hello from begin2"
        }
        process {
            echo "does the magic"
        }
    }
    $bob = @("alpha" "beta")
    $bob | Fred

Then

    $ pwsh fred.ps1
    ParserError: /Users/mdaniel/fred.ps1:5
    Line |
       5 |      begin {
         |      ~~~~~~~
         | Script command clause 'begin' has already been defined.

phoronixrly•4h ago

It's kind of a cool feature. I like it.

thayne•3h ago

Is it? I can't think of a non-contrived case where this would actually be useful.

And in any case where it would be useful, it seems like a better way to optimize would just be to refactor the regex out into a constant.

kayodelycaon•1h ago

Actually, I have a way this would work well. If you’re interpolating a value that comes from configuration and wouldn’t change.

Example: /admin@#{Rails.config.x.domain}/io

But you’re right that a constant would be a lot more clear. “o” is a footgun.

lupire•4h ago

This is the same problem people have with closures, where it's unclear to the user whether the argument is captured by name or by value.

layer8•3h ago

This isn't the same problem, because this is about whether the regex is instantiated each time the code around the regex is executed, or only the first time and cached for subsequent executions. The same could in theory happen with closures, but I haven't ever seen programming-language semantics where, for example, a function containing the definition of a closure that depends on an argument of that outer function, would use the argument value of the first invocation of the function for all subsequent invocations of the function.

For example, when you have

    fn f x = (y -> x + y)

then a sequence of invocations of f

    f 1 3
    f 2 6

will yield 4 and 8 respectively, but never will the second invocation yield 7 due to reusing the value of x from the first invocation. However, that is precisely what happens in the article's regex example, because the equivalent is for the closure value (y -> x + y) to be cached between invocations, so that the x retains the value of the first invocation of f — regardless of whether x is a reference by name or by value.

zer00eyz•4h ago

Im sorry but the classics never go out of style:

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

stavros•3h ago

Yeah but it's kind of tired when it's being used every time someone makes a mistake with regex. I've used them extensively in my career and never once regretted it.

apgwoz•2h ago

The problem with regexps is that “Sometimes a smart person, who has done the work, and knows how to leverage regular expressions correctly, decides they are appropriate for solving a problem where there is shared maintenance. Now, you have people who haven’t put in the work, and have been told repeatedly through ‘witty quips’ to not bother.”

jodrellblank•3h ago

The second problem being how to deal with all the extra time they just freed up?

fanf2•3h ago

This is one of the features that Ruby cribbed directly from Perl. The Ruby documentation seems really bad, in particular “interpolation mode” is grievously misleading.

Perl’s documentation is far more clear about the consequences:

(https://perldoc.perl.org/perlop#Regexp-Quote-Like-Operators)

   o   Compile pattern only once.

  […]

  PATTERN may contain variables, which will be
  interpolated every time the pattern search is
  evaluated, except for when the delimiter is a
  single quote. […] Perl will not recompile the
  pattern unless an interpolated variable that
  it contains changes. You can force Perl to skip
  the test and never recompile by adding a /o
  (which stands for "once") after the trailing
  delimiter. Once upon a time, Perl would recompile
  regular expressions unnecessarily, and this
  modifier was useful to tell it not to do so,
  in the interests of speed. But now, the only
  reasons to use /o are one of:

  [reasons]

  The bottom line is that using /o is almost
  never a good idea.

In the days before Perl automatically memoized the compilation of regexes with interpolation, even back in the 1990s, it said,

  However, mentioning /o constitutes a promise
  that you won't change the variables in the
  pattern. If you change them, Perl won't even
  notice.

Perl 4’s documentation is briefer. It says,

(https://github.com/Perl/perl5/blob/perl-4.0.00/perl.man#L272...)

  PATTERN may contain references to scalar
  variables, which will be interpolated
  (and the pattern recompiled) every time the
  pattern search is evaluated. […] If you want
  such a pattern to be compiled only once, add
  an “o” after the trailing delimiter. This
  avoids expensive run-time recompilations, and
  is useful when the value you are interpolating
  won't change over the life of the script.

Johnny555•51m ago

https://perldoc.perl.org/perlre

  o  - pretend to optimize your code, but actually introduce bugs

Joker_vD•3h ago

> I didn’t recognize /o. It didn’t seem critically important to lookup yet.

> With nothing else to investigate, I finally looked up the docs for what the /o regex modifier does.

I'll probably never understand this mode of thinkning. But then again, Ruby programmers are, after all, people who chose to write Ruby.

> /o is referred to as “Interpolation mode”, which sounded pretty harmless.

Really? Those words sound quite alarming to me, due to personal reminiscences of eval.

Also, this whole "/o" feaure seems insane. If I have an interpolation in my regex, obviously I have to re-interpolate it every time a new value is submitted, or I'd hit this very bug. And if the value is expected to the same every time, then I can just compile it once and save the result myself, right? In which case, I probably could even do without interpolation in the first place.

apgwoz•2h ago

“Compilation”, I think, is exactly right. This feature is less about interpolation than it is about compilation of a single regexp to be used many times. It’s just shrouded in confusing documentation that should say: “/o tells ruby to rewrite this code such that it refers to a new statically allocated regexp object.” And when you write it that way, you see how insane it is for a function call to be hoisted automatically like this, without an explicit, obvious, syntactic annotation.

kazinator•1h ago

> Modifier o means that the first time a literal regexp with interpolations is encountered, the generated Regexp object is saved and used for all future evaluations of that literal regexp.

That is crystal clear to me. It means that on the next execution, the new values of the interpolation will be ignored; the regexp is now "baked" with the first ones.

Like this in C++:

  void fun(int arg)
  {
     static int once = arg;
  }

if we call this as f(42) the first time, once gets initialized to 42. If we then call it f(73), once stays 42.

There is a function in POSIX for once-only initializations: pthread_once. C++ compilers for multithreaded environments emit thread-safe code to do something similar to pthread_once to ensure that even if there are several concurrent first invocations of the function, the initialization happens once.

Telo MT1

6 Weeks of Claude Code

Helsinki records zero traffic deaths for full year

The Art of Multiprocessor Programming 2nd Edition Book Club

I tried living on IPv6 for a day, and here's what happened

Browser extension and local backend that automatically archives YouTube videos

We may not like what we become if A.I. solves loneliness

Anandtech.com now redirects to its forums

Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models

Online Collection of Keygen Music

At a Loss for Words: A flawed idea is teaching kids to be poor readers (2019)

Helion begins work on Washington nuclear fusion plant

Show HN: WebGPU enables local LLM in the browser – demo site with AI chat

Great Question (YC W21) Is Hiring a VP of Engineering (Remote)

The /o in Ruby regex stands for "oh the humanity "

PixiEditor 2.0 – A FOSS universal 2D graphics editor

Compressing Icelandic name declension patterns into a 3.27 kB trie

Australia’s gains in wheat-farm productivity

Financial lessons from my family's experience with long-term care insurance

Double-slit experiment holds up when stripped to its quantum essentials

Linear Types for Programmers (2023)

A.I. researchers are negotiating $250M pay packages

ThinkPad designer David Hill on unreleased models

A dive into open chat protocols

The Rubik's Cube Perfect Scramble (2024)

TclSqueak – Program in Tcl the Smalltalk Way

The Big Oops in type systems: This problem extends to FP as well

Introduction to Unikernel: Building, deploying lightweight, secure applications

Show HN: Wordle-style game for Fermi questions

Write "Freehold" Software

The /o in Ruby regex stands for "oh the humanity "

Comments

Telo MT1

6 Weeks of Claude Code

Helsinki records zero traffic deaths for full year

The Art of Multiprocessor Programming 2nd Edition Book Club

I tried living on IPv6 for a day, and here's what happened

Browser extension and local backend that automatically archives YouTube videos

We may not like what we become if A.I. solves loneliness

Anandtech.com now redirects to its forums

Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models

Online Collection of Keygen Music

At a Loss for Words: A flawed idea is teaching kids to be poor readers (2019)

Helion begins work on Washington nuclear fusion plant

Show HN: WebGPU enables local LLM in the browser – demo site with AI chat

Great Question (YC W21) Is Hiring a VP of Engineering (Remote)

The /o in Ruby regex stands for "oh the humanity "

PixiEditor 2.0 – A FOSS universal 2D graphics editor

Compressing Icelandic name declension patterns into a 3.27 kB trie

Australia’s gains in wheat-farm productivity

Financial lessons from my family's experience with long-term care insurance

Double-slit experiment holds up when stripped to its quantum essentials

Linear Types for Programmers (2023)

A.I. researchers are negotiating $250M pay packages

ThinkPad designer David Hill on unreleased models

A dive into open chat protocols

The Rubik's Cube Perfect Scramble (2024)

TclSqueak – Program in Tcl the Smalltalk Way

The Big Oops in type systems: This problem extends to FP as well

Introduction to Unikernel: Building, deploying lightweight, secure applications

Show HN: Wordle-style game for Fermi questions

Write "Freehold" Software