>o - pretend to optimize your code, but actually introduce bugs
1: I still think of it as a relatively new change, but it's from 2013: <https://github.com/Perl/perl5/commit/7cf040c1f649790a4040aec...>
In the spirit of "what's old is new again," PowerShell also has the same idea, and is done per Function with "begin", "process", "end", and "clean" stanzas that allow setup, teardown, for-each-item, and "finally" behavior: https://learn.microsoft.com/en-us/powershell/module/microsof...
[1] https://en.wikipedia.org/wiki/Advice_(programming)
[2] https://gigamonkeys.com/book/object-reorientation-generic-fu...
Given:
function Fred {
begin {
echo "hello from begin1"
}
begin {
echo "hello from begin2"
}
process {
echo "does the magic"
}
}
$bob = @("alpha" "beta")
$bob | Fred
Then $ pwsh fred.ps1
ParserError: /Users/mdaniel/fred.ps1:5
Line |
5 | begin {
| ~~~~~~~
| Script command clause 'begin' has already been defined.
And in any case where it would be useful, it seems like a better way to optimize would just be to refactor the regex out into a constant.
Example: /admin@#{Rails.config.x.domain}/io
But you’re right that a constant would be a lot more clear. “o” is a footgun.
For example, when you have
fn f x = (y -> x + y)
then a sequence of invocations of f f 1 3
f 2 6
will yield 4 and 8 respectively, but never will the second invocation yield 7 due to reusing the value of x from the first invocation. However, that is precisely what happens in the article's regex example, because the equivalent is for the closure value (y -> x + y) to be cached between invocations, so that the x retains the value of the first invocation of f — regardless of whether x is a reference by name or by value."Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."
Perl’s documentation is far more clear about the consequences:
(https://perldoc.perl.org/perlop#Regexp-Quote-Like-Operators)
o Compile pattern only once.
[…]
PATTERN may contain variables, which will be
interpolated every time the pattern search is
evaluated, except for when the delimiter is a
single quote. […] Perl will not recompile the
pattern unless an interpolated variable that
it contains changes. You can force Perl to skip
the test and never recompile by adding a /o
(which stands for "once") after the trailing
delimiter. Once upon a time, Perl would recompile
regular expressions unnecessarily, and this
modifier was useful to tell it not to do so,
in the interests of speed. But now, the only
reasons to use /o are one of:
[reasons]
The bottom line is that using /o is almost
never a good idea.
In the days before Perl automatically memoized the compilation of regexes with interpolation, even back in the 1990s, it said, However, mentioning /o constitutes a promise
that you won't change the variables in the
pattern. If you change them, Perl won't even
notice.
Perl 4’s documentation is briefer. It says,(https://github.com/Perl/perl5/blob/perl-4.0.00/perl.man#L272...)
PATTERN may contain references to scalar
variables, which will be interpolated
(and the pattern recompiled) every time the
pattern search is evaluated. […] If you want
such a pattern to be compiled only once, add
an “o” after the trailing delimiter. This
avoids expensive run-time recompilations, and
is useful when the value you are interpolating
won't change over the life of the script.
o - pretend to optimize your code, but actually introduce bugs
> With nothing else to investigate, I finally looked up the docs for what the /o regex modifier does.
I'll probably never understand this mode of thinkning. But then again, Ruby programmers are, after all, people who chose to write Ruby.
> /o is referred to as “Interpolation mode”, which sounded pretty harmless.
Really? Those words sound quite alarming to me, due to personal reminiscences of eval.
Also, this whole "/o" feaure seems insane. If I have an interpolation in my regex, obviously I have to re-interpolate it every time a new value is submitted, or I'd hit this very bug. And if the value is expected to the same every time, then I can just compile it once and save the result myself, right? In which case, I probably could even do without interpolation in the first place.
That is crystal clear to me. It means that on the next execution, the new values of the interpolation will be ignored; the regexp is now "baked" with the first ones.
Like this in C++:
void fun(int arg)
{
static int once = arg;
}
if we call this as f(42) the first time, once gets initialized to 42. If we then call it f(73), once stays 42.There is a function in POSIX for once-only initializations: pthread_once. C++ compilers for multithreaded environments emit thread-safe code to do something similar to pthread_once to ensure that even if there are several concurrent first invocations of the function, the initialization happens once.
rco8786•4h ago