hacker-news-custom-logo

Hackr News App

40 comments

  • fanf2

     

    2 days ago

    next

    [ - ]

    This is one of the features that Ruby cribbed directly from Perl. The Ruby documentation seems really bad, in particular “interpolation mode” is grievously misleading.

    Perl’s documentation is far more clear about the consequences:

    (https://perldoc.perl.org/perlop#Regexp-Quote-Like-Operators)

       o   Compile pattern only once.
    
      […]
    
      PATTERN may contain variables, which will be
      interpolated every time the pattern search is
      evaluated, except for when the delimiter is a
      single quote. […] Perl will not recompile the
      pattern unless an interpolated variable that
      it contains changes. You can force Perl to skip
      the test and never recompile by adding a /o
      (which stands for "once") after the trailing
      delimiter. Once upon a time, Perl would recompile
      regular expressions unnecessarily, and this
      modifier was useful to tell it not to do so,
      in the interests of speed. But now, the only
      reasons to use /o are one of:
    
      [reasons]
    
      The bottom line is that using /o is almost
      never a good idea.
    
    In the days before Perl automatically memoized the compilation of regexes with interpolation, even back in the 1990s, it said,

      However, mentioning /o constitutes a promise
      that you won't change the variables in the
      pattern. If you change them, Perl won't even
      notice.
    
    Perl 4’s documentation is briefer. It says,

    (https://github.com/Perl/perl5/blob/perl-4.0.00/perl.man#L272...)

      PATTERN may contain references to scalar
      variables, which will be interpolated
      (and the pattern recompiled) every time the
      pattern search is evaluated. […] If you want
      such a pattern to be compiled only once, add
      an “o” after the trailing delimiter. This
      avoids expensive run-time recompilations, and
      is useful when the value you are interpolating
      won't change over the life of the script.

    reply

    Johnny555

     

    1 day ago

    parent

    next

    [ - ]

    [ x ]

    <@fanf2> https://perldoc.perl.org/perlre

      o  - pretend to optimize your code, but actually introduce bugs

    reply

    giancarlostoro

     

    18 hours ago

    root

    parent

    next

    [ - ]

    [ x ]

    <@Johnny555> Just add this flag for odd versioned deployments, infinity job security

    reply
  • jononor

     

    1 day ago

    prev

    next

    [ - ]

    It looks like an emoji for someone getting bashed in the head with a long stick. So that makes sense?

    reply
  • tialaramex

     

    1 day ago

    prev

    next

    [ - ]

    This is a footgun. A language should strive not to add footguns. Every footgun you provide, somebody is going to blow their foot off with it, so that's a high price. If your language is popular it might be a lot of somebodies.

    The opposite behaviour (we have a constant regular expression, we re-use it often but the tooling doesn't realise and so it's created each time we mention it) is not a footgun, it results in poor performance, and so you might want (especially in some managed languages) to just magically optimise this case, but if not you won't cause mysterious bugs. An expert, asked "Why is this slow?" can just fix it - you have to supply basic tools for that, but this flag is not a sensible tool.

    reply

    elif

     

    1 day ago

    parent

    next

    [ - ]

    [ x ]

    <@tialaramex> Is it really though? There are tons of characters you can add to a regex that have difficult if not impossible to mentally comprehend impacts on the potential matches. That's why you need 100 test cases for every 10 characters you write in a regex. Regex itself could all be a footgun by this standard. No one in the history of no one has ever thought "why dont I just add a random character to my regex I don't need or understand" that's just boogie man level irrational fear if you think this has any bearing on the ease of use of ruby.

    reply

    stouset

     

    1 day ago

    root

    parent

    next

    [ - ]

    [ x ]

    <@elif> Regexes are not fundamentally hard. People make regexes hard by trying to parse things by sight rather than finding a spec. If you have a spec, and it can be parsed by a regular expression, they are pretty damn rote to implement.

    If you aren’t working from a specified input grammar, the task is going to be borderline impossible no matter the tool and you’re going to have a bad time. If you aren’t working with a regular grammar, this is the wrong tool for the job and again you’re going to have a bad time.

    A hint; if you find yourself using `.`, you are probably shooting yourself in the foot.

    reply

    pitched

     

    1 day ago

    root

    parent

    prev

    next

    [ - ]

    [ x ]

    <@elif> Ruby is a well-sharpened knife. Not everyone should be given a sharp knife though, especially children. And not all jobs need a sharp knife, like buttering toast. So I think it’s good for dull knives to exist as part of your tool belt. If we can only choose one language though, I’d rather it be a nimble, sharp one.

    reply

    roughly

     

    13 hours ago

    parent

    prev

    next

    [ - ]

    [ x ]

    <@tialaramex> One of my favorites was Python’s datetime.time() object evaluating to True for every value except exact midnight, which is the sort of thing that makes fine sense when you think about the underlying implementation but is absolutely going to take a toe off of someone.

    My favorite part about that one was it got to go through the full feature deprecation cycle before removal because several people argued in the bug thread about it that they were relying on that behavior in their systems.

    reply

    gpvos

     

    11 hours ago

    parent

    prev

    next

    [ - ]

    [ x ]

    <@tialaramex> In the 1990s, with the processing power of the time, /o was a reasonable compromise. The language later evolved to do the smart thing you describe, but you can't just remove features. A warning would be in order though.

    reply

    emmelaich

     

    1 day ago

    parent

    prev

    next

    [ - ]

    [ x ]

    <@tialaramex> Sometimes you want to blow your foot off.

    reply
  • riffraff

     

    2 days ago

    prev

    next

    [ - ]

    Unsurprisingly, `END {}` is also inherited from perl, tho I think it originally comes from awk.

    reply

    mdaniel

     

    2 days ago

    parent

    next

    [ - ]

    [ x ]

    <@riffraff> Similarly unsurprisingly, with its BEGIN friend https://docs.ruby-lang.org/en/3.3/syntax/miscellaneous_rdoc....

    In the spirit of "what's old is new again," PowerShell also has the same idea, and is done per Function with "begin", "process", "end", and "clean" stanzas that allow setup, teardown, for-each-item, and "finally" behavior: https://learn.microsoft.com/en-us/powershell/module/microsof...

    reply

    mananaysiempre

     

    2 days ago

    root

    parent

    next

    [ - ]

    [ x ]

    <@mdaniel> Oh, that’s an interesting take. I’ve long been looking for newer developments on Awk’s clause structure, and this seems like an interesting take (though I’m unclear on whether I can have multiple begin/end clauses, which are the best thing about Awk’s version). It also finally connects this idea to something else in my mind—specifically advice[1] and CLOS’s :before/:after/:around methods[2]. (I guess Go’s defer also counts?)

    [1] https://en.wikipedia.org/wiki/Advice_(programming)

    [2] https://gigamonkeys.com/book/object-reorientation-generic-fu...

    reply

    mdaniel

     

    2 days ago

    root

    parent

    next

    [ - ]

    [ x ]

    <@mananaysiempre> It seems not:

    Given:

        function Fred {
            begin {
                echo "hello from begin1"
            }
            begin {
                echo "hello from begin2"
            }
            process {
                echo "does the magic"
            }
        }
        $bob = @("alpha" "beta")
        $bob | Fred
    
    Then

        $ pwsh fred.ps1
        ParserError: /Users/mdaniel/fred.ps1:5
        Line |
           5 |      begin {
             |      ~~~~~~~
             | Script command clause 'begin' has already been defined.

    reply
  • cbsmith

     

    2 days ago

    prev

    next

    [ - ]

    As an old Perl programmer, I knew immediately what the /o would do. ;-)

    reply

    Amorymeltzer

     

    2 days ago

    parent

    next

    [ - ]

    [ x ]

    <@cbsmith> I've always loved the recent[1] summary from `perlre`:

    >o - pretend to optimize your code, but actually introduce bugs

    1: I still think of it as a relatively new change, but it's from 2013: <https://github.com/Perl/perl5/commit/7cf040c1f649790a4040aec...>

    reply
  • kazinator

     

    1 day ago

    prev

    next

    [ - ]

    > Modifier o means that the first time a literal regexp with interpolations is encountered, the generated Regexp object is saved and used for all future evaluations of that literal regexp.

    That is crystal clear to me. It means that on the next execution, the new values of the interpolation will be ignored; the regexp is now "baked" with the first ones.

    Like this in C++:

      void fun(int arg)
      {
         static int once = arg;
      }
    
    if we call this as f(42) the first time, once gets initialized to 42. If we then call it f(73), once stays 42.

    There is a function in POSIX for once-only initializations: pthread_once. C++ compilers for multithreaded environments emit thread-safe code to do something similar to pthread_once to ensure that even if there are several concurrent first invocations of the function, the initialization happens once.

    reply
  • rco8786

     

    2 days ago

    prev

    next

    [ - ]

    Love these sorts of deep dives, thanks!

    reply
  • alfiedotwtf

     

    1 day ago

    prev

    next

    [ - ]

    If you don’t like /o, you’re going to hate Perl’s /e

    reply
  • lupire

     

    2 days ago

    prev

    next

    [ - ]

    This is the same problem people have with closures, where it's unclear to the user whether the argument is captured by name or by value.

    reply

    layer8

     

    2 days ago

    parent

    next

    [ - ]

    [ x ]

    <@lupire> This isn't the same problem, because this is about whether the regex is instantiated each time the code around the regex is executed, or only the first time and cached for subsequent executions. The same could in theory happen with closures, but I haven't ever seen programming-language semantics where, for example, a function containing the definition of a closure that depends on an argument of that outer function, would use the argument value of the first invocation of the function for all subsequent invocations of the function.

    For example, when you have

        fn f x = (y -> x + y)
    
    then a sequence of invocations of f

        f 1 3
        f 2 6
    
    will yield 4 and 8 respectively, but never will the second invocation yield 7 due to reusing the value of x from the first invocation. However, that is precisely what happens in the article's regex example, because the equivalent is for the closure value (y -> x + y) to be cached between invocations, so that the x retains the value of the first invocation of f — regardless of whether x is a reference by name or by value.

    reply

    ethan_smith

     

    1 day ago

    parent

    prev

    next

    [ - ]

    [ x ]

    <@lupire> The parallel is apt, but regex /o is more like a closure that captures by value at declaration time rather than an ambiguity between capture strategies.

    reply
  • phoronixrly

     

    2 days ago

    prev

    next

    [ - ]

    It's kind of a cool feature. I like it.

    reply

    thayne

     

    2 days ago

    parent

    next

    [ - ]

    [ x ]

    <@phoronixrly> Is it? I can't think of a non-contrived case where this would actually be useful.

    And in any case where it would be useful, it seems like a better way to optimize would just be to refactor the regex out into a constant.

    reply

    naniwaduni

     

    1 day ago

    root

    parent

    next

    [ - ]

    [ x ]

    <@thayne> The context is that this is a feature cribbed straight from perl, where where it's passed down from perl 4/pre-5.6, where compiled regexen weren't first-class values. Pretty much every use of it this century is a mistake.

    reply

    kayodelycaon

     

    1 day ago

    root

    parent

    prev

    next

    [ - ]

    [ x ]

    <@thayne> Actually, I have a way this would work well. If you’re interpolating a value that comes from configuration and wouldn’t change.

    Example: /admin@#{Rails.config.x.domain}/io

    But you’re right that a constant would be a lot more clear. “o” is a footgun.

    reply

    baobun

     

    1 day ago

    root

    parent

    prev

    next

    [ - ]

    [ x ]

    <@thayne> An HTTP application server matching routes based on runtime configiuration (domains and whatnot) is not really that niche or contrived? Loads of other situations where input not changing during the thread/process lifecycle is part of a set of hot regexes large enough that explicitly compiling each is not a great experience.

    I, for one, appreciate /o.

    reply
  • jwlake

     

    1 day ago

    prev

    next

    [ - ]

    this is similar to the g modifier in javascript?

    reply

    Lio

     

    22 hours ago

    parent

    next

    [ - ]

    [ x ]

    <@jwlake> No, g is the global modifier so gives you multiple mateches rather than stopping on the first match encountered.

    reply
  • zer00eyz

     

    2 days ago

    prev

    next

    [ - ]

    Im sorry but the classics never go out of style:

    "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

    reply

    stavros

     

    2 days ago

    parent

    next

    [ - ]

    [ x ]

    <@zer00eyz> Yeah but it's kind of tired when it's being used every time someone makes a mistake with regex. I've used them extensively in my career and never once regretted it.

    reply

    apgwoz

     

    2 days ago

    root

    parent

    next

    [ - ]

    [ x ]

    <@stavros> The problem with regexps is that “Sometimes a smart person, who has done the work, and knows how to leverage regular expressions correctly, decides they are appropriate for solving a problem where there is shared maintenance. Now, you have people who haven’t put in the work, and have been told repeatedly through ‘witty quips’ to not bother.”

    reply

    jodrellblank

     

    2 days ago

    parent

    prev

    next

    [ - ]

    [ x ]

    <@zer00eyz> The second problem being how to deal with all the extra time they just freed up?

    reply
  • IshKebab

     

    1 day ago

    prev

    next

    [ - ]

    Seems par for the course for Ruby.

    reply
  • Joker_vD

     

    2 days ago

    prev

    [ - ]

    > I didn’t recognize /o. It didn’t seem critically important to lookup yet.

    > With nothing else to investigate, I finally looked up the docs for what the /o regex modifier does.

    I'll probably never understand this mode of thinkning. But then again, Ruby programmers are, after all, people who chose to write Ruby.

    > /o is referred to as “Interpolation mode”, which sounded pretty harmless.

    Really? Those words sound quite alarming to me, due to personal reminiscences of eval.

    Also, this whole "/o" feaure seems insane. If I have an interpolation in my regex, obviously I have to re-interpolate it every time a new value is submitted, or I'd hit this very bug. And if the value is expected to the same every time, then I can just compile it once and save the result myself, right? In which case, I probably could even do without interpolation in the first place.

    reply

    gpvos

     

    1 day ago

    parent

    next

    [ - ]

    [ x ]

    <@Joker_vD> It's a feature dating from the 1990s, when Perl (and I guess Ruby?) didn't have a way for the user to store a compiled regex, and this was a useful shortcut for a very specific optimization, which Ruby documented badly. Perl (and I guess Ruby?) later evolved in a way that made /o unnecessary, but the (now mis)feature remained.

    reply

    apgwoz

     

    2 days ago

    parent

    prev

    [ - ]

    [ x ]

    <@Joker_vD> “Compilation”, I think, is exactly right. This feature is less about interpolation than it is about compilation of a single regexp to be used many times. It’s just shrouded in confusing documentation that should say: “/o tells ruby to rewrite this code such that it refers to a new statically allocated regexp object.” And when you write it that way, you see how insane it is for a function call to be hoisted automatically like this, without an explicit, obvious, syntactic annotation.

    reply

    gpvos

     

    1 day ago

    root

    parent

    [ - ]

    [ x ]

    <@apgwoz> The implications of "statically allocated" are less clear than if you'd just write "compiled only once".

    reply