Using Variables In Perl Regular Expressions

Discussion in 'Perl' started by pradeep, Sep 7, 2007.

  1. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    I am assuming from now on that you are familiar with substitution operator in perl: s///. A basic example:
    Code:
       $str =~ s/apple/orange/;
       
    would replace the word "apple" with the word "orange". The separator "/" we used in this example can be replaced with any other non alpha-numeric character. The catch is; you have to escape the separator character inside your regular expression. So it is a better idea to use a less common character as a separator than "/". I prefer using "!" as a separator, because it is less common in strings and visually it is a good separator. So same regular expression could be written as:
    Code:
       $str =~ s!apple!orange!;
       
    A common mistake people do when using regular expressions is to try to match a variable in your regular expressions.

    Example:
    Code:
       $data =~ s!$url!http://go4expert.com!;
       
    This is going to work properly most of the time. But sometime it won't behave as expected or you will be experiencing occasional run time errors. For example, if your $url is equal to http://yahoo.com/do.cgi?action=go++&tell=perl, the substitution operator is going to fail and exit with an error message.

    Code:
     "/http://yahoo.com/do.cgi?action=go++&tell=perl/: nested *?+ in regex..."
    The reason for the failure is that you can't use "++" inside your regular expression. You have to escape them. The variable might include several special variables, which have to be escaped properly. To correct way to implement this substitution is:

    Code:
       $temp = quotemeta($url);
       $data =~ s!$temp!http://yahoo.com!;
       
    quotemeta() is a standard perl function and it escapes all non-alphanumeric characters in your variable.
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice