String regex help needed

Discussion in 'Perl' started by vecinity, Sep 27, 2008.

  1. vecinity

    vecinity New Member

    Joined:
    Sep 25, 2008
    Messages:
    4
    Likes Received:
    0
    Trophy Points:
    0
    Hey guyz,
    I'm kinda new to Perl, and I'm now reading a tutorial explaining about regex and string handling, and I find it really interesting.
    While practicing I tried to write a short program that locates all the "Password: Something" strings in an input file. But came across a problem I couldn't solve.
    Program No.1 - Works Fine:
    #!/usr/bin/perl -w
    use strict;
    my $line = "This is my password: rabbiT and this is your password: hOLe";
    while ($line=~m/[^P|p]*([P|p]assword:\s\w+)\b/g) {
    print $1;
    }

    Program No.2 - Doesn't Work:
    #!/usr/bin/perl -w
    use strict;
    my $line;
    while ($line = <>)
    {
    while($line=~m/[^P|p]*([P|p]assword:\s\w+)\b/g)
    {
    print $1;
    }
    }

    Input file look like this:
    yes it's me
    writing a massage
    help me find
    the password: kinG
    which
    is pass for my password:
    mine

    The thing is, when I enter each sentence on it's on manually it works fine,
    but why doesn't it work when I read it from an input file ????

    I would really appreciate any help I can get. THANK YOU!
     
  2. oogabooga

    oogabooga New Member

    Joined:
    Jan 9, 2008
    Messages:
    115
    Likes Received:
    11
    Trophy Points:
    0
    Firstly, you should just say [Pp] instead of [P|p];
    the vertical bar is not needed (and is in fact taken literally)
    inside the square brackets.

    Secondly, you do not need the initial [^P|p] (or [^Pp]).

    Thirdly, the reason your second program doesn't work is because
    you are only looking at one line at a time, but "password:"
    and "mine" are on different lines. The solution would be something
    like this:
    Code:
    #!/usr/bin/perl -w
    use strict;
    undef $/;  # enable whole file reading mode
    my $whole_file = <>; # read whole file
    # now globally search $whole_file (with case-insensitivity)
    while( $whole_file =~ /password:\s*(\w+)/gi ) {
      print "$1\n";
    }
    Finally, remember to use code blocks (as I did above) when posting code to the site.
    Check out the tags you can use.
     
  3. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    Reading the whole file into a variable is not a good idea, btw it's called slurping!
    Code:
     #!/usr/bin/perl -w
     use strict;
     
     OPEN(H,"myFile.txt");
     while(<H>)
     {
        if($_ =~ /^password:\s*(.+)$/i)
        {
           print $1,"\n";
        }
     }
     
     
  4. vecinity

    vecinity New Member

    Joined:
    Sep 25, 2008
    Messages:
    4
    Likes Received:
    0
    Trophy Points:
    0
    Hey guyz, first of all thanks for your help.
    ooga, first of all thanks for the tags notice.
    second, i intentionally wrote "password:" twice, one in one row and the other in two different rows.
    the program as you wrote it still won't work, it prints only "which".
    pradeep, the compiler says it cant find the file even if both are in the same directory.
    should i write anything when running the .pl file through the command?

    thanks ahead guyz.
     
  5. oogabooga

    oogabooga New Member

    Joined:
    Jan 9, 2008
    Messages:
    115
    Likes Received:
    11
    Trophy Points:
    0
    Whether slurping the file is a good idea or not depends upon the situation. For example, I find it useful when retrieving and searching webpages since they are obviously small enough to fit easily into memory. Since the poster's example requires searching across lines, and his target file is presubaly relatively small (say less than 100K), slurping is a reasonable option. In a search that can span lines, looping on the line-input operator instead of slurping requires extra logic on the programmer's part, which is technically against the Perl philosophy that laziness is good! Here's a version using line-by-line input:
    Code:
    #Usage: paswrd file1 [file2 ...]
    use warnings; use strict;
    
    my $found_password = 0;
    
    while( <DATA> ) { # remove DATA for non-testing mode
        if( $found_password ) {
            # grab first word
            if( /^\s*(\w+)/g ) {
                printf "$1\n";
                $found_password = 0;
            }
        }
        # "password:" and "word" on same line
        if( /password\s*:\s*(\w+)/gi ) { 
            printf "$1\n";
        }
        # "password:" at the end of a line
        elsif( /password\s*:\s*$/gi ) {
            $found_password = 1;
        }
    }
    
    __DATA__ # Test data
    password:JUST
    xxx Password: 
    SAYING xxx
    x xx xxx xxxx
    xxx paSSword : HELLO xxxx
    xx xxx passworD  :
    
    WORLD xxxx
    xxx
    
     
  6. oogabooga

    oogabooga New Member

    Joined:
    Jan 9, 2008
    Messages:
    115
    Likes Received:
    11
    Trophy Points:
    0
    The programs seem to be working okay.
    Make sure you are in the right directory and maybe check your environment.

    Hmmm, for some reason I cannot edit my previous post.
    I was going to fix up the code a little like so:
    Code:
    # Usage: paswrd file1 [file2 ...]
    use warnings; use strict;
    
    my $found_password = 0;
    
    while( <DATA> ) { # (replace <DATA> with <> for non-testing mode)
        if( $found_password ) {
            # grab first word
            if( /\s*(\w+)/g ) {
                printf "$1\n";
                $found_password = 0;
            }
        }
        # "password:" and word on same line
        while( /\s*password\s*:\s*(\w+)/gi ) {
            printf "$1\n";
        }
        # "password:" at the end of a line
        if( /\s*password\s*:\s*$/i ) {
            $found_password = 1;
        }
    }
    
    __DATA__ # Test data
    password:ONE
    Try across lines:
    xxx password : 
    TWO xxx
    Try more than one on a line:
    xx password: THREE x password: FOUR x password : 
    FIVE xxxx
    Try a blank line in between:
    xx xxx Password: 
    
    SIX xx
    
    Compare that to using "file slurping" for the same result with the same test data:
    Code:
    #!/usr/bin/perl -w
    use strict;
    undef $/;  # enable whole file reading mode
    my $whole_file = <DATA>; # read whole file (use <> if not testing)
    # now globally search $whole_file (with case-insensitivity)
    while( $whole_file =~ /\s*password\s*:\s*(\w+)/gi ) {
      print "$1\n";
    }
    __DATA__ # Test data
    password:ONE
    Try across lines:
    xxx password : 
    TWO xxx
    Try more than one on a line:
    xx password: THREE x password: FOUR x password : 
    FIVE xxxx
    Try a blank line in between:
    xx xxx Password: 
    
    SIX xx
    
     
    Last edited: Sep 29, 2008

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice