Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Perl (http://www.go4expert.com/forums/perl/)
-   -   String regex help needed (http://www.go4expert.com/forums/string-regex-help-t14225/)

vecinity 27Sep2008 18:57

String regex help needed
 
Hey guyz,
I'm kinda new to Perl, and I'm now reading a tutorial explaining about regex and string handling, and I find it really interesting.
While practicing I tried to write a short program that locates all the "Password: Something" strings in an input file. But came across a problem I couldn't solve.
Program No.1 - Works Fine:
#!/usr/bin/perl -w
use strict;
my $line = "This is my password: rabbiT and this is your password: hOLe";
while ($line=~m/[^P|p]*([P|p]assword:\s\w+)\b/g) {
print $1;
}

Program No.2 - Doesn't Work:
#!/usr/bin/perl -w
use strict;
my $line;
while ($line = <>)
{
while($line=~m/[^P|p]*([P|p]assword:\s\w+)\b/g)
{
print $1;
}
}

Input file look like this:
yes it's me
writing a massage
help me find
the password: kinG
which
is pass for my password:
mine

The thing is, when I enter each sentence on it's on manually it works fine,
but why doesn't it work when I read it from an input file ????

I would really appreciate any help I can get. THANK YOU!

oogabooga 27Sep2008 20:48

Re: String regex help needed
 
Firstly, you should just say [Pp] instead of [P|p];
the vertical bar is not needed (and is in fact taken literally)
inside the square brackets.

Secondly, you do not need the initial [^P|p] (or [^Pp]).

Thirdly, the reason your second program doesn't work is because
you are only looking at one line at a time, but "password:"
and "mine" are on different lines. The solution would be something
like this:
Code:

#!/usr/bin/perl -w
use strict;
undef $/;  # enable whole file reading mode
my $whole_file = <>; # read whole file
# now globally search $whole_file (with case-insensitivity)
while( $whole_file =~ /password:\s*(\w+)/gi ) {
  print "$1\n";
}

Finally, remember to use code blocks (as I did above) when posting code to the site.
Check out the tags you can use.

pradeep 28Sep2008 11:09

Re: String regex help needed
 
Reading the whole file into a variable is not a good idea, btw it's called slurping!
Code: Perl

#!/usr/bin/perl -w
 use strict;
 
 OPEN(H,"myFile.txt");
 while(<H>)
 {
    if($_ =~ /^password:\s*(.+)$/i)
    {
       print $1,"\n";
    }
 }


vecinity 28Sep2008 21:19

Re: String regex help needed
 
Hey guyz, first of all thanks for your help.
ooga, first of all thanks for the tags notice.
second, i intentionally wrote "password:" twice, one in one row and the other in two different rows.
the program as you wrote it still won't work, it prints only "which".
pradeep, the compiler says it cant find the file even if both are in the same directory.
should i write anything when running the .pl file through the command?

thanks ahead guyz.

oogabooga 28Sep2008 21:33

Re: String regex help needed
 
Whether slurping the file is a good idea or not depends upon the situation. For example, I find it useful when retrieving and searching webpages since they are obviously small enough to fit easily into memory. Since the poster's example requires searching across lines, and his target file is presubaly relatively small (say less than 100K), slurping is a reasonable option. In a search that can span lines, looping on the line-input operator instead of slurping requires extra logic on the programmer's part, which is technically against the Perl philosophy that laziness is good! Here's a version using line-by-line input:
Code:

#Usage: paswrd file1 [file2 ...]
use warnings; use strict;

my $found_password = 0;

while( <DATA> ) { # remove DATA for non-testing mode
    if( $found_password ) {
        # grab first word
        if( /^\s*(\w+)/g ) {
            printf "$1\n";
            $found_password = 0;
        }
    }
    # "password:" and "word" on same line
    if( /password\s*:\s*(\w+)/gi ) {
        printf "$1\n";
    }
    # "password:" at the end of a line
    elsif( /password\s*:\s*$/gi ) {
        $found_password = 1;
    }
}

__DATA__ # Test data
password:JUST
xxx Password:
SAYING xxx
x xx xxx xxxx
xxx paSSword : HELLO xxxx
xx xxx passworD  :

WORLD xxxx
xxx


oogabooga 29Sep2008 20:07

Re: String regex help needed
 
Quote:

Originally Posted by vecinity
ooga, the program as you wrote it still won't work, it prints only "which". pradeep, the compiler says it cant find the file even if both are in the same directory.

The programs seem to be working okay.
Make sure you are in the right directory and maybe check your environment.

Hmmm, for some reason I cannot edit my previous post.
I was going to fix up the code a little like so:
Code:

# Usage: paswrd file1 [file2 ...]
use warnings; use strict;

my $found_password = 0;

while( <DATA> ) { # (replace <DATA> with <> for non-testing mode)
    if( $found_password ) {
        # grab first word
        if( /\s*(\w+)/g ) {
            printf "$1\n";
            $found_password = 0;
        }
    }
    # "password:" and word on same line
    while( /\s*password\s*:\s*(\w+)/gi ) {
        printf "$1\n";
    }
    # "password:" at the end of a line
    if( /\s*password\s*:\s*$/i ) {
        $found_password = 1;
    }
}

__DATA__ # Test data
password:ONE
Try across lines:
xxx password :
TWO xxx
Try more than one on a line:
xx password: THREE x password: FOUR x password :
FIVE xxxx
Try a blank line in between:
xx xxx Password:

SIX xx

Compare that to using "file slurping" for the same result with the same test data:
Code:

#!/usr/bin/perl -w
use strict;
undef $/;  # enable whole file reading mode
my $whole_file = <DATA>; # read whole file (use <> if not testing)
# now globally search $whole_file (with case-insensitivity)
while( $whole_file =~ /\s*password\s*:\s*(\w+)/gi ) {
  print "$1\n";
}
__DATA__ # Test data
password:ONE
Try across lines:
xxx password :
TWO xxx
Try more than one on a line:
xx password: THREE x password: FOUR x password :
FIVE xxxx
Try a blank line in between:
xx xxx Password:

SIX xx



All times are GMT +5.5. The time now is 01:28.