Testing Filenames in Perl

Discussion in 'Perl' started by pradeep, Jan 15, 2007.

  1. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    When your script writes to a new file, you probably want it to create a new and unique name for the new file, one that doesn't conflict with any existing files, which would be overwritten. One way to create a new file name that's unique is to incorporate the process id and the time into the name. Perl's special variable, $$ returns the current pid and $^T returns the timestamp (in seconds since 1970). So you could use something like $filename = "$$" . "$^T" . ".html"; This alone will not guarantee uniqueness since there are only a finite number of process ids, which are recycled, and your script could have been accessed twice within the same second.

    This also results in ugly filenames, something like "213031168857899.html". If you have prettier names that you insist on, you can test for the existence of a file with the proposed new name, using Perl's -e operator. -e $file_name is true if a file already exists with that name. In the example below, the variable $text holds some key text taken from the contents of the file that we want to use in the name. We're also assuming that the script is writing a web page, so we add ".html" as the extension.

    Code:
    $file_name = $your_chosen_dir . $text . ".html";
    if ( -e $file_name ) 
    {
        ## do something to make it different, like
        ## substitute pidtime.html for html at the end
        $file_name =~ s/html$/$$^T\.html/;
    }
    So, you'll get mixtures of pretty and ugly filenames, only occasionally.
     
  2. oleber

    oleber New Member

    Joined:
    Apr 23, 2007
    Messages:
    37
    Likes Received:
    2
    Trophy Points:
    0
    Occupation:
    Software Developer (Perl, C/C++ and Java)
    Location:
    Hamburg, Germany
    Home Page:
    http://oleber.freehostia.com/
    Many times not logic names are necessary, MD5 can be used in here.

    I'm using something like:

    Code:
    use strict;
    use Digest::MD5 qw(md5_hex);
    
    my $secretText = "xpto";
    my $counter = 0;
    my $file_name = "";
    do { 
    	$file_name = md5_hex($$.$^T.$secretText.($counter++)).".html";
    } while -e $file_name;
    
    The $secretText helps to create a not logic identifier since it is personal.
    The $counter forces different text every time.
     
  3. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    That's a really cool idea too, but most of the times we need the files to be named according to some logic!
     
  4. oleber

    oleber New Member

    Joined:
    Apr 23, 2007
    Messages:
    37
    Likes Received:
    2
    Trophy Points:
    0
    Occupation:
    Software Developer (Perl, C/C++ and Java)
    Location:
    Hamburg, Germany
    Home Page:
    http://oleber.freehostia.com/
    Logic parts can be added to the name ;)

    But the important part is the cycle to find a acceptable name. Probably (I don't work with the web) many web servers are using a process for multiple requests, so the PID will be the same for this requests. Adding a counter can help.
     
  5. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    Yes, quite right, web servers like Apache use threads to serve multiple requests. You code is perfect except for hashing.
    I think, timestamp,PID and counter are enough to make the filename unique!
     
  6. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    We can make a sub-routine out of this logic, like this,

    Code:
     sub getFreeFilename
     {
         my $counter = 0;
         my $file_name = "";
         do { 
             $file_name = sprintf("%s%s%s.html",$$,$^T,$counter++);
         } while -e $file_name;
     
         return $file_name;
     }
     
     
  7. oleber

    oleber New Member

    Joined:
    Apr 23, 2007
    Messages:
    37
    Likes Received:
    2
    Trophy Points:
    0
    Occupation:
    Software Developer (Perl, C/C++ and Java)
    Location:
    Hamburg, Germany
    Home Page:
    http://oleber.freehostia.com/
    Even better lets do $counter a global variable.

    Since all this code is inside { }, the $count is just visible inside { } and will exist while the subroutine is existing (+- as a static variable in C++). So no real loop is necessary.
     
  8. oleber

    oleber New Member

    Joined:
    Apr 23, 2007
    Messages:
    37
    Likes Received:
    2
    Trophy Points:
    0
    Occupation:
    Software Developer (Perl, C/C++ and Java)
    Location:
    Hamburg, Germany
    Home Page:
    http://oleber.freehostia.com/
    sory for the code

    Code:
    {
      my $counter = 0;
      sub getFreeFilename {
        while(1) {
          my $file_name = sprintf("%s_%s_%s.html",$$,$^T,$counter++);
          return $file_name if not -e $file_name;
        };
      }
    }
    
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice