Testing Filenames in Perl

pradeep's Avatar author of Testing Filenames in Perl
This is an article on Testing Filenames in Perl in Perl.
When your script writes to a new file, you probably want it to create a new and unique name for the new file, one that doesn't conflict with any existing files, which would be overwritten. One way to create a new file name that's unique is to incorporate the process id and the time into the name. Perl's special variable, $$ returns the current pid and $^T returns the timestamp (in seconds since 1970). So you could use something like $filename = "$$" . "$^T" . ".html"; This alone will not guarantee uniqueness since there are only a finite number of process ids, which are recycled, and your script could have been accessed twice within the same second.

This also results in ugly filenames, something like "213031168857899.html". If you have prettier names that you insist on, you can test for the existence of a file with the proposed new name, using Perl's -e operator. -e $file_name is true if a file already exists with that name. In the example below, the variable $text holds some key text taken from the contents of the file that we want to use in the name. We're also assuming that the script is writing a web page, so we add ".html" as the extension.

Code: Perl
$file_name = $your_chosen_dir . $text . ".html";
if ( -e $file_name )
{
    ## do something to make it different, like
    ## substitute pidtime.html for html at the end
    $file_name =~ s/html$/$$^T\.html/;
}
So, you'll get mixtures of pretty and ugly filenames, only occasionally.
0
oleber's Avatar, Join Date: Apr 2007
Go4Expert Member
Many times not logic names are necessary, MD5 can be used in here.

I'm using something like:

Code:
use strict;
use Digest::MD5 qw(md5_hex);

my $secretText = "xpto";
my $counter = 0;
my $file_name = "";
do { 
	$file_name = md5_hex($$.$^T.$secretText.($counter++)).".html";
} while -e $file_name;
The $secretText helps to create a not logic identifier since it is personal.
The $counter forces different text every time.
0
pradeep's Avatar, Join Date: Apr 2005
Team Leader
That's a really cool idea too, but most of the times we need the files to be named according to some logic!
0
oleber's Avatar, Join Date: Apr 2007
Go4Expert Member
Logic parts can be added to the name

But the important part is the cycle to find a acceptable name. Probably (I don't work with the web) many web servers are using a process for multiple requests, so the PID will be the same for this requests. Adding a counter can help.
0
pradeep's Avatar, Join Date: Apr 2005
Team Leader
Yes, quite right, web servers like Apache use threads to serve multiple requests. You code is perfect except for hashing.
I think, timestamp,PID and counter are enough to make the filename unique!
0
pradeep's Avatar, Join Date: Apr 2005
Team Leader
We can make a sub-routine out of this logic, like this,

Code: Perl
sub getFreeFilename
 {
     my $counter = 0;
     my $file_name = "";
     do {
         $file_name = sprintf("%s%s%s.html",$$,$^T,$counter++);
     } while -e $file_name;
 
     return $file_name;
 }
0
oleber's Avatar, Join Date: Apr 2007
Go4Expert Member
Even better lets do $counter a global variable.

Quote:
{
my $counter = 0;
sub getFreeFilename {
while(1) {
my $file_name = sprintf("%s_%s_%s.html",$$,$^T,$counter++);
return $file_name if not -e $file_name;
};
}
}
Since all this code is inside { }, the $count is just visible inside { } and will exist while the subroutine is existing (+- as a static variable in C++). So no real loop is necessary.
0
oleber's Avatar, Join Date: Apr 2007
Go4Expert Member
sory for the code

Code:
{
  my $counter = 0;
  sub getFreeFilename {
    while(1) {
      my $file_name = sprintf("%s_%s_%s.html",$$,$^T,$counter++);
      return $file_name if not -e $file_name;
    };
  }
}