1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Use Parallel Processing For Faster Perl Scripts

Discussion in 'Perl' started by pradeep, Feb 28, 2008.

  1. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,646
    Likes Received:
    86
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:

    Introduction



    Usually we run various scripts like newsletter mailer, backup scripts, etc. which take quite a lot of time, making us think of some ways to make it faster. One way to make it faster is to run some operations in parallel, like sending email to 20 subscribers in parallel for the newsletter mailer, instead of mailing one by one. Implenting parallel processing would bring in a huge difference in the taken to run the script.

    Here we'll see how to implement parallel processing in Perl. For this purpose we'll use Parallel::ForkManager - a powerful object-oriented CPAN module - which can be downloaded from http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm or you can use the CPAN shell to install the module.

    How To Use Parallel::ForkManager



    Being an fully object-oriented module, we'll first need to create an instance of Parallel::ForkManager specifying the number of parallel process to fork.

    Code:
      my $pm = new Parallel::ForkManager(50);
      
    Be careful while chosing the number of parallel processes, you'd not want to crash the server. You may also change this number later on in your code like this,

    Code:
      $pm->set_max_procs($max_processes);
      
    The forking of a new parallel process is done with the start method, and you must define the point at which the process ends, which is done using the finish method. A loop is usually used for this purpose, let's see an example to get a better idea.

    Code:
      for(1..100)
      {
          $pm->start and next;
          ## your code for parallel processing goes here
          ## do your stuff
          $pm->finish; ## end point of the parallel process
      }
      
    You may also need the method wait_all_children, which forces the parent process to wait until all the child processes have finished executing.

    A Simple Example



    The best example I feel is sending out newsletters to your subcribers/registered users. Say your site has around 10,000 registered users, you might want to send out newsletter to them every week, you have all your email ids in a database, the newsletter text is kept in a file. Let's see how to go about writing such a program using parallel forking.

    Code:
      use Parallel::ForkManager;
      use DBI;
      use strict;
      
      ## connect to database and query the database, we will have a statement handler $stmt
      
      ## open the newsletter text file and get the contents in the variable $mail_text
      
      my $pm = new Parallel::ForkManager(50);
      while(my($email_id) = $stmt->fetchrow_array())
      {
          $pm->start and next;
          open(SM,'|/usr/sbin/sendmail -f');
          print SM "To: $email\n";
          print SM "Subject: Newsletter\n\n";
          print SM "$mail_text";
          close(SM);
          $pm->finish
      }
      
      $pm->wait_all_children; ## wait for the child processes
      
    Unfortuantely we won't be able to use database handlers inside the child processes, I tried to do so but it didn't work out. You may read about the limitations of the module here http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm#BUGS_AND_LIMITATIONS
     
  2. amlan_das

    amlan_das New Member

    Joined:
    Feb 22, 2008
    Messages:
    2
    Likes Received:
    1
    Trophy Points:
    0
    Code:
    
    ## Changed Above Code A Little
    use Parallel::ForkManager;  
    use DBI;   
    use strict;    
    
    ## connect to database and query the database, we will have a statement handler $stmt     
    ## open the newsletter text file and get the contents in the variable $mail_text     
    
    my $pm = new Parallel::ForkManager(50);   
    while(my($email_id) = $stmt->fetchrow_array())   
    {     
      $pm->start and next;      
      open(SM,'|/usr/sbin/sendmail -f');       
      print SM "To: $email_id\n";      # The mail id variable was incorrected 
      print SM "Subject: Newsletter\n\n";       
      print SM "$mail_text";      
      close(SM);       
      $pm->finish   }     
    $pm->wait_all_children; ## wait for the child processes
    
    
     
    pradeep likes this.

Share This Page