Introduction Usually we run various scripts like newsletter mailer, backup scripts, etc. which take quite a lot of time, making us think of some ways to make it faster. One way to make it faster is to run some operations in parallel, like sending email to 20 subscribers in parallel for the newsletter mailer, instead of mailing one by one. Implenting parallel processing would bring in a huge difference in the taken to run the script. Here we'll see how to implement parallel processing in Perl. For this purpose we'll use Parallel::ForkManager - a powerful object-oriented CPAN module - which can be downloaded from http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm or you can use the CPAN shell to install the module. How To Use Parallel::ForkManager Being an fully object-oriented module, we'll first need to create an instance of Parallel::ForkManager specifying the number of parallel process to fork. Code: my $pm = new Parallel::ForkManager(50); Be careful while chosing the number of parallel processes, you'd not want to crash the server. You may also change this number later on in your code like this, Code: $pm->set_max_procs($max_processes); The forking of a new parallel process is done with the start method, and you must define the point at which the process ends, which is done using the finish method. A loop is usually used for this purpose, let's see an example to get a better idea. Code: for(1..100) { $pm->start and next; ## your code for parallel processing goes here ## do your stuff $pm->finish; ## end point of the parallel process } You may also need the method wait_all_children, which forces the parent process to wait until all the child processes have finished executing. A Simple Example The best example I feel is sending out newsletters to your subcribers/registered users. Say your site has around 10,000 registered users, you might want to send out newsletter to them every week, you have all your email ids in a database, the newsletter text is kept in a file. Let's see how to go about writing such a program using parallel forking. Code: use Parallel::ForkManager; use DBI; use strict; ## connect to database and query the database, we will have a statement handler $stmt ## open the newsletter text file and get the contents in the variable $mail_text my $pm = new Parallel::ForkManager(50); while(my($email_id) = $stmt->fetchrow_array()) { $pm->start and next; open(SM,'|/usr/sbin/sendmail -f'); print SM "To: $email\n"; print SM "Subject: Newsletter\n\n"; print SM "$mail_text"; close(SM); $pm->finish } $pm->wait_all_children; ## wait for the child processes Unfortuantely we won't be able to use database handlers inside the child processes, I tried to do so but it didn't work out. You may read about the limitations of the module here http://search.cpan.org/~dlux/Parallel-ForkManager-0.7.5/ForkManager.pm#BUGS_AND_LIMITATIONS
Code: ## Changed Above Code A Little use Parallel::ForkManager; use DBI; use strict; ## connect to database and query the database, we will have a statement handler $stmt ## open the newsletter text file and get the contents in the variable $mail_text my $pm = new Parallel::ForkManager(50); while(my($email_id) = $stmt->fetchrow_array()) { $pm->start and next; open(SM,'|/usr/sbin/sendmail -f'); print SM "To: $email_id\n"; # The mail id variable was incorrected print SM "Subject: Newsletter\n\n"; print SM "$mail_text"; close(SM); $pm->finish } $pm->wait_all_children; ## wait for the child processes