Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Perl (http://www.go4expert.com/articles/perl-tutorials/)
-   -   Using Benchmark Perl Module (http://www.go4expert.com/articles/using-benchmark-perl-module-t4438/)

pradeep 27May2007 19:12

Using Benchmark Perl Module
 
Everyone knows that Perl is extremely fast when it comes to handling regular expressions and text processing... but have you ever wondered how fast is extremely fast? Well, one of the toys us big kids have at our disposal is the Perl Benchmark module, which lets you test the speed of a Perl script.

Calculating differences in script execution time



The traditional way of measuring script execution time is also the common-sense one: check the time when the script starts, check the time when it ends, and the difference between the two values is the script execution time. In Perl these time values are obtained with the built-in time() function:

Code: Perl

#!/usr/bin/perl
 
 # declare array
 my @data;
 
 # start timer
 $start = time();
 
 # perform a math operation 200000 times
 for ($x=0; $x<=200000; $x++)
 {
 $data[$x] = $x/($x+2);
 }
 
 # end timer
 $end = time();
 
 # report
 print "Time taken was ", ($end - $start), " seconds";

While this is fine for basic use, it becomes complicated if what you really want is to compare the times of different scripts, or run arbitrary pieces of code for fixed time intervals. For these uses, the Benchmark module is more appropriate. This module comes bundled with Perl, and can be imported into your Perl script through the "use" command. Take a look at the next example, which rewrites the previous one to use Benchmark instead of time().

Code: Perl

#!/usr/bin/perl
 
 use Benchmark;
 
 # declare array
 my @data;
 
 # start timer
 $start = new Benchmark;
 
 # perform a math operation 200000 times
 for ($x=0; $x<=200000; $x++)
 {
 $data[$x] = $x/($x+2);
 }
 
 # end timer
 $end = new Benchmark;
 
 # calculate difference
 $diff = timediff($end, $start)
 
 # report
 print "Time taken was ", timestr($diff, 'all'), " seconds";

Every time you create a new Benchmark object with new(), the current time is returned. The difference between the start and end times is calculated with the Benchmark module's timediff() function, and the result is formatted for display with the timestr() function. Here's the sample output of the script above:

Time taken was 2 wallclock secs ( 2.14 usr 0.00 sys + 0.00 cusr
0.00 csys = 2.14 CPU) seconds

As you can see, Benchmark returns a little more detail than the time() function.

Timing multiple runs of a script



Of course, a sample size of one is not necessarily representative of how fast your script is, especially on Web servers that are subject to varying loads. Therefore, what you really need is a way to run this script many times, and calculate the average time taken after compiling the data from each run. Luckily, Benchmark comes with a function to do this too. It's called timethis(), and it's demonstrated in the following example:

Code: Perl

#!/usr/bin/perl
 
 use Benchmark;
 
 # run code 100000 times and display result
 timethis(100000, '
   for ($x=0; $x<=200; $x++)
   {
     sin($x/($x+2));
   }
 '
);

The timethis() function accepts two arguments: the number of times to run the code block, and the code block itself. This code block must be provided to timethis() in a format suitable to the eval() function.

Once the benchmark is complete, timethis() displays a report like this:

Code:

timethis 100000: 210 wallclock secs (209.37 usr + 0.00 sys = 209.37
 CPU) @ 477.62/s (n=100000)

There are two pieces of useful data here: the number of CPU seconds, which tells you how long Perl takes to run the code N times, and the per-second data, which tells you how many runs take place per second. Obviously, the higher the second value, the faster your code is.

Instead of a fixed number of iterations, now let's see how to have timethis() run the code for a fixed period of time.

Hide Counting how often a script runs in a predefined time window



Instead of timing how long a piece of code takes to execute a fixed number of iterations, you can flip things around and have timethis() run the code for a fixed period of time to see how many iterations it completes in that time. You do this by using a negative value as the first argument. Consider the following example, which makes timethis() run the code for a minimum of 10 seconds:

Code: Perl

#!/usr/bin/perl
 
 use Benchmark;
 
 # run code for 10 seconds and display result
 timethis(-10, '
   for ($x=0; $x<=200; $x++)
   {
     sin($x/($x+2));
   }
 '
);

The output will look something like this:

Code:

timethis for 10: 11 wallclock secs (10.93 usr + 0.00 sys = 10.93 CPU) @ 700.82/s (n=7660)
So in 11 seconds (well, 10.93 if you want to be difficult), Perl was able to execute the code 7660 times, or approximately 700 times per second.

You can even create an interactive benchmarking tool with timethis(), by having the user enter the code and the number of iterations at the prompt:

Code: Perl

#!/usr/bin/perl
 
 # use Benchmark module
 use Benchmark;
 
 # ask for count
 print "Enter number of iterations:\n";
 $count = <STDIN>;
 chomp ($code);
 
 # alter the input record separator
 # so as to allow multi-line code blocks
 $/ = "END";
 
 # ask for code
 print "Enter your Perl code (end with END):\n";
 $code = <STDIN>;
 
 print "\nProcessing...\n";
 
 # run code and display result
 timethis($count, $code);

Most of this is pretty simple, and should be clear to you if you understood the previous examples. The only item of note here is the alteration of the Perl input separator to the code END, so that the user can enter multi-line code blocks and terminate them with the statement END (the default separator is a carriage return, which would make Perl jump to the next statement as soon as the user pressed [Enter]).

Here's an example of this script in action (lines beginning with a '>' indicate output from the program, the rest are lines input by the user):

Code:

> Enter number of iterations:
 500
 > Enter your Perl code (end with END):
 for ($a=1; $a<1001; $a++)
 {
        $value = $a ** 10;
 }
 END
 > Processing...
 > timethis 500:  6 wallclock secs ( 5.72 usr +  0.00 sys =  5.72 CPU) @ 87.41/s (n=500)

Timing and comparing different techniques



If you're the kind of Perl programmer who likes experimenting with different ways of accomplishing the same thing, you're going to just love the next tool in Benchmark's arsenal. The timethese() function allows you to time more than one code fragment at a time:

Code: Perl

#!/usr/bin/perl
 
 # use Benchmark module
 use Benchmark;
 
 # time 3 different versions of the same code
 timethese (1000, {
   'ashi' => '$x=1;
           while ($x <= 5000)
             {
               sin ($x/($x+2));
               $x++;
             }'
,
   'anu' => 'for ($x=1; $x<=5000; $x++)
             {
               sin ($x/($x+2));
             }'
,
   'mani' => 'foreach $x (1...5000)
             {
               sin($x/($x+2));
             }'
   
 });

This example tries to calculate the sine of 5,000 numbers, using three different approaches. The first, named "ashi", uses a while() loop; "anu" uses a for() loop; and "mani" uses a foreach() loop. Each of these code snippets is placed inside a single call to the timethese() function, which accepts two arguments: the number of iterations and a hash whose values are the code snippets to be tested (the keys of the hash contain the unique names for the code fragments). The timethese() function then internally calls timethis() for each hash element and returns the time taken for each option. Here's a sample of the output:

Code:

Benchmark: timing 1000 iterations of anu, ashi, mani...
  anu: 92 wallclock secs (91.72 usr +  0.00 sys = 91.72 CPU) @ 10.90/s (n=1000)
  ashi: 160 wallclock secs (159.56 usr +  0.00 sys = 159.56 CPU) @ 6.27/s (n=1000)
  mani: 45 wallclock secs (44.98 usr +  0.00 sys = 44.98 CPU) @ 22.23/s (n=1000)

It is clear from the output that the foreach() loop is the most efficient of the three alternatives, at least for this particular scenario.

Another way to run this test is with the cmpthese() function, which internally calls timethese(), and accepts the same arguments as timethese(). The main advantage is that it formats the result better for comparison purposes:

Code: Perl

#!/usr/bin/perl
 
 # use Benchmark module
 use Benchmark qw (:all);
 
 # time 3 different versions of the same code cmpthese (100, {
   'ashi' => '$x=1;
           while ($x <= 5000)
           {
             sin ($x/($x+2));
             $x++;
           }'
,
   'anu' => '      for ($x=1; $x<=5000; $x++)
           {
             sin ($x/($x+2));
           }'
,
   'mani' => '      foreach $x (1...5000)
           {
             sin($x/($x+2));
           }'
   
 });

Note: the use of "use Benchmark qw (:all)" instead of just "use Benchmark." This ensures all the methods in the Benchmark object get exported.

The output of cmpthese() is a table which compares the speed of each option against the speed of its competition. Since this table contains summary percentage values, it is somewhat easier to understand than the output of timethese():

Code:

Rate mani  ashi anu
 mani 14.1/s        --  -50%  -54%
 ashi  28.5/s  102%        --  -8%



All times are GMT +5.5. The time now is 15:28.