Tuesday, September 15, 2009

Using Modern Perl

[Source article in spanish]
I will try to write a series of articles about Perl, showing how easy and quick is to make solutions based on this platform.
For this I chose a simple design that allows me to illustrate a number of techniques and best practices, with an algorithm accessible to any developer even to a rookie.
The example program will be a statistical calculator which at first will be written in a traditional style, but will gradually become more flexible and easier to maintain, while applying some unique mechanisms of language and some libraries from CPAN.
The grand finale is to make the calculator as a web application using a suprising mechanism available for Perl. Having said that, I will start using modern Perl now.
Giving honor to the title of the article, the first thing our program does is to use the module Modern::Perl, which is a shortcut to say:

use feature ':5.10';
use strict;
use warnings;
use mro 'c3';
That is, turns on all the features introduced in Perl 5.10, also activates the strict and warnings, and finally set the method resolution order to the C3 algorithm. As expected all the examples we will see throughout this series of articles, will only work in Perl 5.10, because I'm trying to promote as many new features as possible, so: install Perl 5.10 now.
Modern Perl advocates strongly recommended the use of strict because it captures many common errors, including accidental use of symbolic references, and typographical errors in variable names, at the cost of declaring them with our (globals) or my (lexicals) before use.
Perl warnings inform us about possible errors in coding. In Perl 5.10 strict is more strict and warnings gives many new warnings, so, they catch more problems than before, which usually improves the overall quality of code and save debugging time.
In my case, when I wanted to read a command or finish the cycle in case of an end of file, so I wrote:

my $comando = readline(STDIN) or last;
Perl immediately warned me that in some cases undef (which signals the EOF) could be confused with "0" (zero) coming from the file, because perl interprets "0" and undef as false values. One way to correct the instruction would be:

defined (my $comando = readline(STDIN)) or last;
But I rather use the new operator // (defined or), that simplifies the statement:

my $comando = readline(STDIN) // last;
The C3 method resolution order, solves some problems with the original resolution order of Perl, and it is advisable to always use it in new code, this is not entirely new, there are modules that use this resolution order for some 4 years now, beacause of a CPAN module (Class:: C3) but now C3 has native support in the language.
So the first tip is to use Modern:: Perl everywhere, because it activates a number of useful and recommended features of Perl in one shot.
Returning to the program, after using Modern:: Perl, it imports the subroutine looks_like_number() of Scalar:: Util, which saved me the trouble of writing regular expressions to recognize numbers, and also saves a lot of panic from readers that can freeze just by looking at those regular expressions.
The last module in use is the main ingredient of the calculator, it never crossed my mind to write statistical algorithms, that's the pupose of CPAN, which has almost everything in it. I choose to use Statistics:: Descriptive, which serves my purpose perfectly.
Line 7 declares a constant with an error message and line 9 defines a variable with an object of class Statistics::Descriptive::Full which will be the state of the calculator during the main loop.
The main loop is simple: read a command or terminate (last) if reached end of file [line 12], then remove the spaces from the left and right of the command [line 13], if the command is a number add it to the dataset [line 15] and if not, select and execute a command.
The selection is done with the new control structure of Perl 5.10 given/when [lines 18-36] that performs smart matching between the given value and the when clauses. As the matching is "smart" depends on the operands, and generally works as expected, however there are some oddities and it never hurts to read the manual.
Finally, the new say operator is just a print which puts a newline at the end of the string, avoiding a lot of concatenations with "\n" and therefore contributing to code clarity.

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use Scalar::Util qw( looks_like_number );
 5 use Statistics::Descriptive;
 6 
 7 use constant SYNTAX_ERROR => "Error: tipee 'help' para ayuda";
 8 
 9 my $s = Statistics::Descriptive::Full->new();
10 while (1) {
11     print "Listo> ";
12     my $command = readline(STDIN) // last;
13     $command =~ s/^\s+//; $command =~ s/\s+$//;
14     if ( looks_like_number($command) ) {
15         $s->add_data($command);
16     }
17     else {
18         given ($command) {
19             when ("sum")                { say "$command = " . $s->sum() }
20             when ("mean")               { say "$command = " . $s->mean() }
21             when ("count")              { say "$command = " . $s->count() }
22             when ("variance")           { say "$command = " . $s->variance() }
23             when ("standard_deviation") { say "$command = " . $s->standard_deviation() }
24             when ("min")                { say "$command = " . $s->min() }
25             when ("mindex")             { say "$command = " . $s->mindex() }
26             when ("max")                { say "$command = " . $s->max() }
27             when ("maxdex")             { say "$command = " . $s->maxdex() }
28             when ("sample_range")       { say "$command = " . $s->sample_range() }
29             when ("median")             { say "$command = " . $s->median() }
30             when ("harmonic_mean")      { say "$command = " . $s->harmonic_mean() }
31             when ("geometric_mean")     { say "$command = " . $s->geometric_mean() }
32             when ("mode")               { say "$command = " . $s->mode() }
33             when ("trimmed_mean")       { say "$command = " . $s->trimmed_mean() }
34             when (/^(exit|quit)$/)      {last}
35             default                     { say SYNTAX_ERROR }
36         }
37     }
38 }
To use the calculator simply execute the file, below is a test run:

opr@toshi$ perl stat.pl
Listo> 19
Listo> 45
Listo> 24
Listo> 15
Listo> 39
Listo> 48
Listo> 36
Listo> count
count = 7
Listo> 10
Listo> 28
Listo> 30
Listo> count
count = 10
Listo> mean
mean = 29.4
Listo> standard_deviation
standard_deviation = 12.685950233756
Listo> salir
Error: tipee 'help' para ayuda
Listo> help
Error: tipee 'help' para ayuda
Listo> exit
opr@toshi$

A simple improvement

A better way to write the program would be to delete the if statement at line 15 and make a new "when" clause, this also allows me to show that given topicalizes $_ to the given value and when clauses not only compare strings (using eq) and regular expressions (using =~) but also allow, among others, to write boolean expressions using $_ as an alias to the value being matched.

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use Scalar::Util qw( looks_like_number );
 5 use Statistics::Descriptive;
 6 
 7 use constant SYNTAX_ERROR => "Error: tipee 'help' para ayuda";
 8 
 9 my $s = Statistics::Descriptive::Full->new();
10 while (1) {
11     print "Listo> ";
12     my $command = readline(STDIN) // last;
13     $command =~ s/^\s+//; $command =~ s/\s+$//;
14     given ($command) {
15         when ( looks_like_number($_) ) { $s->add_data($command) }
16         when ("sum")                   { say "$command = " . $s->sum() }
17         when ("mean")                  { say "$command = " . $s->mean() }
18         when ("count")                 { say "$command = " . $s->count() }
19         when ("variance")              { say "$command = " . $s->variance() }
20         when ("standard_deviation")    { say "$command = " . $s->standard_deviation() }
21         when ("min")                   { say "$command = " . $s->min() }
22         when ("mindex")                { say "$command = " . $s->mindex() }
23         when ("max")                   { say "$command = " . $s->max() }
24         when ("maxdex")                { say "$command = " . $s->maxdex() }
25         when ("sample_range")          { say "$command = " . $s->sample_range() }
26         when ("median")                { say "$command = " . $s->median() }
27         when ("harmonic_mean")         { say "$command = " . $s->harmonic_mean() }
28         when ("geometric_mean")        { say "$command = " . $s->geometric_mean() }
29         when ("mode")                  { say "$command = " . $s->mode() }
30         when ("trimmed_mean")          { say "$command = " . $s->trimmed_mean() }
31         when (/^(exit|quit)$/)         {last}
32         default                        { say SYNTAX_ERROR }
33     }
34 }
I think that almost any programmer used to dynamic languages like Python or Ruby can readily understand code in Modern Perl and even be comfortable working with it.
The programmers of languages like C, C++, C# or Java, after getting used to some basic principles should feel a kind of liberating experience, because writing a program such this in those languages is certanly more difficult.
In the next article we'll see some dynamic features of Perl that make the program shorter, more flexible and easier to maintain.

No comments:

Post a Comment