Monday, December 7, 2009

Plack/PSGI performance

[Original Spanish source]
In my post about PSGI & Plack I said that it was fast, to demonstrate this I benchmarked the program running as CGI in Apache (ACGI) as a standalone server in CGI::Emulate::PSGI (CEP) and as a native PSGI application.

The test was very not rigorous, because I really just wanted to confirm what I've read.

The command to report the rate was:

$ ab -n 1000 -c 10 -k "http://localhost:5000/cgi-bin/perldocweb?pod=PSGI&format=source"

Which gave the following results:

ACGI
CEP
PSGI
Requests per second
10.57
267.17
512.31
Time per request (ms)
94.618
3.743
1.952
Transfer rate (kBps)
179.52
4539.79
8686.67

Just to see the raw speed, I made a small program to serve text files and compare the performance against Apache serving the same static files:

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use IO::File;
 5 
 6 my $dir = "/home/jrey/htdocs";
 7 
 8 my $app = sub {
 9     my $env      = shift;
10     my $filename = $dir . $env->{'REQUEST_URI'};
11     return [ '200', ['Content-Type' => "text/plain"], IO::File->new($filename) ];
12 };

The results for the command:

$ ab -n 1000 -c 10 -k "http://localhost:5000/PSGI.pod"

where:

Plackup
Apache
Requests per second
614.69
3217.03
Time per request (ms)
1.627
0.311
Transfer rate (kBps)
10425.21
55133.41

As I said, Plack is very fast, and in particular this test shows that the performance is acceptable even for static content, so we can deploy applications directly on perl, without additional Web server components, except for special needs such as high availability and load balancing, in which case there are some perl based solutions solutions as well, for example perlbal. Did I told you that there is PSGI for perlbal?

Sunday, December 6, 2009

CGI::Emulate::PSGI error

While working with the code of the previous article, I realized that the example of CGI::Emulate::PSGI wasn't working, because I did not reset CGI's global variables.
Here is the correct way to do it:

1 use CGI::Emulate::PSGI;
2 use CGI;
3 
4 my $app = CGI::Emulate::PSGI->handler(sub {
5     CGI::initialize_globals();
6     do "perldocweb";
7 })

Monday, November 30, 2009

PSGI and Plack: the future of web applications

[Original spanish source]
A few weeks ago I showed my friend Joel a one-liner in Perl it featured a web server, perhaps he had too much work to do because it did not seem surprised by this fantastic line of perl module using IO::All:

perl -MIO::All -e 'io(":8080")->fork->accept->(sub { $_[0] < io(-x $1 ? "./$1 |" : $1) if /^GET \/(.*) / })'

But surprisingly (especially for a Perl fan) his response was: "You know that Python's people have software to deploy powerfull web servers easily", I noticed that he did not understand my point, so I let him go.

Although I was sure he was talking about WSGI (also known as PEP-333): a specification for a web application API, allowing the separation concerns between the interface (policy) and implementation (mechanisms).

In Perl this was the job of HTTP::Engine used among others by Catalyst.

However, I was curious and I looked at CPAN, would there be something new out there?.

I found modules like Mojo, which internally uses an interface similar to WSGI, however the most interesting thing I found was PSGI and Plack.

Apparently HTTP::Engine is far from an ideal solution. I read it is monolithic, difficult to adapt and not very efficient, for embedded environments I guess, so Miyagawa decided to separate HTTP::Engine into three parts:
  1. An specification: PSGI
  2. A reference implementation: Plack::Server
  3. Tools: Plack::*
Most interesting about Plack and PSGI is the pace at which it was implemented, only weeks ago it was an idea and for some time now there are reference implementations available, which allow applications to run standalone by Plack in a single thread or perfork, there are also interfaces for FastCGI, CGI and mod-perl of course, and as if this were not enough, PSGI has the ability to work with non-blocking I/O, so there are servers for POE, AnyEvent and Coro, there is even a PSGI module for Apache (mod-psgi).

On the other hand, PSGI adapters were developed for frameworks like Catalyst (Catalyst::Engine::PSGI), Squatting (Squatting::On::PSGI), CGI::Application (CGI::Application::PSGI), Dancer and even for WebGUI (PlebGUI), there are tools to help in the migration from other technologies, for example if you have an application written for HTTP::Engine, you can use it virtually unchanged in PSGI with HTTP::Engine::Interface::PSGI, if you have a CGI application you may migrate it with little modification with CGI::PSGI, and if even this is too much work you can use CGI::Emulate::PSGI that supports running a CGI server from the command line!.

In the last post I made a toy POD document server, and I choose to implement it as a CGI, I gess that some people may had problems making it work, because it needed a running web server with the right CGI configuration in place.

Using CGI::Emulate::PSGI you only write a program to start the server (perldocweb_starter):

1 use CGI::Emulate::PSGI;
2 my $app = CGI::Emulate::PSGI->handler(sub { do "perldocweb" })

and then run the command plackup:

$ plackup perldocweb_starter
Plack::Server::Standalone: Accepting connections at http://0:5000/

now we have our documentation server running on port 5000, so browsing:

http://localhost:5000/perldocweb?PSGI

PSGI specification should appear in the browser, easy right?.

If you are willing to modify your code, the emulator won't be necessary the application can be executed directly by plackup, and it will be much more efficient.

The first modification is to change line 4 to use CGI::PSGI, I also no longer use CGI::Carp, because Plack has a much more elegant way to display errors using Devel::StackTrace::AsHTML.

When using CGI::PSGI the program must create (and return) a closure that will be our application so the main code between lines 20 and 50 (of the old code) should be inside a closure, also line 20 now must initialize a CGI::PSGI object, so I replaced it with the lines 20 to 22 of the new application:

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use CGI::PSGI;
 5 use IO::File;
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my %content_types = (
10     RTF   => "application/rtf",
11     LaTeX => "application/x-latex",
12     PDF   => "application/pdf",
13 );
14 my @wikis   = qw(Usemod Twiki Template Kwiki Confluence Moinmoin Tiddlywiki Mediawiki Textile);
15 my %formats = (
16     ( map { $_ => "Pod::Simple::$_" } keys %content_types ),
17     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
18 );
19 
20 my $app = sub {
21     my $env      = shift;
22     my $q        = CGI::PSGI->new($env);
23     my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
24     my $format   = $q->param("format") || "HTML";
25     given ($format) {
26         when ("source") {
27             return [ $q->psgi_header("text/plain"), IO::File->new($filename) ];
28         }
29         when ('HTML') {
30             my $parser = Pod::Simple::HTML->new;
31             $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
32             my $footer = "<hr>"
33                 . join( " ", map { make_link( $_, $q ) } "source", keys %content_types )
34                 . " | Wiki formats: "
35                 . join( " ", map { make_link( $_, $q ) } @wikis );
36             $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
37             $parser->output_string( my $output );
38             $parser->parse_file($filename);
39             return [ $q->psgi_header("text/html"), [$output] ];
40         }
41         when (%formats) {
42             my $class = $formats{$format};
43             eval "require $class";
44             my $parser = $class->new;
45             $parser->output_string( my $output );
46             $parser->parse_file($filename);
47             return [ $q->psgi_header( $content_types{$format} || "text/plain" ), [$output] ];
48         }
49         default {
50             die("Formato desconocido '$format'");
51         }
52     }
53 };
54 
55 sub make_link {
56     my $fmt = shift;
57     my $q   = shift;
58     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
59 }

The closure receives a PSGI environment as a parameter (line 21) and uses it to create the object $q that we may use like a regular CGI object.
This closure should return an array of two elements:
  1. The headers: an array of alternating header names and values
  2. The body: an array of lines or an IO::Handle object
A major difference between CGI::PSGI and CGI is that in the later anything written to STDOUT is sent to browser, whereas in the former the body is returned.

So the generation of content in the application must be changed, the source code case (line 26) is more simple than the CGI version because I just need to return the headers together with an IO::Handle object (created with IO::File). CGI::PSGI is responsible for reading the object's data and send it to the browser, if the handle is a real file (as in this case) and the operating system has sendfile(2) (as in linux), sending data is done entirely by the kernel so there is not difference in efficiency between this program and one optimized in C (like Apache).

In the case of HTML (line 29) I have changed the use of output_fh by output_string to store the content generated by Pod::Simple into $output, which is later returned in line 39.

As I can not longer use the STDOUT to send the content to the browser I can not use the shortcut $class->filter of Pod::Simple, so I've replaced it by its equivalent in lines 44 to 46 of the new application.

Although perhaps not obvious, the code returns the closure (line 20) because it is the last computed value in the file.

If we call our new program "server_pod", you can start it with plackup as follows:

$ plackup server_pod
Plack::Server::Standalone: Accepting connections at http://0:5000/

after that, POD content could be browsed as showed before, plackup is using it's default server (Plack::Server::Standalone) a single threaded process which is ideal for development or for personal use, but if you need a production-quality server you should see other options, for migrated CGI code I recommend Plack::Server::Standalone::Prefork, which may be started like this:

$ plackup -s Standalone::Prefork server_pod
Plack::Server::Standalone: Accepting connections at http://0:5000/

That was easy, default values are used for everything, but if you need to tune the server, you can give options to plackup, each server has specific options documented in their implementation class.

Finally, this code is much more efficient than the emulator in the first example because it does not need to use temporary files for capturing the standard output, but you can still run it in CGI, FastCGI or even mod-perl mode under Apache.

The next time I will improve the application to use Plack and its middleware.

PSGI y Plack: el futuro de las aplicaciones web

Este artículo no debió publicarse aquí, lo moví a donde pertenece en Perliscopio, disculpen la molestia.

This should not be published here, I moved the article where it belongs at Perliscopio, sorry for the inconvenience.

Saturday, November 21, 2009

Processing POD with Pod::Simple

[Original spanish content]
In the last article we translated POD to HTML easily for a minimal documentation server using CGI, today I wil expand the application enabling visualization of POD documents in a dozen different ways.

An useful option when I read the documentation at CPAN, is the ability to display the source code of the modules, so I'll add a link to view the source of a document, I'll put the link at the bottom of the document, setting the footer of the HTML conversion, I must also add logic to recognize the new type of link.

I will add a format parameter to the query, which will be interpreted with at line 12, to be compatible with the previous version, I will make this parameter optional defaulting to HTML (line 11):

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use CGI;
 5 use CGI::Carp 'fatalsToBrowser';
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my $q        = new CGI;
10 my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
11 my $format   = $q->param("format") || "HTML";
12 given ($format) {
13     when ("source") {
14         print $q->header("text/plain");
15         open POD, $filename;
16         print $_ while (<POD>);
17     }
18     when ('HTML') {
19         my $parser = Pod::Simple::HTML->new;
20         print $q->header("text/html");
21         $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
22         my $footer = "<hr>" . make_link("source");
23         $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
24         $parser->output_fh(*STDOUT);
25         $parser->parse_file($filename);
26     }
27     default {
28         die("Formato desconocido '$format'");
29     }
30 }
31 
32 sub make_link {
33     my $fmt = shift;
34     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
35 }

This is much bigger than our previous application, however this architecture will prove its flexibility very soon when combined with Pod::Simple and friends. Showing source code is a snap, just send the header (line 14) and then the rest of the file without further processing.

The make_link subroutine helps in the creation of links with the format parameter, using the URL being visited (including the query), and though it is used only once (line 22), we'll use more as wwe add conversion formats to the application.

Another used module was CGI::Carp with the "fatalsToBrowser" option which sends fatal errors to the browser, if you want to try this, just put an unknown format and see the error message in the browser.

Having said that let's translate POD to Wiki, I will use "Pod::Simple::Wiki" which has converters for at least 9 different wiki formats, so no matter if you use Mediawiki or Twiki, you can always write your articles in POD :-)

Since Perl is dynamic, flexible a easy, I'm going to add all formats at once, for which I need an array with all the supported formats (line 9) and a map of formats associated with their POD translators (line 10):

 1 #!/usr/bin/perl
 2
 3 use Modern::Perl;
 4 use CGI;
 5 use CGI::Carp 'fatalsToBrowser';
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my @wikis   = qw(Usemod Twiki Template Kwiki Confluence Moinmoin Tiddlywiki Mediawiki Textile);
10 my %formats = (
11     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
12 );
13 
14 my $q        = new CGI;
15 my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
16 my $format   = $q->param("format") || "HTML";
17 given ($format) {
18     when ("source") {
19         print $q->header("text/plain");
20         open POD, $filename;
21         print $_ while (<POD>);
22     }
23     when ('HTML') {
24         my $parser = Pod::Simple::HTML->new;
25         print $q->header("text/html");
26         $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
27         my $footer = "<hr>" . make_link("source")
28             . " | Wiki formats: "
29             . join( " ", map { make_link($_) } @wikis );
30         $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
31         $parser->output_fh(*STDOUT);
32         $parser->parse_file($filename);
33     }
34     when (%formats) {
35         my $class = $formats{$format};
36         eval "require $class";
37         print $q->header( "text/plain" );
38         $class->filter($filename);
39     }
40     default {
41         die("Formato desconocido '$format'");
42     }
43 }
44 
45 sub make_link {
46     my $fmt = shift;
47     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
48 }

Most work is done to when some of the new formats is recognized at line 34, where we get the class of "Pod::Simple::Wiki" that implements the translation, then dynamically require this class through eval (line 36), thus we don' t have to load all the wiki translators at the beginning of the program, using just the needed bits for the desired translation, then we sent the content type and the translated (filtered) POD to the browser.

Finally I includes links to the different formats in the footer, which is done during the generation of HTML page (lines 27 to 29).

If want to include some of the documentation in a printed manual you probably want to convert POD for tools most suited to this work. Lets translate POD to RTF and LaTeX which should not be very difficult because there are already classes in the CPAN to do this, the first is to generalize the type of content sent to the browser, allowing to use it for different formats:

37         print $q->header( $content_types{$format} || "text/plain" );

This assumes that there is a hash that will associate the formats with their content type, we'll also use this map to create links to new types of content:

27         my $footer = "<hr>" . make_link("source")
28             . join( " | ", map { make_link($_) } keys %content_types )
29             . " Wiki formats: "
30             . join( " | ", map { make_link($_) } @wikis );

The content type map may be added at the beginning:

 9 my %content_types = (
10     RTF    => "application/rtf",
11     LaTeX  => "application/x-latex",
12 );

and don't forget that every format must be listed in the %formats hash to be recognized and processed:

14 my %formats = (
15     ( map { $_ => "Pod::Simple::$_" } keys %content_types ),
16     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
17 );

Now you can convert to RTF, which will surely start your favorite office suite, and in the case of LaTeX probably will download the file.

I will include a final format: PDF, this will be more complex because there is no CPAN module to translate POD to PDF, so I'm going to make me one, based on Pod:: Simple (line 5), which use LaTeX as an intermediate format to create the PDF.

 1 package Pod::Simple::PDF;
 2 
 3 use Modern::Perl;
 4 use Pod::Simple::LaTeX;
 5 use base "Pod::Simple";
 6 
 7 use File::Temp;
 8 use File::Spec::Functions;
 9 use IO::File;
10 use IO::Handle;
11 
12 sub new {
13     my $class = shift;
14     return bless { output_fh => \*STDOUT }, ref $class || $class;
15 }
16 
17 sub parse_file {
18     my $self = shift;
19     my $file = shift;
20 
21     my $dir      = File::Temp->newdir();
22     my $tex_name = catfile( $dir, "pod.tex" );
23     my $texf     = IO::File->new( $tex_name, "w" );
24     my $parser   = Pod::Simple::LaTeX->new;
25     $parser->output_fh($texf);
26     $parser->parse_file($file);
27     $texf->close;
28     `cd '$dir'; pdflatex '$tex_name'; pdflatex '$tex_name'`;
29     my $in = IO::File->new( catfile( $dir, "pod.pdf" ), "r" );
30     $self->{'output_fh'}->print($_) while readline($in);
31 }
32 
33 1;

Perhaps the reason of why there is not a PDF converter module is because there is not a very portable way to do it, I'll use the pdflatex tool that is part of TeX Live, because I suppose that it can be installed both on Unix and Windows, although any modern unix TeX distribution should include this tool.

The parse_file method creates a temporary directory using File::Temp->newdir then creates the file pod.tex (within the temporary directory), which is used to store the result of conversion performed with Pod:: Simple:: LaTeX, this file is now processed with the command 'pdflatex' (line 27) that produces the file 'pod.pdf' (and some other useless files) into the temporary directory.

Many things can go wrong at line 27 because the simple method I'm using to execute the tool have very little control over what happens there, in an improved implementation we should use modules like IPC::Run3 to control the execution of the tools and act appropriately on any failures that might occur, yet one of the interesting features of Perl is that you can make prototypes like this quickly and trefine them later.

In lines 28 & 29 'pod.pdf' is sent to the browser, and when the parse method returns, $dir variable goes out of scope and the object File::Temp is destroyed, deleting the temporary directory along with everything inside it.

Once this module is stored in the right place, where CPAN places it when  the module is packaged as instructed in perlmodlib (though now just for testing you may put PDF.pm in the same directory of Pod::Simple::HTML).

Finally a new content type (PDF) must be added, now this is very easy just add it to the hash %content_types (line 12) and that's it, we have a server capable of showing POD in over a dozen formats:

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use CGI;
 5 use CGI::Carp 'fatalsToBrowser';
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my %content_types = (
10     RTF    => "application/rtf",
11     LaTeX  => "application/x-latex",
12     PDF    => "application/pdf",
13 );
14 my @wikis   = qw(Usemod Twiki Template Kwiki Confluence Moinmoin Tiddlywiki Mediawiki Textile);
15 my %formats = (
16     ( map { $_ => "Pod::Simple::$_" } keys %content_types ),
17     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
18 );
19 
20 my $q        = new CGI;
21 my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
22 my $format   = $q->param("format") || "HTML";
23 given ($format) {
24     when ("source") {
25         print $q->header("text/plain");
26         open POD, $filename;
27         print $_ while (<POD>);
28     }
29     when ('HTML') {
30         my $parser = Pod::Simple::HTML->new;
31         print $q->header("text/html");
32         $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
33         my $footer = "<hr>"
34             . join( " ", map { make_link($_) } "source", keys %content_types )
35             . " | Wiki formats: "
36             . join( " ", map { make_link($_) } @wikis );
37         $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
38         $parser->output_fh(*STDOUT);
39         $parser->parse_file($filename);
40     }
41     when (%formats) {
42         my $class = $formats{$format};
43         eval "require $class";
44         print $q->header( $content_types{$format} || "text/plain" );
45         $class->filter($filename);
46     }
47     default {
48         die("Formato desconocido '$format'");
49     }
50 }
51 
52 sub make_link {
53     my $fmt = shift;
54     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
55 }

Sunday, November 15, 2009

Perl documetantion tools

[Original spanish source]
Perl has its own documentation format called POD (Plain Old Documentation), this format is structured and was specifically designed to be easily manipulated. POD is used not only as a tool for documenting Perl, but as Wiki language and even for book writing.

In perl the most popular tool for reading the documentation is perldoc, that works in the same way that the unix man(1), to show the IO::File module documentation:

$ perldoc IO::Handle

we may get the manuals in LaTeX o html format just by adding options to perldoc:

$ perldoc -T -o LaTeX IO::Handle > IO::Handle.tex
$ perldoc -T -o html IO::Handle > IO::Handle.html

If we see the generated HTML you will realize that the links point to  the CPAN (they are not relative to the processed file), this is just the perldoc way, but there are hundreds of modules to process POD, allowing advanced manipulation and conversions to HTML, XML, LaTeX, texto and DocBook, among others.

When you need more control over the generation of documents, you can use other tools such as: pod2html and pod2latex that create documents based on multiple POD files which are processed together, for example to make a book where each chapter is stored in a different POD file.

If you need total control over the conversion process, you can always program using the modules from the CPAN, one of the easiest to use is Pod::Simple, which offers several predefined conversions, for example you may generate HTML in a CGI application with ease:

1 use CGI;
2 use Pod::Simple::HTML;
3 
4 my $q = new CGI;
5 my $parser = Pod::Simple::HTML->new;
6 $parser->output_fh(*STDOUT);
7 
8 print $q->header("text/html");
9 $parser->parse_file("/usr/share/perl/5.8/IO/File.pod");

This program initializes the CGI and Pod::Simple::HTML objects (lines 4 to 6), sends the HTTP headers (line 6) and finally sends the converted POD as an HTML document (line 9).

In this case you must know the exact name of the POD file you want to send, however if you want to know the name of a file containing information about a particular module, you should look for it, but where?. The answer is: in the same places where perl looks for its modules and programs.

The @INC variable contains the places where perl looks for modules used in programs, this is a combination of predefined locations when compiling perl, the contents of the PERL5LIB environment variable and places specified with "use lib" in the perl code. On the other hand when you must run a program, perl will search for it along the PATH environment variable, so to find the file containing the POD for a Perl module or program you can use a function like find_pod shown below:

 1 use Modern::Perl;
 2 use Env::Path;
 3 use File::Spec::Functions;
 4 
 5 sub find_pod
 6 {
 7     my $module = shift;
 8     my @module_path = split("::", $module);
 9     for my $dir ( @INC, Env::Path->PATH->List ) {
10         for my $ext ( '', '.pod', '.pm', '.pl' ) {
11             my $name = catfile($dir, @module_path) . $ext;
12             return $name if -e $name;
13         }
14     }
15     return undef;
16 }
17 
18 print "Nombre: ", find_pod(@ARGV), "\n";

This function receives the name of the module or program, then split the names on "::" and finally iterates all directories in @INC and the system's PATH environment variable, which is converted to a list using "Env::Path->PATH->List" (line 9), then for each directory it looks for the names alone and the arguments with the extensions: pod, pm and pl, the first match found is returned or undef is none is found.

I used "Env::Path" to get the system PATH in a portable way and "File::Spec::Functions" which imports "catfile" to make pathnames also portable between Unix and Windows.

But I made this just for fun, because CPAN already has something better: "Pod::Simple::Search", which is well done and can be easily installed from your favorite mirror, this is way more flexible than my toy subroutine, and I will use it to improve the code allowing to show PODs by module or program name:

 1 #!/usr/bin/perl
 2 use CGI;
 3 use Pod::Simple::HTML;
 4 use Pod::Simple::Search;
 5 
 6 my $q = new CGI;
 7 my $parser = Pod::Simple::HTML->new;
 8 $parser->output_fh(*STDOUT);
 9 
10 my $filename = Pod::Simple::Search->new->inc(1)->find($q->param("pod"));
11 print $q->header("text/html");
12 $parser->parse_file($filename);

If you have a web server already configured, just copy the file in the CGI-BIN directory with the name "perldocweb" and add executable privileges, you may test it by using the following URL in your favorite browser:

http://localhost/cgi-bin/perldocweb?pod=IO::File

it will show the IO::File manual, although the links still point to CPAN.

To fix the links we must set the perldoc_url_prefix to point to our documentation server, I will use CGI's url() method as shown in line 12, which returns the full script URL (without the query):

 1 #!/usr/bin/perl
 2 use CGI;
 3 use Pod::Simple::HTML;
 4 use Pod::Simple::Search;
 5 
 6 my $q = new CGI;
 7 my $parser = Pod::Simple::HTML->new;
 8 $parser->output_fh(*STDOUT);
 9 
10 my $filename = Pod::Simple::Search->new->inc(1)->find($q->param("pod"));
11 print $q->header("text/html");
12 $parser->perldoc_url_prefix($q->url(-path_info=>1) . "?pod=");
13 $parser->parse_file($filename);

So far so good, a fairly simple documentation server in just 13 lines, the next time I will convert POD to many formats, meanwhile you can install Pod::Server which shows a better and more elegant way to do a documentation server.

Thursday, November 5, 2009

Perl error handling

[Original spanish article]

Exception handling in Perl is a bit different than we are probably used to, particularly Perl has no try/catch/throw as some other languages, but that doesn't mean that it can't do exception handling, Perl can catch and handle exceptions as well as any other language but it has a slightly different structure.

Exception handling in Perl is based on the use of the eval operator, which allows the evaluation of code and error catching, when eval receives a string, it compiles the code inside it and executes it, however any error that happens in the code, from the compilation to execution would abort the only the eval while our program will continue its execution, for example:

1 use Modern::Perl;
2 my $result = eval( "5 / 0" );
3 say "El resultado es: $result";

Although the program works, the result of eval is undef, because division by zero prevented the return of any value, this also causes a warning on line 3 about the use of an uninitialized value.

What we need to know is whether the eval was successful or not, and that information is in the special variable $@ (also known as $EVAL_ERROR if we use the module English).

1 use Modern::Perl;
2 my $result = eval( "5 / 0" );
3 if ( $@ ) {
4     say "Ooops: $@";
5 }
6 else {
7     say "El resultado es: $result";
8 }

The problem with this solution is that the code within the string is not checked at compile time, because it is compiled at run time, and although this is extremely powerful, in most cases we are just interested in the eval's ability to catch errors, the second form of eval, takes a block of code that is checked during compilation of the program, and we can use it like this:

2 my $result = eval { 5 / 0 };

In this form of eval, the braces ({}) mark the catch block where exception handling is required and returns the last expression of this block, or undef if an error occurs while executing it (because it has already been compiled altogether with the containing program).

The last primitive we need to complete Perl's exception system is die, which allows to throw an exception, this routine receives a value that is assigned to the variable $@, so we could make a program that throws an exception like this:

 1 use Modern::Perl;
 2 use IO::File;
 3 
 4 eval {
 5     my $fh = IO::File->new("AlgunArchivo.txt", "r");
 6     die("No se puede abrir") unless $fh;
 7 };
 8 if ( $@ ) {
 9     say "Ooops: $@";
10 }

Some people may think that this way of capturing exceptions is archaic, however, it is as good as any other, and with the facilities of Perl could be used as basis for implementing a structure similar to that of other languages, something like try/catch. As I've already said on other articles Perl is an excellent language for implementing new features based on the language primitives, and I will roll my own version of try/catch just for fun:

 1 use Modern::Perl;
 2 use IO::File;
 3
 4 sub try(&amp;) {
 5     eval { shift-&gt;() };
 6 }
 7
 8 sub catch(&amp;) {
 9     if ( $@ ) {
10         local $_ = $@;
11         shift-&gt;();
12     }
13 }
14
15 try {
16     my $fh = IO::File-&gt;new( "AlgunArchivo.txt", "r" );
17     die("No se puede abrir") unless $fh;
18 };
19 catch {
20     say "Ooops: $_";
21 };


Here the Perl prototype (&) allows subroutines try and catch to receive a closure, but the prototype will allow to remove the sub declaration, pretending that try and catch are control structures with an associated code block, while they are just plain subroutines whose first parameter is a closure, and thus can be invoked, at line 5 the first argument is removed (with shift) and used to execute the closure (with ->()) within the eval, so any exception inside the closure code will abort the eval and exit the try subroutine.

When used after a try, catch localizes any value of $@ in $_ and runs the closure, which can use $_ as the value of the exception.

To make an extension that allows to use the newly created structures, we just make a new module, I will call MyTryCatch and should be in the file "MyTryCatch.pm":


 1 package MyTryCatch;
 2
 3 use Exporter;
 4
 5 our $VERSION = "1.000";
 6 our @EXPORT_OK = qw( try catch );
 7 our @EXPORT = @EXPORT_OK;
 8
 9 sub try(&amp;) {
10     eval { shift-&gt;() };
11 }
12
13 sub catch(&amp;) {
14     if ( $@ ) {
15         local $_ = $@;
16         shift-&gt;();
17     }
18 }

19

20 1;

Thus we may use the new structure in any program easily:

 1 use Modern::Perl;
 2 use IO::File;
 3 use MyTryCatch;
 4 
 5 try {
 6     my $fh = IO::File->new( "AlgunArchivo.txt", "r" );
 7     die("No se puede abrir") unless $fh;
 8 };
 9 catch {
10     say "Ooops: $_";
11 };

The primitives just created have some defects, for example the allow the use of a catch without a catch, a return statement within a try or catch block will exit the block and not the enclosing subroutine, among others. However with a bit more effort we could make an extension that declares a structure that behaves better.

There are several CPAN modules that let to do exceptions handling  from the simplest Try::Tiny, who suffers from some drawbacks of MyTryCatch to the most complex TryCatch that uses deep magic from Devel::Declare to make an exception handling structure with almost anything you can imagine.

If your requirements are not demanding my recommendation is to use Try::Tiny, it is tiny, has almost no dependencies and is easy to install, on the other hand if you want an exception handling system that does everything, you do not mind much about resource consumption and have the patience to install dozens of modules, you can use TryCatch.