Sunday, November 15, 2009

Perl documetantion tools

[Original spanish source]
Perl has its own documentation format called POD (Plain Old Documentation), this format is structured and was specifically designed to be easily manipulated. POD is used not only as a tool for documenting Perl, but as Wiki language and even for book writing.

In perl the most popular tool for reading the documentation is perldoc, that works in the same way that the unix man(1), to show the IO::File module documentation:

$ perldoc IO::Handle

we may get the manuals in LaTeX o html format just by adding options to perldoc:

$ perldoc -T -o LaTeX IO::Handle > IO::Handle.tex
$ perldoc -T -o html IO::Handle > IO::Handle.html

If we see the generated HTML you will realize that the links point to  the CPAN (they are not relative to the processed file), this is just the perldoc way, but there are hundreds of modules to process POD, allowing advanced manipulation and conversions to HTML, XML, LaTeX, texto and DocBook, among others.

When you need more control over the generation of documents, you can use other tools such as: pod2html and pod2latex that create documents based on multiple POD files which are processed together, for example to make a book where each chapter is stored in a different POD file.

If you need total control over the conversion process, you can always program using the modules from the CPAN, one of the easiest to use is Pod::Simple, which offers several predefined conversions, for example you may generate HTML in a CGI application with ease:

1 use CGI;
2 use Pod::Simple::HTML;
3 
4 my $q = new CGI;
5 my $parser = Pod::Simple::HTML->new;
6 $parser->output_fh(*STDOUT);
7 
8 print $q->header("text/html");
9 $parser->parse_file("/usr/share/perl/5.8/IO/File.pod");

This program initializes the CGI and Pod::Simple::HTML objects (lines 4 to 6), sends the HTTP headers (line 6) and finally sends the converted POD as an HTML document (line 9).

In this case you must know the exact name of the POD file you want to send, however if you want to know the name of a file containing information about a particular module, you should look for it, but where?. The answer is: in the same places where perl looks for its modules and programs.

The @INC variable contains the places where perl looks for modules used in programs, this is a combination of predefined locations when compiling perl, the contents of the PERL5LIB environment variable and places specified with "use lib" in the perl code. On the other hand when you must run a program, perl will search for it along the PATH environment variable, so to find the file containing the POD for a Perl module or program you can use a function like find_pod shown below:

 1 use Modern::Perl;
 2 use Env::Path;
 3 use File::Spec::Functions;
 4 
 5 sub find_pod
 6 {
 7     my $module = shift;
 8     my @module_path = split("::", $module);
 9     for my $dir ( @INC, Env::Path->PATH->List ) {
10         for my $ext ( '', '.pod', '.pm', '.pl' ) {
11             my $name = catfile($dir, @module_path) . $ext;
12             return $name if -e $name;
13         }
14     }
15     return undef;
16 }
17 
18 print "Nombre: ", find_pod(@ARGV), "\n";

This function receives the name of the module or program, then split the names on "::" and finally iterates all directories in @INC and the system's PATH environment variable, which is converted to a list using "Env::Path->PATH->List" (line 9), then for each directory it looks for the names alone and the arguments with the extensions: pod, pm and pl, the first match found is returned or undef is none is found.

I used "Env::Path" to get the system PATH in a portable way and "File::Spec::Functions" which imports "catfile" to make pathnames also portable between Unix and Windows.

But I made this just for fun, because CPAN already has something better: "Pod::Simple::Search", which is well done and can be easily installed from your favorite mirror, this is way more flexible than my toy subroutine, and I will use it to improve the code allowing to show PODs by module or program name:

 1 #!/usr/bin/perl
 2 use CGI;
 3 use Pod::Simple::HTML;
 4 use Pod::Simple::Search;
 5 
 6 my $q = new CGI;
 7 my $parser = Pod::Simple::HTML->new;
 8 $parser->output_fh(*STDOUT);
 9 
10 my $filename = Pod::Simple::Search->new->inc(1)->find($q->param("pod"));
11 print $q->header("text/html");
12 $parser->parse_file($filename);

If you have a web server already configured, just copy the file in the CGI-BIN directory with the name "perldocweb" and add executable privileges, you may test it by using the following URL in your favorite browser:

http://localhost/cgi-bin/perldocweb?pod=IO::File

it will show the IO::File manual, although the links still point to CPAN.

To fix the links we must set the perldoc_url_prefix to point to our documentation server, I will use CGI's url() method as shown in line 12, which returns the full script URL (without the query):

 1 #!/usr/bin/perl
 2 use CGI;
 3 use Pod::Simple::HTML;
 4 use Pod::Simple::Search;
 5 
 6 my $q = new CGI;
 7 my $parser = Pod::Simple::HTML->new;
 8 $parser->output_fh(*STDOUT);
 9 
10 my $filename = Pod::Simple::Search->new->inc(1)->find($q->param("pod"));
11 print $q->header("text/html");
12 $parser->perldoc_url_prefix($q->url(-path_info=>1) . "?pod=");
13 $parser->parse_file($filename);

So far so good, a fairly simple documentation server in just 13 lines, the next time I will convert POD to many formats, meanwhile you can install Pod::Server which shows a better and more elegant way to do a documentation server.

No comments:

Post a Comment