Saturday, November 21, 2009

Processing POD with Pod::Simple

[Original spanish content]
In the last article we translated POD to HTML easily for a minimal documentation server using CGI, today I wil expand the application enabling visualization of POD documents in a dozen different ways.

An useful option when I read the documentation at CPAN, is the ability to display the source code of the modules, so I'll add a link to view the source of a document, I'll put the link at the bottom of the document, setting the footer of the HTML conversion, I must also add logic to recognize the new type of link.

I will add a format parameter to the query, which will be interpreted with at line 12, to be compatible with the previous version, I will make this parameter optional defaulting to HTML (line 11):

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use CGI;
 5 use CGI::Carp 'fatalsToBrowser';
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my $q        = new CGI;
10 my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
11 my $format   = $q->param("format") || "HTML";
12 given ($format) {
13     when ("source") {
14         print $q->header("text/plain");
15         open POD, $filename;
16         print $_ while (<POD>);
17     }
18     when ('HTML') {
19         my $parser = Pod::Simple::HTML->new;
20         print $q->header("text/html");
21         $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
22         my $footer = "<hr>" . make_link("source");
23         $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
24         $parser->output_fh(*STDOUT);
25         $parser->parse_file($filename);
26     }
27     default {
28         die("Formato desconocido '$format'");
29     }
30 }
31 
32 sub make_link {
33     my $fmt = shift;
34     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
35 }

This is much bigger than our previous application, however this architecture will prove its flexibility very soon when combined with Pod::Simple and friends. Showing source code is a snap, just send the header (line 14) and then the rest of the file without further processing.

The make_link subroutine helps in the creation of links with the format parameter, using the URL being visited (including the query), and though it is used only once (line 22), we'll use more as wwe add conversion formats to the application.

Another used module was CGI::Carp with the "fatalsToBrowser" option which sends fatal errors to the browser, if you want to try this, just put an unknown format and see the error message in the browser.

Having said that let's translate POD to Wiki, I will use "Pod::Simple::Wiki" which has converters for at least 9 different wiki formats, so no matter if you use Mediawiki or Twiki, you can always write your articles in POD :-)

Since Perl is dynamic, flexible a easy, I'm going to add all formats at once, for which I need an array with all the supported formats (line 9) and a map of formats associated with their POD translators (line 10):

 1 #!/usr/bin/perl
 2
 3 use Modern::Perl;
 4 use CGI;
 5 use CGI::Carp 'fatalsToBrowser';
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my @wikis   = qw(Usemod Twiki Template Kwiki Confluence Moinmoin Tiddlywiki Mediawiki Textile);
10 my %formats = (
11     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
12 );
13 
14 my $q        = new CGI;
15 my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
16 my $format   = $q->param("format") || "HTML";
17 given ($format) {
18     when ("source") {
19         print $q->header("text/plain");
20         open POD, $filename;
21         print $_ while (<POD>);
22     }
23     when ('HTML') {
24         my $parser = Pod::Simple::HTML->new;
25         print $q->header("text/html");
26         $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
27         my $footer = "<hr>" . make_link("source")
28             . " | Wiki formats: "
29             . join( " ", map { make_link($_) } @wikis );
30         $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
31         $parser->output_fh(*STDOUT);
32         $parser->parse_file($filename);
33     }
34     when (%formats) {
35         my $class = $formats{$format};
36         eval "require $class";
37         print $q->header( "text/plain" );
38         $class->filter($filename);
39     }
40     default {
41         die("Formato desconocido '$format'");
42     }
43 }
44 
45 sub make_link {
46     my $fmt = shift;
47     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
48 }

Most work is done to when some of the new formats is recognized at line 34, where we get the class of "Pod::Simple::Wiki" that implements the translation, then dynamically require this class through eval (line 36), thus we don' t have to load all the wiki translators at the beginning of the program, using just the needed bits for the desired translation, then we sent the content type and the translated (filtered) POD to the browser.

Finally I includes links to the different formats in the footer, which is done during the generation of HTML page (lines 27 to 29).

If want to include some of the documentation in a printed manual you probably want to convert POD for tools most suited to this work. Lets translate POD to RTF and LaTeX which should not be very difficult because there are already classes in the CPAN to do this, the first is to generalize the type of content sent to the browser, allowing to use it for different formats:

37         print $q->header( $content_types{$format} || "text/plain" );

This assumes that there is a hash that will associate the formats with their content type, we'll also use this map to create links to new types of content:

27         my $footer = "<hr>" . make_link("source")
28             . join( " | ", map { make_link($_) } keys %content_types )
29             . " Wiki formats: "
30             . join( " | ", map { make_link($_) } @wikis );

The content type map may be added at the beginning:

 9 my %content_types = (
10     RTF    => "application/rtf",
11     LaTeX  => "application/x-latex",
12 );

and don't forget that every format must be listed in the %formats hash to be recognized and processed:

14 my %formats = (
15     ( map { $_ => "Pod::Simple::$_" } keys %content_types ),
16     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
17 );

Now you can convert to RTF, which will surely start your favorite office suite, and in the case of LaTeX probably will download the file.

I will include a final format: PDF, this will be more complex because there is no CPAN module to translate POD to PDF, so I'm going to make me one, based on Pod:: Simple (line 5), which use LaTeX as an intermediate format to create the PDF.

 1 package Pod::Simple::PDF;
 2 
 3 use Modern::Perl;
 4 use Pod::Simple::LaTeX;
 5 use base "Pod::Simple";
 6 
 7 use File::Temp;
 8 use File::Spec::Functions;
 9 use IO::File;
10 use IO::Handle;
11 
12 sub new {
13     my $class = shift;
14     return bless { output_fh => \*STDOUT }, ref $class || $class;
15 }
16 
17 sub parse_file {
18     my $self = shift;
19     my $file = shift;
20 
21     my $dir      = File::Temp->newdir();
22     my $tex_name = catfile( $dir, "pod.tex" );
23     my $texf     = IO::File->new( $tex_name, "w" );
24     my $parser   = Pod::Simple::LaTeX->new;
25     $parser->output_fh($texf);
26     $parser->parse_file($file);
27     $texf->close;
28     `cd '$dir'; pdflatex '$tex_name'; pdflatex '$tex_name'`;
29     my $in = IO::File->new( catfile( $dir, "pod.pdf" ), "r" );
30     $self->{'output_fh'}->print($_) while readline($in);
31 }
32 
33 1;

Perhaps the reason of why there is not a PDF converter module is because there is not a very portable way to do it, I'll use the pdflatex tool that is part of TeX Live, because I suppose that it can be installed both on Unix and Windows, although any modern unix TeX distribution should include this tool.

The parse_file method creates a temporary directory using File::Temp->newdir then creates the file pod.tex (within the temporary directory), which is used to store the result of conversion performed with Pod:: Simple:: LaTeX, this file is now processed with the command 'pdflatex' (line 27) that produces the file 'pod.pdf' (and some other useless files) into the temporary directory.

Many things can go wrong at line 27 because the simple method I'm using to execute the tool have very little control over what happens there, in an improved implementation we should use modules like IPC::Run3 to control the execution of the tools and act appropriately on any failures that might occur, yet one of the interesting features of Perl is that you can make prototypes like this quickly and trefine them later.

In lines 28 & 29 'pod.pdf' is sent to the browser, and when the parse method returns, $dir variable goes out of scope and the object File::Temp is destroyed, deleting the temporary directory along with everything inside it.

Once this module is stored in the right place, where CPAN places it when  the module is packaged as instructed in perlmodlib (though now just for testing you may put PDF.pm in the same directory of Pod::Simple::HTML).

Finally a new content type (PDF) must be added, now this is very easy just add it to the hash %content_types (line 12) and that's it, we have a server capable of showing POD in over a dozen formats:

 1 #!/usr/bin/perl
 2 
 3 use Modern::Perl;
 4 use CGI;
 5 use CGI::Carp 'fatalsToBrowser';
 6 use Pod::Simple::Search;
 7 use Pod::Simple::HTML;
 8 
 9 my %content_types = (
10     RTF    => "application/rtf",
11     LaTeX  => "application/x-latex",
12     PDF    => "application/pdf",
13 );
14 my @wikis   = qw(Usemod Twiki Template Kwiki Confluence Moinmoin Tiddlywiki Mediawiki Textile);
15 my %formats = (
16     ( map { $_ => "Pod::Simple::$_" } keys %content_types ),
17     ( map { $_ => "Pod::Simple::Wiki::$_" } @wikis )
18 );
19 
20 my $q        = new CGI;
21 my $filename = Pod::Simple::Search->new->inc(1)->find( $q->param("pod") );
22 my $format   = $q->param("format") || "HTML";
23 given ($format) {
24     when ("source") {
25         print $q->header("text/plain");
26         open POD, $filename;
27         print $_ while (<POD>);
28     }
29     when ('HTML') {
30         my $parser = Pod::Simple::HTML->new;
31         print $q->header("text/html");
32         $parser->perldoc_url_prefix( $q->url( -path_info => 1 ) . "?pod=" );
33         my $footer = "<hr>"
34             . join( " ", map { make_link($_) } "source", keys %content_types )
35             . " | Wiki formats: "
36             . join( " ", map { make_link($_) } @wikis );
37         $parser->html_footer(qq[\n<!-- end doc -->\n\n$footer</body></html>\n]);
38         $parser->output_fh(*STDOUT);
39         $parser->parse_file($filename);
40     }
41     when (%formats) {
42         my $class = $formats{$format};
43         eval "require $class";
44         print $q->header( $content_types{$format} || "text/plain" );
45         $class->filter($filename);
46     }
47     default {
48         die("Formato desconocido '$format'");
49     }
50 }
51 
52 sub make_link {
53     my $fmt = shift;
54     $q->a( { href => $q->url( -path_info => 1, -query => 1 ) . "\&format=$fmt" }, $fmt );
55 }

No comments:

Post a Comment