Generate PDF Add-On

Generate a PDF file from a single TWiki page

Introduction

This is a substantial re-write of the TWiki:Codev.PrintUsingPDF pdf script. It essentially accomplishes the same goal but with more integration with TWiki (e.g. use of TWiki variables for configuration and the TWiki rendering methods instead of using a hacked version of the TWiki::view for generating HTML) as well as more modular code structure (i.e. most of the logic is moved to a perl package).

We began using the TWiki:Codev.PrintUsingPDF pdf script to publish pages but found it somewhat TWiki unfriendly (no offense to the authors as it's a great idea). This version allows most of the script configuration (such as page size, table of contents, etc.) via preference variables in TWiki. It also allows header, footer, and title page configuration via TWiki topics.

Usage

  • The simplest way to activate PDF printing, is to add genpdf to front of the SKIN setting. This will replace the PatternSkin 'Print Version' action with 'Generate PDF'
    • eg Set SKIN=genpdf,pattern
  • Like the TWiki:Codev.PrintUsingPDF pdf script, a page can be published by substituting genpdf for view in the topic URL.
  • To make it even easier for novice TWiki users to use, we added a link (like edit and attach) to our view.tmpl to publish the current page (using the current topic as the document title). The line we added to our template is:
    • <a href="%SCRIPTURLPATH{genpdf}%/%WEB%/%TOPIC%?pdftitle=%TOPIC%">PDF</a>
  • You may also choose to replace the Printable (?skin=print) targets in your favourite skin with
[[%SCRIPTURL{genpdf}%/%WEB%/%TOPIC%][Printable]]
  • NEW You can now generate a PDF document containing all of the descendents of the base topic as separate chapters. For example, if you create a ParentTopic, then create FirstChild and SecondChild with ParentTopic as their parent topic, then create GrandChildOne with FirstChild as its parent (and so on), you get a tree like this:
ParentTopic
   - FirstChild
      - GrandChildOne
      - GrandChildTwo
   - SecondChild
      - GrandChildThree
If you add '?pdfrecursive=on' to the URL parameters, all of the topics will be rolled into the PDF.

Configuring the Script

The script can be configured either via URL parameters or web preference variables. URL parameters have precedence over web preference variables. If neither of these are present, the script will use hard-coded default variables. The general script configuration variables are explained in the next section while header/footer and title page configuration are explained in subsequent sections.

General Configuration

The following table shows the various configuration variables and their meaning. The first column gives the variable to use if passed in the URL. The second column shows the variable to use if using a TWiki preference variable (i.e. Set VARIABLE = ). The third column gives the default value if neither the URL nor TWiki preference variable is used. Note that URL variables have precedence over TWiki preference variables. For a more detailed description of the HTMLDOC related variables, see the HTMLDOC documentation at http://www.htmldoc.org.

URL Variable Twiki Preference VariableSorted ascending Default Value Example Explanation
pdfbanner GENPDFADDON_BANNER   Foobar Documentation System Used to override the banner of a title page.
pdfbodycolor GENPDFADDON_BODYCOLOR undef #CCff99 Specify the background colour of all pages
pdfbodyimage GENPDFADDON_BODYIMAGE   http://my.server.com/path/to/background.jpeg The image that will appear tiled in the background of every page
cover GENPDFADDON_COVER print print.nat Default cover to use for PDF generation
pdfduplex GENPDFADDON_DUPLEX undef True Option flag to set up the document for duplex printing. Headers and footers will swap position on alternating pages. Set it to anything for true.
pdffooter GENPDFADDON_FOOTER   .1. Specify content of footer, see http://www.htmldoc.org
pdfformat GENPDFADDON_FORMAT pdf14 pdf12 HTMLDOC output format
pdfheader GENPDFADDON_HEADER   .1. Specify content of header, see http://www.htmldoc.org
pdfheadershift GENPDFADDON_HEADERSHIFT 0 +3 Shift all headers up or down (for negative values) by this amount (e.g. H1 would become H3 for a value of 2).
pdfheadertopic GENPDFADDON_HEADERTOPIC   GenPDFHeaderFooterTopic The name of a topic that defines headers and footers using <-- HEADER LEFT "foobar" --> syntax
pdfheadfootfont GENPDFADDON_HEADFOOTFONT   Helvetica-Bold Font specification for headers and footers.
pdfheadfootsize GENPDFADDON_HEADFOOTSIZE   12 Sets the size of the header and footer text in points (1 point = 1/72nd inch)
pdfkeywords GENPDFADDON_KEYWORDS %FORMFIELD{"KeyWords"}% 'foo, bar, baz, zip' Used for PDF Keywords META info to help search engines
pdflogoimage GENPDFADDON_LOGOIMAGE   http://my.server.com/path/to/logo.jpeg The logo that will appear in a header or footer if you specify 'l' in the string (see http://www.htmldoc.org)
pdfmargins GENPDFADDON_MARGINS undef top:0.5in,bottom:2.5cm,left:12pt,right:15mm Specify the page margins (white space to edge of page)
pdfnumberedtoc GENPDFADDON_NUMBEREDTOC undef True Option flag for getting numbered headings and Table of Contents. Set it to anything for true.
pdforientation GENPDFADDON_ORIENTATION portrait landscape The page orientation (e.g. landscape or portrait)
pdfpagesize GENPDFADDON_PAGESIZE a4 letter The page size for PDF output
pdfpermissions GENPDFADDON_PERMISSIONS undef print,no-copy PDF Security permissions to disable print/copy etc. By default the PDF is not protected.
pdfrecursive GENPDFADDON_RECURSIVE undef on Include children of the base topic in the PDF
skin GENPDFADDON_SKIN pattern nat Default skin to use for PDF generation
pdfstruct GENPDFADDON_STRUCT book webpage use book for structured topics, i.e. when rendering a bunch of topics recursively; use webpage when printing a topic without a specific heading structure, i.e. if it is just a normal webpage or if it has got a special VIEW_TEMPLATE
pdfsubject GENPDFADDON_SUBJECT %FORMFIELD{"TopicHeadline"}% 'Foobar document creation' Used for PDF Subject META info to help search engines
pdfsubtitle GENPDFADDON_SUBTITLE   A short guide to creating foobar documents  
pdftitle GENPDFADDON_TITLE   Writing Foobars  
pdftitletopic GENPDFADDON_TITLETOPIC   GenPDFTitleTopic The name of a topic that defines the layout of the title page
pdftocfooter GENPDFADDON_TOCFOOTER ..i .i. See http://www.htmldoc.org/
pdftocheader GENPDFADDON_TOCHEADER ... l.. See http://www.htmldoc.org/
pdftoclevels GENPDFADDON_TOCLEVELS 5 3 Number of levels to include in the PDF table of contents (use 0 to disable the generation of a table of contents). Note that HTMLDOC generates a table of contents based on HTML headers in the page.
pdfwidth GENPDFADDON_WIDTH 860 1060 The pixel width of the browser (used to scale images--images wider than this will be truncated)

If using TWiki preference variables, copy them to the appropriate web preferences page.

  • Settings for the GenPDFAddOn Plugin
    • #Set GENPDFADDON_BANNER =
    • #Set GENPDFADDON_TITLE =
    • #Set GENPDFADDON_SUBTITLE =
    • #Set GENPDFADDON_HEADERTOPIC =
    • #Set GENPDFADDON_TITLETOPIC =
    • #Set GENPDFADDON_SKIN =
    • #Set GENPDFADDON_COVER =
    • #Set GENPDFADDON_RECURSIVE =
    • #Set GENPDFADDON_FORMAT =
    • #Set GENPDFADDON_TOCLEVELS =
    • #Set GENPDFADDON_PAGESIZE =
    • #Set GENPDFADDON_ORIENTATION =
    • #Set GENPDFADDON_WIDTH =
    • #Set GENPDFADDON_HEADERSHIFT =
    • #Set GENPDFADDON_KEYWORDS =
    • #Set GENPDFADDON_SUBJECT =
    • #Set GENPDFADDON_TOCHEADER =
    • #Set GENPDFADDON_TOCFOOTER =
    • #Set GENPDFADDON_HEADER =
    • #Set GENPDFADDON_FOOTER =
    • #Set GENPDFADDON_HEADFOOTFONT =
    • #Set GENPDFADDON_HEADFOOTSIZE =
    • #Set GENPDFADDON_BODYIMAGE =
    • #Set GENPDFADDON_LOGOIMAGE =
    • #Set GENPDFADDON_NUMBEREDTOC =
    • #Set GENPDFADDON_DUPLEX =
    • #Set GENPDFADDON_PERMISSIONS =
    • #Set GENPDFADDON_MARGINS =
    • #Set GENPDFADDON_BODYCOLOR =
    • #Set GENPDFADDON_STRUCT =

Limiting the PDF Generation Region

The add-on allows the user to define the region of the topic that should be included in the PDF generation (much like the TWiki %STARTINCLUDE% and %STOPINCLUDE% variables. In this case, HTML comments are used instead. Everything between these two comments will be included in the PDF generation. The rest of topic will be excluded.

  • Use <!-- PDFSTART --> to mark the starting point in the topic for PDF generation.
  • Use <!-- PDFSTOP --> to mark the stopping point in the topic for PDF generation.

Note that there can be multiple PDFSTART and PDFSTOP comment pairs in a single topic to selectively include/exclude multiple sections of the topic in the PDF document. (If you view this page in raw mode or edit it, you'll see an example of multiple PDFSTART/PDFSTOP sections to exclude the TWiki table of contents). If no PDFSTART/PDFSTOP comment pair appears in the topic, the entire topic text is used. In general, this should not be a problem except for title topics that include forms as the form meta-data will show up in a fairly illegible manner at the end of the document. Therefore, for topics that reference forms, a PDFSTART comment should be placed at the beginning of the topic and a PDFSTOP should be placed at the end.

TIP NOTE: all %META: tags are removed from the base topic. If you want to display form data, you should add %FORMFIELD{"field"}% tags to the topic or title topic.

Creating and Configuring a Title Page

The add-on allows the user to use a topic as a title page for PDF generation. Earlier versions of the add-on required that the title page be expressed using pure HTML as the title page topic was not TWiki rendered. The latest version of add-on, however, does full TWiki rendering of the title topic page like any other TWiki topic. In addition, the following variables can be passed with the URL to override their settings elsewhere (e.g. in the web preferences or TWiki preferences pages).

Also note that the PDFSTART and PDFSTOP HTML comments should be placed at the beginning and end of title topic. An example title page can be found at GenPDFExampleTitleTopic.

Creating and Configuring Headers and Footers

The add-on also allows the user to configure header and footer formats for both the main section of the document and the table of contents. Configuring the main header and footer is much like configuring a title page. You can select a TWiki topic to use for the header and footer. Remember to wrap the HTML comments that HTMLDOC uses for the header and footer between <!-- PDFSTART --> and <!-- PDFSTOP --> tags. The add-on will perform TWiki common variable substition within the HTMLDOC header/footer HTML comments. This will allow TWiki variables (such as %REVINFO{web="%WEB%" topic="%BASETOPIC%"}%) to be embedded in the headers and footers.

See the HTMLDOC documentation at http://www.htmldoc.org for details of the format of the header and footers. In addition, the genpdf script will perform variable substition for the %GENPDFADDON _BANNER%, %GENPDFADDON _TITLE%, and %GENPDFADDON _SUBTITLE% variables as it does for the title page. Finally, the PDFSTART and PDFSTOP HTML comments should be placed at the beginning and end of header/footer topic. An example header/footer page can be found at GenPDFExampleHeaderFooterTopic.

Frequently Asked Questions

How do I stop the table of contents from being generated?
For some topics, like User topics, it doesn't make any sense to have a table of contents generated so add 'pdftoclevels=0' as a URL parameter.
When I do a recursive PDF of WebHome it doesn't include all topics
That's because some topics distributed with TWiki don't have a parent association. If you really want to include every topic in the web, you should reparent them all with WebHome as the parent.

Add-On Installation Instructions

Note: You do not need to install anything on the browser to use this add-on. The following instructions are for the administrator who installs the add-on on the server where TWiki is running.

  • Install htmldoc from http://www.htmldoc.org/ (optionally use the patch in the Addon's zip file for headers on every page)
  • Download the ZIP file from the Add-on Home (see below)
  • Unzip GenPDFAddOn.zip in your twiki installation directory. Content:
    File: Description:
    data/TWiki/GenPDFAddOn.txt Add-on topic
    data/TWiki/GenPDFAddOnDemo.txt Example topic
    data/TWiki/GenPDFExampleTitleTopic.txt Example title topic
    data/TWiki/GenPDFExampleHeaderFooterTopic.txt Example header and footer topic
    bin/genpdf Add-on script
    lib/TWiki/Contrib/GenPDFAddOn.pm Add-on perl package
    lib/TWiki/Contrib/GenPDFAddOn/Config.spec  
    templates/view.genpdf.tmpl  

  • Adjust the script ownership to match your webserver configuration (e.g. chown nobody genpdf) if needed.
  • Make sure the script is executable (e.g chmod 755 genpdf).
  • Adjust the perl path in the genpdf script to match your perl installation location.
  • ALERT! BEFORE TWiki-4 Add the variable $htmldocCmd = "/path/to/htmldoc"; to your lib/TWiki.cfg file so TWiki can find the location of the htmldoc executable. Look for $fgrepCmd.
  • ALERT! TWiki-4 and later Configure the $TWiki::cfg{Extensions}{GenPDFAddOn}{htmldocCmd} = "/path/to/htmldoc"; using configure (in the Extensions section)
  • TIP Copy the preferences from above and paste them into TWikiPreferences, or the WebPreferences topic for a single web.
  • Test if the installation was successful:
    • Try loading this page
    • If it doesn't work, check your webserver logs for any problems. The most common issue is probably an htmldoc installation problem.

Known Bugs

  • Verbatim text runs off the page. This is a limitation of HTMLDOC. Preformatted text may run off the edge of the page and be truncated.
  • HTMLDOC crashes with segmentation faults. Eg it fails to generate TWikiDocumentation. I managed to get it to work a few times, but it generally fails. The error returned is Conversion failed: 'Inappropriate ioctl for device' at /var/www/twiki/lib/TWiki/Contrib/GenPDFAddOn.pm line XXX
  • Some pages don't have a header. HTMLDOC breaks the page for every level 1 heading (Eg. <h1>) but it doesn't write a header for the new page, so topics with lots of level 1 headings and not much content don't seem to have any headers. Therefore I patched htmldoc-1.8.24 to force a header for every new page:
*** htmldoc-1.8.24/htmldoc/ps-pdf.cxx   Sat Oct 30 05:53:59 2004
--- htmldoc-1.8.24/htmldoc/ps-pdf_force_header.cxx      Tue Jun 13 02:12:28 2005
***************
*** 1465,1471 ****

      pspdf_prepare_heading(page, print_page, pages[page].header, top,
                            page_text, sizeof(page_text),
!                         page > chapter_starts[chapter] ||
                              OutputType != OUTPUT_BOOK);
      pspdf_prepare_heading(page, print_page, pages[page].footer, 0,
                            page_text, sizeof(page_text));
--- 1465,1472 ----

      pspdf_prepare_heading(page, print_page, pages[page].header, top,
                            page_text, sizeof(page_text),
! /*                      page > chapter_starts[chapter] || */
!                         1 || /* force heading onto chapter front page */
                              OutputType != OUTPUT_BOOK);
      pspdf_prepare_heading(page, print_page, pages[page].footer, 0,
                            page_text, sizeof(page_text));

Add-On Info

  • Set SHORTDESCRIPTION = Generate a PDF file from a single TWiki page

Add-on Author: TWiki:Main/BrianSpinar, TWiki:Main/WadeTurland
Copyright: © 2005-2006 TWiki:Main.BrianSpinar
2008 TWiki:Main.SvenDowideit
© 2007-2011 TWiki:Main.TWikiContributors
License: GPL (GNU General Public License)
Add-on Version: 2011-01-24
Change History:  
2011-01-24: TWikibug:Item6638: Fix for TWiki-5.0; doc improvements; changing TWIKIWEB to SYSTEMWEB -- TWiki:Main.PeterThoeny
02 Jul 2008 added support for VIEW_TEMPLATE and COVER; fixed rendering of anchor and img tags; added pdfstruct parameter to print unstructured webpages as well; TWiki:Main.MichaelDaum
25 Jun 2008 added template activation and Configure script spec file TWiki:Main.SvenDowideit
25 Jun 2008 security and TWiki 4.2 fixes TWiki:Main.SvenDowideit
2 Nov 2007 Added new header and footer control (Bugs:Item4916) and fixed generation of wrong TWiki page (Bugs:Item4915)
23 Oct 2007 Fixed Bugs:Item4452 & Bugs:Item4885, compatibility with Perl 5.6 and missing images with SSL certificates
31 Aug 2007 Fixed Bugs:Item4530, improper rendering of lists
13196 Removed nop tags before sending to htmldoc, fixed Bugs:Item3549
11673 TWiki:Main/RickMach updated MIME type to pdf from x-pdf, fixed bug preventing disabling the TOC
9716 TWiki:Main/CrawfordCurrie added content-disposition header to files, so they download using a sensible file name
9683 TWiki:Main/CrawfordCurrie updated for TWiki-4
Version 0.6 (28 Jun 2005)
  • Less aggressive regex for removing twikiNewLink spans so it doesn't break when using the TWiki:Plugins.SpacedWikiWordPlugin
  • TIP Added 'recursive' preference to include chapters for all descendents of the base topic
  • Use File::Spec->tmpdir to specify the default directory for temporary files
Version 0.5 (16 Jun 2005)
  • Redirect to 'oops' URLs if permission denied or topic does not exist.
  • Removed twikiNewLink spans from title page so they render as normal text (without the blue ? mark).
  • Fully qualify image/href URLs on the title page.
  • Changed temp files to use OO style 'new File::Temp;' for better code portability.
Version 0.4 (13 Jun 2005)
  • Better security (now calls system($TWiki::htmldocCmd, @htmldocArgs) )
  • Checks return code of htmldoc and returns an error on failure
  • Validation of preferences
  • ALERT! Preferences changed to comply with TWikiPlugins standard
  • Better HTML3.2 compatibility for htmldoc 1.8.24 (downgrades a few elements)
  • Full integration of PDF META tags (optionally using 2 FORMFIELDs):
    • ==%FORMFIELD{"TopicHeadine"}%== for PDF Subject field
    • ==%FORMFIELD{"KeyWords"}%== for PDF Keywords field
    • all other PDF fields use topic info
  • More htmldoc options (margins, permissions, numbered TOC, logoimage, headfootfont) using preferences
  • Removed %TOC% fields so it only uses HTMLDOC's TOC
  • Title topic and header/footer topics are now fully rendered
  • Fixed the heading shifting function
  • Fully qualify links, making the document portable
  • HTMLDOC output goes to a temp file instead of stdout
  • Temp files now use 'GenPDFAddOn' prefix. (Eg. /tmp/GenPDFAddOn1wr3C48ibd.html)
Version 0.3 (12 Apr 2005)
  • Added full TWiki rendering to title topic page
  • Added TWiki common variable expansion to header/footer topic page
Version 0.2 (26 Mar 2005)
  • Fixed bug with table of contents generation (i.e. it was always generated even if pdftoclevels was set to 0)
  • Now allow multiple PDFSTART/PDFSTOP pairs within a single page to include/exclude multiple sections of the same page
  • Added Brent Robert's fix to allow operation with latest version (1.8.24) of HTMLDOC
Version 0.1 (30 Jan 2005)
  • Initial version
CPAN Dependencies: File::Temp (if for some reason you don't already have it installed )
Other Dependencies: HTMLDOC (http://www.htmldoc.org)
Perl Version: 5.005 or above
License: GPL
Add-on Home: http://TWiki.org/cgi-bin/view/Plugins/GenPDFAddOn
Feedback: http://TWiki.org/cgi-bin/view/Plugins/GenPDFAddOnDev
Appraisal: http://TWiki.org/cgi-bin/view/Plugins/GenPDFAddOnAppraisal

Related Topic: GenPDFAddOnDemo, GenPDFExampleHeaderFooterTopic, GenPDFExampleTitleTopic, TWikiAddOns

Edit | Attach | PDF | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | More topic actions
Topic revision: r5 - 2012-04-03 - TWikiAdminUser
 
This site is powered by the TWiki collaboration platformCopyright © 1999-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback
Note: Please contribute updates to this topic on TWiki.org at TWiki:TWiki.GenPDFAddOn.