Printers.PDFWatermark History

Hide minor edits - Show changes to markup

Monday 20 June 2005, at 07:05 GMT+8 by Renaud -
Changed lines 1-2 from:

You want to add a watermark (background graphic) under a PDF.

to:

You want to add a watermark (background graphic) in a PDF.

Changed line 5 from:
to:
Monday 20 June 2005, at 06:45 GMT+8 by Renaud -
Deleted lines 0-1:

Abstract

Monday 20 June 2005, at 06:44 GMT+8 by Renaud -
Changed lines 3-158 from:

You want to add a watermark to all PDF created from a PDFprinter.

The issue

I got a request a few days ago from someone who needed to send invoices and other company documents over email. The problem was that all official documents were printed over pre-printed forms and for the emailed document to bear the company-imposed background, it had first to be printed, then each page scanned.

The main issue was, apart from the poor quality of the result, the time spent doing all the printing, scanning and converting amnually. There had to be a better solution.

While the main application runs under Windows, the printer server is a linux box. That allowed the creation of a modified PDFprinter to allow documents printed through it to automatically include a watermark.

The setup

First, we will need to create a virtual PDFprinter under Samba on the linux box. Although all the details are shown below, check the PDFprinter article on this topic for more details on the various ways to do this.

You will also need to install pdftk, an Open Sourced GPL command line utility that runs on most OS (including Windows) and manipulates PDF files.

Installing pdftk on an old RedHat machine worked well straight from the source tar ball provided at the bottom of the Build page. Note that you may have to run as root to be able to compile. For instance, on my machine, for pdktk v1.12:

# wget http://www.pdfhacks.com/pdftk/pdftk-1.12.tar.gz
# tar xzvf pdftk-1.12.tar.gz
# cd pdftk-1.12
# cd pdftk
# make -f Makefile.RedHat

Be patient while the software is being made as it takes a little while. Once compiled, you end-up with a single executable pdftk that you can optionally reduce in size using strip to remove unused symbols and copy into your /usr/bin directory with the proper execution rights (chmod 755).

# strip pdftk
# chmod 755 pdftk
# mv pdftk /usr/bin
# pdftk --help

The watermark

The watermark can be any PDF file. pdftk will only use the first page of the file as the watermark. This has its importance as we will see later.

http:/pub/images/PDFWatermark01sm.png

If you don't have a clean original source file for your company forms, consider re-making a clean one from a vector package like CorelDraw, Illustrator or even your simple office application rather than scanning an existing printed form: the scanned form will add a lot of weight to your final documents and if you need to use it as a foreground rather than a background watermark, it will completly cover the page underneath.

The first tests

Now that you have a PDF of your watermark, print PDF of a sample document from your invoicing application (for instance):

http:/pub/images/PDFWatermark02sm.png

Then use pdftk to add the watermark to our in.pdf:

# pdfdk in.pdf background watermark.pdf output invoice.pdf

If you're lucky, the resulting invoice.pdf will look just as you want, with the invoice text nicely aligned over the form.
Unfortunately, reality may not be as rosy and your resulting file may look more like this:

http:/pub/images/PDFWatermark03sm.pnghttp:/pub/images/PDFWatermark03detail.png

As you can see on the magnified detail, we have a problem: the alignment of the 2 images is off, resulting in the invoice data not being correctly aligned with the pre-printed table or fields present on the watermark (you should also first check that the paper size of your invoice is the same as the watermark to avoid creating more problems).

Basic server setup using Samba

For the sake of simplicity, and for those who are lucky enough that they are getting what they need, let' s see how we should declare our PDFWatermark printer. Make sure that you have a world-writable (chmod 777) /tmp/pdf_out folder accessible by everyone on your LAN as this folder will receive the output PDF.
Copy your watermark.pdf file into that directory.

Create the virtual printer under samba by editing your smb.conf file usually found under /etc/samba/ and adding the following section:

[pdf_out]
comment = PDF output
path = /tmp/pdf_out
read only = No
guest only = Yes
guest ok = Yes
<:vspace>
[PDFWatermark]
comment = PDF for pre-printed forms
path = /tmp/pdf_out
printable = Yes
guest ok = Yes
create mask = 0755
use client driver = Yes
print command = fixPS "%s";\
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile="%s.tmp.pdf" "%s"; \
pdftk "%s.tmp.pdf" background watermark.pdf output "%s.pdf"; \
rm "%s"; rm "%s.tmp.pdf";
lpq command =

The pdf_out share is accessible to everyone one the network. You may want a more restrictive setting depending on who is supposed to have access to the PDF.
The empty lpq command ensures that Windows gets no error when it queries the state of the print queue. Otherwise, the printer status will appear as deleting as Samba will keep trying to query a printer queue that does not really exists.

Now add a printer from your Windows box and choose the PDFWatermark on the network. For the printer driver, I get very good results from the HP Color Laserjet 5500 PS. Unfortunately, its drivers are not included by default on Windows XP or even Windows Server 2003 and you may need to download them from the HP website.Another alternative is the already included Apple Laserwriter 12/640, although I generally prefer the former as it allows the creation of large colour PDF files (up to A3 size).

Note:I also usually make sure that the printer options in the Advanced and General tabs of the printer properties set the correct paper size and that True type fonts are downloaded to the device rather than substituted. This ensures that the created PDF will look exactly as intended.

Now you can print to your virtual PDF printer from any application: the watermark will be systematically added to the background of every page.
If you want to change the watermark, just copy another PDF over the watermark.pdf file in the shared pdf_out folder.

Fixing the page offsets

Now, for those like me who had the problem the problem with the offset, between the overlayed documents, there is a small fix: the document needs to go though a small filter that will modify the PostScript file passed to Samba from Windows.

Copy and paste the following script into a file called fixPS and move it to the /usr/bin directory after making it executable (chmod 755):

#!/usr/bin/perl -w
my $file = $ARGV[0];
my $notDone = 1;
open IN, $file or die $!;
open OUT, ">TMP$file" or die $!;
while (<IN>) {
        print OUT $_;
        next unless $notDone && $_ =~ /%Page:/;
        print OUT "10 -13 translate\n";  # modify to suit your offset
        $notDone = 1;  # Change to 0 if effect is cumulative
}
close OUT;
close IN;
unlink $file;
rename TMP$file $file;

This is not a state-of-the-art optimised Perl script, but it does the job nicely:

  • the file given as argument is opened and a temporary file is created.
  • each line of the input file is read and saved until we encounter a PostScript %Page: directive.
  • after each such directive, we add a X Y translate call whose X and Y values depend on the magnitude of your offset problem. The units are in Point (1/72 inch). For X, a positive value offsets toward the left, for Y, a positive value offsets toward the top. The offset I used will move the data on the page about 3.5mm left and about 4.6mm down.
  • if you use a different PostScript driver than the one I suggested above, you may see the offset being cumulated on successive pages. In that case, set the $notDone to 0 so the offset is corrected on the first page only.

Now update your Samba configuration file:

[PDFWatermark]
...
print command = fixPS "%s";\
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile="%s.tmp.pdf" "%s"; \
pdftk "%s.tmp.pdf" background watermark.pdf output "%s.pdf"; \
rm "%s"; rm "%s.tmp.pdf";

You should now get a properly aligned PDF:

http:/pub/images/PDFWatermark04sm.pnghttp:/pub/images/PDFWatermark04detail.png

Overlaying the Watermark

You may want or need to add the watermark on top of the document instead of the background:

  • you may encounter an issue with some applications that surroung the printed text with fully white boxes. In that case, your resulting PDF will not look too good with portions of the background being partially obscured by those unnecessary text boxes.
  • you need to add a signature that must look like it is overlapping the text for instance.

To implement this feature is a bit more tricky than previously done: we need to swap the in.pdf with the watermark.pdf, but you may remember that pdftk uses the first page of the watermark file only, resulting in only one page of the printed document being output from our virtual printer.

To solve this, we need to break the original PDF into single pages, then apply the overlaid watermark to each page before re-assembling them into a single PDF.

Just update your Samba configuration file:

[PDFWatermark]
...
print command = fixPS "%s";\
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile="%s.tmp.pdf" "%s"; \
pdftk "%s.tmp.pdf" burst output PAGE%02d; \
for N in PAGE*; \
do pdftk watermark.pdf background $N output $N.TMP; \
done; \
pdftk PAGE*.TMP cat output "%s.pdf"; \
rm PAGE*; rm doc_data.txt; \
rm "%s"; rm "%s.tmp.pdf";

Consult the pdftk documentation for more details on the various actions being performed here.

Links

  • The pdf toolkit official page
  • An extensive list of linux text utilities
  • Some PostScript tips and Problems
  • Create a PDF Service with Samba
to:

You want to add a watermark (background graphic) under a PDF.

There are 2 variations to this:

Tuesday 14 June 2005, at 17:15 GMT+8 by Renaud -
Added line 158:
  • Create a PDF Service with Samba
Sunday 15 May 2005, at 10:32 GMT+8 by Renaud -
Added line 1:
Sunday 06 February 2005, at 07:40 GMT+8 by Renaud -
Changed lines 1-156 from:

Describe PDFWatermark here.

to:

Abstract

You want to add a watermark to all PDF created from a PDFprinter.

The issue

I got a request a few days ago from someone who needed to send invoices and other company documents over email. The problem was that all official documents were printed over pre-printed forms and for the emailed document to bear the company-imposed background, it had first to be printed, then each page scanned.

The main issue was, apart from the poor quality of the result, the time spent doing all the printing, scanning and converting amnually. There had to be a better solution.

While the main application runs under Windows, the printer server is a linux box. That allowed the creation of a modified PDFprinter to allow documents printed through it to automatically include a watermark.

The setup

First, we will need to create a virtual PDFprinter under Samba on the linux box. Although all the details are shown below, check the PDFprinter article on this topic for more details on the various ways to do this.

You will also need to install http://www.accesspdf.com/index.php?topic=pdftk pdftk, an Open Sourced GPL command line utility that runs on most OS (including Windows) and manipulates PDF files.

Installing pdftk on an old RedHat machine worked well straight from the source tar ball provided at the bottom of the http://www.accesspdf.com/article.php/20041129180128366 Build page. Note that you may have to run as root to be able to compile. For instance, on my machine, for pdktk v1.12:

# wget http://www.pdfhacks.com/pdftk/pdftk-1.12.tar.gz
# tar xzvf pdftk-1.12.tar.gz
# cd pdftk-1.12
# cd pdftk
# make -f Makefile.RedHat

Be patient while the software is being made as it takes a little while. Once compiled, you end-up with a single executable pdftk that you can optionally reduce in size using strip to remove unused symbols and copy into your /usr/bin directory with the proper execution rights (chmod 755).

# strip pdftk
# chmod 755 pdftk
# mv pdftk /usr/bin
# pdftk --help

The watermark

The watermark can be any PDF file. pdftk will only use the first page of the file as the watermark. This has its importance as we will see later.

http:/pub/images/PDFWatermark01.png http:/pub/images/PDFWatermark01sm.png

If you don't have a clean original source file for your company forms, consider re-making a clean one from a vector package like CorelDraw, Illustrator or even your simple office application rather than scanning an existing printed form: the scanned form will add a lot of weight to your final documents and if you need to use it as a foreground rather than a background watermark, it will completly cover the page underneath.

The first tests

Now that you have a PDF of your watermark, print PDF of a sample document from your invoicing application (for instance):

http:/pub/images/PDFWatermark02.png http:/pub/images/PDFWatermark02sm.png

Then use pdftk to add the watermark to our in.pdf:

# pdfdk in.pdf background watermark.pdf output invoice.pdf

If you're lucky, the resulting invoice.pdf will look just as you want, with the invoice text nicely aligned over the form.
Unfortunately, reality may not be as rosy and your resulting file may look more like this:

http:/pub/images/PDFWatermark03.png http:/pub/images/PDFWatermark03sm.pnghttp:/pub/images/PDFWatermark03detail.png

As you can see on the magnified detail, we have a problem: the alignment of the 2 images is off, resulting in the invoice data not being correctly aligned with the pre-printed table or fields present on the watermark (you should also first check that the paper size of your invoice is the same as the watermark to avoid creating more problems).

Basic server setup using Samba

For the sake of simplicity, and for those who are lucky enough that they are getting what they need, let' s see how we should declare our PDFWatermark printer. Make sure that you have a world-writable (chmod 777) /tmp/pdf_out folder accessible by everyone on your LAN as this folder will receive the output PDF.
Copy your watermark.pdf file into that directory.

Create the virtual printer under samba by editing your smb.conf file usually found under /etc/samba/ and adding the following section:

[pdf_out]
comment = PDF output
path = /tmp/pdf_out
read only = No
guest only = Yes
guest ok = Yes
<:vspace>
[PDFWatermark]
comment = PDF for pre-printed forms
path = /tmp/pdf_out
printable = Yes
guest ok = Yes
create mask = 0755
use client driver = Yes
print command = fixPS "%s";\
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile="%s.tmp.pdf" "%s"; \
pdftk "%s.tmp.pdf" background watermark.pdf output "%s.pdf"; \
rm "%s"; rm "%s.tmp.pdf";
lpq command =

The pdf_out share is accessible to everyone one the network. You may want a more restrictive setting depending on who is supposed to have access to the PDF.
The empty lpq command ensures that Windows gets no error when it queries the state of the print queue. Otherwise, the printer status will appear as deleting as Samba will keep trying to query a printer queue that does not really exists.

Now add a printer from your Windows box and choose the PDFWatermark on the network. For the printer driver, I get very good results from the HP Color Laserjet 5500 PS. Unfortunately, its drivers are not included by default on Windows XP or even Windows Server 2003 and you may need to download them from the http://www.hp.com/ HP website.Another alternative is the already included Apple Laserwriter 12/640, although I generally prefer the former as it allows the creation of large colour PDF files (up to A3 size).

Note:I also usually make sure that the printer options in the Advanced and General tabs of the printer properties set the correct paper size and that True type fonts are downloaded to the device rather than substituted. This ensures that the created PDF will look exactly as intended.

Now you can print to your virtual PDF printer from any application: the watermark will be systematically added to the background of every page.
If you want to change the watermark, just copy another PDF over the watermark.pdf file in the shared pdf_out folder.

Fixing the page offsets

Now, for those like me who had the problem the problem with the offset, between the overlayed documents, there is a small fix: the document needs to go though a small filter that will modify the PostScript file passed to Samba from Windows.

Copy and paste the following script into a file called fixPS and move it to the /usr/bin directory after making it executable (chmod 755):

#!/usr/bin/perl -w
my $file = $ARGV[0];
my $notDone = 1;
open IN, $file or die $!;
open OUT, ">TMP$file" or die $!;
while (<IN>) {
        print OUT $_;
        next unless $notDone && $_ =~ /%Page:/;
        print OUT "10 -13 translate\n";  # modify to suit your offset
        $notDone = 1;  # Change to 0 if effect is cumulative
}
close OUT;
close IN;
unlink $file;
rename TMP$file $file;

This is not a state-of-the-art optimised Perl script, but it does the job nicely:

  • the file given as argument is opened and a temporary file is created.
  • each line of the input file is read and saved until we encounter a PostScript %Page: directive.
  • after each such directive, we add a X Y translate call whose X and Y values depend on the magnitude of your offset problem. The units are in Point (1/72 inch). For X, a positive value offsets toward the left, for Y, a positive value offsets toward the top. The offset I used will move the data on the page about 3.5mm left and about 4.6mm down.
  • if you use a different PostScript driver than the one I suggested above, you may see the offset being cumulated on successive pages. In that case, set the $notDone to 0 so the offset is corrected on the first page only.

Now update your Samba configuration file:

[PDFWatermark]
...
print command = fixPS "%s";\
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile="%s.tmp.pdf" "%s"; \
pdftk "%s.tmp.pdf" background watermark.pdf output "%s.pdf"; \
rm "%s"; rm "%s.tmp.pdf";

You should now get a properly aligned PDF:

http:/pub/images/PDFWatermark04.png http:/pub/images/PDFWatermark04sm.pnghttp:/pub/images/PDFWatermark04detail.png

Overlaying the Watermark

You may want or need to add the watermark on top of the document instead of the background:

  • you may encounter an issue with some applications that surroung the printed text with fully white boxes. In that case, your resulting PDF will not look too good with portions of the background being partially obscured by those unnecessary text boxes.
  • you need to add a signature that must look like it is overlapping the text for instance.

To implement this feature is a bit more tricky than previously done: we need to swap the in.pdf with the watermark.pdf, but you may remember that pdftk uses the first page of the watermark file only, resulting in only one page of the printed document being output from our virtual printer.

To solve this, we need to break the original PDF into single pages, then apply the overlaid watermark to each page before re-assembling them into a single PDF.

Just update your Samba configuration file:

[PDFWatermark]
...
print command = fixPS "%s";\
gs -dNOPAUSE -dBATCH -q -sDEVICE=pdfwrite -sOutputFile="%s.tmp.pdf" "%s"; \
pdftk "%s.tmp.pdf" burst output PAGE%02d; \
for N in PAGE*; \
do pdftk watermark.pdf background $N output $N.TMP; \
done; \
pdftk PAGE*.TMP cat output "%s.pdf"; \
rm PAGE*; rm doc_data.txt; \
rm "%s"; rm "%s.tmp.pdf";

Consult the pdftk documentation for more details on the various actions being performed here.

Links

  • http://www.accesspdf.com/index.php?topic=pdftk The pdf toolkit official page
  • http://www.linuxlinks.com/Software/Utilities/Text_Utilities/ An extensive list of linux text utilities
  • http://www.csit.fsu.edu/~mimi/tex/postscript.html Some PostScript tips and Problems
Design by N.Design Studio, adapted by solidGone.org (version 1.0.0)
Powered by pmwiki-2.2.0-beta65