The Design of Software (CLOSED)

A public forum for discussing the design of software, from the user interface to the code architecture. Now closed.

The "Design of Software" discussion group has been merged with the main Joel on Software discussion group.

The archives will remain online indefinitely.

PDF Compression

I am using Perl's PDF::API2 to combine different PDF into one big PDF. It works fine. Now I am getting the PDF as compressed with CVision Technology. They are about 1/10th of the size of original PDF. When I try to combine them into one ig PDF it creates PDF but gives problem in reading it. It gives ADOBE error and closes down PDF.  Any solution?

PDF Developer
Monday, August 15, 2005
Not unless you can be more specific about what the error is.

Try loading the resulting file into a GhostView variant, it tends to be a little more specific about errors. However even then chances are it will say "/undefined in <something>" and dump the stack.

This will, at least, give you what failed, but fixing it is still non-trivial...
Katie Lucas
Monday, August 15, 2005
What compression scheme are they using?
Looking at the PDF specs, you should see something after the /Filter tag. Such as /CCITTFaxDecode or /FlateDecode. It is possible that the Perl package can't handle the filters being used by the new software package.

You can download a PDF copy of the PDF specs from:
Monday, August 15, 2005
Well, the error is :
"Cannot parse the image"

They are originally tiff formats and comrpessed by 'Cvisiontech's pdfcompressor.  I noticed that they are adding some tags such as

/Decode [1.0 0]
/CVMRC /Mask
/CVWidth  2550
/CVHeight 3300
/CVXRes  300.00
/CVYRes 300.00
/Length 10 0 R
/Filter [/JBIG2Decode]


What do I do with PDF Specs? Is it possible to do strip these tags so that Perl Package can handle that?

PDF Developer
Tuesday, August 16, 2005
It looks like the CVision program is using JBig2 compression.

Reading, it looks like PDF::API2 uses COMPRESS::ZLIB. And it looks like COMPRESS::ZLIB doesn't support JBig. The LZW and Flate encodings (those 2 are the most common PDF encodings that I have seen) are part of the Zlib spec (RFC 1950, RFC 1951). The JBig filter was added in PDF 1.4 (Acrobat 5).

From page 15 of the PDF specs v1.5 we see:
"Using JPEG compression, color and grayscale images can be compressed by a factor of 10 or more. Effective compression of monochrome images depends on the compression filter used and the properties of the image, but reductions of 2:1 to 8:1 are common (or 20:1 to 50:1 for JBIG2 compression of an image of a page full of text)."

Which agrees with your remark that the documents are 1/10 of the original/earlier size. Perhaps you should contact the author of PDF::API2 or your customer who is supplying you with JBig2 compressed PDFs. Or perhaps contribute a modification to CPAN that will allow PDF::API2 to handle other compression filters.

>What do I do with PDF Specs?
Uh, read them. If you want to know what is happening inside your PDF documents, you will need to understand the specs. Or, at least where to hunt down what you need to know/understand inside the specs.

Tags like /CVXRes should not cause any problem. Since new tags are introduced everytime there is a new version of the PDF spec, software that handles PDF docs *must* be capable of handling some tag that it was not written to handle.

You, like most pdf developers, will build (or buy) a tool that lets you examine the internal structure of a PDF document. What are the objects? What dictionaries are present? Are they corrupted? Does it include fonts? The whole font? Just a few letters of the font?

The Acrobat developer SDK is only available via subscription nowadays, which I think is a big mistake. It is only $99/year, which may or may not be too much for your budget.
Tuesday, August 16, 2005

 That was really very useful information.
 I will try to let here know, what I did finally.

PDF Developer
Tuesday, August 16, 2005

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics
Powered by FogBugz