Really good TIFF compression?

Printable View

April 20th, 2004, 03:02 AM
Tuttle

Really good TIFF compression?

I have about half a gig of large compressed TIFF files we'd like to load onto an SD card and view inside a Pocket PC. Good in theory, but the files are so large when uncompressed we can't reliably open them.

An average file is about 10000x7000 pixels, about 8 or 9 MB uncompressed, and somehow they've been compressed to about 300 or 400 KB each.

What I need to do is shrink the file resolution (half should do, to 5000x3500) and then put the smaller files in the Pocket PC. Except even if I shrink the files to 2500x1750, I can't find a way to save them such that the file size is less than about 500 KB. 500 KB is too big - I can no longer fit everything on a single SD card. I've tried GIF, PNG and all the TIFF variants which IrfanView supports. JPEG isn't acceptable; they're line drawings and detail is vital.

Does anyone know of a really good lossless compression algorithm for TIFF files, and a program which implements it? A suitable algorithm obviously exists and was used to create the original files, but I have no idea what it is.
April 20th, 2004, 02:31 PM
photolady

What is the dpi settings at? The higher the dpi the higher the size of the file also. Being line drawings the dpi should only be about 150dpi. If it's more than that, try lowering it.

If that doesn't work might try getting in touch with a member by the name of falcon2000, he knows all about these types of things. resolution, dpi, lpi etc. ;)
April 21st, 2004, 12:02 AM
falcon2000

Got your message Photolady.

Tuttle,

Lossless JPEG may help. I can compress a 6MB file down to about 1MB with no detectable degradation in quality.

I use the Photoshop plug but you have a choice of that or one for the Windows (the -LIB file.) Irfanview and xNView both support JPEG-LS.

Which OS your pocket PC uses?
April 21st, 2004, 12:13 AM
falcon2000

I re-read your post and actually I am quite confused as to what happened and what you are trying to accomplish.

You said, "An average file is about 10000x7000 pixels, about 8 or 9 MB uncompressed, and somehow they've been compressed to about 300 or 400 KB each."

Then you said "I can't find a way to save them such that the file size is less than about 500 KB." I thought they have been compressed to about 300 to 400 KB each already. No?
April 21st, 2004, 12:24 AM
photolady

Thanks falcon2000 for coming in and helping Tuttle. ;)
April 21st, 2004, 12:25 AM
Tuttle

The Pocket PCs are running Windows Mobile 2003 Phone Edition.

As for the sizing, the originals are compressed TIFFs - they're 300 - 400 KB on disk, but when they're opened in memory they're uncompressed to about 8 MB in size - TIFF is basically a bitmap format. I then reduce the resolution by 50% (since it's the uncompressed size which is killing us on the PPC), which gives me about a 2 MB file in memory. I need to save that lower-res image back to disk so that it's no larger than the original file, but I can't get it below about 500 KB.

That's where I'm stuck right now - I need to get those lower res images back down to the original 300 or 400 KB, or else I can't fit everything on a 512 MB SD card.

I'll have a look at lossless JPEG if what I'm doing now (a batch job through Photoshop) doesn't work. It seems from the couple of test images I ran through that Photoshop's TIFF LZW compression is far better than IrfanView's...
April 21st, 2004, 12:53 AM
falcon2000

In that case no compmression is going to help ... Lossless JPEG or LZW tif, etc, b/c when the fle is opened, size in memory will be the same as the original.

May be I still don't understand ... but didn't you say, "since it's the uncompressed size which is killing us on the PPC"?

The only way is to ditch pixels ... but you said they are graphics and you need the details.

Obviously, I am still confused.

Another thing I don't understand is that you said each orginal file is about 10000 x 7000 pixels which should give you a roughly 200MB file not the 8 or 9 MB you stated.
April 21st, 2004, 01:10 AM
Tuttle
They're monochrome files - 10,000x7,000 = 70,000,000 pixels at 1 bit each, ie 70,000,000 / 8 = 8,750,000 bytes.

As for the uncompressed size, we're ditching pixels - I'm halving the resolution. They're still usable after doing that (barely). I'm struggling to save and re-compress that lower-res file so the on-disk file size is at least as small as it used to be.

Basically it's going like this:
- 400 KB file exists on disk.
- File is opened in IrfanView, which reports an 8 MB image.
- Resize function is used to shrink the image to about 5000x3500. IrfanView reports a 2 MB image.
- File is saved as a TIFF with LZW compression. It appears on disk as about a 500 KB file.
Obviously something other than IrfanView's LZW compression was used on the original files. I'm trying to work out what it was. :)
April 21st, 2004, 02:30 AM
falcon2000

So they are 1-bit files ... now it makes perfect sense.

In which case don't even bother with Lossless JPEG b/c it requires at least a 8-bit file.

Sorry, LZW is the only lossless compression you can save TIFF as ... that I know of.

If I find out more I'll let you know.
April 21st, 2004, 05:05 AM
Tuttle

Well, Photoshop seems to have produced smaller files than IrfanView. I don't understand it, but I'm happy to work with it. :D

Thanks for the ideas.
April 21st, 2004, 10:17 AM
falcon2000

There are many variety of LZW compressions ... different codes based on the same basic one.

Sorry that I can't be of more help to your question.
September 20th, 2006, 01:21 AM
Oldspammer

Image resize / LZW compression / GZip / Vectorization

I am a software developer (and slightly knowledgeable in math).

In most of these compression algorithms / schemes for lossless compression there is a parameter for dictionary size limit. Also there are two or three choices for the data structure tree optimizations that can be done to organize / optimize compression for things like palettes. It also may be possible to perform some kind of multi pass version of LZW in order better to know ahead of time the optimal content of the compression dictionaries, rather than the usual single pass version? In this way the compression dictionary is provided first so that the coded file can be expanded properly.

In the very "advanced user"-level interfaces, the user could be offered a choice of what all these parameters are tuned to be. For example, huge dictionary size slows speed, but compresses more. I believe that "Octree" variety of palette type choice often optimizes the colors better / closer to the actual 24-bit true-color and produces a smaller palette, but is a more computationally intensive algorithm that will also slow your compression processing down a bit.

Reference: In Paint Shop Pro 7 with an 8-bit image loaded, select Save As, GIF file type, Select Options button, Select GIF 87, Select Run Optimizer button, Select Colors Tab, select radio button for Octree Palette or Median Cut or Standard Web. Google search Octree palette image OR graphics

The open source GZip algorithm last I checked in 1998 offers, for example, integer settings from 0 to 9 for the size of the dictionary used during compression. These settings correspond to 2 raised to the power of x for the size of the dictionaries. With "modern day" computers eventually having 8-dual core (16-way) processors and 64 Gigabytes of RAM and 64 Terabytes of disk or more on-line, the dictionary size limit could be increased dramatically. The range of values could easily go much higher than 9. Google search man-page gzip compression-level

If Adobe or Irfan View software designers (their managers) chose not to expose in their user interfaces all of these parameter options so as to simplify the UI and make their software easier to use, then you must live with that design choice (blame those responsible for the design choices).

The same thing goes for the lossy JPEG file type. I believe that there are matrices for quantization. These matrices can be set as the standard ones, or they can be optimized for the given image (on a per image basis). The DCT moire pattern noise and compression level for the same sized JPEG file can be better or worse depending on the methods used by the software vendors.
Google search moire
Google search DCT jpeg
Google Search quantization-matrices jpeg optimization wiki

JBIG2 is a more recent image compression development. Instead of having a dictionary entry for stripes of pixels for compression, I believe that it uses 2-dimensional glyphs / symbols for compression. In this way monochrome images like facsimile or images of text or line art can be extremely compressed without losses. Google search JBIG2 for details.

Now for the image re-dimensioning issue...

If you are going to make the images smaller, the more narrow line thickness may have an impact on resizing the images.

If a simplistic / dumb algorithm for resize is used like a 1/N decimation (sampling every N-pixels, where N is an integer) then you could lose lines altogether, or make the lines nearly invisible. A smart resampling uses pixel color interpolation / weighting to determine the output image's pixel colors. Interpolation can be done using convolution (some FFTs and a filter kernel), or using bilinear, or bicubic spline methods. This is done because you want to sample pixels from image co-ordinates that are somewhat fractional in value like 75% (3/4) between some of the original image pixels, and even if you were doing integer co-ordinates, you do not want to do decimation resampling that might skip over smaller details that you don't want left out of the final smaller image. Google search decimation resampling

Even if you use a good smart resize (one where you gradually reduce image size to 80% of original, then that new image by 80% of previous size, and so on) the line thickness may also shrink to nearly invisible.

Because of this you will probably have to
a) convert to 24-bit color, thereby increasing the image pixel's dynamic range, then
b) smart resize by 80% or so,
c) perform an erode filter to brighten / thicken the now thinner and lighter colored lines (because of the dynamic range increased to 24-bit, you don't lose the information as much as you would at lower 8-bit range),
d) repeat steps b and c until the desired size is achieved, then
e) convert back to a lower palette size such as 8-bit, 6-bit, 5-bit, 4-bit, 3-bit, 1-bit.

Note that you may even have to apply c - the erode filter twice or more to brighten / thicken the 24-bit color lines before resampling again (depending on the graphics applications erode filter strength)--apply to suit the image.

This process of lowering the resolution, while maintaining even the thin lines might serve to really badly thicken the already thick lines, messing up the image a bit--so some experimentation may be needed for suitable processing on a per image basis.

Google Search erode filter
Google search smart-resample image-size

Of course, I could be remembering this resize processing wrongly. Someone please try it to see.

In theory, another way would involve a vector graphics conversion algorithm from within a more advanced Drawing application like Corel Draw!, or Adobe Illustrator, or Xara. Once vectorized, page resizing should work fine. The drawing is then exported with suitable dpi and pixel dimensions to suit the PDA application limitations.

Quote:

Originally Posted by Tuttle

I have about half a gig of large compressed TIFF files we'd like to load onto an SD card and view inside a Pocket PC. Good in theory, but the files are so large when uncompressed we can't reliably open them.

An average file is about 10000x7000 pixels, about 8 or 9 MB uncompressed, and somehow they've been compressed to about 300 or 400 KB each.

What I need to do is shrink the file resolution (half should do, to 5000x3500) and then put the smaller files in the Pocket PC. Except even if I shrink the files to 2500x1750, I can't find a way to save them such that the file size is less than about 500 KB. 500 KB is too big - I can no longer fit everything on a single SD card. I've tried GIF, PNG and all the TIFF variants which IrfanView supports. JPEG isn't acceptable; they're line drawings and detail is vital.

Does anyone know of a really good lossless compression algorithm for TIFF files, and a program which implements it? A suitable algorithm obviously exists and was used to create the original files, but I have no idea what it is.
September 20th, 2006, 01:50 AM
Train

Give Graphics Work Shop Pro a shot.
Program I have been using for years. http://www.mindworkshop.com/alchemy/gwspro.html
September 20th, 2006, 02:00 AM
Oldspammer

Packbits / RLE is lossless. & Huffman encoding too.

I think that have also seen something named "packbits" and perhaps even Huffman encoding, and there are several CCITT Group-X Facsimile compressions too? Reference: Jasc Paint Shop Pro 7, File-SaveAs, choose TIF, click options button, Select RLE / Packbits or CCITT Group 3 Fax, Huffman Encoding, LZW compression, or Uncompressed.

Google search Packbits+CCITT+3+Fax+Huffman+Encoding+LZW+compression+Uncompressed This search turns up a link to a LEAD Tools imaging tools / software development vendor page, among other things. Keywords that turn up seem to say that Group 4 Fax and Arithmetic encoding is also possible? I remember reading that arithmetic encoding is pretty good because it completely analyses the entire file's bit strings and produces an optimal dictionary on a per file basis. They don't say if this compression format is applied to any one specific format such as TIF though.

Pdf file fullext.pdf file contains the following entry:

TIFF Tagged-Image File. The standard, originally developed
by Microsoft and Aldus, is so flexible there is an infinite
number of ways to store images. (Raster Format)Below
are the main common formats:
TIFF No Compression (1, 4, 8, 24 Bits)
TIFF Huffman (1 Bit)
TIFF Pack Bit (1, 4, 8, 24 Bits)
TIFF LZW compression (1,4,8,24 Bits)
TIFF Fax Group 3 RLE (1 Bit)
TIFF Fax Group 4 RLE (1 Bit)
CCITT RLE
Multi-Image
Tiled
Motorola (MM)
JPEG
ZIP
CMYK
--------------
Certainly of these Packbits / RLE is lossless. It is used in palletized BMP files that are lossless. RLE stands for run length encoding. Arithmetic encoding is also lossless. Zip is lossless too.

Huffman encoding is also lossless. Huffman was a math guy. He found that if a binary tree was made of the dictionary of bit patterns found in a binary stream / file, that an optimal variable length code could be used to access this dictionary tree. These tree traversal codes just mean 0=left sub-tree/node, and 1=right sub-tree/node. When you reach a leaf node, you have the dictionary node with the expansion bit pattern stored. The most frequently used bit pattern is then placed in the tree at the shortest traversal pattern for access, resulting in a very short code to replace a highly repeating pattern. Similarly for the second-most repeated bit pattern stored to the next closest binary tree node. In data compression, Huffman is sometimes applied after using RLE to super compress the data.

Some of the other compressions in the above list are not, like JPEG that uses DCT and quantization matrices.

Quote:

Originally Posted by falcon2000

So they are 1-bit files ... now it makes perfect sense.

In which case don't even bother with Lossless JPEG b/c it requires at least a 8-bit file.

Sorry, LZW is the only lossless compression you can save TIFF as ... that I know of.

If I find out more I'll let you know.