Project pst-geo-compress 2009-07-30 ======================== Author: Heiko Oberdiek License is LPPL 1.3c, maintainance status: maintained current maintainer: author Files: README (this file) pst-geo-compress.pl pst-geo-decompress.pl Requirements: Ghostscript >= 8.56 (if prediction is used) `pst-geo' (CTAN:graphics/pstricks/contrib/pst-geo/) comes with huge data files: CTAN:graphics/pstricks/contrib/pst-geo/data/data.tgz CTAN:graphics/pstricks/contrib/pst-geo/dataII/dataII.tgz Unpacked the data files consume about 134 MB. The data files (files with extension `.dat') are PostScript files. PostScript supports compression and decompression via filter. The Perl script `pst-geo-compress.pl' compresses the data files using the /FlateEncode filter and adds code to decompresses the file, if read by a PostScript interpreter. Now the space requirements of the data files is about 24 MB. Caveat: Filters /FlateEncode and /FlateDecode require PostScript language level 3 (supported by ghostscript). (LZW compression don't need language level 3, but the compression is less effective.) Size reduction details: * Much unnecessary white space is removed. * Some comments are removed (especially number comments in .dat files of data.tgz), the remaining comments are preserved (especially city names). * /FlateEncode filter. * PNG prediction (method `up') is used for most of the files. Spaces at line ends are added to fill the space up to to the columns count. This improves the prediction, because these files contains long lists of sorted coordinate pairs. The prediction columns count is also the maximal line length, because of a ghostscript bug: Ghostscript has a bug regarding prediction, it is fixed in version 8.56 (2007-03-14): | 2006-12-11T17:30:53.980862Z L. Peter Deutsch | | Fixes bug: the PNG predictor filters produced incorrect data for the last | pixel of each row. (The encoder and decoder had matching bugs, so | encode+decode produced the correct result!) Fixes a diff in PS3 CET | 23-12U-1. By having each line at the same length, the last byte of a row has always the same value (end of line character). This does not trigger this bug (tested with gs 7.07/8.64). After compression the files aren't human-readable any more. The Perl script `pst-geo-decompress.pl' can be used to decompress a data file. Both scripts know option -h that prints a short usage screen. Option --gscmd allows to configure the ghostscript programm call, if the automatic ghostscript isn't sufficient. `pst-geo-compress.pl' is called inside the directory where the data files are present. Options `-1' and `-2' choose between the files from `data.tgz' and `dataII.tgz'. `pst-geo-decompress.pl' expects the data file to decompress. If an output file is given, the result is stored there. Otherwise standard output is used. History: 2009-07-30 v1.0 First release 2009-07-30 v1.1: * Workaround for ghostscript bug added: * Files that are compressed with prediction have equal line length. * Removing comments (except for city data files) * Obsolete option --prediction is therefore removed * Adding standalone test mode (option `--test').