We have been thinking about using general-purpose compression for some time. Pro:
Con:
Requirements:
Recommendation: It looks like zlib might be the best choice: It is open source with a notice-style license like ICU's, is free of patents, documented in RFCs (1950 & 1951) supported in java.util.zip, used in .jar and .png files, used in Apache, ... GranularityCompress whole .dat package. Pro: Might be best compression because it can optimize across pieces. Con: Need to decompress all of the data before accessing any of it. Compress per data item/file, e.g. per resource bundle, except for the header. Pro: Could be used with any data item/file without modification to its internal format. The general data writing and loading code could handle compression and decompression. Con: If it is common for only parts of a data item/file to be accessed, such as in resource bundles, then the decompression may read and decompress much more data than necessary, and the per-data-item/file load time increases. Compress per piece of data, e.g. per string in a resource bundle. Pro: Does not decompress what's not used, faster bundle loading. Con: Probably inefficient for short strings (which are most common). Need per-string synchronization for multithreading. |
Design Docs > Size Reduction >