DSL Ideas and Suggestions :: Extensions can be recompressed with `advdef`



Still, I wonder why bother at all for 2% when using gzip -9 is nearly the same and takes less time.  If you want max compression when using tar -z, you can set the variable GZIP="-9"
Two things:
1. Recompression does not take very long.
2. Usually, I am seeing double-digit size reductions even for extensions of 1MB or less.
e.g.
Code Sample
du install_flash_player_7_linux.tar.gz
999     install_flash_player_7_linux.tar.gz

time advdef -z4 install_flash_player_7_linux.tar.gz
    1017790      980222  96% install_flash_player_7_linux.tar.gz
    1017790      980222  96%

real    0m5.163s
user    0m4.788s
sys     0m0.204s

du install_flash_player_7_linux.tar.gz
963     install_flash_player_7_linux.tar.gz


Filetypes do play a part as well. Fonts, for example, seem to benefit more from the new algorithm.
e.g. (DSL 3.3 '/usr/X11R6/lib/X11/fonts/misc')
Code Sample
/usr/X11R6/lib/X11/fonts/misc# du -cs *.gz
1096    total
/usr/X11R6/lib/X11/fonts/misc# advdef -z4 *.gz
919075      785524  85%

This is important: CPU is a P4 1.7 'Willamette' on PC-133 SDRAM.

And again, the question of the amount of compression used in these *.gz files has not been addressed. If it wasn't originally compressed to maximum (most aren't), the comparison is an unfair one. It's possible that compressing with gzip at the maximum level, as ^thehatsrule^ demonstrated, might show the size decrease to be consistently insignificant.  Or the results might be more encouraging than that. I stand by my original (and repeated) statement that if it doesn't make it noticeably smaller (which has yet to be shown), or if it uses noticeably more resources (a p4 won't say much about how it works on 486), there is very little point in it.

I can understand the desire for better compression techniques, or course, but over the last several years there has been little acceptible improvement. Either the technique is too slow, or the result isn't worth replacing the de facto standard.

Quote (mikshaw @ July 18 2007,21:48)
And again, the question of the amount of compression used in these *.gz files has not been addressed. If it wasn't originally compressed to maximum (most aren't), the comparison is an unfair one. It's possible that compressing with gzip at the maximum level, as ^thehatsrule^ demonstrated, might show the size decrease to be consistently insignificant.

Agreed. I will post the font-compression comparison again, this time using '-9' on the original files first.

Update:
Code Sample

/usr/X11R6/lib/X11/fonts/misc# gzip -d *.gz
/usr/X11R6/lib/X11/fonts/misc# du -cs *.pcf
5148    total
/usr/X11R6/lib/X11/fonts/misc# gzip -9 *.pcf
/usr/X11R6/lib/X11/fonts/misc# du -cs *.gz
1056    total
/usr/X11R6/lib/X11/fonts/misc# advdef -z4 *.gz
880984      786600  89%
/usr/X11R6/lib/X11/fonts/misc# du -cs *.gz
948     total

I think the reason why my package couldn't have been as small as it could be, was that it contained many archived types of files already.  I think if I did run it on each of those, then compare, there could be a significant difference.  <edit>Tried it out... now it has a total of 8% difference in size - around 4 MB</edit>

Since I mainly do my testing in virtualized or emulated environments, I do not think posting system specs will be beneficial for comparison.

I also tried some ~3mb binaries and the recompressed files were usually 4-5% smaller.

Decompression:
I did some timed tests using busybox's time (wonder why this was included in the busybox build), and concluded that the minor differences (few milliseconds) were good enough to conclude that the time taken to decompress should be the same with or without advdef.  I also subtracted differences in system memory using free (is this a valid use of it?), and the peak values were also very close - so the memory consumption should be around the same.

So what it boils down to, is if the extension maintainer has the time and the resources to use advdef.

also, for reference.. from manpage:
Quote
4 Limitations
The advdef program cannot be used to recompress huge files because it needs to allocate memory for both the complete compressed and uncompressed data.

Next Page...
original here.