FANDOM

2,054,160 Pages

A bot controlled by Sean Colombo to attempt to do some fixes (namely: image compression).

Avatar is cropped version of CC-BY-SA image on Wikimedia Commons.

One of the main reasons that a bot is being used (and uploading new versions of images instead of just compressing on the server) is that "lossless" compression often doesn't seem to be lossless. A related bug in PageSpeed (which uses many of the same compression programs) can be found here: http://code.google.com/p/page-speed/issues/detail?id=467

Real Results

After running the script against our ACTUAL file servers for several months, the first run is almost done. Here are the stats, straight from the database:

 mysql> SELECT NOW() as timestamp, COUNT(*) as numImagesLookedAt, (SUM(img_orig_bytes) - SUM(img_curr_bytes))/1000000000 AS gigabytes_saved, 100-(SUM(img_curr_bytes) * 100)/(SUM(img_orig_bytes)) as average_percent_compressed  FROM img_status;
 +---------------------+-------------------+-----------------+----------------------------+
 | timestamp           | numImagesLookedAt | gigabytes_saved | average_percent_compressed |
 +---------------------+-------------------+-----------------+----------------------------+
 | 2012-09-10 22:50:43 |          10196021 |        193.3332 |                    12.7857 |
 +---------------------+-------------------+-----------------+----------------------------+
 1 row in set (8.70 sec)

Test Results

This bot ran a series of tests to figure out the cost/benefits of running it across all or part of Wikia.

TODO: if there's time, point it at the "File:Wiki-background" of a bunch of wikis.

NOTE: There are raw stats (not formatted as well) here:

Recent Results

Results

This table is the results by filetype if we compress using both available methods for each filetype and then choose the best one:

Type Num Images Original size Compressed size Savings Savings percent
Overall (using best result) 803,818 81,960,965,166 71,228,665,347 10,732,299,819 13.09%
PNG (ave both methods) 953,380 14,760,411,702 21.17%
PNG (only using best result) 478,403 8,052,663,772 22.92%
JPEG (ave both methods) 650,678 3,857,200,797 4.12%
JPEG (only using best result) 325,415 2,679,636,047 5.72%


NOTE: pngcrush was run in a very slow way to get optimal compression (with options: -fix -force -brute).

Type Num Images Original size Compressed size Savings Savings percent
PNG (optipng) 478,403 35,135,819,685 7,707,427,424 21.94%
PNG (pngcrush) 474,825 34,569,382,865 7,051,499,968 20.40%
JPEG (jpegoptim) 325,415 46,825,073,763 2,190,527,617 4.68%
JPEG (jpegtran) 325,415 46,825,073,763 1,668,157,490 3.56%


Filesize Range Num Images Original size Compressed size Savings Savings percent
1M+ 29,366 6,470,662,833 11.05%
500k - 1M 42,200 4,091,213,946 14.00%
100k - 500k 226,352 4,965,511,618 10.01%
< 100k 1,306,140 3,090,224,102 11.90%


If we set a rule that says the bot would only do a re-upload if the savings were above a certain percentage:

Min % Savings To Upload Num Images Original size Compressed size Savings Savings percent
No minimum 803,818 81,960,965,166 71,228,665,347 10,732,299,819 13.09%
1% min 669,432 10,706,833,946 16.13%
5% min 547,563 10,051,394,343 23.62%
10% min 486,020 9,260,860,843 29.48%
15% min 447,266 8,448,252,456 34.09%

Raw Data

=============================================================================
==                             == STATS ==                                 ==
=============================================================================
  #IMAGES BEFORE_SIZE AFTER_SIZE SAVED % SAVED
=============================================================================
_OVERALL 1604058 163355350076 144737737577 18617612499  11.40%
best:JPG 325415 46825073763 44145437716 2679636047  5.72%
best:PNG 478403 35135891403 27083227631 8052663772  22.92%
best:overall 803818 81960965166 71228665347 10732299819  13.09%
size:100k-500k 226352 49619458385 44653946767 4965511618  10.01%
size:1M + 29366 58554010911 52083348078 6470662833  11.05%
size:500k-1M 42200 29214462252 25123248306 4091213946  14.00%
size:< 100k 1306140 25967418528 22877194426 3090224102  11.90%
threshold:01 669432 66373391140 55666557194 10706833946  16.13%
threshold:05 547563 42563121058 32511726715 10051394343  23.62%
threshold:10 486020 31408817580 22147956737 9260860843  29.48%
threshold:15 447266 24781137815 16332885359 8448252456  34.09%
type:jpg 650678 93629526574 89772325777 3857200797  4.12%
type:png 953380 69725823502 54965411800 14760411702  21.17%
used:OptiPNG 478403 35135819685 27428392261 7707427424  21.94%
used:PNGcrush 474825 34569382865 27517882897 7051499968  20.40%
used:jpegoptim 325415 46825073763 44634546146 2190527617  4.68%
used:jpegtran 325415 46825073763 45156916273 1668157490  3.56%
=============================================================================
Done.
Community content is available under Copyright unless otherwise noted.