From 8940e6ca8628e10b45486fad4572a15b63546eb1 Mon Sep 17 00:00:00 2001 From: DRC Date: Sun, 11 May 2014 09:46:28 +0000 Subject: [PATCH] Provide a more thorough description of the trade-offs between the various DCT/IDCT algorithms, based on new resarch git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1286 632fc199-4ca6-4c93-a231-07263d6284db --- README-turbo.txt | 22 +++++++++++----- cjpeg.1 | 23 +++++++++++----- djpeg.1 | 26 +++++++++++++----- libjpeg.txt | 55 +++++++++++++++++++++++++++++++-------- usage.txt | 68 ++++++++++++++++++++++++++++++++++-------------- 5 files changed, 144 insertions(+), 50 deletions(-) diff --git a/README-turbo.txt b/README-turbo.txt index b81299f..a94ff97 100755 --- a/README-turbo.txt +++ b/README-turbo.txt @@ -419,10 +419,16 @@ details. For the most part, libjpeg-turbo should produce identical output to libjpeg v6b. The one exception to this is when using the floating point DCT/IDCT, in -which case the outputs of libjpeg v6b and libjpeg-turbo are not guaranteed to -be identical (the accuracy of the floating point DCT/IDCT is constant when -using libjpeg-turbo's SIMD extensions, but otherwise, it can depend heavily on -the compiler and compiler settings.) +which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the +following reasons: + +-- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so + slightly more accurate than the implementation in libjpeg v6b, but not by + any amount perceptible to human vision (generally in the range of 0.01 to + 0.08 dB gain in PNSR.) +-- When not using the SIMD extensions, then the accuracy of the floating point + DCT/IDCT can depend on the compiler and compiler settings. + While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is still using the same algorithms as libjpeg v6b, so there are several specific @@ -430,12 +436,14 @@ cases in which libjpeg-turbo cannot be expected to produce the same output as libjpeg v8: -- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8 - implements those scaling algorithms a bit differently than libjpeg v6b does, - and libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior. + implements those scaling algorithms differently than libjpeg v6b does, and + libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior. -- When using chrominance subsampling, because libjpeg v8 implements this with its DCT/IDCT scaling algorithms rather than with a separate - downsampling/upsampling algorithm. + downsampling/upsampling algorithm. In our testing, the subsampled/upsampled + output of libjpeg v8 is less accurate than that of libjpeg v6b for this + reason. -- When using the floating point IDCT, for the reasons stated above and also because the floating point IDCT algorithm was modified in libjpeg v8a to diff --git a/cjpeg.1 b/cjpeg.1 index 113efd5..b4edf62 100644 --- a/cjpeg.1 +++ b/cjpeg.1 @@ -1,4 +1,4 @@ -.TH CJPEG 1 "18 January 2013" +.TH CJPEG 1 "11 May 2014" .SH NAME cjpeg \- compress an image file to a JPEG file .SH SYNOPSIS @@ -166,14 +166,25 @@ Use integer DCT method (default). .TP .B \-dct fast Use fast integer DCT (less accurate). +In libjpeg-turbo, the fast method is generally about 5-15% faster than the int +method when using the x86/x86-64 SIMD extensions (results may vary with other +SIMD implementations, or when using libjpeg-turbo without SIMD extensions.) +For quality levels of 90 and below, there should be little or no perceptible +difference between the two algorithms. For quality levels above 90, however, +the difference between the fast and the int methods becomes more pronounced. +With quality=97, for instance, the fast method incurs generally about a 1-3 dB +loss (in PSNR) relative to the int method, but this can be larger for some +images. Do not use the fast method with quality levels above 97. The +algorithm often degenerates at quality=98 and above and can actually produce a +more lossy image than if lower quality levels had been used. .TP .B \-dct float Use floating-point DCT method. -The float method is very slightly more accurate than the int method, but is -much slower unless your machine has very fast floating-point hardware. Also -note that results of the floating-point method may vary slightly across -machines, while the integer methods should give the same results everywhere. -The fast integer method is much less accurate than the other two. +The float method is mostly a legacy feature. It does not produce significantly +more accurate results than the int method, and it is much slower. The float +method may also give different results on different machines due to varying +roundoff behavior, whereas the integer methods should give the same results on +all machines. .TP .BI \-restart " N" Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is diff --git a/djpeg.1 b/djpeg.1 index 8bb7d27..d77e7ed 100644 --- a/djpeg.1 +++ b/djpeg.1 @@ -1,4 +1,4 @@ -.TH DJPEG 1 "18 January 2013" +.TH DJPEG 1 "11 May 2014" .SH NAME djpeg \- decompress a JPEG file to an image file .SH SYNOPSIS @@ -115,14 +115,28 @@ Use integer DCT method (default). .TP .B \-dct fast Use fast integer DCT (less accurate). +In libjpeg-turbo, the fast method is generally about 5-15% faster than the int +method when using the x86/x86-64 SIMD extensions (results may vary with other +SIMD implementations, or when using libjpeg-turbo without SIMD extensions.) If +the JPEG image was compressed using a quality level of 85 or below, then there +should be little or no perceptible difference between the two algorithms. When +decompressing images that were compressed using quality levels above 85, +however, the difference between the fast and int methods becomes more +pronounced. With images compressed using quality=97, for instance, the fast +method incurs generally about a 4-6 dB loss (in PSNR) relative to the int +method, but this can be larger for some images. If you can avoid it, do not +use the fast method when decompressing images that were compressed using +quality levels above 97. The algorithm often degenerates for such images and +can actually produce a more lossy output image than if the JPEG image had been +compressed using lower quality levels. .TP .B \-dct float Use floating-point DCT method. -The float method is very slightly more accurate than the int method, but is -much slower unless your machine has very fast floating-point hardware. Also -note that results of the floating-point method may vary slightly across -machines, while the integer methods should give the same results everywhere. -The fast integer method is much less accurate than the other two. +The float method is mostly a legacy feature. It does not produce significantly +more accurate results than the int method, and it is much slower. The float +method may also give different results on different machines due to varying +roundoff behavior, whereas the integer methods should give the same results on +all machines. .TP .B \-dither fs Use Floyd-Steinberg dithering in color quantization. diff --git a/libjpeg.txt b/libjpeg.txt index d110738..afc002b 100644 --- a/libjpeg.txt +++ b/libjpeg.txt @@ -3,7 +3,7 @@ USING THE IJG JPEG LIBRARY This file was part of the Independent JPEG Group's software: Copyright (C) 1994-2011, Thomas G. Lane, Guido Vollbeding. Modifications: -Copyright (C) 2010, D. R. Commander. +Copyright (C) 2010, 2014, D. R. Commander. For conditions of distribution and use, see the accompanying README file. @@ -886,14 +886,23 @@ J_DCT_METHOD dct_method JDCT_FLOAT: floating-point method JDCT_DEFAULT: default method (normally JDCT_ISLOW) JDCT_FASTEST: fastest method (normally JDCT_IFAST) - The FLOAT method is very slightly more accurate than the ISLOW method, - but may give different results on different machines due to varying - roundoff behavior. The integer methods should give the same results - on all machines. On machines with sufficiently fast FP hardware, the - floating-point method may also be the fastest. The IFAST method is - considerably less accurate than the other two; its use is not - recommended if high quality is a concern. JDCT_DEFAULT and - JDCT_FASTEST are macros configurable by each installation. + In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than + JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary + with other SIMD implementations, or when using libjpeg-turbo without + SIMD extensions.) For quality levels of 90 and below, there should be + little or no perceptible difference between the two algorithms. For + quality levels above 90, however, the difference between JDCT_IFAST and + JDCT_ISLOW becomes more pronounced. With quality=97, for instance, + JDCT_IFAST incurs generally about a 1-3 dB loss (in PSNR) relative to + JDCT_ISLOW, but this can be larger for some images. Do not use + JDCT_IFAST with quality levels above 97. The algorithm often + degenerates at quality=98 and above and can actually produce a more + lossy image than if lower quality levels had been used. JDCT_FLOAT is + mostly a legacy feature. It does not produce significantly more + accurate results than the ISLOW method, and it is much slower. The + FLOAT method may also give different results on different machines due + to varying roundoff behavior, whereas the integer methods should give + the same results on all machines. J_COLOR_SPACE jpeg_color_space int num_components @@ -1170,8 +1179,32 @@ int actual_number_of_colors Additional decompression parameters that the application may set include: J_DCT_METHOD dct_method - Selects the algorithm used for the DCT step. Choices are the same - as described above for compression. + Selects the algorithm used for the DCT step. Choices are: + JDCT_ISLOW: slow but accurate integer algorithm + JDCT_IFAST: faster, less accurate integer method + JDCT_FLOAT: floating-point method + JDCT_DEFAULT: default method (normally JDCT_ISLOW) + JDCT_FASTEST: fastest method (normally JDCT_IFAST) + In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than + JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary + with other SIMD implementations, or when using libjpeg-turbo without + SIMD extensions.) If the JPEG image was compressed using a quality + level of 85 or below, then there should be little or no perceptible + difference between the two algorithms. When decompressing images that + were compressed using quality levels above 85, however, the difference + between JDCT_IFAST and JDCT_ISLOW becomes more pronounced. With images + compressed using quality=97, for instance, JDCT_IFAST incurs generally + about a 4-6 dB loss (in PSNR) relative to JDCT_ISLOW, but this can be + larger for some images. If you can avoid it, do not use JDCT_IFAST + when decompressing images that were compressed using quality levels + above 97. The algorithm often degenerates for such images and can + actually produce a more lossy output image than if the JPEG image had + been compressed using lower quality levels. JDCT_FLOAT is mostly a + legacy feature. It does not produce significantly more accurate + results than the ISLOW method, and it is much slower. The FLOAT method + may also give different results on different machines due to varying + roundoff behavior, whereas the integer methods should give the same + results on all machines. boolean do_fancy_upsampling If TRUE, do careful upsampling of chroma components. If FALSE, diff --git a/usage.txt b/usage.txt index 14ab77b..b328a21 100644 --- a/usage.txt +++ b/usage.txt @@ -172,13 +172,28 @@ Switches for advanced users: -dct int Use integer DCT method (default). -dct fast Use fast integer DCT (less accurate). -dct float Use floating-point DCT method. - The float method is very slightly more accurate than - the int method, but is much slower unless your machine - has very fast floating-point hardware. Also note that - results of the floating-point method may vary slightly - across machines, while the integer methods should give - the same results everywhere. The fast integer method - is much less accurate than the other two. + In libjpeg-turbo, the fast method is generally about + 5-15% faster than the int method when using the + x86/x86-64 SIMD extensions (results may vary with other + SIMD implementations, or when using libjpeg-turbo + without SIMD extensions.) For quality levels of 90 and + below, there should be little or no perceptible + difference between the two algorithms. For quality + levels above 90, however, the difference between + the fast and the int methods becomes more pronounced. + With quality=97, for instance, the fast method incurs + generally about a 1-3 dB loss (in PSNR) relative to + the int method, but this can be larger for some images. + Do not use the fast method with quality levels above + 97. The algorithm often degenerates at quality=98 and + above and can actually produce a more lossy image than + if lower quality levels had been used. The float + method is mostly a legacy feature. It does not produce + significantly more accurate results than the int + method, and it is much slower. The float method may + also give different results on different machines due + to varying roundoff behavior, whereas the integer + methods should give the same results on all machines. -restart N Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is attached to the number. @@ -296,13 +311,32 @@ Switches for advanced users: -dct int Use integer DCT method (default). -dct fast Use fast integer DCT (less accurate). -dct float Use floating-point DCT method. - The float method is very slightly more accurate than - the int method, but is much slower unless your machine - has very fast floating-point hardware. Also note that - results of the floating-point method may vary slightly - across machines, while the integer methods should give - the same results everywhere. The fast integer method - is much less accurate than the other two. + In libjpeg-turbo, the fast method is generally about + 5-15% faster than the int method when using the + x86/x86-64 SIMD extensions (results may vary with other + SIMD implementations, or when using libjpeg-turbo + without SIMD extensions.) If the JPEG image was + compressed using a quality level of 85 or below, then + there should be little or no perceptible difference + between the two algorithms. When decompressing images + that were compressed using quality levels above 85, + however, the difference between the fast and int + methods becomes more pronounced. With images + compressed using quality=97, for instance, the fast + method incurs generally about a 4-6 dB loss (in PSNR) + relative to the int method, but this can be larger for + some images. If you can avoid it, do not use the fast + method when decompressing images that were compressed + using quality levels above 97. The algorithm often + degenerates for such images and can actually produce + a more lossy output image than if the JPEG image had + been compressed using lower quality levels. The float + method is mostly a legacy feature. It does not produce + significantly more accurate results than the int + method, and it is much slower. The float method may + also give different results on different machines due + to varying roundoff behavior, whereas the integer + methods should give the same results on all machines. -dither fs Use Floyd-Steinberg dithering in color quantization. -dither ordered Use ordered dithering in color quantization. @@ -381,12 +415,6 @@ When producing a color-quantized image, "-onepass -dither ordered" is fast but much lower quality than the default behavior. "-dither none" may give acceptable results in two-pass mode, but is seldom tolerable in one-pass mode. -If you are fortunate enough to have very fast floating point hardware, -"-dct float" may be even faster than "-dct fast". But on most machines -"-dct float" is slower than "-dct int"; in this case it is not worth using, -because its theoretical accuracy advantage is too small to be significant -in practice. - Two-pass color quantization requires a good deal of memory; on MS-DOS machines it may run out of memory even with -maxmemory 0. In that case you can still decompress, with some loss of image quality, by specifying -onepass for -- 2.40.0