Provide a more thorough description of the trade-offs between the various DCT/IDCT...

author DRC <dcommander@users.sourceforge.net>

Sun, 11 May 2014 09:46:28 +0000 (09:46 +0000)

committer DRC <dcommander@users.sourceforge.net>

Sun, 11 May 2014 09:46:28 +0000 (09:46 +0000)
author DRC <dcommander@users.sourceforge.net>
Sun, 11 May 2014 09:46:28 +0000 (09:46 +0000)
committer DRC <dcommander@users.sourceforge.net>
Sun, 11 May 2014 09:46:28 +0000 (09:46 +0000)
diff --git a/README-turbo.txt b/README-turbo.txt

index b81299f1a0ee22ec908f4aafc0c1a5a821f70ff5..a94ff972a931afb34a9734a67c69ef67424dc152 100755 (executable)
--- a/README-turbo.txt
+++ b/README-turbo.txt
@@ -419,10 +419,16 @@ details.
  
  For the most part, libjpeg-turbo should produce identical output to libjpeg
  v6b.  The one exception to this is when using the floating point DCT/IDCT, in
-which case the outputs of libjpeg v6b and libjpeg-turbo are not guaranteed to
-be identical (the accuracy of the floating point DCT/IDCT is constant when
-using libjpeg-turbo's SIMD extensions, but otherwise, it can depend heavily on
-the compiler and compiler settings.)
+which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
+following reasons:
+
+-- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
+   slightly more accurate than the implementation in libjpeg v6b, but not by
+   any amount perceptible to human vision (generally in the range of 0.01 to
+   0.08 dB gain in PNSR.)
+-- When not using the SIMD extensions, then the accuracy of the floating point
+   DCT/IDCT can depend on the compiler and compiler settings.
+
  
  While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
  still using the same algorithms as libjpeg v6b, so there are several specific
@@ -430,12 +436,14 @@ cases in which libjpeg-turbo cannot be expected to produce the same output as
  libjpeg v8:
  
  -- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
-   implements those scaling algorithms a bit differently than libjpeg v6b does,
-   and libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
+   implements those scaling algorithms differently than libjpeg v6b does, and
+   libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
  
  -- When using chrominance subsampling, because libjpeg v8 implements this
     with its DCT/IDCT scaling algorithms rather than with a separate
-   downsampling/upsampling algorithm.
+   downsampling/upsampling algorithm.  In our testing, the subsampled/upsampled
+   output of libjpeg v8 is less accurate than that of libjpeg v6b for this
+   reason.
  
  -- When using the floating point IDCT, for the reasons stated above and also
     because the floating point IDCT algorithm was modified in libjpeg v8a to
diff --git a/cjpeg.1 b/cjpeg.1

index 113efd5209bb39bc2bf8671ef7d1b822a0690f3f..b4edf62383540b565dd97719a7fc255cd79cbd51 100644 (file)
--- a/cjpeg.1
+++ b/cjpeg.1
@@ -1,4 +1,4 @@
-.TH CJPEG 1 "18 January 2013"
+.TH CJPEG 1 "11 May 2014"
  .SH NAME
  cjpeg \- compress an image file to a JPEG file
  .SH SYNOPSIS
@@ -166,14 +166,25 @@ Use integer DCT method (default).
  .TP
  .B \-dct fast
  Use fast integer DCT (less accurate).
+In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
+method when using the x86/x86-64 SIMD extensions (results may vary with other
+SIMD implementations, or when using libjpeg-turbo without SIMD extensions.)
+For quality levels of 90 and below, there should be little or no perceptible
+difference between the two algorithms.  For quality levels above 90, however,
+the difference between the fast and the int methods becomes more pronounced.
+With quality=97, for instance, the fast method incurs generally about a 1-3 dB
+loss (in PSNR) relative to the int method, but this can be larger for some
+images.  Do not use the fast method with quality levels above 97.  The
+algorithm often degenerates at quality=98 and above and can actually produce a
+more lossy image than if lower quality levels had been used.
  .TP
  .B \-dct float
  Use floating-point DCT method.
-The float method is very slightly more accurate than the int method, but is
-much slower unless your machine has very fast floating-point hardware.  Also
-note that results of the floating-point method may vary slightly across
-machines, while the integer methods should give the same results everywhere.
-The fast integer method is much less accurate than the other two.
+The float method is mostly a legacy feature.  It does not produce significantly
+more accurate results than the int method, and it is much slower.  The float
+method may also give different results on different machines due to varying
+roundoff behavior, whereas the integer methods should give the same results on
+all machines.
  .TP
  .BI \-restart " N"
  Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is
diff --git a/djpeg.1 b/djpeg.1

index 8bb7d278f007aacc3b68d346e906450daa04977a..d77e7edd86ebc4675ae05b752f8bde8c6e6c7069 100644 (file)
--- a/djpeg.1
+++ b/djpeg.1
@@ -1,4 +1,4 @@
-.TH DJPEG 1 "18 January 2013"
+.TH DJPEG 1 "11 May 2014"
  .SH NAME
  djpeg \- decompress a JPEG file to an image file
  .SH SYNOPSIS
@@ -115,14 +115,28 @@ Use integer DCT method (default).
  .TP
  .B \-dct fast
  Use fast integer DCT (less accurate).
+In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
+method when using the x86/x86-64 SIMD extensions (results may vary with other
+SIMD implementations, or when using libjpeg-turbo without SIMD extensions.)  If
+the JPEG image was compressed using a quality level of 85 or below, then there
+should be little or no perceptible difference between the two algorithms.  When
+decompressing images that were compressed using quality levels above 85,
+however, the difference between the fast and int methods becomes more
+pronounced.  With images compressed using quality=97, for instance, the fast
+method incurs generally about a 4-6 dB loss (in PSNR) relative to the int
+method, but this can be larger for some images.  If you can avoid it, do not
+use the fast method when decompressing images that were compressed using
+quality levels above 97.  The algorithm often degenerates for such images and
+can actually produce a more lossy output image than if the JPEG image had been
+compressed using lower quality levels.
  .TP
  .B \-dct float
  Use floating-point DCT method.
-The float method is very slightly more accurate than the int method, but is
-much slower unless your machine has very fast floating-point hardware.  Also
-note that results of the floating-point method may vary slightly across
-machines, while the integer methods should give the same results everywhere.
-The fast integer method is much less accurate than the other two.
+The float method is mostly a legacy feature.  It does not produce significantly
+more accurate results than the int method, and it is much slower.  The float
+method may also give different results on different machines due to varying
+roundoff behavior, whereas the integer methods should give the same results on
+all machines.
  .TP
  .B \-dither fs
  Use Floyd-Steinberg dithering in color quantization.
diff --git a/libjpeg.txt b/libjpeg.txt

index d110738017ed2c187164ccc198afb3cd006311de..afc002baa76a3ae5470a7e5028d577d820b91dcd 100644 (file)
--- a/libjpeg.txt
+++ b/libjpeg.txt
@@ -3,7 +3,7 @@ USING THE IJG JPEG LIBRARY
  This file was part of the Independent JPEG Group's software:
  Copyright (C) 1994-2011, Thomas G. Lane, Guido Vollbeding.
  Modifications:
-Copyright (C) 2010, D. R. Commander.
+Copyright (C) 2010, 2014, D. R. Commander.
  For conditions of distribution and use, see the accompanying README file.
  
  
@@ -886,14 +886,23 @@ J_DCT_METHOD dct_method
                  JDCT_FLOAT: floating-point method
                  JDCT_DEFAULT: default method (normally JDCT_ISLOW)
                  JDCT_FASTEST: fastest method (normally JDCT_IFAST)
-        The FLOAT method is very slightly more accurate than the ISLOW method,
-        but may give different results on different machines due to varying
-        roundoff behavior.  The integer methods should give the same results
-        on all machines.  On machines with sufficiently fast FP hardware, the
-        floating-point method may also be the fastest.  The IFAST method is
-        considerably less accurate than the other two; its use is not
-        recommended if high quality is a concern.  JDCT_DEFAULT and
-        JDCT_FASTEST are macros configurable by each installation.
+        In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
+        JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
+        with other SIMD implementations, or when using libjpeg-turbo without
+        SIMD extensions.)  For quality levels of 90 and below, there should be
+        little or no perceptible difference between the two algorithms.  For
+        quality levels above 90, however, the difference between JDCT_IFAST and
+        JDCT_ISLOW becomes more pronounced.  With quality=97, for instance,
+        JDCT_IFAST incurs generally about a 1-3 dB loss (in PSNR) relative to
+        JDCT_ISLOW, but this can be larger for some images.  Do not use
+        JDCT_IFAST with quality levels above 97.  The algorithm often
+        degenerates at quality=98 and above and can actually produce a more
+        lossy image than if lower quality levels had been used.  JDCT_FLOAT is
+        mostly a legacy feature.  It does not produce significantly more
+        accurate results than the ISLOW method, and it is much slower.  The
+        FLOAT method may also give different results on different machines due
+        to varying roundoff behavior, whereas the integer methods should give
+        the same results on all machines.
  
  J_COLOR_SPACE jpeg_color_space
  int num_components
@@ -1170,8 +1179,32 @@ int actual_number_of_colors
  Additional decompression parameters that the application may set include:
  
  J_DCT_METHOD dct_method
-        Selects the algorithm used for the DCT step.  Choices are the same
-        as described above for compression.
+        Selects the algorithm used for the DCT step.  Choices are:
+                JDCT_ISLOW: slow but accurate integer algorithm
+                JDCT_IFAST: faster, less accurate integer method
+                JDCT_FLOAT: floating-point method
+                JDCT_DEFAULT: default method (normally JDCT_ISLOW)
+                JDCT_FASTEST: fastest method (normally JDCT_IFAST)
+        In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
+        JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
+        with other SIMD implementations, or when using libjpeg-turbo without
+        SIMD extensions.)  If the JPEG image was compressed using a quality
+        level of 85 or below, then there should be little or no perceptible
+        difference between the two algorithms.  When decompressing images that
+        were compressed using quality levels above 85, however, the difference
+        between JDCT_IFAST and JDCT_ISLOW becomes more pronounced.  With images
+        compressed using quality=97, for instance, JDCT_IFAST incurs generally
+        about a 4-6 dB loss (in PSNR) relative to JDCT_ISLOW, but this can be
+        larger for some images.  If you can avoid it, do not use JDCT_IFAST
+        when decompressing images that were compressed using quality levels
+        above 97.  The algorithm often degenerates for such images and can
+        actually produce a more lossy output image than if the JPEG image had
+        been compressed using lower quality levels.  JDCT_FLOAT is mostly a
+        legacy feature.  It does not produce significantly more accurate
+        results than the ISLOW method, and it is much slower.  The FLOAT method
+        may also give different results on different machines due to varying
+        roundoff behavior, whereas the integer methods should give the same
+        results on all machines.
  
  boolean do_fancy_upsampling
          If TRUE, do careful upsampling of chroma components.  If FALSE,
diff --git a/usage.txt b/usage.txt

index 14ab77b2fc4db37fd82e9ee327e6cd9e44861da7..b328a21c4e158f49be305cf4a2267f2bf05d4167 100644 (file)
--- a/usage.txt
+++ b/usage.txt
@@ -172,13 +172,28 @@ Switches for advanced users:
          -dct int        Use integer DCT method (default).
          -dct fast       Use fast integer DCT (less accurate).
          -dct float      Use floating-point DCT method.
-                        The float method is very slightly more accurate than
-                        the int method, but is much slower unless your machine
-                        has very fast floating-point hardware.  Also note that
-                        results of the floating-point method may vary slightly
-                        across machines, while the integer methods should give
-                        the same results everywhere.  The fast integer method
-                        is much less accurate than the other two.
+                        In libjpeg-turbo, the fast method is generally about
+                        5-15% faster than the int method when using the
+                        x86/x86-64 SIMD extensions (results may vary with other
+                        SIMD implementations, or when using libjpeg-turbo
+                        without SIMD extensions.)  For quality levels of 90 and
+                        below, there should be little or no perceptible
+                        difference between the two algorithms.  For quality
+                        levels above 90, however, the difference between
+                        the fast and the int methods becomes more pronounced.
+                        With quality=97, for instance, the fast method incurs
+                        generally about a 1-3 dB loss (in PSNR) relative to
+                        the int method, but this can be larger for some images.
+                        Do not use the fast method with quality levels above
+                        97.  The algorithm often degenerates at quality=98 and
+                        above and can actually produce a more lossy image than
+                        if lower quality levels had been used.  The float
+                        method is mostly a legacy feature.  It does not produce
+                        significantly more accurate results than the int
+                        method, and it is much slower.  The float method may
+                        also give different results on different machines due
+                        to varying roundoff behavior, whereas the integer
+                        methods should give the same results on all machines.
  
          -restart N      Emit a JPEG restart marker every N MCU rows, or every
                          N MCU blocks if "B" is attached to the number.
@@ -296,13 +311,32 @@ Switches for advanced users:
          -dct int        Use integer DCT method (default).
          -dct fast       Use fast integer DCT (less accurate).
          -dct float      Use floating-point DCT method.
-                        The float method is very slightly more accurate than
-                        the int method, but is much slower unless your machine
-                        has very fast floating-point hardware.  Also note that
-                        results of the floating-point method may vary slightly
-                        across machines, while the integer methods should give
-                        the same results everywhere.  The fast integer method
-                        is much less accurate than the other two.
+                        In libjpeg-turbo, the fast method is generally about
+                        5-15% faster than the int method when using the
+                        x86/x86-64 SIMD extensions (results may vary with other
+                        SIMD implementations, or when using libjpeg-turbo
+                        without SIMD extensions.)  If the JPEG image was
+                        compressed using a quality level of 85 or below, then
+                        there should be little or no perceptible difference
+                        between the two algorithms.  When decompressing images
+                        that were compressed using quality levels above 85,
+                        however, the difference between the fast and int
+                        methods becomes more pronounced.  With images
+                        compressed using quality=97, for instance, the fast
+                        method incurs generally about a 4-6 dB loss (in PSNR)
+                        relative to the int method, but this can be larger for
+                        some images.  If you can avoid it, do not use the fast
+                        method when decompressing images that were compressed
+                        using quality levels above 97.  The algorithm often
+                        degenerates for such images and can actually produce
+                        a more lossy output image than if the JPEG image had
+                        been compressed using lower quality levels.  The float
+                        method is mostly a legacy feature.  It does not produce
+                        significantly more accurate results than the int
+                        method, and it is much slower.  The float method may
+                        also give different results on different machines due
+                        to varying roundoff behavior, whereas the integer
+                        methods should give the same results on all machines.
  
          -dither fs      Use Floyd-Steinberg dithering in color quantization.
          -dither ordered Use ordered dithering in color quantization.
@@ -381,12 +415,6 @@ When producing a color-quantized image, "-onepass -dither ordered" is fast but
  much lower quality than the default behavior.  "-dither none" may give
  acceptable results in two-pass mode, but is seldom tolerable in one-pass mode.
  
-If you are fortunate enough to have very fast floating point hardware,
-"-dct float" may be even faster than "-dct fast".  But on most machines
-"-dct float" is slower than "-dct int"; in this case it is not worth using,
-because its theoretical accuracy advantage is too small to be significant
-in practice.
-
  Two-pass color quantization requires a good deal of memory; on MS-DOS machines
  it may run out of memory even with -maxmemory 0.  In that case you can still
  decompress, with some loss of image quality, by specifying -onepass for
author	DRC <dcommander@users.sourceforge.net>
	Sun, 11 May 2014 09:46:28 +0000 (09:46 +0000)
committer	DRC <dcommander@users.sourceforge.net>
	Sun, 11 May 2014 09:46:28 +0000 (09:46 +0000)
README-turbo.txt		patch \| blob \| history
cjpeg.1		patch \| blob \| history
djpeg.1		patch \| blob \| history
libjpeg.txt		patch \| blob \| history
usage.txt		patch \| blob \| history