Merge branch '1.4.x'

author DRC <information@libjpeg-turbo.org>

Fri, 5 Feb 2016 00:52:23 +0000 (18:52 -0600)

committer DRC <information@libjpeg-turbo.org>

Fri, 5 Feb 2016 00:52:23 +0000 (18:52 -0600)
author DRC <information@libjpeg-turbo.org>
Fri, 5 Feb 2016 00:52:23 +0000 (18:52 -0600)
committer DRC <information@libjpeg-turbo.org>
Fri, 5 Feb 2016 00:52:23 +0000 (18:52 -0600)
diff --cc ChangeLog.txt

index 16b264d4736a8119c2dcb7deeb45ccf17dae6306,101a0669047b2d7654b1fd5719218b927bba56df..505f7664ec8969ee267015b9a510fd5f0993d014
--- 1/ChangeLog.txt
--- 2/ChangeLog.txt
+++ b/ChangeLog.txt
@@@ -48,49 -10,15 +48,58 @@@ between the i386 and x86_64 RPMs (any d
   are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.)
   Since the macro is used only internally, it has been moved into jconfigint.h.
   
- -[2] Fixed an issue in the accelerated Huffman decoder that could have caused
+ +[10] The x86-64 SIMD code can now be disabled at run time by setting the
+ +JSIMD_FORCENONE environment variable to 1 (the other SIMD implementations
+ +already had this capability.)
+ +
+ +[11] Added a new command-line argument to TJBench (-nowrite) that prevents the
+ +benchmark from outputting any images.  This removes any potential operating
+ +system overhead that might be caused by lazy writes to disk and thus improves
+ +the consistency of the performance measurements.
+ +
+ +[12] Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and
+ +x86-64 platforms.  This speeds up the compression of full-color JPEGs by about
+ +10-15% on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and
+ +AMD CPUs.  Additionally, this works around an issue in the clang optimizer that
+ +prevents it (as of this writing) from achieving the same performance as GCC
+ +when compiling the C version of the Huffman encoder
+ +(https://llvm.org/bugs/show_bug.cgi?id=16035). For the purposes of benchmarking
+ +or regression testing, SIMD-accelerated Huffman encoding can be disabled by
+ +setting the JSIMD_NOHUFFENC environment variable to 1.
+ +
+ +[13] Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
+ +platforms.  This speeds up the compression of full-color JPEGs by about 30% on
+ +average on a Cortex-A9 core (iPhone 4S) and by about 6-7% on average on
+ +Cortex-A53 and Cortex-A57 cores.  For the purposes of benchmarking or
+ +regression testing, SIMD-accelerated Huffman encoding can be disabled by
+ +setting the JSIMD_NOHUFFENC environment variable to 1.
+ +
+ +[14] Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
+ +compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
+ +downsampling algorithms, which are not accelerated in the 32-bit NEON
+ +implementation.)  This speeds up the compression of full-color JPEGs by about
+ +75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
+ +Cortex-A53 and Cortex-A57 cores.
+ +
+ +[15] pkg-config (.pc) scripts are now included for both the libjpeg and
+ +TurboJPEG API libraries on Un*x systems.  Note that if a project's build system
+ +relies on these scripts, then it will not be possible to build that project
+ +with libjpeg or with a prior version of libjpeg-turbo.
+ +
+ +[16] Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
+ +improve performance on CPUs with in-order pipelines.  This speeds up the
+ +decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
+ +processor and by about 15% on average on a Cortex-A53 core.
+ +
++[17] Fixed an issue in the accelerated Huffman decoder that could have caused
+ the decoder to read past the end of the input buffer when a malformed,
+ specially-crafted JPEG image was being decompressed.  In prior versions of
+ libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only
+ if there were > 128 bytes of data in the input buffer.  However, it is possible
+ to construct a JPEG image in which a single Huffman block is over 430 bytes
+ long, so this version of libjpeg-turbo activates the accelerated Huffman
+ decoder only if there are > 512 bytes of data in the input buffer.
+ 
   
   1.4.2
   =====
diff --cc jdhuff.c

index e3a3f0aac39df79c8eb2fc4e12c8866972c0f006,2ab44a430946b9261ef9337555847008a8f7bdbc..e0495ab82f5a07fba5ca39a0713bc0fb2abcda5b
--- 1/jdhuff.c
--- 2/jdhuff.c
+++ b/jdhuff.c
@@@ -4,9 -4,8 +4,9 @@@
    * This file was part of the Independent JPEG Group's software:
    * Copyright (C) 1991-1997, Thomas G. Lane.
    * libjpeg-turbo Modifications:
-  * Copyright (C) 2009-2011, 2015, D. R. Commander.
+  * Copyright (C) 2009-2011, 2016, D. R. Commander.
- - * For conditions of distribution and use, see the accompanying README file.
+ + * For conditions of distribution and use, see the accompanying README.ijg
+ + * file.
    *
    * This file contains Huffman entropy decoding routines.
    *
diff --cc jmemmgr.c

index 4ddf33ff76c2d197893a708b9196c71215db261e,4b0fcac01fd250b784d9eb9247932f085100d870..73e770f6c1c4e625b5eb047bfef7367d919a5190
--- 1/jmemmgr.c
--- 2/jmemmgr.c
+++ b/jmemmgr.c
@@@ -3,10 -3,9 +3,10 @@@
    *
    * This file was part of the Independent JPEG Group's software:
    * Copyright (C) 1991-1997, Thomas G. Lane.
-  * It was modified by The libjpeg-turbo Project to include only code and
-  * information relevant to libjpeg-turbo.
+  * libjpeg-turbo Modifications:
+  * Copyright (C) 2016, D. R. Commander.
- - * For conditions of distribution and use, see the accompanying README file.
+ + * For conditions of distribution and use, see the accompanying README.ijg
+ + * file.
    *
    * This file contains the JPEG system-independent memory management
    * routines.  This code is usable across a wide variety of machines; most
diff --cc rdppm.c

index bf8ded01380b14b83b9b4456a1fb6e28e47e9348,ebe82acc069c82b57f5da95bd9f6029885d4f28e..f496ab36704e373871cac15502e4a50d276297d6
--- 1/rdppm.c
--- 2/rdppm.c
+++ b/rdppm.c
@@@ -414,11 -414,13 +414,13 @@@ start_input_ppm (j_compress_ptr cinfo, 
       /* On 16-bit-int machines we have to be careful of maxval = 65535 */
       source->rescale = (JSAMPLE *)
         (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE,
-                                   (size_t) (((long) maxval + 1L) * sizeof(JSAMPLE)));
+                                   (size_t) (((long) maxval + 1L) *
+                                             sizeof(JSAMPLE)));
       half_maxval = maxval / 2;
- -    for (val = 0; val <= (INT32) maxval; val++) {
+ +    for (val = 0; val <= (long) maxval; val++) {
         /* The multiplication here must be done in 32 bits to avoid overflow */
-       source->rescale[val] = (JSAMPLE) ((val*MAXJSAMPLE + half_maxval)/maxval);
+       source->rescale[val] = (JSAMPLE) ((val * MAXJSAMPLE + half_maxval) /
+                                         maxval);
       }
     }
   }
diff --cc turbojpeg.c
Simple merge
author	DRC <information@libjpeg-turbo.org>
	Fri, 5 Feb 2016 00:52:23 +0000 (18:52 -0600)
committer	DRC <information@libjpeg-turbo.org>
	Fri, 5 Feb 2016 00:52:23 +0000 (18:52 -0600)
		1	2
ChangeLog.txt	patch \|	diff1 \|	diff2 \|	blob \| history
jdhuff.c	patch \|	diff1 \|	diff2 \|	blob \| history
jmemmgr.c	patch \|	diff1 \|	diff2 \|	blob \| history
rdppm.c	patch \|	diff1 \|	diff2 \|	blob \| history
turbojpeg.c	patch \|	diff1 \|	diff2 \|	blob \| history