From: DRC Date: Fri, 5 Feb 2016 00:52:23 +0000 (-0600) Subject: Merge branch '1.4.x' X-Git-Tag: 1.4.90~33 X-Git-Url: https://granicus.if.org/sourcecode?a=commitdiff_plain;h=55a18d40076b9a70a1c11be94d8e1a4129639bb4;p=libjpeg-turbo Merge branch '1.4.x' --- 55a18d40076b9a70a1c11be94d8e1a4129639bb4 diff --cc ChangeLog.txt index 16b264d,101a066..505f766 --- a/ChangeLog.txt +++ b/ChangeLog.txt @@@ -48,49 -10,15 +48,58 @@@ between the i386 and x86_64 RPMs (any d are not allowed when 32-bit and 64-bit RPMs are installed simultaneously.) Since the macro is used only internally, it has been moved into jconfigint.h. -[2] Fixed an issue in the accelerated Huffman decoder that could have caused +[10] The x86-64 SIMD code can now be disabled at run time by setting the +JSIMD_FORCENONE environment variable to 1 (the other SIMD implementations +already had this capability.) + +[11] Added a new command-line argument to TJBench (-nowrite) that prevents the +benchmark from outputting any images. This removes any potential operating +system overhead that might be caused by lazy writes to disk and thus improves +the consistency of the performance measurements. + +[12] Added SIMD acceleration for Huffman encoding on SSE2-capable x86 and +x86-64 platforms. This speeds up the compression of full-color JPEGs by about +10-15% on average (relative to libjpeg-turbo 1.4.x) when using modern Intel and +AMD CPUs. Additionally, this works around an issue in the clang optimizer that +prevents it (as of this writing) from achieving the same performance as GCC +when compiling the C version of the Huffman encoder +(https://llvm.org/bugs/show_bug.cgi?id=16035). For the purposes of benchmarking +or regression testing, SIMD-accelerated Huffman encoding can be disabled by +setting the JSIMD_NOHUFFENC environment variable to 1. + +[13] Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit +platforms. This speeds up the compression of full-color JPEGs by about 30% on +average on a Cortex-A9 core (iPhone 4S) and by about 6-7% on average on +Cortex-A53 and Cortex-A57 cores. For the purposes of benchmarking or +regression testing, SIMD-accelerated Huffman encoding can be disabled by +setting the JSIMD_NOHUFFENC environment variable to 1. + +[14] Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used +compression algorithms (including the slow integer forward DCT and h2v2 & h2v1 +downsampling algorithms, which are not accelerated in the 32-bit NEON +implementation.) This speeds up the compression of full-color JPEGs by about +75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on +Cortex-A53 and Cortex-A57 cores. + +[15] pkg-config (.pc) scripts are now included for both the libjpeg and +TurboJPEG API libraries on Un*x systems. Note that if a project's build system +relies on these scripts, then it will not be possible to build that project +with libjpeg or with a prior version of libjpeg-turbo. + +[16] Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to +improve performance on CPUs with in-order pipelines. This speeds up the +decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX +processor and by about 15% on average on a Cortex-A53 core. + ++[17] Fixed an issue in the accelerated Huffman decoder that could have caused + the decoder to read past the end of the input buffer when a malformed, + specially-crafted JPEG image was being decompressed. In prior versions of + libjpeg-turbo, the accelerated Huffman decoder was invoked (in most cases) only + if there were > 128 bytes of data in the input buffer. However, it is possible + to construct a JPEG image in which a single Huffman block is over 430 bytes + long, so this version of libjpeg-turbo activates the accelerated Huffman + decoder only if there are > 512 bytes of data in the input buffer. + 1.4.2 ===== diff --cc jdhuff.c index e3a3f0a,2ab44a4..e0495ab --- a/jdhuff.c +++ b/jdhuff.c @@@ -4,9 -4,8 +4,9 @@@ * This file was part of the Independent JPEG Group's software: * Copyright (C) 1991-1997, Thomas G. Lane. * libjpeg-turbo Modifications: - * Copyright (C) 2009-2011, 2015, D. R. Commander. + * Copyright (C) 2009-2011, 2016, D. R. Commander. - * For conditions of distribution and use, see the accompanying README file. + * For conditions of distribution and use, see the accompanying README.ijg + * file. * * This file contains Huffman entropy decoding routines. * diff --cc jmemmgr.c index 4ddf33f,4b0fcac..73e770f --- a/jmemmgr.c +++ b/jmemmgr.c @@@ -3,10 -3,9 +3,10 @@@ * * This file was part of the Independent JPEG Group's software: * Copyright (C) 1991-1997, Thomas G. Lane. - * It was modified by The libjpeg-turbo Project to include only code and - * information relevant to libjpeg-turbo. + * libjpeg-turbo Modifications: + * Copyright (C) 2016, D. R. Commander. - * For conditions of distribution and use, see the accompanying README file. + * For conditions of distribution and use, see the accompanying README.ijg + * file. * * This file contains the JPEG system-independent memory management * routines. This code is usable across a wide variety of machines; most diff --cc rdppm.c index bf8ded0,ebe82ac..f496ab3 --- a/rdppm.c +++ b/rdppm.c @@@ -414,11 -414,13 +414,13 @@@ start_input_ppm (j_compress_ptr cinfo, /* On 16-bit-int machines we have to be careful of maxval = 65535 */ source->rescale = (JSAMPLE *) (*cinfo->mem->alloc_small) ((j_common_ptr) cinfo, JPOOL_IMAGE, - (size_t) (((long) maxval + 1L) * sizeof(JSAMPLE))); + (size_t) (((long) maxval + 1L) * + sizeof(JSAMPLE))); half_maxval = maxval / 2; - for (val = 0; val <= (INT32) maxval; val++) { + for (val = 0; val <= (long) maxval; val++) { /* The multiplication here must be done in 32 bits to avoid overflow */ - source->rescale[val] = (JSAMPLE) ((val*MAXJSAMPLE + half_maxval)/maxval); + source->rescale[val] = (JSAMPLE) ((val * MAXJSAMPLE + half_maxval) / + maxval); } } }