]> granicus.if.org Git - libjpeg-turbo/log
libjpeg-turbo
8 years agoBuild: Clean up inline keyword detection
DRC [Mon, 5 Dec 2016 22:52:54 +0000 (16:52 -0600)]
Build: Clean up inline keyword detection

Strict C89-conformant compilers don't support the "inline" keyword, but
most of them support "__inline__", and that keyword can be used with the
always_inline atribute as well.  This commit also removes duplicate code
by using a foreach() loop to test the various keywords.

8 years agoPackaging: Use correct name for SRPM spec file
DRC [Sat, 3 Dec 2016 21:51:58 +0000 (15:51 -0600)]
Packaging: Use correct name for SRPM spec file

Per convention, the file should be named {package name}.spec.

8 years agoBuild: Fix issues when building as a Git submodule
DRC [Sat, 3 Dec 2016 20:21:11 +0000 (14:21 -0600)]
Build: Fix issues when building as a Git submodule

- Replace CMAKE_SOURCE_DIR with CMAKE_CURRENT_SOURCE_DIR
- Replace CMAKE_BINARY_DIR with CMAKE_CURRENT_BINARY_DIR
- Don't use "libjpeg-turbo" in any of the package system filenames
  (because CMAKE_PROJECT_NAME will not be the same if building LJT as
  a submodule.)

Closes #122

8 years agoBuild: Fix buglet in output of `make tjtest`
DRC [Sat, 3 Dec 2016 19:38:21 +0000 (13:38 -0600)]
Build: Fix buglet in output of `make tjtest`

8 years agoBuild: Fix regression in AltiVec SIMD detection
DRC [Sat, 3 Dec 2016 21:17:09 +0000 (21:17 +0000)]
Build: Fix regression in AltiVec SIMD detection

Only the SIMD source files should be built with -maltivec.  Otherwise
the detection code will not be compiled in.

8 years agoBuild: Use wrapper script for gas-preprocessor.pl
DRC [Sat, 26 Nov 2016 00:50:11 +0000 (18:50 -0600)]
Build: Use wrapper script for gas-preprocessor.pl

The previous hack (adding ${CMAKE_ASM_COMPILER} to CMAKE_ASM_FLAGS)
didn't work in all cases, because more recent versions of CMake place
the includes ahead of the flags (which meant that the real assembler
wasn't the first argument to gas-preprocessor.pl.)

8 years agoTravis: Fix OS X build
DRC [Thu, 24 Nov 2016 00:44:33 +0000 (18:44 -0600)]
Travis: Fix OS X build

+ migrate to new xcode7.3 image, since xcode7.2 is going away soon.

8 years agoBuild: Fix RPATH handling
DRC [Wed, 23 Nov 2016 23:12:57 +0000 (17:12 -0600)]
Build: Fix RPATH handling

CMAKE_INSTALL_RPATH has to be set before the targets are defined (oops.)

This also explicitly turns on MACOSX_RPATH for the shared libraries
(which is the default with newer versions of CMake but not with 2.8.x.)
The old autotools/libtool build system hard-coded the install name
directory of the OS X shared libraries to libdir, which meant that any
executable that linked against those libraries would also be hard-coded
to look for the libjpeg-turbo libraries in that directory.  @rpath makes
the OS X version of libjpeg-turbo behave like the Linux version, in the
sense that the executables under /opt/libjpeg-turbo/bin will
automatically pick up the libraries under /opt/libjpeg-turbo/lib* by
default, but other executables won't unless they are linked with -rpath.

8 years agoUnified CMake-based build system
DRC [Tue, 15 Nov 2016 14:47:43 +0000 (08:47 -0600)]
Unified CMake-based build system

See #56 for discussion.

Fixes #21, Fixes #29, Fixes #37, Closes #56, Fixes #58, Closes #73
Obviates #82

See also:
https://sourceforge.net/p/libjpeg-turbo/feature-requests/5/
https://sourceforge.net/p/libjpeg-turbo/patches/5/

8 years agoBUILDING.md: NASM 2.10+/YASM 1.2.0+ always needed
DRC [Fri, 18 Nov 2016 14:31:49 +0000 (08:31 -0600)]
BUILDING.md: NASM 2.10+/YASM 1.2.0+ always needed

... for all x86[-64] builds, because we now have both 64-bit and 32-bit
AVX2 SIMD extensions.

8 years agoAdvertise the new AVX2 SIMD extensions
DRC [Wed, 16 Nov 2016 21:55:12 +0000 (15:55 -0600)]
Advertise the new AVX2 SIMD extensions

(our story so far ...)

8 years agoMerge branch 'master' into dev
DRC [Tue, 22 Nov 2016 15:33:19 +0000 (09:33 -0600)]
Merge branch 'master' into dev

8 years agoAppVeyor: Use built-in MSYS2 MinGW compilers
DRC [Tue, 22 Nov 2016 04:58:18 +0000 (22:58 -0600)]
AppVeyor: Use built-in MSYS2 MinGW compilers

AppVeyor already has MinGW32 and MinGW64 flavors of GCC 5.3.0
installed under MSYS2, so there is no need to install our own builds of
MinGW.  MinGW-builds is no longer an active project, and we were getting
occasional timeouts while wgetting those files from SourceForge.
Furthermore, GCC 5.3.0 should produce faster code than GCC 4.8.1.

8 years agoBUILDING.md: Clarifications and wordsmithing
DRC [Mon, 21 Nov 2016 02:59:55 +0000 (20:59 -0600)]
BUILDING.md: Clarifications and wordsmithing

Updated out-of-date information, wordsmithed and clarified many
sections, and generally cleaned up the build recipes (including a
complete overhaul of the iOS recipes.)

8 years agoWindows build: Add an "uninstall" target
DRC [Mon, 21 Nov 2016 01:10:54 +0000 (19:10 -0600)]
Windows build: Add an "uninstall" target

8 years agoBUILDING.md/README.md: Increment libjpeg SO age
DRC [Fri, 18 Nov 2016 15:09:41 +0000 (09:09 -0600)]
BUILDING.md/README.md: Increment libjpeg SO age

Documentation buglet.  This should have been changed in
6ed4d9d11085acd04dc7f2f899848693976dc010 to reflect the addition of
libjpeg API functions in libjpeg-turbo 1.5.

8 years agoREADME.md: Don't use trailing spaces as line break
DRC [Fri, 18 Nov 2016 15:08:52 +0000 (09:08 -0600)]
README.md: Don't use trailing spaces as line break

Makes it easier to maintain this file using editors that automatically
remove trailing spaces.

8 years agoTJBench: Fix regression/-nowrite always enabled
DRC [Wed, 16 Nov 2016 21:08:16 +0000 (15:08 -0600)]
TJBench: Fix regression/-nowrite always enabled

Introduced by eb59b6e72d8098a1f7b8c7e0c710b32eb6f5dc45

8 years agoBUILDING.md: Don't use trailing spaces as line break
DRC [Tue, 15 Nov 2016 15:52:49 +0000 (09:52 -0600)]
BUILDING.md: Don't use trailing spaces as line break

Makes it easier to maintain this file using editors that automatically
remove trailing spaces.

8 years agoCMake build system: Fix the "testclean" target
DRC [Tue, 15 Nov 2016 14:37:04 +0000 (08:37 -0600)]
CMake build system: Fix the "testclean" target

Regression caused by f9134384b728d8943f252b27464d83c4b7b2d159

This commit also makes the "testclean" target clean up the 4:1:1 test
images.  This was implemented in the autotools build system in
1f3635c4969f2319a01c9fe561958815b733227f but was left out of the CMake
build system due to an oversight.

8 years agoDetect AltiVec support on AmigaOS 4
Chris Young [Fri, 18 Nov 2016 19:03:28 +0000 (19:03 +0000)]
Detect AltiVec support on AmigaOS 4

8 years agoMerge branch 'master' into dev
DRC [Fri, 21 Oct 2016 00:17:30 +0000 (19:17 -0500)]
Merge branch 'master' into dev

8 years agoTravis: Deploy to S3 rather than SourceForge
DRC [Thu, 20 Oct 2016 22:55:55 +0000 (17:55 -0500)]
Travis: Deploy to S3 rather than SourceForge

This has the following advantages:
-- It doesn't require checking a private SSH key into the repository.
(With SourceForge, an SSH key is the "keys to the kingdom".)
-- If the S3 key is compromised, it is very easy to revoke it and
generate a new one.
-- The S3 bucket is isolated, so even if it becomes compromised, then
the damage that one could do is limited.
-- It's much easier to manage files through S3's web interface than
through SourceForge.
-- The files are served via HTTPS.
-- Travis fully supports S3 as a deployment target, so this simplifies
.travis.yml somewhat.

8 years agoMerge branch 'master' into dev
DRC [Thu, 20 Oct 2016 06:37:40 +0000 (01:37 -0500)]
Merge branch 'master' into dev

8 years agoTravis: GPG sign Linux binaries/source tarballs
DRC [Thu, 20 Oct 2016 06:01:27 +0000 (01:01 -0500)]
Travis: GPG sign Linux binaries/source tarballs

Since we're still deploying our Linux/macOS CI artifacts to a web server
(specifically SourceForge Project Web Services) that doesn't support
HTTPS, it's a good idea to sign them.  But since the private key has to
be checked into the repository, we use a different key for signing the
pre-releases (per project policy, the private signing keys for our
release binaries are never made available on any public server.)

8 years agoWin: Use YASM if it is in the PATH and NASM isn't
DRC [Tue, 11 Oct 2016 16:58:20 +0000 (11:58 -0500)]
Win: Use YASM if it is in the PATH and NASM isn't

Previously, simd/CMakeLists.txt was hard-coded to use NASM, and it was
necessary to override the NASM variable in order to use YASM.  This
commit changes the behavior such that NASM is still preferred, but YASM
will be used if it is in the PATH and NASM isn't available.  This brings
the actual behavior in line with the behavior described in BUILDING.md.

Based on
https://github.com/xpol/libjpeg-turbo/commit/b0799a1598782799d4876538eddca7ad8438d8a6

Closes #107

8 years agoMerge branch 'master' into dev
DRC [Fri, 7 Oct 2016 17:55:18 +0000 (12:55 -0500)]
Merge branch 'master' into dev

8 years agoTravis: Fix deployment issue (2nd attempt)
DRC [Fri, 7 Oct 2016 17:54:55 +0000 (12:54 -0500)]
Travis: Fix deployment issue (2nd attempt)

8 years agoMerge branch 'master' into dev
DRC [Fri, 7 Oct 2016 10:47:35 +0000 (05:47 -0500)]
Merge branch 'master' into dev

8 years agoTravis: Fix deployment issue
DRC [Fri, 7 Oct 2016 10:34:11 +0000 (05:34 -0500)]
Travis: Fix deployment issue

"Skipping a deployment with the script provider because this branch is
not permitted"

8 years agoFix AppVeyor build on non-master branches
DRC [Fri, 7 Oct 2016 10:07:22 +0000 (05:07 -0500)]
Fix AppVeyor build on non-master branches

buildljt will clone the Git repository into the temp. directory, even if
the repository is really a local sandbox, so we need to specify the
branch.

8 years agoTravis: Use existing sandbox for official builds
DRC [Fri, 7 Oct 2016 10:07:11 +0000 (05:07 -0500)]
Travis: Use existing sandbox for official builds

This eliminates the need to specify the remote repository and branch,
and it prevents the code from being checked out twice.

8 years agoMerge branch 'master' into dev
DRC [Fri, 7 Oct 2016 09:51:29 +0000 (04:51 -0500)]
Merge branch 'master' into dev

8 years agoAdd AppVeyor config for Windows pre-release builds
DRC [Fri, 7 Oct 2016 09:28:02 +0000 (04:28 -0500)]
Add AppVeyor config for Windows pre-release builds

8 years agoTravis CI: Fixes to support AVX2 code
DRC [Wed, 5 Oct 2016 19:42:35 +0000 (14:42 -0500)]
Travis CI: Fixes to support AVX2 code

-- Use trusty for SIMD builds.  Ubuntu 12.04 is still using NASM 2.09.x,
   which isn't new enough to support AVX2.
-- Add a special test for the SSE2 code path, since it is no longer the
   default.

8 years agoMerge branch 'master' into dev
DRC [Wed, 5 Oct 2016 19:41:14 +0000 (14:41 -0500)]
Merge branch 'master' into dev

8 years agoTravis: use correct repo/branch for off. builds
DRC [Wed, 5 Oct 2016 19:36:46 +0000 (14:36 -0500)]
Travis: use correct repo/branch for off. builds

Pass the actual repository and branch that Travis is using into the
builtljt script, so the official builds it generates will come from
the same code base as the other tested builds.

8 years agoFix 'make dist'
DRC [Wed, 5 Oct 2016 18:36:35 +0000 (13:36 -0500)]
Fix 'make dist'

8 years agoTravis CI: Use correct key for this repository
DRC [Wed, 5 Oct 2016 17:38:59 +0000 (12:38 -0500)]
Travis CI: Use correct key for this repository

8 years agoAdd Travis CI config for Un*x pre-release builds
DRC [Sun, 2 Oct 2016 14:13:23 +0000 (09:13 -0500)]
Add Travis CI config for Un*x pre-release builds

8 years agoFix 32-bit non-SIMD FP regression tests
DRC [Tue, 4 Oct 2016 18:41:48 +0000 (13:41 -0500)]
Fix 32-bit non-SIMD FP regression tests

- Introduce a new FLOATTEST value ("387") on Un*x systems that will
  compare the floating point DCT/IDCT algorithms against the expected
  results from the C algorithms when built using 32-bit code and
  -mfpmath=387.
- Extend the Windows regression tests so that they work properly when
  building libjpeg-turbo with 32-bit code and without SIMD, using either
  Visual C++ (tested with 2008, 2010, 2015) or MinGW.

8 years agoFix broken build w/ Visual C++ < 2010
DRC [Tue, 4 Oct 2016 18:25:34 +0000 (13:25 -0500)]
Fix broken build w/ Visual C++ < 2010

Regression introduced by dfefba77520ded5c5fd4864e76352a5f3eb23e74
(Windows doesn't always have stdint.h.)

8 years agoFix broken MIPS build
DRC [Mon, 26 Sep 2016 22:59:14 +0000 (17:59 -0500)]
Fix broken MIPS build

Regression introduced by 9055fb408dcb585ce9392d395e16630d51002152

Fixes #104

8 years agoFix UBSan warning in arithmetic decoder
DRC [Thu, 22 Sep 2016 19:38:51 +0000 (14:38 -0500)]
Fix UBSan warning in arithmetic decoder

Very similar to the ones that were fixed in the Huffman decoders in
8e9cef2e6f5156c4b055a04a8f979b7291fc6b7a.  These are innocuous.

Refer to https://bugzilla.mozilla.org/show_bug.cgi?id=1304567.

8 years agoFix broken build with NDK platforms < android-21
DRC [Thu, 22 Sep 2016 19:19:29 +0000 (14:19 -0500)]
Fix broken build with NDK platforms < android-21

Regression introduced by a09ba29a55b9a43d346421210d94370065eeaf53

Fixes #103

8 years agoBump version to 1.5.2 to prepare for new commits
DRC [Thu, 22 Sep 2016 19:14:05 +0000 (14:14 -0500)]
Bump version to 1.5.2 to prepare for new commits

8 years agoREADME.md: Fix typo
Roberto Civille Rodrigues [Thu, 22 Sep 2016 03:42:25 +0000 (00:42 -0300)]
README.md: Fix typo

Introduced in 17de51835735e319ada5ca139a64227423946a8a

Closes #102

8 years agoMerge branch 'master' into dev
DRC [Tue, 20 Sep 2016 23:09:15 +0000 (18:09 -0500)]
Merge branch 'master' into dev

8 years agoARM64 NEON: Fix another ABI conformance issue 1.5.1
mayeut [Tue, 20 Sep 2016 19:06:24 +0000 (21:06 +0200)]
ARM64 NEON: Fix another ABI conformance issue

Based on
https://github.com/mayeut/libjpeg-turbo/commit/98a5a9dc899aa9265858a3cbe0a96289a31a1322
with wordsmithing by DRC.

In the AArch64 ABI, as in many others, it's forbidden to read/store data
below the stack pointer.  Some SIMD functions were doing just that
(stack pointer misuse) when trying to preserve callee-saved registers,
and this resulted in those registers being restored with incorrect
contents under certain circumstances.

This patch fixes that behavior, and callee-saved registers are now
stored above the stack pointer throughout the function call.  The patch
also removes register saving in places where it is unnecessary for this
ABI, or it makes use of unused scratch regiters instead of callee-saved
registers.

Fixes #97.  Closes #101.

Refer also to https://bugzilla.redhat.com/show_bug.cgi?id=1368569

8 years agoBuild: Remove ARMv6 support from 'make iosdmg'
DRC [Tue, 20 Sep 2016 03:47:18 +0000 (22:47 -0500)]
Build: Remove ARMv6 support from 'make iosdmg'

The last iDevice to require ARMv6 was the iPhone 3G, which required iOS
4.2.1 or older.  Our binaries have always required iOS 4.3 or newer,
so I'm not sure if the ARMv6 fork of our binaries was ever useful to
begin with.  In any case, if it ever was useful, it no longer is.  Fat
binaries can still be generated with ARMv6 support by invoking
{build_directory}/pkgscripts/makemacpkg manually.

8 years agoFix out-of-bounds write in partial decomp. feature
DRC [Fri, 9 Sep 2016 02:49:02 +0000 (21:49 -0500)]
Fix out-of-bounds write in partial decomp. feature

Reported by Clang UBSan (refer to
https://bugzilla.mozilla.org/show_bug.cgi?id=1301252 for test image.)
This appears to be a legitimate bug introduced by
3ab68cf563f6edc2608c085f5c8b2d5d5c61157e.  Any component array, such
as first_MCU_col and last_MCU_col, should always be able to accommodate
MAX_COMPONENTS values.  The aforementioned test image had 8 components,
which was not enough to make the out-of-bounds write bust out of the
jpeg_decomp_master struct (and fortunately the memory after last_MCU_col
is an integer used as a boolean, so stomping on it will do nothing other
than change the decoder state.)  I crafted another special image that
has 10 components (the maximum allowable), but that was apparently not
enough to bust out of the allocated memory, either.  Thus, it is
posited that the security threat posed by this bug is either extremely
minimal or non-existent.

8 years agoSilence additional UBSan warnings
DRC [Fri, 9 Sep 2016 02:29:58 +0000 (21:29 -0500)]
Silence additional UBSan warnings

NOTE: The jdhuff.c/jdphuff.c warnings should have already been silenced
by 8e9cef2e6f5156c4b055a04a8f979b7291fc6b7a, but apparently I need to
be REALLY clear that I'm trying to do pointer arithmetic rather than
dereference an array.  Grrr...

Refer to:
https://bugzilla.mozilla.org/show_bug.cgi?id=1301250
https://bugzilla.mozilla.org/show_bug.cgi?id=1301256

8 years agoFix unsigned int overflow in libjpeg memory mgr.
DRC [Wed, 7 Sep 2016 21:40:10 +0000 (16:40 -0500)]
Fix unsigned int overflow in libjpeg memory mgr.

When attempting to decode a malformed JPEG image (refer to
https://bugzilla.mozilla.org/show_bug.cgi?id=1295044) with dimensions
61472 x 32800, the maximum_space variable within the
realize_virt_arrays() function will exceed the maximum value of a 32-bit
integer and will wrap around.  The memory manager subsequently fails
with an "Insufficient memory" error (case 4, in alloc_large()), so this
commit simply causes that error to be triggered earlier, before UBSan
has a chance to complain.

Note that this issue did not ever represent an exploitable security
threat, because the POSIX-based memory manager that we use doesn't ever
do anything meaningful with the value of maximum_space.
jpeg_mem_available() simply sets avail_mem = maximum_space, so the
subsequent behavior of the memory manager is the same regardless of
whether maximum_space is correct or not.  This commit simply removes a
UBSan warning in order to make it easier to detect actual security
issues.

8 years agoTurboJPEG: Decomp. 4:2:2/4:4:0 JPEGs w/unusual SFs
DRC [Mon, 1 Aug 2016 16:22:24 +0000 (11:22 -0500)]
TurboJPEG: Decomp. 4:2:2/4:4:0 JPEGs w/unusual SFs

Normally, 4:2:2 JPEGs have horizontal x vertical luminance,chrominance
sampling factors of 2x1,1x1, and 4:4:0 JPEGs have horizontal x vertical
luminance,chrominance sampling factors of 1x2,1x1.  However, it is
technically legal to create 4:2:2 JPEGs with sampling factors of
2x2,1x2 and 4:4:0 JPEGs with sampling factors of 2x2,2x1, since the
sums of the products of those sampling factors (2x2 + 1x2 + 1x2 and
2x2 + 2x1 + 2x1) are still <= 10.  The libjpeg API correctly decodes
such images, so the TurboJPEG API should as well.

Fixes #92

8 years agoSilence pedantic GCC6 code formatting warnings
DRC [Thu, 14 Jul 2016 18:36:47 +0000 (13:36 -0500)]
Silence pedantic GCC6 code formatting warnings

Apparently it's "misleading" to put two self-contained if statements
on a single line.  Who knew?

8 years agoUse plain upsampling if merged isn't accelerated
DRC [Thu, 14 Jul 2016 01:39:11 +0000 (20:39 -0500)]
Use plain upsampling if merged isn't accelerated

Currently, this only affects ARM, since it is the only platform that
accelerates YCbCr-to-RGB conversion but not merged upsampling.  Even if
"plain" upsampling isn't accelerated, the combination of accelerated
color conversion + unaccelerated plain upsampling is still faster than
the unaccelerated merged upsampling algorithms.

Closes #81

8 years agoImplement h1v2 fancy upsampling
Kornel Lesiński [Sun, 10 Jul 2016 18:01:51 +0000 (19:01 +0100)]
Implement h1v2 fancy upsampling

This allows fancy upsampling to be used when decompressing 4:2:2 images
that have been losslessly rotated or transposed.

(docs and comments added by DRC)

Based on https://github.com/pornel/libjpeg-turbo/commit/f63aca945debde07e7c6476a1f667b71728c3d44

Closes #89

8 years agoAVX2: Perform additional checks for O/S support
DRC [Wed, 13 Jul 2016 21:03:36 +0000 (16:03 -0500)]
AVX2: Perform additional checks for O/S support

cpuid tells us whether the O/S uses extended state management via
XSAVE/XRSTOR, but we have to call xgetbv to verify that it is using
XSAVE/XRSTOR to manage the state of XMM/YMM registers.

8 years agoFix AArch64 ABI conformance issue in SIMD code
DRC [Wed, 13 Jul 2016 17:15:02 +0000 (12:15 -0500)]
Fix AArch64 ABI conformance issue in SIMD code

In the AArch64 ABI, the high (unused) DWORD of a 32-bit argument's
register is undefined, so it was incorrect to use 64-bit
instructions to transfer a JDIMENSION argument in the 64-bit NEON SIMD
functions.  The code worked thus far only because the existing compiler
optimizers weren't smart enough to do anything else with the register in
question, so the upper 32 bits happened to be all zeroes.

The latest builds of Clang/LLVM have a smarter optimizer, and under
certain circumstances, it will attempt to load-combine adjacent 32-bit
integers from one of the libjpeg structures into a single 64-bit integer
and pass that 64-bit integer as a 32-bit argument to one of the SIMD
functions (which is allowed by the ABI, since the upper 32 bits of the
32-bit argument's register are undefined.)  This caused the
libjpeg-turbo regression tests to crash.

This patch tries to use the Wn registers whenever possible.  Otherwise,
it uses a zero-extend instruction to avoid using the upper 32 bits of
the 64-bit registers, which are not guaranteed to be valid for 32-bit
arguments.

Based on https://github.com/sebpop/libjpeg-turbo/commit/1fbae13021eb98f6fffdfaf8678fcdb00b0b04d9

Closes #91.  Refer also to android-ndk/ndk#110 and
https://llvm.org/bugs/show_bug.cgi?id=28393

8 years agoDon't install libturbojpeg.pc if TJPEG disabled
DRC [Wed, 13 Jul 2016 03:21:20 +0000 (22:21 -0500)]
Don't install libturbojpeg.pc if TJPEG disabled

8 years agoAVX2: Verify O/S support for AVX2 before enabling
DRC [Tue, 12 Jul 2016 01:21:46 +0000 (20:21 -0500)]
AVX2: Verify O/S support for AVX2 before enabling

This fixes crashes that would occur when attempting to use
libjpeg-turbo's AVX2 extensions on older O/S's (such as Windows XP or
RHEL 5.)  Even if the CPU supports AVX2, the O/S has to also support
saving/restoring YMM registers when switching contexts.

8 years agoReformat jsimdcpu[-64].asm to improve readability
DRC [Tue, 12 Jul 2016 00:42:37 +0000 (19:42 -0500)]
Reformat jsimdcpu[-64].asm to improve readability

8 years agoMerge branch 'master' into dev
DRC [Mon, 11 Jul 2016 18:11:25 +0000 (13:11 -0500)]
Merge branch 'master' into dev

8 years ago32-bit AVX2 implementation of integer quantization
DRC [Sat, 9 Jul 2016 02:28:48 +0000 (21:28 -0500)]
32-bit AVX2 implementation of integer quantization

8 years ago64-bit AVX2 implementation of integer quantization
DRC [Fri, 8 Jul 2016 18:56:30 +0000 (13:56 -0500)]
64-bit AVX2 implementation of integer quantization

8 years agoAVX2: Avoid expensive AVX-SSE transitions
DRC [Sat, 9 Jul 2016 01:10:24 +0000 (20:10 -0500)]
AVX2: Avoid expensive AVX-SSE transitions

Refer to
https://software.intel.com/sites/default/files/m/d/4/1/d/8/11MC12_Avoiding_2BAVX-SSE_2BTransition_2BPenalties_2Brh_2Bfinal.pdf
for more information.  This eliminates all AVX-SSE transitions detected
with the Intel SDE tool.

8 years ago64-bit AVX2: Fix bug in IS_ALIGNED_AVX() macro
DRC [Fri, 8 Jul 2016 18:00:13 +0000 (13:00 -0500)]
64-bit AVX2: Fix bug in IS_ALIGNED_AVX() macro

32 = 1 << 5, not 1 << 8

8 years ago32-bit AVX2 impl. of h2v2 & h2v1 merged upsampling
DRC [Fri, 8 Jul 2016 03:04:25 +0000 (22:04 -0500)]
32-bit AVX2 impl. of h2v2 & h2v1 merged upsampling

8 years agoOptimize 64-bit AVX2 h2v2 fancy upsampler
DRC [Fri, 8 Jul 2016 01:36:15 +0000 (20:36 -0500)]
Optimize 64-bit AVX2 h2v2 fancy upsampler

Reduce register usage and eliminate unnecessary mov instructions

8 years ago32-bit AVX2 impl. of h2v2 & h2v1 upsampling
DRC [Fri, 8 Jul 2016 01:34:08 +0000 (20:34 -0500)]
32-bit AVX2 impl. of h2v2 & h2v1 upsampling

(Fancy & Plain)

8 years ago32-bit AVX2 impl. of YCC->RGB color conversion
DRC [Tue, 5 Jul 2016 22:20:20 +0000 (17:20 -0500)]
32-bit AVX2 impl. of YCC->RGB color conversion

8 years ago32-bit AVX2 impl. of h2v2 & h2v1 downsampling
DRC [Tue, 5 Jul 2016 20:50:50 +0000 (15:50 -0500)]
32-bit AVX2 impl. of h2v2 & h2v1 downsampling

8 years ago32-bit AVX2 impl. of RGB->YCC/RGB->Gray color conv
DRC [Tue, 5 Jul 2016 21:21:10 +0000 (16:21 -0500)]
32-bit AVX2 impl. of RGB->YCC/RGB->Gray color conv

8 years agoLinux/PPC: Only enable AltiVec if CPU supports it
DRC [Wed, 6 Jul 2016 16:58:28 +0000 (16:58 +0000)]
Linux/PPC: Only enable AltiVec if CPU supports it

This eliminates "illegal instruction" errors when running libjpeg-turbo
under Linux on PowerPC chips that lack AltiVec support (e.g. the old
7XX/G3 models but also the newer e5500 series.)

8 years agoARM/MIPS: Change the behavior of JSIMD_FORCE*
DRC [Thu, 7 Jul 2016 18:10:30 +0000 (13:10 -0500)]
ARM/MIPS: Change the behavior of JSIMD_FORCE*

The JSIMD_FORCE* environment variables previously meant "force the use
of this instruction set if it is available but others are available as
well", but that did nothing on ARM platforms, since there is only ever
one instruction set available.  Since the ARM and MIPS CPU feature
detection code is less than bulletproof, and since there is only one
SIMD instruction set (currently) supported on those platforms, it makes
sense for the JSIMD_FORCE* environment variables on those platforms to
actually force the use of the SIMD instruction set, thus bypassing the
CPU feature detection code.

This addresses a concern raised in #88 whereby parsing /proc/cpuinfo
didn't work within a QEMU environment.  This at least provides a
workaround, allowing users to force-enable or force-disable SIMD
instructions for ARM and MIPS builds of libjpeg-turbo.

8 years agoBump version to 1.5.1 to prepare for new commits
DRC [Wed, 6 Jul 2016 16:22:27 +0000 (16:22 +0000)]
Bump version to 1.5.1 to prepare for new commits

8 years agoLay the groundwork for 32-bit AVX2 SIMD support
DRC [Tue, 5 Jul 2016 21:19:26 +0000 (16:19 -0500)]
Lay the groundwork for 32-bit AVX2 SIMD support

8 years ago64-bit AVX2 SIMD: Restore instructive comments
DRC [Fri, 10 Jun 2016 21:03:49 +0000 (16:03 -0500)]
64-bit AVX2 SIMD: Restore instructive comments

This commit adds back instructive comments in the image-space
algorithms, similar to those in the SSE2 code.  These comments make it
easier to follow the flow of data through the algorithms.

8 years ago64-bit AVX2 impl. of h2v2 & h2v1 merged upsampling
DRC [Tue, 31 May 2016 20:19:53 +0000 (15:19 -0500)]
64-bit AVX2 impl. of h2v2 & h2v1 merged upsampling

8 years ago64-bit AVX2 impl. of h2v2 & h2v1 upsampling
DRC [Sun, 29 May 2016 13:09:27 +0000 (08:09 -0500)]
64-bit AVX2 impl. of h2v2 & h2v1 upsampling

(Fancy & Plain)

8 years ago64-bit AVX2 impl. of YCC->RGB color conversion
DRC [Sun, 29 May 2016 11:54:56 +0000 (06:54 -0500)]
64-bit AVX2 impl. of YCC->RGB color conversion

8 years ago64-bit AVX2 impl. of h2v2 & h2v1 downsampling
DRC [Sun, 29 May 2016 00:53:44 +0000 (19:53 -0500)]
64-bit AVX2 impl. of h2v2 & h2v1 downsampling

8 years ago64-bit AVX2 impl. of RGB->Gray color conversion
DRC [Sun, 29 May 2016 00:15:18 +0000 (19:15 -0500)]
64-bit AVX2 impl. of RGB->Gray color conversion

8 years ago64-bit AVX2 impl. of RGB->YCC color conversion
DRC [Sat, 28 May 2016 21:42:44 +0000 (16:42 -0500)]
64-bit AVX2 impl. of RGB->YCC color conversion

8 years agoLay the groundwork for 64-bit AVX2 SIMD support
DRC [Fri, 20 May 2016 15:45:32 +0000 (10:45 -0500)]
Lay the groundwork for 64-bit AVX2 SIMD support

8 years agox86-64 SIMD: Optimize argument collection
DRC [Sun, 29 May 2016 15:51:16 +0000 (10:51 -0500)]
x86-64 SIMD: Optimize argument collection

Expand collect_args/uncollect_args macros so that the number of
arguments can be specified.  This prevents unnecessary push and mov
instructions.

NOTE: On Windows, the push/pop of xmm6 and xmm7 had to be moved to the
other end of the macro to ensure that rsp is aligned on a 16-byte
boundary.

8 years agoBump version to 1.6 alpha1
DRC [Tue, 10 May 2016 01:28:17 +0000 (20:28 -0500)]
Bump version to 1.6 alpha1

(to prepare for new features)

8 years agoReformat SSE/SSE2 SIMD code to improve readability
DRC [Fri, 27 May 2016 21:58:23 +0000 (16:58 -0500)]
Reformat SSE/SSE2 SIMD code to improve readability

8 years ago1.5.0 1.5.0
DRC [Wed, 1 Jun 2016 03:53:17 +0000 (22:53 -0500)]
1.5.0

8 years agoBUILDING.md: More NASM/YASM clarifications
DRC [Wed, 1 Jun 2016 03:48:52 +0000 (22:48 -0500)]
BUILDING.md: More NASM/YASM clarifications

28d1a1300c6be7fc8614ed827eb56cd97cf84e76 introduced the line
"nasm.exe should be in your PATH".  This commit corrects an oversight in
8f1c0a681cd34e8e80ba7b06f356d6080a7172c9 /
e5091f2cf3b6ba747907012146df93df0d01ec85 whereby this line should have
been extended to include yasm.exe.

8 years agoFormat copyright headers more consistently
DRC [Tue, 24 May 2016 15:23:56 +0000 (10:23 -0500)]
Format copyright headers more consistently

The IJG convention is to format copyright notices as:

Copyright (C) YYYY, Owner.

We try to maintain this convention for any code that is part of the
libjpeg API library (with the exception of preserving the copyright
notices from Cendio's code verbatim, since those predate
libjpeg-turbo.)

Note that the phrase "All Rights Reserved" is no longer necessary, since
all Buenos Aires Convention signatories signed onto the Berne Convention
in 2000.  However, our convention is to retain this phrase for any files
that have a self-contained copyright header but to leave it off of any
files that refer to another file for conditions of distribution and use.
For instance, all of the non-SIMD files in the libjpeg API library refer
to README.ijg, and the copyright message in that file contains "All
Rights Reserved", so it is unnecessary to add it to the individual
files.

The TurboJPEG code retains my preferred formatting convention for
copyright notices, which is based on that of VirtualGL (where the
TurboJPEG API originated.)

8 years agoMerge branch '1.4.x'
DRC [Sat, 28 May 2016 23:19:45 +0000 (18:19 -0500)]
Merge branch '1.4.x'

8 years agoBUILDING.txt: Clarify NASM build requirements 1.4.x
DRC [Sat, 28 May 2016 23:08:22 +0000 (18:08 -0500)]
BUILDING.txt: Clarify NASM build requirements

The version requirements only apply to NASM (not YASM.)  Also, 2.11.09
was never actually released (the first release containing the OS X fix
is 2.12.)

8 years agoDon't allow opaque source/dest mgrs to be swapped
DRC [Wed, 11 May 2016 02:04:02 +0000 (21:04 -0500)]
Don't allow opaque source/dest mgrs to be swapped

Calling jpeg_stdio_dest() followed by jpeg_mem_dest(), or jpeg_mem_src()
followed by jpeg_stdio_src(), is dangerous, because the existing opaque
structure would not be big enough to accommodate the new source/dest
manager.  This issue was non-obvious to libjpeg-turbo consumers, since
it was only documented in code comments.  Furthermore, the issue could
also occur if the source/dest manager was allocated by the calling
program, but it was not allocated with enough space to accommodate the
opaque stdio or memory source/dest manager structs.  The safest thing to
do is to throw an error if one of these functions is called when there
is already a source/dest manager assigned to the object and it was
allocated elsewhere.

Closes #78, #79

8 years agoBuild: Add integer version macro to jconfig.h
DRC [Wed, 11 May 2016 00:36:34 +0000 (19:36 -0500)]
Build: Add integer version macro to jconfig.h

This makes it significantly easier to do conditional compilation based
on the libjpeg-turbo version.

Based on:
https://github.com/hasinoff/libjpeg-turbo/commit/e6d5b3e50b8b07488cb7b4d26ab2061685bc6875
https://github.com/hasinoff/libjpeg-turbo/commit/1394a89ba6f3cd8abb556c1b65bac4a5f09760d0

Closes #80

8 years agoBuild: Don't allow jpeg-7+ emul. w/o arith coding
DRC [Tue, 10 May 2016 01:00:46 +0000 (20:00 -0500)]
Build: Don't allow jpeg-7+ emul. w/o arith coding

The jpeg-7/jpeg-8 APIs/ABIs require arithmetic coding, and the jpeg-8
API/ABI requires the memory source/destination manager, so this commit
causes the build system to ignore --with-arith-enc/--without-arith-enc
and --with-arith-dec/--without-arith-dec (and the equivalent CMake
variables-- WITH_ARITH_ENC and WITH_ARITH_DEC) when v7/v8 API/ABI
emulation is enabled.  Furthermore, the CMake build system now ignores
WITH_MEM_SRCDST whenever WITH_JPEG8 is specified (the autotools build
system already did that.)

8 years agoARMv7 SIMD: Fix clang compatibility (Part 2)
mattsarett [Tue, 3 May 2016 14:33:43 +0000 (10:33 -0400)]
ARMv7 SIMD: Fix clang compatibility (Part 2)

GCC does support UAL syntax (strbeq) if the ".syntax unified" directive
is supplied.  This directive is supported by all versions of GCC and
clang going back to 2003, so it should not create any backward
compatibility issues.

Based on https://github.com/mattsarett/libjpeg-turbo/commit/1264349e2fa6f098178c37abfa7b059ad8b405a2

Closes #76

8 years agoARMv7 SIMD: Fix clang compatibility
mattsarett [Mon, 2 May 2016 16:31:51 +0000 (12:31 -0400)]
ARMv7 SIMD: Fix clang compatibility

By design, clang only supports Unified Assembler Language (and not
pre-UAL syntax):
https://llvm.org/bugs/show_bug.cgi?id=23507
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473c/BABJIHGJ.html

Thus, clang only supports the strbeq instruction and not streqb, but
unfortunately some versions of GCC only support streqb.  Go, go
Gadget #ifdef...

Based on https://github.com/mattsarett/libjpeg-turbo/commit/a82e63aac63f8fa3
95fa4caad4de6859623ee2e2

Closes #75

8 years agoMerge branch '1.4.x'
DRC [Sun, 1 May 2016 17:07:05 +0000 (12:07 -0500)]
Merge branch '1.4.x'

8 years agoFix CMake fallback BUILD var on non-U.S. machines
DRC [Sun, 1 May 2016 16:42:15 +0000 (11:42 -0500)]
Fix CMake fallback BUILD var on non-U.S. machines

If wmic.exe wasn't available, then CMakeLists.txt would call
"cmd /C date /T" and parse the result in order to set the BUILD
variable.  However, the parser assumed that the date was in MM/DD/YYYY
format, which is not generally the case unless the user's locale is U.S.
English with the default region/language settings for that locale.

This commit modifies CMakeLists.txt such that it uses the
string(TIMESTAMP) function available in CMake 2.8.11 and later to set
the BUILD variable, thus eliminating the need to use wmic.exe or any
other platform-specific hack.

This commit also modifies the build instructions to remove any reference
to CMake 2.6 (which hasn't been supported by our build system since
libjpeg-turbo 1.3.x.)

Closes #74