]> granicus.if.org Git - libx264/log
libx264
3 years agomp4: Add GPAC detection with pkg-config
Anton Mitrofanov [Thu, 11 Feb 2021 18:10:31 +0000 (21:10 +0300)]
mp4: Add GPAC detection with pkg-config

3 years agoaarch64: Fix the zigzag_interleave_8x8_cavlc_neon function
Martin Storsjö [Mon, 12 Apr 2021 06:54:56 +0000 (09:54 +0300)]
aarch64: Fix the zigzag_interleave_8x8_cavlc_neon function

Use 'cmhs' (which does an unsigned greater or equal comparison)
instead of 'cmhi' (which does an unsigned greater comparison).

This makes sure that dct coeffs with a magnitude of 1 are recognized
in the output nnz buffer.

3 years agox86inc: Add stack probing on Windows
Henrik Gramner [Wed, 10 Feb 2021 14:40:32 +0000 (15:40 +0100)]
x86inc: Add stack probing on Windows

Large stack allocations on Windows need to use stack probing in order
to guarantee that all stack memory is committed before accessing it.
This is done by ensuring that the guard page(s) at the end of the
currently committed pages are touched prior to any pages beyond that.

3 years agoSilence false positive -Wformat-truncation warning
Henrik Gramner [Thu, 11 Feb 2021 13:24:27 +0000 (14:24 +0100)]
Silence false positive -Wformat-truncation warning

3 years agoFix MB stats
Anton Mitrofanov [Mon, 8 Feb 2021 18:07:36 +0000 (21:07 +0300)]
Fix MB stats

Bug report by Zhengzhi Duan.

3 years agoCI: Update macos URL to vlc-contrib
Anton Mitrofanov [Mon, 8 Feb 2021 21:25:32 +0000 (00:25 +0300)]
CI: Update macos URL to vlc-contrib

3 years agoFix VBV overflow check for B-frames
Anton Mitrofanov [Mon, 1 Feb 2021 19:32:37 +0000 (22:32 +0300)]
Fix VBV overflow check for B-frames

3 years agox86inc: Fix LOAD_MM_PERMUTATION for AVX-512
Anton Mitrofanov [Wed, 27 Jan 2021 14:14:55 +0000 (17:14 +0300)]
x86inc: Fix LOAD_MM_PERMUTATION for AVX-512

3 years agoFix PADH alignment
Anton Mitrofanov [Thu, 21 Jan 2021 20:26:27 +0000 (23:26 +0300)]
Fix PADH alignment

Make pointers to padded buffers aligned both before and after padding.

3 years agoFix alignment of chroma buffer for weightp
Anton Mitrofanov [Tue, 26 Jan 2021 17:43:34 +0000 (20:43 +0300)]
Fix alignment of chroma buffer for weightp

In 10-bit mode pixel_asd8 expects 16-byte alignment for pix1 and pix2.

3 years agoMakefile: Drop the -T argument to install
Henrik Gramner [Tue, 26 Jan 2021 01:21:16 +0000 (02:21 +0100)]
Makefile: Drop the -T argument to install

It's not required, and BSD doesn't support it.

4 years agoFix use of nalu_process callback
Anton Mitrofanov [Thu, 21 Jan 2021 13:41:42 +0000 (16:41 +0300)]
Fix use of nalu_process callback

Broke after unifying of 8-bit and 10-bit libraries.

4 years agoFix weighting for B-frames
Anton Mitrofanov [Tue, 19 Jan 2021 21:20:18 +0000 (00:20 +0300)]
Fix weighting for B-frames

This bug never occurs with the current reference management logic.
Bug report by Lingjiang Fang.

4 years agoFix CAVLC encoding
Anton Mitrofanov [Fri, 22 Jan 2021 00:51:23 +0000 (03:51 +0300)]
Fix CAVLC encoding

This bug mainly occurred when encoding with high bitrate (low QP).
It did not occur when encoding in baseline or main profile.

4 years agoBump dates to 2021
Anton Mitrofanov [Sun, 24 Jan 2021 13:28:24 +0000 (16:28 +0300)]
Bump dates to 2021

4 years agoCI: Move macos to catalina builders
Konstantin Pavlov [Tue, 24 Nov 2020 17:53:11 +0000 (20:53 +0300)]
CI: Move macos to catalina builders

4 years agoCI: Update URLs to the latest vlc-contrib
Anton Mitrofanov [Fri, 8 Jan 2021 15:51:45 +0000 (18:51 +0300)]
CI: Update URLs to the latest vlc-contrib

4 years agogitlab-ci: Add build-only configurations with llvm-mingw targeting armv7/aarch64
Martin Storsjö [Mon, 26 Oct 2020 08:01:33 +0000 (10:01 +0200)]
gitlab-ci: Add build-only configurations with llvm-mingw targeting armv7/aarch64

4 years agoconfigure: Fix endianness test when LTO is enabled through CFLAGS
Henrik Gramner [Sun, 29 Nov 2020 13:38:07 +0000 (14:38 +0100)]
configure: Fix endianness test when LTO is enabled through CFLAGS

4 years agogitlab-ci: Remove the unused _PATH variable
Martin Storsjö [Mon, 26 Oct 2020 08:36:36 +0000 (10:36 +0200)]
gitlab-ci: Remove the unused _PATH variable

It became unused in cde9a93319bea766a92e306d69059c76de970190.

4 years agoaarch64/asm: optimize cabac asm
Janne Grunau [Thu, 1 Oct 2020 21:08:37 +0000 (21:08 +0000)]
aarch64/asm: optimize cabac asm

0.5% - 2% overall speedup on
`./x264 --threads X --profile high --preset veryfast --crf 15 -o /dev/null park_joy_420_720p50.y4m`
cabac is responsible for roughly 1/6 of the CPU use.
Branch mispredictions are reduced by 15% to 20%.

cortex-s53: 0.5% faster
cortex-a72: 2%  faster
neoverse-n1: 0.9% faster

4 years agoaarch64/asm: support offsets in movrel macro
Janne Grunau [Thu, 1 Oct 2020 20:15:18 +0000 (22:15 +0200)]
aarch64/asm: support offsets in movrel macro

Imported from dav1d.

4 years agoaarch64/asm: optimize cabac_encode_terminal with extrinsic knowledge
Janne Grunau [Thu, 1 Oct 2020 23:49:53 +0000 (01:49 +0200)]
aarch64/asm: optimize cabac_encode_terminal with extrinsic knowledge

Approach taken from x86 asm. Overall speedup meaningless.
cabac_encode_terminal on average twice as fast on cortex-53 while
encoding with following command:
./x264 --threads 1 --profile high --preset veryfast --crf 15 -o /dev/null park_joy_420_720p50.y4m

Less relative speedup on cortex-a72/73.

4 years agoAdd a missing include of stdlib.h
Martin Storsjö [Mon, 26 Oct 2020 07:42:20 +0000 (09:42 +0200)]
Add a missing include of stdlib.h

Since 7ab4c928ef4511ea5753a36a57c3506d9fd5086b, osdep.h contains
calls to malloc/free.

This fixes building with MSVC targeting WinRT.

4 years agoconfigure: Add Apple Silicon support
Damiano Galassi [Thu, 23 Jul 2020 15:23:09 +0000 (17:23 +0200)]
configure: Add Apple Silicon support

4 years agox86: Remove workaround for nasm on macho64
Anton Mitrofanov [Thu, 8 Oct 2020 18:16:53 +0000 (21:16 +0300)]
x86: Remove workaround for nasm on macho64

4 years agox86: Fix exhaustive search ME asm
Anton Mitrofanov [Sat, 19 Sep 2020 10:30:28 +0000 (13:30 +0300)]
x86: Fix exhaustive search ME asm

4 years agox86: Fix memory operands for inline asm
Anton Mitrofanov [Tue, 8 Sep 2020 13:36:24 +0000 (16:36 +0300)]
x86: Fix memory operands for inline asm

4 years agox86: Fix clobbers for inline asm
Anton Mitrofanov [Fri, 4 Sep 2020 15:00:45 +0000 (18:00 +0300)]
x86: Fix clobbers for inline asm

4 years agoAdd support for long filenames on Windows 10
Henrik Gramner [Sat, 12 Sep 2020 17:24:00 +0000 (19:24 +0200)]
Add support for long filenames on Windows 10

4 years agomp4: Remove GPAC Windows Unicode compatibility shim
Henrik Gramner [Sat, 12 Sep 2020 17:23:57 +0000 (19:23 +0200)]
mp4: Remove GPAC Windows Unicode compatibility shim

GPAC has native UTF-8 support nowadays.

Also move the compatibility code to input/avs.c since that's the only
remaining code that uses it now.

4 years agomp4: Fix compiling with recent GPAC versions
Henrik Gramner [Sat, 12 Sep 2020 17:23:55 +0000 (19:23 +0200)]
mp4: Fix compiling with recent GPAC versions

4 years agoRename function x264_strdup to x264_param_strdup
Anton Mitrofanov [Sun, 12 Jul 2020 15:53:32 +0000 (18:53 +0300)]
Rename function x264_strdup to x264_param_strdup

4 years agocli: Add info about gpac/lsmash into version info
Anton Mitrofanov [Sun, 12 Jul 2020 15:12:53 +0000 (18:12 +0300)]
cli: Add info about gpac/lsmash into version info

4 years agoconfigure: Add options for bash-completion install
Anton Mitrofanov [Tue, 14 Jul 2020 13:35:11 +0000 (15:35 +0200)]
configure: Add options for bash-completion install

4 years agoAdd new API function: x264_param_cleanup
Anton Mitrofanov [Thu, 2 Jul 2020 20:23:50 +0000 (22:23 +0200)]
Add new API function: x264_param_cleanup

Should be called to free struct members allocated internally by libx264,
e.g. by x264_param_parse.
Partially based on videolan/x264!18 by Derek Buitenhuis.

4 years agomp4: Update GPAC support to v0.8.0 or later
A. David [Thu, 2 Jul 2020 17:45:50 +0000 (19:45 +0200)]
mp4: Update GPAC support to v0.8.0 or later

4 years agocli: Install bash autocomplete during 'make install'
Henrik Gramner [Wed, 10 Jun 2020 17:18:08 +0000 (19:18 +0200)]
cli: Install bash autocomplete during 'make install'

4 years agolavf: Update to the new API for iterating demuxers
Henrik Gramner [Wed, 10 Jun 2020 01:21:24 +0000 (03:21 +0200)]
lavf: Update to the new API for iterating demuxers

4 years agoCI: Add lsmash support + Change ffmpeg source
Anton Mitrofanov [Tue, 30 Jun 2020 19:28:05 +0000 (22:28 +0300)]
CI: Add lsmash support + Change ffmpeg source

4 years agoFix compilation with nasm 2.15
Henrik Gramner [Thu, 2 Jul 2020 01:00:32 +0000 (03:00 +0200)]
Fix compilation with nasm 2.15

4 years agoRemove code for non-positive f_ip_factor/f_pb_factor
Anton Mitrofanov [Sun, 26 Apr 2020 00:19:00 +0000 (03:19 +0300)]
Remove code for non-positive f_ip_factor/f_pb_factor

Currently they are guaranteed to be positive.

4 years agoconfigure: Fix building under the MSYS shell
JHammler [Mon, 15 Jun 2020 19:57:16 +0000 (21:57 +0200)]
configure: Fix building under the MSYS shell

4 years agoconfigure: allow 'strings' override via STRINGS variable
Sergei Trofimovich [Fri, 5 Jun 2020 17:34:02 +0000 (19:34 +0200)]
configure: allow 'strings' override via STRINGS variable

This allows building x264 on systems where 'strings' or
'${HOST}-strings' does not exist, but llvm-strings exists.

4 years agox86inc: Fix warnings when using nasm 2.15
Henrik Gramner [Tue, 9 Jun 2020 19:04:58 +0000 (21:04 +0200)]
x86inc: Fix warnings when using nasm 2.15

4 years agocheckasm: increase float error margin to 1e-5
Anton Mitrofanov [Sun, 24 May 2020 14:15:35 +0000 (17:15 +0300)]
checkasm: increase float error margin to 1e-5

checkasm10 with seed=511142008 failed on win32 gcc builds.

4 years agoFix data race
Anton Mitrofanov [Sun, 24 May 2020 13:35:00 +0000 (16:35 +0300)]
Fix data race

Closes videolan/x264#16.
Bug report by Zu-Ming Jiang.

4 years agoFix undefined behavior: index out of bounds (one more)
Anton Mitrofanov [Sat, 25 Apr 2020 23:56:25 +0000 (02:56 +0300)]
Fix undefined behavior: index out of bounds (one more)

last_non_b_pict_type is initialized to -1.
Bug report by Vitaly Buka.

4 years agoRemove use of non-breaking spaces
Anton Mitrofanov [Sat, 25 Apr 2020 21:50:12 +0000 (00:50 +0300)]
Remove use of non-breaking spaces

4 years agoFix file encoding from Windows-1252 to UTF-8
Anton Mitrofanov [Sat, 25 Apr 2020 21:20:07 +0000 (00:20 +0300)]
Fix file encoding from Windows-1252 to UTF-8

4 years agoFix warning: comparison of integers of different signs [-Wsign-compare]
Anton Mitrofanov [Sun, 5 Apr 2020 19:29:32 +0000 (22:29 +0300)]
Fix warning: comparison of integers of different signs [-Wsign-compare]

4 years agoFix error "invalid size of malloc" for 10-bit encodes at i686
Anton Mitrofanov [Sun, 5 Apr 2020 18:55:15 +0000 (21:55 +0300)]
Fix error "invalid size of malloc" for 10-bit encodes at i686

4 years agoFix undefined behavior: shift exponent is negative
Anton Mitrofanov [Sun, 1 Mar 2020 11:44:09 +0000 (14:44 +0300)]
Fix undefined behavior: shift exponent is negative

4 years agoFix undefined behavior: access within misaligned address
Anton Mitrofanov [Sun, 1 Mar 2020 11:42:50 +0000 (14:42 +0300)]
Fix undefined behavior: access within misaligned address

4 years agoFix undefined behavior: applying [non-]zero offset to null pointer
Anton Mitrofanov [Sun, 1 Mar 2020 11:17:02 +0000 (14:17 +0300)]
Fix undefined behavior: applying [non-]zero offset to null pointer

4 years agoFix undefined behavior: index out of bounds
Anton Mitrofanov [Sun, 1 Mar 2020 11:00:34 +0000 (14:00 +0300)]
Fix undefined behavior: index out of bounds

4 years agoFix undefined behavior: division by zero
Anton Mitrofanov [Sun, 1 Mar 2020 10:38:46 +0000 (13:38 +0300)]
Fix undefined behavior: division by zero

4 years agoCI: Fix vlc-contrib URL for windows targets
Anton Mitrofanov [Wed, 8 Apr 2020 22:11:23 +0000 (01:11 +0300)]
CI: Fix vlc-contrib URL for windows targets

4 years agoBump dates to 2020
Anton Mitrofanov [Sat, 29 Feb 2020 19:02:01 +0000 (22:02 +0300)]
Bump dates to 2020

5 years agoCheck support for force_align_arg_pointer attribute
Anton Mitrofanov [Mon, 25 Nov 2019 14:38:57 +0000 (17:38 +0300)]
Check support for force_align_arg_pointer attribute

Closes videolan/x264#9.

5 years agoFix float division by zero when encoding CRF+VBV
Anton Mitrofanov [Mon, 25 Nov 2019 11:58:43 +0000 (14:58 +0300)]
Fix float division by zero when encoding CRF+VBV

Bug report by Sam Panzer.

5 years agoLimit maximum supported resolution
Anton Mitrofanov [Fri, 15 Nov 2019 00:04:16 +0000 (03:04 +0300)]
Limit maximum supported resolution

And other resolution dependent buffers checks.
Closes videolan/x264#10.

5 years agoaarch64: Use HAVE_NEON define during CPU detection
Janne Grunau [Fri, 1 Nov 2019 09:00:11 +0000 (10:00 +0100)]
aarch64: Use HAVE_NEON define during CPU detection

5 years agoaarch64: Fix compilation with disabled asm
Anton Mitrofanov [Thu, 31 Oct 2019 23:45:39 +0000 (02:45 +0300)]
aarch64: Fix compilation with disabled asm

5 years agoExport symbols only when building shared library
Anton Mitrofanov [Thu, 31 Oct 2019 21:10:22 +0000 (00:10 +0300)]
Export symbols only when building shared library

5 years agoFix compilation of fprofiled shared build
Anton Mitrofanov [Thu, 31 Oct 2019 20:22:28 +0000 (23:22 +0300)]
Fix compilation of fprofiled shared build

5 years agoRemove CRT objects use between DLL boundaries origin/HEAD origin/master
Anton Mitrofanov [Wed, 8 May 2019 16:19:11 +0000 (19:19 +0300)]
Remove CRT objects use between DLL boundaries

Fix crash of MSVC builds compiled with --system-libx264 and /MT (default) CRT.

5 years agoFix MSVS build with ./configure --enable-shared --system-libx264
Anton Mitrofanov [Mon, 22 Apr 2019 19:18:01 +0000 (22:18 +0300)]
Fix MSVS build with ./configure --enable-shared --system-libx264

5 years agoMark explicitly DSO public API symbols and hide all other by -fvisibility=hidden
Anton Mitrofanov [Fri, 29 Mar 2019 14:53:14 +0000 (17:53 +0300)]
Mark explicitly DSO public API symbols and hide all other by -fvisibility=hidden

Removes need for -Bsymbolic during linking.

5 years agox86: Perform stack realignment in C instead of assembly
Henrik Gramner [Sat, 30 Mar 2019 16:47:25 +0000 (17:47 +0100)]
x86: Perform stack realignment in C instead of assembly

Simplifies a lot of code and avoids having to export public asm functions.

Note that the force_align_arg_pointer function attribute is broken in clang
versions prior to 6.0.1 which may result in crashes, so make sure to either
use a newer clang version or a different compiler.

5 years agoStrip git-hash from version in x264.pc origin/stable
Anton Mitrofanov [Fri, 12 Jul 2019 12:23:29 +0000 (15:23 +0300)]
Strip git-hash from version in x264.pc

pkg-config doesn't like spaces in version string.

5 years agoRevert r2959: Signal Progressive and Constrained profiles
Anton Mitrofanov [Mon, 8 Jul 2019 12:46:56 +0000 (15:46 +0300)]
Revert r2959: Signal Progressive and Constrained profiles

Some hardware decoders reject to decode streams with non-zero
constraint_set4_flag/constraint_set5_flag.

5 years agoFix x264_picture_alloc with X264_CSP_I400 colorspace
Anton Mitrofanov [Fri, 14 Jun 2019 16:57:36 +0000 (19:57 +0300)]
Fix x264_picture_alloc with X264_CSP_I400 colorspace

5 years agoShut up UBSan about uninitialized data read
Anton Mitrofanov [Wed, 8 May 2019 14:52:15 +0000 (17:52 +0300)]
Shut up UBSan about uninitialized data read

Result was never used in that case.

5 years agoFix integer overflow detected by UBSan in --weightp analysis
Anton Mitrofanov [Mon, 22 Apr 2019 18:41:43 +0000 (21:41 +0300)]
Fix integer overflow detected by UBSan in --weightp analysis

Bug report by Xuezhi Yan.

5 years agocheckasm: Fix heap-buffer-overflow read detected by ASan
Anton Mitrofanov [Fri, 12 Apr 2019 12:40:01 +0000 (15:40 +0300)]
checkasm: Fix heap-buffer-overflow read detected by ASan

5 years agoFix heap-buffer-overflow read detected by ASan with interlaced encoding
Anton Mitrofanov [Fri, 12 Apr 2019 12:38:08 +0000 (15:38 +0300)]
Fix heap-buffer-overflow read detected by ASan with interlaced encoding

Bug report by Hongxu Chen.

5 years agoCI: Bump macos target to darwin18
Konstantin Pavlov [Tue, 16 Jul 2019 19:38:32 +0000 (22:38 +0300)]
CI: Bump macos target to darwin18

5 years agoCI: Use a newer aarch64 image
Konstantin Pavlov [Tue, 16 Jul 2019 19:24:46 +0000 (22:24 +0300)]
CI: Use a newer aarch64 image

It now includes pkg-config, so lavf can be detected.

5 years agoAdded gitlab CI
Konstantin Pavlov [Fri, 5 Apr 2019 12:08:29 +0000 (15:08 +0300)]
Added gitlab CI

Supported targets:
 - debian amd64
 - debian aarch64
 - windows 32 bit
 - windows 64 bit
 - macos 64bit

The tests are ran on all supported targets (via wine on windows).

The release jobs are only available on master/stable branches in
videolan/x264 repository, and must be ran manually when a developer
wishes to upload the artifacts.

5 years agoFix warning in autocomplete.c when compiled with lavf
Henrik Gramner [Thu, 14 Mar 2019 13:31:22 +0000 (14:31 +0100)]
Fix warning in autocomplete.c when compiled with lavf

5 years agoRemove compatibility workarounds
Anton Mitrofanov [Mon, 5 Jun 2017 23:30:41 +0000 (02:30 +0300)]
Remove compatibility workarounds

This will break decoding with older versions of FFmpeg/Libav.

5 years agoRemove h->rc dereferencing where possible
Anton Mitrofanov [Fri, 9 Nov 2018 15:37:17 +0000 (18:37 +0300)]
Remove h->rc dereferencing where possible

5 years agox86inc: Add support for GFNI instructions
Henrik Gramner [Sat, 16 Feb 2019 20:02:01 +0000 (21:02 +0100)]
x86inc: Add support for GFNI instructions

5 years agox86inc: Improve warnings for use of unsupported instructions
Henrik Gramner [Sat, 16 Feb 2019 16:57:21 +0000 (17:57 +0100)]
x86inc: Improve warnings for use of unsupported instructions

Warn when the following are used without the appropriate cpuflag:
 * YMM and ZMM registers
 * 'pextrw' with a memory operand
 * GPR instruction set extensions

5 years agox86inc: Support N_PEXT bit on Mach-O
Henrik Gramner [Thu, 31 Jan 2019 19:42:32 +0000 (20:42 +0100)]
x86inc: Support N_PEXT bit on Mach-O

Allows for marking symbols as having limited global scope, similar to
using 'hidden' symbol visibility on ELF.

5 years agox86inc: Make 'non-adjacent' default in the TAIL_CALL macro
Henrik Gramner [Thu, 31 Jan 2019 19:21:43 +0000 (20:21 +0100)]
x86inc: Make 'non-adjacent' default in the TAIL_CALL macro

5 years agox86inc: Add x86-32 PIC support macros
Henrik Gramner [Thu, 31 Jan 2019 19:17:56 +0000 (20:17 +0100)]
x86inc: Add x86-32 PIC support macros

5 years agox86inc: Turn 'movsxd' into 'movifnidn' on x86-32
Henrik Gramner [Thu, 31 Jan 2019 19:11:01 +0000 (20:11 +0100)]
x86inc: Turn 'movsxd' into 'movifnidn' on x86-32

5 years agoBump dates to 2019
Henrik Gramner [Thu, 31 Jan 2019 19:08:40 +0000 (20:08 +0100)]
Bump dates to 2019

5 years agocli: Bash autocomplete support
Henrik Gramner [Sun, 1 Jul 2018 18:34:48 +0000 (20:34 +0200)]
cli: Bash autocomplete support

Allows for automatic command line completion for both options and values.

Options such as --input-csp and --input-fmt will dynamically retrieve
supported values from libavformat when compiled with lavf support.

Execute 'source tools/bash-autocomplete.sh' in bash to enable.

5 years agoSignal Progressive and Constrained profiles
Yusuke Nakamura [Mon, 9 Apr 2018 02:01:28 +0000 (11:01 +0900)]
Signal Progressive and Constrained profiles

Progressive High, Constrained High, and Progressive High 10.

Even in Main profile, constraint_set4_flag is now set to 1 if progressive,
and constraint_set5_flag is set to 1 if no B-slices are present.

5 years agoppc: Use xxpermdi in sad_x3/x4 and use macros to avoid redundant code
Alexandra Hájková [Sat, 8 Sep 2018 07:15:53 +0000 (07:15 +0000)]
ppc: Use xxpermdi in sad_x3/x4 and use macros to avoid redundant code

5 years agoppc: Use the vec_xst_len for partial stores in mc
Luca Barbato [Thu, 6 Sep 2018 10:25:14 +0000 (12:25 +0200)]
ppc: Use the vec_xst_len for partial stores in mc

Around a ~1% speedup to the overall encoding for --slow.

5 years agoppc: Use vec_splats in mc
Luca Barbato [Thu, 6 Sep 2018 10:25:13 +0000 (12:25 +0200)]
ppc: Use vec_splats in mc

No overall speedup, just tidier code.

5 years agoppc: Use the vec_xst_len for partial stores
Luca Barbato [Thu, 23 Aug 2018 08:30:37 +0000 (08:30 +0000)]
ppc: Use the vec_xst_len for partial stores

Seems to give about a 1-2% overall speedup on --slow.

5 years agoppc: Use xxpermdi in VEC_STORE8
Luca Barbato [Sun, 19 Aug 2018 15:27:55 +0000 (17:27 +0200)]
ppc: Use xxpermdi in VEC_STORE8

Around a ~2% speedup to the overall encoding for --slow.

5 years agoppc: Use a single store to write the scores for sad_x4_8x8
Luca Barbato [Sun, 19 Aug 2018 15:27:54 +0000 (17:27 +0200)]
ppc: Use a single store to write the scores for sad_x4_8x8

Yet another use of xxpermdi, another 10% gain.

5 years agoppc: Use xxpermdi to halve the computation in sad_x4_8x8
Luca Barbato [Sun, 19 Aug 2018 15:27:53 +0000 (17:27 +0200)]
ppc: Use xxpermdi to halve the computation in sad_x4_8x8

About 20% faster.

5 years agoppc: Rework satd_4* likewise
Luca Barbato [Sun, 19 Aug 2018 07:28:42 +0000 (09:28 +0200)]
ppc: Rework satd_4* likewise

Now 4x4 is as slow as C and 4x8 is a 2% faster than before.