granicus.if.org Git - libx264/log

]> granicus.if.org Git - libx264/log

projects / libx264 / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Martin Storsjö [Tue, 25 Aug 2015 11:38:10 +0000 (14:38 +0300)]

checkasm: Check the right output range for integral_initXh

These functions write their output into sum+stride, while we previously
only checked [0..stride-8] within the sum array.

This catches the previously broken aarch64 version of these functions.

Also check up until stride-4 elements for init4h.

commit | commitdiff | tree

Janne Grunau [Thu, 20 Aug 2015 11:55:54 +0000 (13:55 +0200)]

aarch64: Skip deblocking in 264_deblock_h_chroma_422_neon

If the parameters (alpha, beta, tc0[]) indicated that the deblocking
should have been skipped, every 2nd chrome line would have deblocked
anyway.

deblock_h_chroma_422_neon: 2259 (before)
deblock_h_chroma_422_neon: 2192 (after)

commit | commitdiff | tree

Janne Grunau [Mon, 17 Aug 2015 14:39:20 +0000 (16:39 +0200)]

aarch64: Optimize various intra_predict asm functions

Make them at least as fast as the compiled C version (tested on
cortex-a53 vs. gcc 4.9.2).

                        C     NEON (before)   NEON (after)
intra_predict_4x4_dc:   260   335             260
intra_predict_4x4_dct:  210   265             200
intra_predict_8x8c_dc:  497   548             493
intra_predict_8x8c_v:   232   309             179 (arm64)
intra_predict_8x16c_dc: 795   830             790

commit | commitdiff | tree

Janne Grunau [Tue, 18 Aug 2015 08:25:10 +0000 (10:25 +0200)]

aarch64: Faster intra_predict_4x4_h

Use multiplication with 0x01010101 for splats.

On a cortex-a53:
gcc 4.9.2 llvm 3.6 neon (before) neon (after)
intra_predict_4x4_h: 162 147 160/155 139/135

commit | commitdiff | tree

Janne Grunau [Tue, 18 Aug 2015 08:25:09 +0000 (10:25 +0200)]

aarch64: Fix coeff_level_run* macros with LLVM's assembler

LLVM's integrated assembler does not treat symbols as integer constants.

commit | commitdiff | tree

Janne Grunau [Tue, 18 Aug 2015 08:25:08 +0000 (10:25 +0200)]

aarch64: Remove commas LLVM's assembler complains about

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:31 +0000 (23:59 +0300)]

arm: Implement x264_sub8x16_dct_dc_neon

checkasm timing      Cortex-A7      A8     A9
sub8x16_dct_dc_c            6386    3901   4080
sub8x16_dct_dc_neon         1491    698    917

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:28 +0000 (23:59 +0300)]

arm: Optimize x264_deblock_h_chroma_neon

Shuffle both chroma components together as a 16 bit unit, and
don't write the unchanged columns (like in x264_deblock_h_luma_neon
and in the aarch64 version of the function).

This causes a minor slowdown for x264_deblock_v_chroma_neon, but
it is negligible compared to the speedup.

checkasm timing      Cortex-A7    A8    A9
deblock_chroma[1]_c         4817  4057  3601
deblock_chroma[1]_neon      1249  716   817   (before)
deblock_chroma[1]_neon      1249  766   845   (after)

deblock_h_chroma_420_c      3699  3275  2830
deblock_h_chroma_420_neon   2068  1414  1400  (before)
deblock_h_chroma_420_neon   1838  1355  1291  (after)

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:27 +0000 (23:59 +0300)]

aarch64: Remove leftover commented out code

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:26 +0000 (23:59 +0300)]

aarch64: Simplify the decimate_score functions

After doing a left shift by the number of bits returned by clz,
only bits set to zero can be shifted out, so if the register
was nonzero to start with (which is checked), it can't become
zero here.

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:25 +0000 (23:59 +0300)]

arm: Use aligned loads in x264_coeff_last15_neon

After subtracting 2, the pointer will be aligned.

checkasm timing      Cortex-A7    A8    A9
coeff_last15_c              423   375   230
coeff_last15_neon           350   420   404  (before)
coeff_last15_neon           350   400   394  (after)

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:24 +0000 (23:59 +0300)]

arm: Simplify x264_predict_8x8c_p_neon

This gets rid of a few unnecessary (and confusing) steps in
calculating the increment to i00.

checkasm timing      Cortex-A7    A8    A9
intra_predict_8x8c_p_c      5525  4732  4755
intra_predict_8x8c_p_neon   1719  1140  1262  (before)
intra_predict_8x8c_p_neon   1663  1142  1255  (after)

commit | commitdiff | tree

Vittorio Giovara [Tue, 15 Sep 2015 13:40:14 +0000 (15:40 +0200)]

lavf: Use the prefixed name for pixel format enum

commit | commitdiff | tree

Janne Grunau [Wed, 2 Sep 2015 22:21:58 +0000 (00:21 +0200)]

aarch64: fix x264_mbtree_propagate_cost_neon

The branch conditon caused the loop to execute one time more than
intended. Detected by a memory corruption on arm with the 1 to 1 port of
the function.

commit | commitdiff | tree

Martin Storsjö [Thu, 13 Aug 2015 20:59:22 +0000 (23:59 +0300)]

aarch64: Fix integral_init4/8h_neon

The stride is the number of uint16_t elements and thus needs
to be shifted.

This issue had slipped unnoticed since checkasm didn't actually
verify the output of these functions.

commit | commitdiff | tree

Henrik Gramner [Thu, 27 Aug 2015 17:53:00 +0000 (19:53 +0200)]

x86: Fix integral_init4/8h_avx2

The AVX2 implementation was using the wrong offsets. It went undetected due to
the checkasm test being incorrect.

commit | commitdiff | tree

Mark Webster [Wed, 5 Aug 2015 03:28:17 +0000 (04:28 +0100)]

Simplify inclusion of x264.h in C++ projects

Name all structs to support forward declarations.
Add a conditional extern "C" wrapper in x264.h itself instead of having to
specify it in every location where it's included.

commit | commitdiff | tree

Henrik Gramner [Sun, 16 Aug 2015 19:59:26 +0000 (21:59 +0200)]

checkasm: Properly save rdx/edx in checkasm_call() on x86

If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.

Doesn't affect any of the existing checkasm tests but it's more correct
behavior and it might be useful in the future.

commit | commitdiff | tree

Henrik Gramner [Tue, 11 Aug 2015 15:19:35 +0000 (17:19 +0200)]

x86: Enable SSE2 by default on x86-32

It makes more sense to tune the defaults to benefit the vast majority of users.

Anyone still using a Pentium III for video encoding is of course free to
explicitly set different flags when compiling.

commit | commitdiff | tree

Henrik Gramner [Mon, 10 Aug 2015 20:30:21 +0000 (22:30 +0200)]

msvs/icl: Improve default CFLAGS

Use -fp:fast as a substitute for -ffast-math.
Increase warning level from -W0 to -W1 (the default setting).
Disable -GS (stack cookies) on MSVS. It's disabled by default on ICL.

commit | commitdiff | tree

Henrik Gramner [Wed, 12 Aug 2015 20:23:31 +0000 (22:23 +0200)]

Use a relative $SRCPATH for out-of-tree builds

Fixes out-of-tree MSVS builds on Cygwin.

commit | commitdiff | tree

Henrik Gramner [Sat, 8 Aug 2015 20:26:38 +0000 (22:26 +0200)]

cygwin: Enable MSVS support

`cl -showIncludes` creates absolute Windows paths for some files, attempt
to convert those to Unix paths.

Use relative paths for dependencies located in or below the working directory
in order to mimic the behavior of gcc and to make the paths more readable.

Make the dependency generation script a bit more robust in general.

commit | commitdiff | tree

Henrik Gramner [Sat, 8 Aug 2015 16:34:21 +0000 (18:34 +0200)]

cltostr.sh: Minor fixes

commit | commitdiff | tree

Henrik Gramner [Sat, 8 Aug 2015 10:21:54 +0000 (12:21 +0200)]

Simplify version.sh

Also remove some non-POSIX syntax and improve robustness.

As a bonus the script now runs about 2-3 times faster.

`git rev-list --count` could be used to simplify things even further,
but that functionality was added in git 1.7.2 so keep `wc -l` for now
to maintain compatibility with older git versions.

commit | commitdiff | tree

장영훈 [Fri, 7 Aug 2015 05:43:24 +0000 (14:43 +0900)]

msvs: Fix cl detection in non-English environments

commit | commitdiff | tree

Henrik Gramner [Mon, 3 Aug 2015 19:05:11 +0000 (21:05 +0200)]

x86inc: Sync minor changes from ffmpeg/libav

commit | commitdiff | tree

Henrik Gramner [Wed, 29 Jul 2015 17:30:52 +0000 (19:30 +0200)]

matroska: Add comments for the remaining element names

commit | commitdiff | tree

Henrik Gramner [Wed, 29 Jul 2015 17:30:41 +0000 (19:30 +0200)]

Silence various static analyzer warnings

Those are false positives, but it doesn't hurt to get rid of them.

commit | commitdiff | tree

Henrik Gramner [Sun, 26 Jul 2015 21:13:29 +0000 (23:13 +0200)]

mingw: Enable the tsaware linker flag

Avoids an irrelevant compatibility layer in Terminal Services environments.

https://msdn.microsoft.com/en-us/library/cc834995.aspx

commit | commitdiff | tree

Henrik Gramner [Sun, 26 Jul 2015 21:13:26 +0000 (23:13 +0200)]

msvs: Don't redefine snprintf for VS2015

Visual Studio 2015 has a proper snprintf implementation.

commit | commitdiff | tree

Henrik Gramner [Sun, 26 Jul 2015 21:13:19 +0000 (23:13 +0200)]

msvs: Prefer link.exe from the same directory as cl.exe

/usr/bin/link from coreutils may be located before the MSVS linker in $PATH
which causes linking to fail due to using the wrong binary.

commit | commitdiff | tree

Henrik Gramner [Sun, 26 Jul 2015 22:10:00 +0000 (00:10 +0200)]

frame_dump: check fseek() return value

commit | commitdiff | tree

Henrik Gramner [Sun, 26 Jul 2015 22:08:38 +0000 (00:08 +0200)]

x264_vfprintf: use va_copy

It's undefined behavior to use the same va_list twice.

This most likely didn't cause any issues in practice since the string would
have to be larger than 4 KiB to trigger the fallback path.

Use workaround for ICL as it doesn't define va_copy even for C99.

commit | commitdiff | tree

Henrik Gramner [Sun, 26 Jul 2015 22:08:31 +0000 (00:08 +0200)]

param_parse: Fix framerate rounding issues

commit | commitdiff | tree

Marcin Juszkiewicz [Mon, 1 Jun 2015 09:24:45 +0000 (11:24 +0200)]

aarch64: Remove broken CFLAGS in configure

GCC doesn't have an "-arch" switch, but works when that entire line is removed.

commit | commitdiff | tree

Rong Yan [Mon, 20 Jul 2015 08:34:20 +0000 (03:34 -0500)]

ppc: Add little-endian PowerPC support

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:46 +0000 (17:48 +0530)]

mips: MSA quant optimizations

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:45 +0000 (17:48 +0530)]

mips: MSA predict optimizations

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:44 +0000 (17:48 +0530)]

mips: MSA pixel optimizations

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:43 +0000 (17:48 +0530)]

mips: MSA deblock optimizations

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:42 +0000 (17:48 +0530)]

mips: MSA dct optimizations

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:40 +0000 (17:48 +0530)]

mips: MSA mc optimizations

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Thu, 18 Jun 2015 12:18:38 +0000 (17:48 +0530)]

mips: Common MSA macros

Add macros for load/store, slide, shift, transpose and basic arithmetic
operations required by subsequent patches.

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Rishikesh More [Tue, 12 May 2015 14:08:09 +0000 (19:38 +0530)]

mips: Add MSA support to checkasm

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>

commit | commitdiff | tree

Kaustubh Raste [Fri, 17 Apr 2015 12:08:58 +0000 (17:38 +0530)]

mips: Initial MSA support

MSA is the MIPS SIMD Architecture.

Add X264_CPU_MSA define.
Update configure to detect MIPS platform and set flags.
CPU-specific gcc options are expected through --extra-cflags.

Sample command line for mips32r5:
./configure --host=mipsel-linux-gnu --cross-prefix=<TOOLCHAIN>/mips-mti-linux-gnu-
--extra-cflags="-EL -mips32r5 -msched-weight -mload-store-pairs"

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>

commit | commitdiff | tree

Anton Mitrofanov [Thu, 16 Jul 2015 21:22:29 +0000 (00:22 +0300)]

Limit autodetection of threads number according to the source height

commit | commitdiff | tree

Anton Mitrofanov [Thu, 16 Jul 2015 16:04:59 +0000 (19:04 +0300)]

Fine-tune of frame's size predictors at ratecontrol start

This is attempt to improve VBV at start of video with a lot of threads which
delay feedback for predictors.

commit | commitdiff | tree

Anton Mitrofanov [Thu, 16 Jul 2015 13:15:56 +0000 (16:15 +0300)]

Use forced frame types in slicetype analysis

This should improve MBTree and VBV when a lot of forced frame types are used.

commit | commitdiff | tree

Henrik Gramner [Mon, 1 Dec 2014 21:05:42 +0000 (22:05 +0100)]

x86: SSSE3 and AVX2 implementations of plane_copy_swap

For NV21 input.

commit | commitdiff | tree

Yu Xiaolei [Fri, 6 Jun 2014 08:05:27 +0000 (16:05 +0800)]

NV21 input support

Eliminates an extra copy when encoding Android camera preview images.

Checkasm test by Janne Grunau.
ARM assembly with improvements from Janne Grunau.

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 15:00:47 +0000 (17:00 +0200)]

deblock: Write combining

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 12:59:59 +0000 (14:59 +0200)]

Get rid of some tabs and trailing whitespaces

commit | commitdiff | tree

Henrik Gramner [Sat, 23 May 2015 17:44:16 +0000 (19:44 +0200)]

x86: Experimental nasm support

Enables the use of nasm as an alternative to yasm.

Note that nasm cannot assemble x264 with PIC enabled since it currently doesn't
support [symbol-$$] addressing which is used extensively by x264's PIC code.
This includes all 64-bit Windows and 64-bit OS X builds, even non-shared.

For the above reason nasm is currently intentionally not auto-detected, instead
the assembler must be explicitly specified using "AS=nasm ./configure".

Also drop -O2 from ASFLAGS since it's simply ignored anyway.

commit | commitdiff | tree

Timothy Gu [Tue, 26 May 2015 17:12:42 +0000 (19:12 +0200)]

x86inc: Prevent warnings when using `struc` and `endstruc`

struc and endstruc attempts to revert to the previous section state set by
the SECTION macro.

Use the primitive [SECTION] directive instead of the SECTION macro for the
.note.GNU-stack section to prevent it from being emitted again during endstruc.

commit | commitdiff | tree

Henrik Gramner [Wed, 27 May 2015 19:38:14 +0000 (21:38 +0200)]

x86inc: Drop SECTION_TEXT macro

The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.

commit | commitdiff | tree

Henrik Gramner [Sat, 23 May 2015 11:38:05 +0000 (13:38 +0200)]

x86inc: Disable vpbroadcastq workaround in newer yasm versions

The bug was fixed in 1.3.0, so only perform the workaround in earlier versions.

commit | commitdiff | tree

Henrik Gramner [Sun, 24 May 2015 20:57:00 +0000 (22:57 +0200)]

Prefer Unicode versions of Windows API calls

Just for consistency, doesn't affect behavior.

commit | commitdiff | tree

Henrik Gramner [Sun, 24 May 2015 21:21:20 +0000 (23:21 +0200)]

Get rid of fPIC warnings when compiling a shared library on Windows

PIC is always enabled when compiling for Windows so gcc complains when using
-fPIC since it doesn't do anything.

commit | commitdiff | tree

Henrik Gramner [Sat, 25 Jul 2015 20:42:59 +0000 (22:42 +0200)]

matroska: Write the correct DocTypeVersion when using frame-packing

The StereoMode element is only valid with DocTypeVersion 3 or higher.

commit | commitdiff | tree

Anton Mitrofanov [Fri, 24 Jul 2015 21:21:52 +0000 (00:21 +0300)]

dump_yuv: Fix file handle leak

commit | commitdiff | tree

Anton Mitrofanov [Fri, 24 Jul 2015 21:20:47 +0000 (00:20 +0300)]

mp4: Fix file handle leak

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 22:40:45 +0000 (00:40 +0200)]

flv: Check fseek() and fwrite() return values

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 22:22:56 +0000 (00:22 +0200)]

flv: Fix memory and file handle leaks

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 23:23:35 +0000 (01:23 +0200)]

avs: Fix file handle leak

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 11:38:02 +0000 (13:38 +0200)]

matroska: Fix memory leak

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 11:24:29 +0000 (13:24 +0200)]

rdo: Fix potential CAVLC overflow issues

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 20:08:35 +0000 (22:08 +0200)]

slurp_file: Various minor bug fixes

* Fix unsigned <= 0 check.
* Add additional size sanity check on 32-bit systems.
* Don't read uninitialized data if fread() fails.

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 20:47:53 +0000 (22:47 +0200)]

param_parse: Check strdup() return value

commit | commitdiff | tree

Henrik Gramner [Tue, 23 Jun 2015 13:38:16 +0000 (15:38 +0200)]

param_parse: Fix memory leak

commit | commitdiff | tree

Anton Mitrofanov [Fri, 19 Jun 2015 13:01:12 +0000 (16:01 +0300)]

Add FreeBSD's stdint.h header guard to allowed list

Patch written by Koop Mast <kwm@FreeBSD.org>

commit | commitdiff | tree

Henrik Gramner [Fri, 22 May 2015 17:23:33 +0000 (19:23 +0200)]

x86: Prevent overread of src in plane_copy_interleave

Could only occur in 4:2:2 with height == 1.

Also enable asm for inputs with different U/V strides as long as the strides
have identical signs.

commit | commitdiff | tree

Anton Mitrofanov [Wed, 20 May 2015 20:10:20 +0000 (23:10 +0300)]

checkasm: Fix incorrect memcmp size for ARM architecture

commit | commitdiff | tree

Anton Mitrofanov [Sun, 26 Apr 2015 17:51:05 +0000 (20:51 +0300)]

Fix possible use of uninitialized MVs in lookahead analysis for B-frames

commit | commitdiff | tree

Anton Mitrofanov [Tue, 21 Apr 2015 20:08:19 +0000 (23:08 +0300)]

Catch incorrect usage of libx264 API for delayed frames flushing

commit | commitdiff | tree

Anton Mitrofanov [Sat, 7 Mar 2015 20:00:09 +0000 (23:00 +0300)]

Fix detection of system libx264 configuration

commit | commitdiff | tree

Anton Mitrofanov [Mon, 23 Feb 2015 11:23:18 +0000 (14:23 +0300)]

Cosmetic changes

commit | commitdiff | tree

Anton Mitrofanov [Tue, 30 Dec 2014 23:15:05 +0000 (02:15 +0300)]

Update configure for auto detection of system libx264 configuration

commit | commitdiff | tree

Anton Mitrofanov [Tue, 3 Feb 2015 11:51:28 +0000 (14:51 +0300)]

Add tile format frame packing value

Defined in 2014-02 edition.

commit | commitdiff | tree

Anton Mitrofanov [Tue, 3 Feb 2015 10:39:14 +0000 (13:39 +0300)]

Stricter validation of crop-rect values

commit | commitdiff | tree

Vittorio Giovara [Tue, 20 Jan 2015 16:15:56 +0000 (16:15 +0000)]

Add mono frame packing value

Defined in 2013-04 edition.

commit | commitdiff | tree

Vittorio Giovara [Tue, 20 Jan 2015 15:57:41 +0000 (15:57 +0000)]

Validate frame packing value instead of clipping

commit | commitdiff | tree

Christophe Gisquet [Tue, 3 Feb 2015 19:40:41 +0000 (20:40 +0100)]

x86inc: Correctly warn on use of SSE2 instructions in SSE functions

SSE2 instructions that are XMM-implementations of pre-existing MMX/MMX2
instructions did not issue warnings when used in SSE functions. Handle
it by also checking the register type when such instructions are used.

commit | commitdiff | tree

Christophe Gisquet [Tue, 3 Feb 2015 17:02:30 +0000 (18:02 +0100)]

x86inc: Fix instantiation of YMM registers

commit | commitdiff | tree

Vittorio Giovara [Tue, 20 Jan 2015 16:28:54 +0000 (16:28 +0000)]

matroska: Correctly write display width and height in stereo mode

According to the specifications, when stereo mode is set, these values
represent the single view size.

commit | commitdiff | tree

Kieran Kunhya [Tue, 20 Jan 2015 15:38:00 +0000 (09:38 -0600)]

Use POC type 0 for AVC-Intra

Based on a patch from Capella Systems

commit | commitdiff | tree

Anton Mitrofanov [Sat, 3 Jan 2015 12:46:19 +0000 (15:46 +0300)]

Fix ARCH variable name conflict with BSD ports (bsd.port.mk) read-only variable

commit | commitdiff | tree

Anton Mitrofanov [Sat, 27 Dec 2014 17:35:39 +0000 (20:35 +0300)]

Fix negative percentages in final stats output

They were caused by integer overflow when encoding long UHD video.

commit | commitdiff | tree

Anton Mitrofanov [Sat, 3 Jan 2015 20:35:23 +0000 (23:35 +0300)]

Bump dates to 2015

commit | commitdiff | tree

Anton Mitrofanov [Mon, 15 Dec 2014 15:49:23 +0000 (18:49 +0300)]

x86: Update intel compiler cpu dispatcher override for new versions of ICC/ICL

commit | commitdiff | tree

Anton Mitrofanov [Tue, 6 Sep 2011 17:53:29 +0000 (21:53 +0400)]

New AQ mode: auto-variance AQ with bias to dark scenes

Also known as --aq-mode 3 or auto-variance AQ modification.

commit | commitdiff | tree

Anton Mitrofanov [Tue, 28 Aug 2012 23:02:27 +0000 (03:02 +0400)]

Improve HRD conformance

commit | commitdiff | tree

Henrik Gramner [Fri, 28 Nov 2014 22:24:56 +0000 (23:24 +0100)]

x86: SSE and AVX implementations of plane_copy

Also remove the MMX2 implementation and fix src overread for height == 1.

commit | commitdiff | tree

Anton Mitrofanov [Mon, 29 Sep 2014 19:26:19 +0000 (23:26 +0400)]

Update to the latest version of gas-preprocessor.pl from http://git.libav.org/?p=gas-preprocessor.git

Contributions by Janne Grunau, Martin Storsjo, Mans Rullgard, David Conrad, Martin Aumuller and others

commit | commitdiff | tree

Janne Grunau [Tue, 18 Nov 2014 23:33:55 +0000 (00:33 +0100)]

aarch64: cabac_encode_{decision,bypass,terminal}_asm

benchmarks on a Nexus 9 (nvidia denver):
101.3 cycles in x264_cabac_encode_decision_c,   67105369 runs, 3495 skips
97.3 cycles in x264_cabac_encode_decision_asm, 67105493 runs, 3371 skips
132.8 cycles in x264_cabac_encode_terminal_c,    1046950 runs, 1626 skips
116.1 cycles in x264_cabac_encode_terminal_asm,  1048424 runs, 152 skips
92.4 cycles in x264_cabac_encode_bypass_c,     16776192 runs, 1024 skips
89.6 cycles in x264_cabac_encode_bypass_asm,   16776453 runs, 763 skips

Cycle counts are not as stable as one would like. The dynamic code
optimisation seems to produce different results for small chnages in a
binary. Repeated runs with the same binary produce stable results
though (ignoring the first run).

commit | commitdiff | tree

Janne Grunau [Thu, 6 Nov 2014 08:20:17 +0000 (09:20 +0100)]

checkasm: add cycle counter read for aarch64

Needs kernel support since user space access to the cycle counter is not
allowed on all available AArch64 systems (Android 5 and iOS).

commit | commitdiff | tree

Janne Grunau [Wed, 5 Nov 2014 10:35:13 +0000 (11:35 +0100)]

aarch64: nal_escape_neon

3-4 times faster.

commit | commitdiff | tree

Janne Grunau [Fri, 31 Oct 2014 13:49:04 +0000 (14:49 +0100)]

aarch64: {plane_copy,memcpy_aligned,memzero_aligned}_neon

2-3 times faster than C.

commit | commitdiff | tree

Janne Grunau [Wed, 29 Oct 2014 17:17:48 +0000 (18:17 +0100)]

aarch64: x264_mbtree_propagate_{cost,list}_neon

x264_mbtree_propagate_cost_neon is ~7 times faster.
x264_mbtree_propagate_list_neon is 33% faster.

commit | commitdiff | tree

Janne Grunau [Tue, 21 Oct 2014 13:18:49 +0000 (15:18 +0200)]

aarch64: x264_denoise_dct_neon

3.5 times faster.

commit | commitdiff | tree

Janne Grunau [Mon, 20 Oct 2014 11:12:14 +0000 (13:12 +0200)]

aarch64: x264_coeff_level_run{4,8,15,16}

All functions ~33% faster.

Unnamed repository; edit this file 'description' to name the repository.