]> granicus.if.org Git - libx264/log
libx264
9 years agomips: Add MSA support to checkasm
Rishikesh More [Tue, 12 May 2015 14:08:09 +0000 (19:38 +0530)]
mips: Add MSA support to checkasm

Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
9 years agomips: Initial MSA support
Kaustubh Raste [Fri, 17 Apr 2015 12:08:58 +0000 (17:38 +0530)]
mips: Initial MSA support

MSA is the MIPS SIMD Architecture.

Add X264_CPU_MSA define.
Update configure to detect MIPS platform and set flags.
CPU-specific gcc options are expected through --extra-cflags.

Sample command line for mips32r5:
    ./configure --host=mipsel-linux-gnu --cross-prefix=<TOOLCHAIN>/mips-mti-linux-gnu-
    --extra-cflags="-EL -mips32r5 -msched-weight -mload-store-pairs"

Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
9 years agoLimit autodetection of threads number according to the source height
Anton Mitrofanov [Thu, 16 Jul 2015 21:22:29 +0000 (00:22 +0300)]
Limit autodetection of threads number according to the source height

9 years agoFine-tune of frame's size predictors at ratecontrol start
Anton Mitrofanov [Thu, 16 Jul 2015 16:04:59 +0000 (19:04 +0300)]
Fine-tune of frame's size predictors at ratecontrol start

This is attempt to improve VBV at start of video with a lot of threads which
delay feedback for predictors.

9 years agoUse forced frame types in slicetype analysis
Anton Mitrofanov [Thu, 16 Jul 2015 13:15:56 +0000 (16:15 +0300)]
Use forced frame types in slicetype analysis

This should improve MBTree and VBV when a lot of forced frame types are used.

9 years agox86: SSSE3 and AVX2 implementations of plane_copy_swap
Henrik Gramner [Mon, 1 Dec 2014 21:05:42 +0000 (22:05 +0100)]
x86: SSSE3 and AVX2 implementations of plane_copy_swap

For NV21 input.

9 years agoNV21 input support
Yu Xiaolei [Fri, 6 Jun 2014 08:05:27 +0000 (16:05 +0800)]
NV21 input support

Eliminates an extra copy when encoding Android camera preview images.

Checkasm test by Janne Grunau.
ARM assembly with improvements from Janne Grunau.

9 years agodeblock: Write combining
Henrik Gramner [Tue, 23 Jun 2015 15:00:47 +0000 (17:00 +0200)]
deblock: Write combining

9 years agoGet rid of some tabs and trailing whitespaces
Henrik Gramner [Tue, 23 Jun 2015 12:59:59 +0000 (14:59 +0200)]
Get rid of some tabs and trailing whitespaces

9 years agox86: Experimental nasm support
Henrik Gramner [Sat, 23 May 2015 17:44:16 +0000 (19:44 +0200)]
x86: Experimental nasm support

Enables the use of nasm as an alternative to yasm.

Note that nasm cannot assemble x264 with PIC enabled since it currently doesn't
support [symbol-$$] addressing which is used extensively by x264's PIC code.
This includes all 64-bit Windows and 64-bit OS X builds, even non-shared.

For the above reason nasm is currently intentionally not auto-detected, instead
the assembler must be explicitly specified using "AS=nasm ./configure".

Also drop -O2 from ASFLAGS since it's simply ignored anyway.

9 years agox86inc: Prevent warnings when using `struc` and `endstruc`
Timothy Gu [Tue, 26 May 2015 17:12:42 +0000 (19:12 +0200)]
x86inc: Prevent warnings when using `struc` and `endstruc`

struc and endstruc attempts to revert to the previous section state set by
the SECTION macro.

Use the primitive [SECTION] directive instead of the SECTION macro for the
.note.GNU-stack section to prevent it from being emitted again during endstruc.

9 years agox86inc: Drop SECTION_TEXT macro
Henrik Gramner [Wed, 27 May 2015 19:38:14 +0000 (21:38 +0200)]
x86inc: Drop SECTION_TEXT macro

The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.

9 years agox86inc: Disable vpbroadcastq workaround in newer yasm versions
Henrik Gramner [Sat, 23 May 2015 11:38:05 +0000 (13:38 +0200)]
x86inc: Disable vpbroadcastq workaround in newer yasm versions

The bug was fixed in 1.3.0, so only perform the workaround in earlier versions.

9 years agoPrefer Unicode versions of Windows API calls
Henrik Gramner [Sun, 24 May 2015 20:57:00 +0000 (22:57 +0200)]
Prefer Unicode versions of Windows API calls

Just for consistency, doesn't affect behavior.

9 years agoGet rid of fPIC warnings when compiling a shared library on Windows
Henrik Gramner [Sun, 24 May 2015 21:21:20 +0000 (23:21 +0200)]
Get rid of fPIC warnings when compiling a shared library on Windows

PIC is always enabled when compiling for Windows so gcc complains when using
-fPIC since it doesn't do anything.

9 years agomatroska: Write the correct DocTypeVersion when using frame-packing
Henrik Gramner [Sat, 25 Jul 2015 20:42:59 +0000 (22:42 +0200)]
matroska: Write the correct DocTypeVersion when using frame-packing

The StereoMode element is only valid with DocTypeVersion 3 or higher.

9 years agodump_yuv: Fix file handle leak
Anton Mitrofanov [Fri, 24 Jul 2015 21:21:52 +0000 (00:21 +0300)]
dump_yuv: Fix file handle leak

9 years agomp4: Fix file handle leak
Anton Mitrofanov [Fri, 24 Jul 2015 21:20:47 +0000 (00:20 +0300)]
mp4: Fix file handle leak

9 years agoflv: Check fseek() and fwrite() return values
Henrik Gramner [Tue, 23 Jun 2015 22:40:45 +0000 (00:40 +0200)]
flv: Check fseek() and fwrite() return values

9 years agoflv: Fix memory and file handle leaks
Henrik Gramner [Tue, 23 Jun 2015 22:22:56 +0000 (00:22 +0200)]
flv: Fix memory and file handle leaks

9 years agoavs: Fix file handle leak
Henrik Gramner [Tue, 23 Jun 2015 23:23:35 +0000 (01:23 +0200)]
avs: Fix file handle leak

9 years agomatroska: Fix memory leak
Henrik Gramner [Tue, 23 Jun 2015 11:38:02 +0000 (13:38 +0200)]
matroska: Fix memory leak

9 years agordo: Fix potential CAVLC overflow issues
Henrik Gramner [Tue, 23 Jun 2015 11:24:29 +0000 (13:24 +0200)]
rdo: Fix potential CAVLC overflow issues

9 years agoslurp_file: Various minor bug fixes
Henrik Gramner [Tue, 23 Jun 2015 20:08:35 +0000 (22:08 +0200)]
slurp_file: Various minor bug fixes

 * Fix unsigned <= 0 check.
 * Add additional size sanity check on 32-bit systems.
 * Don't read uninitialized data if fread() fails.

9 years agoparam_parse: Check strdup() return value
Henrik Gramner [Tue, 23 Jun 2015 20:47:53 +0000 (22:47 +0200)]
param_parse: Check strdup() return value

9 years agoparam_parse: Fix memory leak
Henrik Gramner [Tue, 23 Jun 2015 13:38:16 +0000 (15:38 +0200)]
param_parse: Fix memory leak

9 years agoAdd FreeBSD's stdint.h header guard to allowed list
Anton Mitrofanov [Fri, 19 Jun 2015 13:01:12 +0000 (16:01 +0300)]
Add FreeBSD's stdint.h header guard to allowed list

Patch written by Koop Mast <kwm@FreeBSD.org>

9 years agox86: Prevent overread of src in plane_copy_interleave
Henrik Gramner [Fri, 22 May 2015 17:23:33 +0000 (19:23 +0200)]
x86: Prevent overread of src in plane_copy_interleave

Could only occur in 4:2:2 with height == 1.

Also enable asm for inputs with different U/V strides as long as the strides
have identical signs.

9 years agocheckasm: Fix incorrect memcmp size for ARM architecture
Anton Mitrofanov [Wed, 20 May 2015 20:10:20 +0000 (23:10 +0300)]
checkasm: Fix incorrect memcmp size for ARM architecture

9 years agoFix possible use of uninitialized MVs in lookahead analysis for B-frames
Anton Mitrofanov [Sun, 26 Apr 2015 17:51:05 +0000 (20:51 +0300)]
Fix possible use of uninitialized MVs in lookahead analysis for B-frames

9 years agoCatch incorrect usage of libx264 API for delayed frames flushing
Anton Mitrofanov [Tue, 21 Apr 2015 20:08:19 +0000 (23:08 +0300)]
Catch incorrect usage of libx264 API for delayed frames flushing

9 years agoFix detection of system libx264 configuration
Anton Mitrofanov [Sat, 7 Mar 2015 20:00:09 +0000 (23:00 +0300)]
Fix detection of system libx264 configuration

9 years agoCosmetic changes
Anton Mitrofanov [Mon, 23 Feb 2015 11:23:18 +0000 (14:23 +0300)]
Cosmetic changes

9 years agoUpdate configure for auto detection of system libx264 configuration
Anton Mitrofanov [Tue, 30 Dec 2014 23:15:05 +0000 (02:15 +0300)]
Update configure for auto detection of system libx264 configuration

9 years agoAdd tile format frame packing value
Anton Mitrofanov [Tue, 3 Feb 2015 11:51:28 +0000 (14:51 +0300)]
Add tile format frame packing value

Defined in 2014-02 edition.

9 years agoStricter validation of crop-rect values
Anton Mitrofanov [Tue, 3 Feb 2015 10:39:14 +0000 (13:39 +0300)]
Stricter validation of crop-rect values

9 years agoAdd mono frame packing value
Vittorio Giovara [Tue, 20 Jan 2015 16:15:56 +0000 (16:15 +0000)]
Add mono frame packing value

Defined in 2013-04 edition.

9 years agoValidate frame packing value instead of clipping
Vittorio Giovara [Tue, 20 Jan 2015 15:57:41 +0000 (15:57 +0000)]
Validate frame packing value instead of clipping

9 years agox86inc: Correctly warn on use of SSE2 instructions in SSE functions
Christophe Gisquet [Tue, 3 Feb 2015 19:40:41 +0000 (20:40 +0100)]
x86inc: Correctly warn on use of SSE2 instructions in SSE functions

SSE2 instructions that are XMM-implementations of pre-existing MMX/MMX2
instructions did not issue warnings when used in SSE functions. Handle
it by also checking the register type when such instructions are used.

9 years agox86inc: Fix instantiation of YMM registers
Christophe Gisquet [Tue, 3 Feb 2015 17:02:30 +0000 (18:02 +0100)]
x86inc: Fix instantiation of YMM registers

9 years agomatroska: Correctly write display width and height in stereo mode
Vittorio Giovara [Tue, 20 Jan 2015 16:28:54 +0000 (16:28 +0000)]
matroska: Correctly write display width and height in stereo mode

According to the specifications, when stereo mode is set, these values
represent the single view size.

9 years agoUse POC type 0 for AVC-Intra
Kieran Kunhya [Tue, 20 Jan 2015 15:38:00 +0000 (09:38 -0600)]
Use POC type 0 for AVC-Intra

Based on a patch from Capella Systems

9 years agoFix ARCH variable name conflict with BSD ports (bsd.port.mk) read-only variable
Anton Mitrofanov [Sat, 3 Jan 2015 12:46:19 +0000 (15:46 +0300)]
Fix ARCH variable name conflict with BSD ports (bsd.port.mk) read-only variable

9 years agoFix negative percentages in final stats output
Anton Mitrofanov [Sat, 27 Dec 2014 17:35:39 +0000 (20:35 +0300)]
Fix negative percentages in final stats output

They were caused by integer overflow when encoding long UHD video.

9 years agoBump dates to 2015
Anton Mitrofanov [Sat, 3 Jan 2015 20:35:23 +0000 (23:35 +0300)]
Bump dates to 2015

10 years agox86: Update intel compiler cpu dispatcher override for new versions of ICC/ICL
Anton Mitrofanov [Mon, 15 Dec 2014 15:49:23 +0000 (18:49 +0300)]
x86: Update intel compiler cpu dispatcher override for new versions of ICC/ICL

10 years agoNew AQ mode: auto-variance AQ with bias to dark scenes
Anton Mitrofanov [Tue, 6 Sep 2011 17:53:29 +0000 (21:53 +0400)]
New AQ mode: auto-variance AQ with bias to dark scenes

Also known as --aq-mode 3 or auto-variance AQ modification.

10 years agoImprove HRD conformance
Anton Mitrofanov [Tue, 28 Aug 2012 23:02:27 +0000 (03:02 +0400)]
Improve HRD conformance

10 years agox86: SSE and AVX implementations of plane_copy
Henrik Gramner [Fri, 28 Nov 2014 22:24:56 +0000 (23:24 +0100)]
x86: SSE and AVX implementations of plane_copy

Also remove the MMX2 implementation and fix src overread for height == 1.

10 years agoUpdate to the latest version of gas-preprocessor.pl from http://git.libav.org/?p...
Anton Mitrofanov [Mon, 29 Sep 2014 19:26:19 +0000 (23:26 +0400)]
Update to the latest version of gas-preprocessor.pl from http://git.libav.org/?p=gas-preprocessor.git

Contributions by Janne Grunau, Martin Storsjo, Mans Rullgard, David Conrad, Martin Aumuller and others

10 years agoaarch64: cabac_encode_{decision,bypass,terminal}_asm
Janne Grunau [Tue, 18 Nov 2014 23:33:55 +0000 (00:33 +0100)]
aarch64: cabac_encode_{decision,bypass,terminal}_asm

benchmarks on a Nexus 9 (nvidia denver):
101.3 cycles in x264_cabac_encode_decision_c,   67105369 runs, 3495 skips
 97.3 cycles in x264_cabac_encode_decision_asm, 67105493 runs, 3371 skips
132.8 cycles in x264_cabac_encode_terminal_c,    1046950 runs, 1626 skips
116.1 cycles in x264_cabac_encode_terminal_asm,  1048424 runs, 152 skips
 92.4 cycles in x264_cabac_encode_bypass_c,     16776192 runs, 1024 skips
 89.6 cycles in x264_cabac_encode_bypass_asm,   16776453 runs, 763 skips

Cycle counts are not as stable as one would like. The dynamic code
optimisation seems to produce different results for small chnages in a
binary. Repeated runs with the same binary produce stable results
though (ignoring the first run).

10 years agocheckasm: add cycle counter read for aarch64
Janne Grunau [Thu, 6 Nov 2014 08:20:17 +0000 (09:20 +0100)]
checkasm: add cycle counter read for aarch64

Needs kernel support since user space access to the cycle counter is not
allowed on all available AArch64 systems (Android 5 and iOS).

10 years agoaarch64: nal_escape_neon
Janne Grunau [Wed, 5 Nov 2014 10:35:13 +0000 (11:35 +0100)]
aarch64: nal_escape_neon

3-4 times faster.

10 years agoaarch64: {plane_copy,memcpy_aligned,memzero_aligned}_neon
Janne Grunau [Fri, 31 Oct 2014 13:49:04 +0000 (14:49 +0100)]
aarch64: {plane_copy,memcpy_aligned,memzero_aligned}_neon

2-3 times faster than C.

10 years agoaarch64: x264_mbtree_propagate_{cost,list}_neon
Janne Grunau [Wed, 29 Oct 2014 17:17:48 +0000 (18:17 +0100)]
aarch64: x264_mbtree_propagate_{cost,list}_neon

x264_mbtree_propagate_cost_neon is ~7 times faster.
x264_mbtree_propagate_list_neon is 33% faster.

10 years agoaarch64: x264_denoise_dct_neon
Janne Grunau [Tue, 21 Oct 2014 13:18:49 +0000 (15:18 +0200)]
aarch64: x264_denoise_dct_neon

3.5 times faster.

10 years agoaarch64: x264_coeff_level_run{4,8,15,16}
Janne Grunau [Mon, 20 Oct 2014 11:12:14 +0000 (13:12 +0200)]
aarch64: x264_coeff_level_run{4,8,15,16}

All functions ~33% faster.

10 years agoaarch64: NEON asm for intra luma deblocking
Janne Grunau [Tue, 14 Oct 2014 17:20:52 +0000 (19:20 +0200)]
aarch64: NEON asm for intra luma deblocking

deblock_luma_intra[0]_neon is 2 times fastes,
deblock_luma_intra[1]_neon is ~4 times faster.

10 years agoaarch64: x264_deblock_h_chroma_422_neon
Janne Grunau [Mon, 13 Oct 2014 15:29:22 +0000 (17:29 +0200)]
aarch64: x264_deblock_h_chroma_422_neon

deblock_h_chroma_422 2.5 times faster

10 years agoaarch64: x264_deblock_h_chroma_mbaff_neon
Janne Grunau [Mon, 13 Oct 2014 10:43:50 +0000 (12:43 +0200)]
aarch64: x264_deblock_h_chroma_mbaff_neon

deblock_chroma_420_mbaff_neon  2 times faster

10 years agoaarch64: NEON asm for intra chroma deblocking
Janne Grunau [Fri, 10 Oct 2014 08:29:15 +0000 (10:29 +0200)]
aarch64: NEON asm for intra chroma deblocking

deblock_h_chroma_420_intra, deblock_h_chroma_422_intra and
x264_deblock_h_chroma_intra_mbaff_neon are ~3 times faster.
deblock_chroma_intra[1] is ~4 times faster than C.

10 years agoaarch64: add myself as author to aarch64/mc.h
Janne Grunau [Tue, 2 Sep 2014 08:27:22 +0000 (10:27 +0200)]
aarch64: add myself as author to aarch64/mc.h

10 years agoaarch64: NEON asm for integral init
Janne Grunau [Thu, 14 Aug 2014 13:22:50 +0000 (14:22 +0100)]
aarch64: NEON asm for integral init

integral_init4h_neon and integral_init8h_neon are 3-4 times faster than
C. integral_init8v_neon is 6 times faster and integral_init4v_neon is 10
times faster.

10 years agoaarch64: NEON asm for 8x16c intra prediction
Janne Grunau [Wed, 13 Aug 2014 12:30:53 +0000 (13:30 +0100)]
aarch64: NEON asm for 8x16c intra prediction

Between 10% and 40% faster than C.

10 years agoaarch64: NEON asm for decimate_score
Janne Grunau [Tue, 12 Aug 2014 15:26:10 +0000 (17:26 +0200)]
aarch64: NEON asm for decimate_score

decimate_score15 and 16 are 60% faster, decimate_score64 is 4 times
faster than C.

10 years agoaarch64: implement x264_sub8x16_dct_dc_neon
Janne Grunau [Fri, 8 Aug 2014 10:19:35 +0000 (11:19 +0100)]
aarch64: implement x264_sub8x16_dct_dc_neon

4 times faster than C.

10 years agoaarch64: implement x264_pixel_asd8_neon
Janne Grunau [Thu, 7 Aug 2014 17:46:07 +0000 (19:46 +0200)]
aarch64: implement x264_pixel_asd8_neon

7 times faster than C.

10 years agoaarch64: NEON asm for 4x16 sad, satd and ssd
Janne Grunau [Thu, 7 Aug 2014 14:49:12 +0000 (16:49 +0200)]
aarch64: NEON asm for 4x16 sad, satd and ssd

pixel_sad_4x16_neon: 33% faster than C
pixel_satd_4x16_neon: 5 times faster
pixel_ssd_4x16_neon:  4 times faster

10 years agoaarch64: implement x264_pixel_ssd_nv12_core_neon
Janne Grunau [Wed, 30 Jul 2014 14:48:25 +0000 (15:48 +0100)]
aarch64: implement x264_pixel_ssd_nv12_core_neon

13 times faster than C.

10 years agoaarch64: implement x264_pixel_vsad_neon
Janne Grunau [Tue, 29 Jul 2014 17:26:11 +0000 (18:26 +0100)]
aarch64: implement x264_pixel_vsad_neon

35 times faster than C.

10 years agoaarch64: NEON asm for missing x264_zigzag_* functions
Janne Grunau [Tue, 29 Jul 2014 10:06:24 +0000 (11:06 +0100)]
aarch64: NEON asm for missing x264_zigzag_* functions

zigzag_scan_4x4_field_neon, zigzag_sub_4x4_field_neon,
zigzag_sub_4x4ac_field_neon, zigzag_sub_4x4_frame_neon,
igzag_sub_4x4ac_frame_neon more than 2 times faster

zigzag_scan_8x8_frame_neon, zigzag_scan_8x8_field_neon,
zigzag_sub_8x8_field_neon, zigzag_sub_8x8_frame_neon 4-5 times faster

zigzag_interleave_8x8_cavlc_neon 6 times faster

10 years agoaarch64: implement x264_pixel_sa8d_satd_16x16_neon
Janne Grunau [Fri, 25 Jul 2014 10:53:17 +0000 (11:53 +0100)]
aarch64: implement x264_pixel_sa8d_satd_16x16_neon

~20% faster than calling pixel_sa8d_16x16 and pixel_satd_16x16
separately.

10 years agoaarch64: optimize x264_predict_8x8c_dc_left_neon
Janne Grunau [Thu, 14 Aug 2014 21:13:27 +0000 (23:13 +0200)]
aarch64: optimize x264_predict_8x8c_dc_left_neon

25% faster than the previous version.

10 years agox86: Make AVX2 also imply FMA3
Henrik Gramner [Sat, 2 Aug 2014 16:26:18 +0000 (18:26 +0200)]
x86: Make AVX2 also imply FMA3

All CPUs with AVX2 supports FMA3 (but not the other way around).

10 years agoSimplify libx264 API usage example
Anton Mitrofanov [Thu, 13 Nov 2014 19:52:00 +0000 (22:52 +0300)]
Simplify libx264 API usage example

10 years agoAvxSynth: Remove a bunch of unused cruft
Henrik Gramner [Fri, 21 Nov 2014 22:47:20 +0000 (23:47 +0100)]
AvxSynth: Remove a bunch of unused cruft

10 years agoFix bugs/typos in motion compensation and cache_load
Anton Mitrofanov [Wed, 3 Dec 2014 19:36:12 +0000 (22:36 +0300)]
Fix bugs/typos in motion compensation and cache_load

Didn't affect output due to the incorrect values either not being used in the
code path or producing equal results compared to the correct values.

Also deduplicate hpel_ref arrays.

10 years agocheckasm: Fix undefined behavior warnings
Anton Mitrofanov [Sun, 30 Nov 2014 20:39:28 +0000 (23:39 +0300)]
checkasm: Fix undefined behavior warnings

10 years agocheckasm: Fix V210 reporting
Henrik Gramner [Sat, 29 Nov 2014 17:47:52 +0000 (18:47 +0100)]
checkasm: Fix V210 reporting

It would previously report FAILED if any of the earlier plane_copy tests failed.

10 years agoSafety check against malicious high bit-depth input which could cause crash
Anton Mitrofanov [Sun, 12 Oct 2014 17:01:53 +0000 (21:01 +0400)]
Safety check against malicious high bit-depth input which could cause crash

10 years agolibx264 API usage example
Anton Mitrofanov [Sun, 12 Oct 2014 16:45:40 +0000 (20:45 +0400)]
libx264 API usage example

10 years agox86: AVX2 high bit-depth var_16x16
Henrik Gramner [Fri, 17 Oct 2014 19:35:42 +0000 (21:35 +0200)]
x86: AVX2 high bit-depth var_16x16

40->27 cycles on Haswell.

10 years agocheckasm: Serialize read_time() calls on x86
Henrik Gramner [Wed, 8 Oct 2014 20:25:35 +0000 (22:25 +0200)]
checkasm: Serialize read_time() calls on x86

Improves the accuracy of benchmarks, especially in short functions.

To quote the Intel 64 and IA-32 Architectures Software Developer's Manual:
"The RDTSC instruction is not a serializing instruction. It does not necessarily
wait until all previous instructions have been executed before reading the counter.
Similarly, subsequent instructions may begin execution before the read operation
is performed. If software requires RDTSC to be executed only after all previous
instructions have completed locally, it can either use RDTSCP (if the processor
supports that instruction) or execute the sequence LFENCE;RDTSC."

RDTSCP would accomplish the same task, but it's only available since Nehalem.

This change makes SSE2 a requirement to run checkasm.

10 years agoSupport case-independent string options
Vittorio Giovara [Mon, 29 Sep 2014 17:51:30 +0000 (18:51 +0100)]
Support case-independent string options

10 years agoShut up gcc -Wuninitialized warnings
Anton Mitrofanov [Sat, 6 Sep 2014 16:44:49 +0000 (20:44 +0400)]
Shut up gcc -Wuninitialized warnings

10 years agoShut up clang -Wuninitialized warning
Anton Mitrofanov [Fri, 5 Sep 2014 15:43:52 +0000 (19:43 +0400)]
Shut up clang -Wuninitialized warning

10 years agoFix few clang -Wunused-* warnings
Anton Mitrofanov [Fri, 5 Sep 2014 15:30:47 +0000 (19:30 +0400)]
Fix few clang -Wunused-* warnings

10 years agoFix inappropriate instruction use
Anton Mitrofanov [Thu, 28 Aug 2014 16:13:13 +0000 (20:13 +0400)]
Fix inappropriate instruction use

10 years agox264asm: warn when inappropriate instruction used in function with specified cpuflags
Anton Mitrofanov [Thu, 28 Aug 2014 14:38:53 +0000 (18:38 +0400)]
x264asm: warn when inappropriate instruction used in function with specified cpuflags

10 years agoFix VBV with true VFR streams
Anton Mitrofanov [Mon, 1 Sep 2014 21:48:00 +0000 (01:48 +0400)]
Fix VBV with true VFR streams

10 years agoFix VBV
Anton Mitrofanov [Mon, 1 Sep 2014 18:45:00 +0000 (22:45 +0400)]
Fix VBV

10 years agoUpdate to the current lavf API and fix memory leak when using --seek
Anton Mitrofanov [Tue, 29 Jul 2014 23:03:32 +0000 (03:03 +0400)]
Update to the current lavf API and fix memory leak when using --seek

10 years agox86inc: Make INIT_CPUFLAGS support an arbitrary number of cpuflags
Henrik Gramner [Mon, 4 Aug 2014 23:42:55 +0000 (01:42 +0200)]
x86inc: Make INIT_CPUFLAGS support an arbitrary number of cpuflags

Previously there was a limit of two cpuflags.

10 years agox86: Minor pixel_ssim_end4 improvements
Henrik Gramner [Mon, 4 Aug 2014 23:42:51 +0000 (01:42 +0200)]
x86: Minor pixel_ssim_end4 improvements

Reduce the number of vector registers used from 7 to 5.
Eliminate some moves in the AVX implementation.
Avoid bypass delays for transitioning between int and float domains.

10 years agox86: Faster quant_4x4x4
Henrik Gramner [Mon, 4 Aug 2014 23:42:47 +0000 (01:42 +0200)]
x86: Faster quant_4x4x4

Also drop the MMX version instead of doing a bunch of ifdeffery to support it after this change.

10 years agoconfigure: improve cc_check for clang and ICL to not ignore unknown options
Anton Mitrofanov [Sun, 10 Aug 2014 18:46:12 +0000 (22:46 +0400)]
configure: improve cc_check for clang and ICL to not ignore unknown options

10 years agocheckasm: Only call x264_cpu_detect() once
Henrik Gramner [Mon, 4 Aug 2014 23:42:44 +0000 (01:42 +0200)]
checkasm: Only call x264_cpu_detect() once

10 years agoaarch64: deblocking NEON asm
Janne Grunau [Fri, 18 Jul 2014 13:49:10 +0000 (14:49 +0100)]
aarch64: deblocking NEON asm

Deblock chroma/luma are based on libav's h264 aarch64 NEON deblocking
filter which was ported by me from the existing ARM NEON asm. No
additional persons to ask for a relicense.

10 years agoaarch64: intra predition NEON asm
Janne Grunau [Fri, 18 Jul 2014 08:29:35 +0000 (09:29 +0100)]
aarch64: intra predition NEON asm

Ported from the ARM NEON asm.

10 years agoaarch64: motion compensation NEON asm
Janne Grunau [Thu, 17 Jul 2014 14:58:44 +0000 (15:58 +0100)]
aarch64: motion compensation NEON asm

Ported from the ARM NEON asm.