]>
granicus.if.org Git - libx264/log
Henrik Gramner [Sat, 8 Aug 2015 10:21:54 +0000 (12:21 +0200)]
Simplify version.sh
Also remove some non-POSIX syntax and improve robustness.
As a bonus the script now runs about 2-3 times faster.
`git rev-list --count` could be used to simplify things even further,
but that functionality was added in git 1.7.2 so keep `wc -l` for now
to maintain compatibility with older git versions.
장영훈 [Fri, 7 Aug 2015 05:43:24 +0000 (14:43 +0900)]
msvs: Fix cl detection in non-English environments
Henrik Gramner [Mon, 3 Aug 2015 19:05:11 +0000 (21:05 +0200)]
x86inc: Sync minor changes from ffmpeg/libav
Henrik Gramner [Wed, 29 Jul 2015 17:30:52 +0000 (19:30 +0200)]
matroska: Add comments for the remaining element names
Henrik Gramner [Wed, 29 Jul 2015 17:30:41 +0000 (19:30 +0200)]
Silence various static analyzer warnings
Those are false positives, but it doesn't hurt to get rid of them.
Henrik Gramner [Sun, 26 Jul 2015 21:13:29 +0000 (23:13 +0200)]
mingw: Enable the tsaware linker flag
Avoids an irrelevant compatibility layer in Terminal Services environments.
https://msdn.microsoft.com/en-us/library/
cc834995 .aspx
Henrik Gramner [Sun, 26 Jul 2015 21:13:26 +0000 (23:13 +0200)]
msvs: Don't redefine snprintf for VS2015
Visual Studio 2015 has a proper snprintf implementation.
Henrik Gramner [Sun, 26 Jul 2015 21:13:19 +0000 (23:13 +0200)]
msvs: Prefer link.exe from the same directory as cl.exe
/usr/bin/link from coreutils may be located before the MSVS linker in $PATH
which causes linking to fail due to using the wrong binary.
Henrik Gramner [Sun, 26 Jul 2015 22:10:00 +0000 (00:10 +0200)]
frame_dump: check fseek() return value
Henrik Gramner [Sun, 26 Jul 2015 22:08:38 +0000 (00:08 +0200)]
x264_vfprintf: use va_copy
It's undefined behavior to use the same va_list twice.
This most likely didn't cause any issues in practice since the string would
have to be larger than 4 KiB to trigger the fallback path.
Use workaround for ICL as it doesn't define va_copy even for C99.
Henrik Gramner [Sun, 26 Jul 2015 22:08:31 +0000 (00:08 +0200)]
param_parse: Fix framerate rounding issues
Marcin Juszkiewicz [Mon, 1 Jun 2015 09:24:45 +0000 (11:24 +0200)]
aarch64: Remove broken CFLAGS in configure
GCC doesn't have an "-arch" switch, but works when that entire line is removed.
Rong Yan [Mon, 20 Jul 2015 08:34:20 +0000 (03:34 -0500)]
ppc: Add little-endian PowerPC support
Rishikesh More [Thu, 18 Jun 2015 12:18:46 +0000 (17:48 +0530)]
mips: MSA quant optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:45 +0000 (17:48 +0530)]
mips: MSA predict optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:44 +0000 (17:48 +0530)]
mips: MSA pixel optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:43 +0000 (17:48 +0530)]
mips: MSA deblock optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:42 +0000 (17:48 +0530)]
mips: MSA dct optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:40 +0000 (17:48 +0530)]
mips: MSA mc optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:38 +0000 (17:48 +0530)]
mips: Common MSA macros
Add macros for load/store, slide, shift, transpose and basic arithmetic
operations required by subsequent patches.
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Tue, 12 May 2015 14:08:09 +0000 (19:38 +0530)]
mips: Add MSA support to checkasm
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Kaustubh Raste [Fri, 17 Apr 2015 12:08:58 +0000 (17:38 +0530)]
mips: Initial MSA support
MSA is the MIPS SIMD Architecture.
Add X264_CPU_MSA define.
Update configure to detect MIPS platform and set flags.
CPU-specific gcc options are expected through --extra-cflags.
Sample command line for mips32r5:
./configure --host=mipsel-linux-gnu --cross-prefix=<TOOLCHAIN>/mips-mti-linux-gnu-
--extra-cflags="-EL -mips32r5 -msched-weight -mload-store-pairs"
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Anton Mitrofanov [Thu, 16 Jul 2015 21:22:29 +0000 (00:22 +0300)]
Limit autodetection of threads number according to the source height
Anton Mitrofanov [Thu, 16 Jul 2015 16:04:59 +0000 (19:04 +0300)]
Fine-tune of frame's size predictors at ratecontrol start
This is attempt to improve VBV at start of video with a lot of threads which
delay feedback for predictors.
Anton Mitrofanov [Thu, 16 Jul 2015 13:15:56 +0000 (16:15 +0300)]
Use forced frame types in slicetype analysis
This should improve MBTree and VBV when a lot of forced frame types are used.
Henrik Gramner [Mon, 1 Dec 2014 21:05:42 +0000 (22:05 +0100)]
x86: SSSE3 and AVX2 implementations of plane_copy_swap
For NV21 input.
Yu Xiaolei [Fri, 6 Jun 2014 08:05:27 +0000 (16:05 +0800)]
NV21 input support
Eliminates an extra copy when encoding Android camera preview images.
Checkasm test by Janne Grunau.
ARM assembly with improvements from Janne Grunau.
Henrik Gramner [Tue, 23 Jun 2015 15:00:47 +0000 (17:00 +0200)]
deblock: Write combining
Henrik Gramner [Tue, 23 Jun 2015 12:59:59 +0000 (14:59 +0200)]
Get rid of some tabs and trailing whitespaces
Henrik Gramner [Sat, 23 May 2015 17:44:16 +0000 (19:44 +0200)]
x86: Experimental nasm support
Enables the use of nasm as an alternative to yasm.
Note that nasm cannot assemble x264 with PIC enabled since it currently doesn't
support [symbol-$$] addressing which is used extensively by x264's PIC code.
This includes all 64-bit Windows and 64-bit OS X builds, even non-shared.
For the above reason nasm is currently intentionally not auto-detected, instead
the assembler must be explicitly specified using "AS=nasm ./configure".
Also drop -O2 from ASFLAGS since it's simply ignored anyway.
Timothy Gu [Tue, 26 May 2015 17:12:42 +0000 (19:12 +0200)]
x86inc: Prevent warnings when using `struc` and `endstruc`
struc and endstruc attempts to revert to the previous section state set by
the SECTION macro.
Use the primitive [SECTION] directive instead of the SECTION macro for the
.note.GNU-stack section to prevent it from being emitted again during endstruc.
Henrik Gramner [Wed, 27 May 2015 19:38:14 +0000 (21:38 +0200)]
x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
Henrik Gramner [Sat, 23 May 2015 11:38:05 +0000 (13:38 +0200)]
x86inc: Disable vpbroadcastq workaround in newer yasm versions
The bug was fixed in 1.3.0, so only perform the workaround in earlier versions.
Henrik Gramner [Sun, 24 May 2015 20:57:00 +0000 (22:57 +0200)]
Prefer Unicode versions of Windows API calls
Just for consistency, doesn't affect behavior.
Henrik Gramner [Sun, 24 May 2015 21:21:20 +0000 (23:21 +0200)]
Get rid of fPIC warnings when compiling a shared library on Windows
PIC is always enabled when compiling for Windows so gcc complains when using
-fPIC since it doesn't do anything.
Henrik Gramner [Sat, 25 Jul 2015 20:42:59 +0000 (22:42 +0200)]
matroska: Write the correct DocTypeVersion when using frame-packing
The StereoMode element is only valid with DocTypeVersion 3 or higher.
Anton Mitrofanov [Fri, 24 Jul 2015 21:21:52 +0000 (00:21 +0300)]
dump_yuv: Fix file handle leak
Anton Mitrofanov [Fri, 24 Jul 2015 21:20:47 +0000 (00:20 +0300)]
mp4: Fix file handle leak
Henrik Gramner [Tue, 23 Jun 2015 22:40:45 +0000 (00:40 +0200)]
flv: Check fseek() and fwrite() return values
Henrik Gramner [Tue, 23 Jun 2015 22:22:56 +0000 (00:22 +0200)]
flv: Fix memory and file handle leaks
Henrik Gramner [Tue, 23 Jun 2015 23:23:35 +0000 (01:23 +0200)]
avs: Fix file handle leak
Henrik Gramner [Tue, 23 Jun 2015 11:38:02 +0000 (13:38 +0200)]
matroska: Fix memory leak
Henrik Gramner [Tue, 23 Jun 2015 11:24:29 +0000 (13:24 +0200)]
rdo: Fix potential CAVLC overflow issues
Henrik Gramner [Tue, 23 Jun 2015 20:08:35 +0000 (22:08 +0200)]
slurp_file: Various minor bug fixes
* Fix unsigned <= 0 check.
* Add additional size sanity check on 32-bit systems.
* Don't read uninitialized data if fread() fails.
Henrik Gramner [Tue, 23 Jun 2015 20:47:53 +0000 (22:47 +0200)]
param_parse: Check strdup() return value
Henrik Gramner [Tue, 23 Jun 2015 13:38:16 +0000 (15:38 +0200)]
param_parse: Fix memory leak
Anton Mitrofanov [Fri, 19 Jun 2015 13:01:12 +0000 (16:01 +0300)]
Add FreeBSD's stdint.h header guard to allowed list
Patch written by Koop Mast <kwm@FreeBSD.org>
Henrik Gramner [Fri, 22 May 2015 17:23:33 +0000 (19:23 +0200)]
x86: Prevent overread of src in plane_copy_interleave
Could only occur in 4:2:2 with height == 1.
Also enable asm for inputs with different U/V strides as long as the strides
have identical signs.
Anton Mitrofanov [Wed, 20 May 2015 20:10:20 +0000 (23:10 +0300)]
checkasm: Fix incorrect memcmp size for ARM architecture
Anton Mitrofanov [Sun, 26 Apr 2015 17:51:05 +0000 (20:51 +0300)]
Fix possible use of uninitialized MVs in lookahead analysis for B-frames
Anton Mitrofanov [Tue, 21 Apr 2015 20:08:19 +0000 (23:08 +0300)]
Catch incorrect usage of libx264 API for delayed frames flushing
Anton Mitrofanov [Sat, 7 Mar 2015 20:00:09 +0000 (23:00 +0300)]
Fix detection of system libx264 configuration
Anton Mitrofanov [Mon, 23 Feb 2015 11:23:18 +0000 (14:23 +0300)]
Cosmetic changes
Anton Mitrofanov [Tue, 30 Dec 2014 23:15:05 +0000 (02:15 +0300)]
Update configure for auto detection of system libx264 configuration
Anton Mitrofanov [Tue, 3 Feb 2015 11:51:28 +0000 (14:51 +0300)]
Add tile format frame packing value
Defined in 2014-02 edition.
Anton Mitrofanov [Tue, 3 Feb 2015 10:39:14 +0000 (13:39 +0300)]
Stricter validation of crop-rect values
Vittorio Giovara [Tue, 20 Jan 2015 16:15:56 +0000 (16:15 +0000)]
Add mono frame packing value
Defined in 2013-04 edition.
Vittorio Giovara [Tue, 20 Jan 2015 15:57:41 +0000 (15:57 +0000)]
Validate frame packing value instead of clipping
Christophe Gisquet [Tue, 3 Feb 2015 19:40:41 +0000 (20:40 +0100)]
x86inc: Correctly warn on use of SSE2 instructions in SSE functions
SSE2 instructions that are XMM-implementations of pre-existing MMX/MMX2
instructions did not issue warnings when used in SSE functions. Handle
it by also checking the register type when such instructions are used.
Christophe Gisquet [Tue, 3 Feb 2015 17:02:30 +0000 (18:02 +0100)]
x86inc: Fix instantiation of YMM registers
Vittorio Giovara [Tue, 20 Jan 2015 16:28:54 +0000 (16:28 +0000)]
matroska: Correctly write display width and height in stereo mode
According to the specifications, when stereo mode is set, these values
represent the single view size.
Kieran Kunhya [Tue, 20 Jan 2015 15:38:00 +0000 (09:38 -0600)]
Use POC type 0 for AVC-Intra
Based on a patch from Capella Systems
Anton Mitrofanov [Sat, 3 Jan 2015 12:46:19 +0000 (15:46 +0300)]
Fix ARCH variable name conflict with BSD ports (bsd.port.mk) read-only variable
Anton Mitrofanov [Sat, 27 Dec 2014 17:35:39 +0000 (20:35 +0300)]
Fix negative percentages in final stats output
They were caused by integer overflow when encoding long UHD video.
Anton Mitrofanov [Sat, 3 Jan 2015 20:35:23 +0000 (23:35 +0300)]
Bump dates to 2015
Anton Mitrofanov [Mon, 15 Dec 2014 15:49:23 +0000 (18:49 +0300)]
x86: Update intel compiler cpu dispatcher override for new versions of ICC/ICL
Anton Mitrofanov [Tue, 6 Sep 2011 17:53:29 +0000 (21:53 +0400)]
New AQ mode: auto-variance AQ with bias to dark scenes
Also known as --aq-mode 3 or auto-variance AQ modification.
Anton Mitrofanov [Tue, 28 Aug 2012 23:02:27 +0000 (03:02 +0400)]
Improve HRD conformance
Henrik Gramner [Fri, 28 Nov 2014 22:24:56 +0000 (23:24 +0100)]
x86: SSE and AVX implementations of plane_copy
Also remove the MMX2 implementation and fix src overread for height == 1.
Anton Mitrofanov [Mon, 29 Sep 2014 19:26:19 +0000 (23:26 +0400)]
Update to the latest version of gas-preprocessor.pl from http://git.libav.org/?p=gas-preprocessor.git
Contributions by Janne Grunau, Martin Storsjo, Mans Rullgard, David Conrad, Martin Aumuller and others
Janne Grunau [Tue, 18 Nov 2014 23:33:55 +0000 (00:33 +0100)]
aarch64: cabac_encode_{decision,bypass,terminal}_asm
benchmarks on a Nexus 9 (nvidia denver):
101.3 cycles in x264_cabac_encode_decision_c,
67105369 runs, 3495 skips
97.3 cycles in x264_cabac_encode_decision_asm,
67105493 runs, 3371 skips
132.8 cycles in x264_cabac_encode_terminal_c,
1046950 runs, 1626 skips
116.1 cycles in x264_cabac_encode_terminal_asm,
1048424 runs, 152 skips
92.4 cycles in x264_cabac_encode_bypass_c,
16776192 runs, 1024 skips
89.6 cycles in x264_cabac_encode_bypass_asm,
16776453 runs, 763 skips
Cycle counts are not as stable as one would like. The dynamic code
optimisation seems to produce different results for small chnages in a
binary. Repeated runs with the same binary produce stable results
though (ignoring the first run).
Janne Grunau [Thu, 6 Nov 2014 08:20:17 +0000 (09:20 +0100)]
checkasm: add cycle counter read for aarch64
Needs kernel support since user space access to the cycle counter is not
allowed on all available AArch64 systems (Android 5 and iOS).
Janne Grunau [Wed, 5 Nov 2014 10:35:13 +0000 (11:35 +0100)]
aarch64: nal_escape_neon
3-4 times faster.
Janne Grunau [Fri, 31 Oct 2014 13:49:04 +0000 (14:49 +0100)]
aarch64: {plane_copy,memcpy_aligned,memzero_aligned}_neon
2-3 times faster than C.
Janne Grunau [Wed, 29 Oct 2014 17:17:48 +0000 (18:17 +0100)]
aarch64: x264_mbtree_propagate_{cost,list}_neon
x264_mbtree_propagate_cost_neon is ~7 times faster.
x264_mbtree_propagate_list_neon is 33% faster.
Janne Grunau [Tue, 21 Oct 2014 13:18:49 +0000 (15:18 +0200)]
aarch64: x264_denoise_dct_neon
3.5 times faster.
Janne Grunau [Mon, 20 Oct 2014 11:12:14 +0000 (13:12 +0200)]
aarch64: x264_coeff_level_run{4,8,15,16}
All functions ~33% faster.
Janne Grunau [Tue, 14 Oct 2014 17:20:52 +0000 (19:20 +0200)]
aarch64: NEON asm for intra luma deblocking
deblock_luma_intra[0]_neon is 2 times fastes,
deblock_luma_intra[1]_neon is ~4 times faster.
Janne Grunau [Mon, 13 Oct 2014 15:29:22 +0000 (17:29 +0200)]
aarch64: x264_deblock_h_chroma_422_neon
deblock_h_chroma_422 2.5 times faster
Janne Grunau [Mon, 13 Oct 2014 10:43:50 +0000 (12:43 +0200)]
aarch64: x264_deblock_h_chroma_mbaff_neon
deblock_chroma_420_mbaff_neon 2 times faster
Janne Grunau [Fri, 10 Oct 2014 08:29:15 +0000 (10:29 +0200)]
aarch64: NEON asm for intra chroma deblocking
deblock_h_chroma_420_intra, deblock_h_chroma_422_intra and
x264_deblock_h_chroma_intra_mbaff_neon are ~3 times faster.
deblock_chroma_intra[1] is ~4 times faster than C.
Janne Grunau [Tue, 2 Sep 2014 08:27:22 +0000 (10:27 +0200)]
aarch64: add myself as author to aarch64/mc.h
Janne Grunau [Thu, 14 Aug 2014 13:22:50 +0000 (14:22 +0100)]
aarch64: NEON asm for integral init
integral_init4h_neon and integral_init8h_neon are 3-4 times faster than
C. integral_init8v_neon is 6 times faster and integral_init4v_neon is 10
times faster.
Janne Grunau [Wed, 13 Aug 2014 12:30:53 +0000 (13:30 +0100)]
aarch64: NEON asm for 8x16c intra prediction
Between 10% and 40% faster than C.
Janne Grunau [Tue, 12 Aug 2014 15:26:10 +0000 (17:26 +0200)]
aarch64: NEON asm for decimate_score
decimate_score15 and 16 are 60% faster, decimate_score64 is 4 times
faster than C.
Janne Grunau [Fri, 8 Aug 2014 10:19:35 +0000 (11:19 +0100)]
aarch64: implement x264_sub8x16_dct_dc_neon
4 times faster than C.
Janne Grunau [Thu, 7 Aug 2014 17:46:07 +0000 (19:46 +0200)]
aarch64: implement x264_pixel_asd8_neon
7 times faster than C.
Janne Grunau [Thu, 7 Aug 2014 14:49:12 +0000 (16:49 +0200)]
aarch64: NEON asm for 4x16 sad, satd and ssd
pixel_sad_4x16_neon: 33% faster than C
pixel_satd_4x16_neon: 5 times faster
pixel_ssd_4x16_neon: 4 times faster
Janne Grunau [Wed, 30 Jul 2014 14:48:25 +0000 (15:48 +0100)]
aarch64: implement x264_pixel_ssd_nv12_core_neon
13 times faster than C.
Janne Grunau [Tue, 29 Jul 2014 17:26:11 +0000 (18:26 +0100)]
aarch64: implement x264_pixel_vsad_neon
35 times faster than C.
Janne Grunau [Tue, 29 Jul 2014 10:06:24 +0000 (11:06 +0100)]
aarch64: NEON asm for missing x264_zigzag_* functions
zigzag_scan_4x4_field_neon, zigzag_sub_4x4_field_neon,
zigzag_sub_4x4ac_field_neon, zigzag_sub_4x4_frame_neon,
igzag_sub_4x4ac_frame_neon more than 2 times faster
zigzag_scan_8x8_frame_neon, zigzag_scan_8x8_field_neon,
zigzag_sub_8x8_field_neon, zigzag_sub_8x8_frame_neon 4-5 times faster
zigzag_interleave_8x8_cavlc_neon 6 times faster
Janne Grunau [Fri, 25 Jul 2014 10:53:17 +0000 (11:53 +0100)]
aarch64: implement x264_pixel_sa8d_satd_16x16_neon
~20% faster than calling pixel_sa8d_16x16 and pixel_satd_16x16
separately.
Janne Grunau [Thu, 14 Aug 2014 21:13:27 +0000 (23:13 +0200)]
aarch64: optimize x264_predict_8x8c_dc_left_neon
25% faster than the previous version.
Henrik Gramner [Sat, 2 Aug 2014 16:26:18 +0000 (18:26 +0200)]
x86: Make AVX2 also imply FMA3
All CPUs with AVX2 supports FMA3 (but not the other way around).
Anton Mitrofanov [Thu, 13 Nov 2014 19:52:00 +0000 (22:52 +0300)]
Simplify libx264 API usage example
Henrik Gramner [Fri, 21 Nov 2014 22:47:20 +0000 (23:47 +0100)]
AvxSynth: Remove a bunch of unused cruft
Anton Mitrofanov [Wed, 3 Dec 2014 19:36:12 +0000 (22:36 +0300)]
Fix bugs/typos in motion compensation and cache_load
Didn't affect output due to the incorrect values either not being used in the
code path or producing equal results compared to the correct values.
Also deduplicate hpel_ref arrays.
Anton Mitrofanov [Sun, 30 Nov 2014 20:39:28 +0000 (23:39 +0300)]
checkasm: Fix undefined behavior warnings
Henrik Gramner [Sat, 29 Nov 2014 17:47:52 +0000 (18:47 +0100)]
checkasm: Fix V210 reporting
It would previously report FAILED if any of the earlier plane_copy tests failed.
Anton Mitrofanov [Sun, 12 Oct 2014 17:01:53 +0000 (21:01 +0400)]
Safety check against malicious high bit-depth input which could cause crash