]>
granicus.if.org Git - libx264/log
Loren Merritt [Sat, 1 Mar 2014 02:57:56 +0000 (02:57 +0000)]
x86inc: free up variable name "n" in global namespace
Henrik Gramner [Wed, 22 Jan 2014 18:09:12 +0000 (19:09 +0100)]
x86: Pass -Worphan-labels to yasm
Makes it easier to detect typos.
Steve Lhomme [Sun, 16 Feb 2014 12:15:09 +0000 (13:15 +0100)]
Write 3D metadata when outputting Matroska
For when --frame-packing is set.
Anton Mitrofanov [Sun, 23 Feb 2014 12:56:03 +0000 (16:56 +0400)]
Don't set chroma_loc_info_present_flag for non-4:2:0
The H.264 spec says it shouldn't be set in these cases.
Fiona Glaser [Mon, 10 Mar 2014 15:42:50 +0000 (08:42 -0700)]
x264.h: fix documentation
The full details of the return values of encoder_encode and encoder_headers
were mistakenly removed a while ago; re-add them.
Anton Mitrofanov [Sun, 23 Feb 2014 11:52:57 +0000 (15:52 +0400)]
Fix pointer cast warning for 64-bit builds
Anton Mitrofanov [Mon, 10 Mar 2014 12:48:02 +0000 (16:48 +0400)]
mbaff: fix mb_field_decoding_flag tracking and simplify allow skip check
Fixes an issue with too many forced non-skips in mbaff+cavlc, as well as
non-deterministic output with mbaff+cavlc+sliced-threads.
Anton Mitrofanov [Sun, 9 Mar 2014 23:22:57 +0000 (03:22 +0400)]
Fix memory overwrite in x264_deblock_h_chroma_mbaff_sse2
Fixes possible corruption with MBAFF+sliced threads.
Fiona Glaser [Sun, 2 Mar 2014 18:09:01 +0000 (10:09 -0800)]
Fix corruption with CAVLC overflow handling in MBAFF+main profile
Probably a regression in r2178.
Anton Mitrofanov [Mon, 10 Mar 2014 17:17:19 +0000 (21:17 +0400)]
Fix checkasm --bench output when nop_cycles is too large
Anton Mitrofanov [Wed, 22 Jan 2014 08:54:49 +0000 (12:54 +0400)]
Really fix quantization factor allocation
Actually allocate less (instead of just initialize less) and fix comments.
Yu Xiaolei [Sun, 23 Feb 2014 12:12:51 +0000 (04:12 -0800)]
Fix build with Android NDK
Android NDK does not expose sched_getaffinity.
Loren Merritt [Thu, 16 Jan 2014 21:34:46 +0000 (13:34 -0800)]
x86inc: speed up compilation with yasm
Work around yasm's inefficiency with handling large numbers of variables
in the global scope.
Kieran Kunhya [Fri, 10 Jan 2014 23:27:33 +0000 (23:27 +0000)]
Add support for AVC-Intra Class 200
James Weaver [Tue, 7 Jan 2014 10:31:58 +0000 (10:31 +0000)]
v210 input support
Assembly based on code by Henrik Gramner and Loren Merritt.
Fiona Glaser [Tue, 21 Jan 2014 21:39:33 +0000 (13:39 -0800)]
Fix quantization factor allocation
We don't need to wastefully allocate quant tables above QP_MAX_SPEC; they're
never used.
Henrik Gramner [Wed, 8 Jan 2014 00:06:56 +0000 (01:06 +0100)]
Avoid some unneccesary memory loads in macroblock_encode
Henrik Gramner [Sun, 5 Jan 2014 14:25:05 +0000 (15:25 +0100)]
Bump dates to 2014
Also update AUTHORS file and my e-mail address in the headers of various files.
Henrik Gramner [Sun, 5 Jan 2014 23:18:31 +0000 (00:18 +0100)]
Remove tools/xyuv.c
It's an old stand-alone application that isn't relevant to x264.
Anton Mitrofanov [Wed, 6 Nov 2013 22:37:23 +0000 (02:37 +0400)]
Use 8x16c wrappers with x86 asm functions for 4:2:2 with high bit depth
Henrik Gramner [Fri, 20 Dec 2013 21:44:28 +0000 (22:44 +0100)]
CLI: Avoid redundant 16-bit upconversions in piped raw input
It's not possible to seek in pipes, so if we want to skip frames we have to read and
discard unused ones. It's pointless to do bit-depth upconversions in those frames.
Anton Mitrofanov [Fri, 3 Jan 2014 16:06:06 +0000 (20:06 +0400)]
Fix input support from named pipes in Windows
Steve Clark [Wed, 20 Nov 2013 17:40:23 +0000 (21:40 +0400)]
Fix ARM asm compilation with Apple assembler
Anton Mitrofanov [Wed, 13 Nov 2013 15:24:48 +0000 (19:24 +0400)]
Fix uninitialized variable
Caused if the timebase is not specified in stats file. Found by Clang.
Anton Mitrofanov [Sun, 27 Oct 2013 15:27:23 +0000 (19:27 +0400)]
Remove --visualize option.
It probably wasn't used or maintained for last few years.
Anton Mitrofanov [Tue, 15 Oct 2013 08:32:25 +0000 (12:32 +0400)]
Add L-SMASH support as preferable alternative for MP4-muxing
Kieran Kunhya [Sat, 21 Sep 2013 18:16:12 +0000 (19:16 +0100)]
Add AVC-Intra 1080p50/60 Class 100 parameters
Also add some compatibility fixes.
Fiona Glaser [Mon, 9 Sep 2013 19:37:59 +0000 (12:37 -0700)]
Add --filler option
Allows generation of hard-CBR streams without using NAL HRD.
Useful if you want to be able to reconfigure the bitrate (which you can't do
with NAL HRD on).
Anton Mitrofanov [Sun, 27 Oct 2013 11:22:51 +0000 (15:22 +0400)]
Make x264_encoder_reconfig more threadsafe
Do the reconfig when the next frame's encode begins.
Fixes some rare crashes with frame-threading and encoder_reconfig.
Fiona Glaser [Fri, 25 Oct 2013 00:19:00 +0000 (17:19 -0700)]
chroma-me: take shortcut in BI analysis
~100 cycles faster with subme>=9
Fiona Glaser [Thu, 24 Oct 2013 21:44:43 +0000 (14:44 -0700)]
CRF-max: don't warn if VBV underflow occurs
Only warn if underflow occurs for reasons other than CRF-max, as CRF-max
implies that VBV underflow is desired by the user.
Henrik Gramner [Fri, 18 Oct 2013 20:43:36 +0000 (22:43 +0200)]
x86inc: Make ym# behave the same way as xm#
This makes more sense for future implementations of templates with zmm registers.
Henrik Gramner [Fri, 18 Oct 2013 20:21:38 +0000 (22:21 +0200)]
Use calloc instead of malloc + memset
Henrik Gramner [Thu, 10 Oct 2013 14:54:12 +0000 (16:54 +0200)]
Replace gf_malloc with regular malloc in mp4 muxer
It was used as a workaround for a bug that only existed in the GPAC repository
for a few weeks back in 2010. There's no reason to keep it anymore.
Anton Mitrofanov [Tue, 8 Oct 2013 19:20:40 +0000 (23:20 +0400)]
Update to current libav/ffmpeg API
Rafaël Carré [Fri, 25 Oct 2013 14:12:24 +0000 (07:12 -0700)]
version.sh: change to use /bin/sh
Sean McGovern [Wed, 4 Sep 2013 21:15:00 +0000 (14:15 -0700)]
configure: don't generate a git version number if .git isn't present
Martin Storsjo [Tue, 3 Sep 2013 21:56:18 +0000 (14:56 -0700)]
configure: include dependency libs in the Libs pkg-config
If only a static library is built, the user of the library that just
tries to link to the lib using the flags provided by pkg-config
might not know that only a static lib exists and that he'd have to
pass --static to pkg-config to get the internal dependencies to
be able to link the library.
For a shared build, the internal dependencies are kept in Libs.private
as before.
This matches how libav's pkg-config files are generated.
Anton Mitrofanov [Thu, 17 Oct 2013 20:38:06 +0000 (00:38 +0400)]
Fix compilation in case of HAVE_LOG2F check fails spuriously
Anton Mitrofanov [Sat, 12 Oct 2013 08:01:57 +0000 (12:01 +0400)]
Fix compilation of shared library for Windows with original MinGW toolchain
Anton Mitrofanov [Tue, 8 Oct 2013 19:32:37 +0000 (23:32 +0400)]
Fix possible crashes in resize and crop filters with high bitdepth input
Tim Mooney [Tue, 3 Sep 2013 20:43:50 +0000 (13:43 -0700)]
Fix INSTALL in configure for Solaris systems
Henrik Gramner [Tue, 27 Aug 2013 22:50:31 +0000 (00:50 +0200)]
Workaround for FFMS indexing bug
If FFMS_ReadIndex is used with an empty index file it gets stuck in an infinite loop instead of returning NULL
like it's supposed to do on failure. Explicitly check if the file is empty before calling it as a workaround.
Anton Mitrofanov [Mon, 26 Aug 2013 17:20:31 +0000 (21:20 +0400)]
Fix masked access violation in KERNEL32
Caused crashes under gdb in Windows and might cause other unknown problems.
Hiroki Taniura [Sat, 24 Aug 2013 16:18:57 +0000 (01:18 +0900)]
Fix GPAC support on Windows
Henrik Gramner [Sun, 11 Aug 2013 17:50:42 +0000 (19:50 +0200)]
Windows Unicode support
Windows, unlike most other operating systems, uses UTF-16 for Unicode strings while x264 is designed for UTF-8.
This patch does the following in order to handle things like Unicode filenames:
* Keep strings internally as UTF-8.
* Retrieve the CLI command line as UTF-16 and convert it to UTF-8.
* Always use Unicode versions of Windows API functions and convert strings to UTF-16 when calling them.
* Attempt to use legacy 8.3 short filenames for external libraries without Unicode support.
Kieran Kunhya [Sat, 20 Jul 2013 17:47:59 +0000 (18:47 +0100)]
AVC-Intra support
This format has been reverse engineered and x264's output has almost exactly
the same bitstream as Panasonic cameras and encoders produce. It therefore does
not comply with SMPTE RP2027 since Panasonic themselves do not comply with
their own specification. It has been tested in Avid, Premiere, Edius and
Quantel.
Parts of this patch were written by Fiona Glaser and some reverse
engineering was done by Joseph Artsimovich.
Henrik Gramner [Mon, 8 Jul 2013 19:06:42 +0000 (12:06 -0700)]
Transparent hugepage support
Combine frame and mb data mallocs into a single large malloc.
Additionally, on Linux systems with hugepage support, ask for hugepages on
large mallocs.
This gives a small performance improvement (~0.2-0.9%) on systems without
hugepage support, as well as a small memory footprint reduction.
On recent Linux kernels with hugepage support enabled (set to madvise or
always), it improves performance up to 4% at the cost of about 7-12% more
memory usage on typical settings..
It may help even more on Haswell and other recent CPUs with improved 2MB page
support in hardware.
Henrik Gramner [Fri, 5 Jul 2013 19:15:54 +0000 (21:15 +0200)]
x86: SSSE3 implementation of pixel_sad_x3 and pixel_sad_x4
Henrik Gramner [Fri, 5 Jul 2013 19:15:49 +0000 (21:15 +0200)]
x86: Faster AVX2 pixel_sad_x3 and pixel_sad_x4
Diogo Franco [Wed, 24 Jul 2013 01:17:44 +0000 (22:17 -0300)]
configure: Support cygwin64
Derek Buitenhuis [Fri, 9 Aug 2013 17:39:27 +0000 (13:39 -0400)]
x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"
This is also a valid value for WIN64.
Anton Mitrofanov [Tue, 23 Jul 2013 21:11:50 +0000 (14:11 -0700)]
Fix cases in which intra refresh allowed prediction from disallowed pixels
Anton Mitrofanov [Tue, 6 Aug 2013 21:56:34 +0000 (01:56 +0400)]
Fix a few minor bugs found with a static analyzer
Fiona Glaser [Fri, 12 Jul 2013 23:07:35 +0000 (16:07 -0700)]
Fix AVX2 detection bug with "limit CPUID" enabled in BIOS
Henrik Gramner [Fri, 5 Jul 2013 19:15:43 +0000 (21:15 +0200)]
x86: Remove X264_CPU_SSE_MISALIGN functions
Prevents a crash if the misaligned exception mask bit is cleared for some reason.
Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule.
They also require modifying the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.
VEX-encoded instructions also supports unaligned memory operands. I tried adding AVX
implementations of all removed functions but there were no performance improvements on
Ivy Bridge. pixel_sad_x3 and pixel_sad_x4 had significant code size reductions though
so I kept them and added some minor cosmetics fixes and tweaks.
Fiona Glaser [Thu, 20 Jun 2013 22:51:39 +0000 (15:51 -0700)]
Tweak i16x16-delta-quant-avoidance code
Don't omit the delta quant if it'd raise the quantizer to do so; this fixes
a rare flickering issue caused by deblocking.
Fiona Glaser [Sun, 9 Jun 2013 16:06:27 +0000 (09:06 -0700)]
x86: faster AVX2 iDCT, AVX deblock_luma_h, deblock_luma_h_intra
Lucien [Mon, 17 Jun 2013 18:28:09 +0000 (18:28 +0000)]
Add new color primaries, transfer characteristics, matrix coefficients
Fiona Glaser [Sat, 1 Jun 2013 00:01:29 +0000 (17:01 -0700)]
Add "--stitchable" option for segmented encoding
Stops x264 from attempting to optimize global stream headers, ensuring that
different segments of a video will have identical headers when used with
identical encoding settings.
Fiona Glaser [Thu, 27 Jun 2013 15:29:06 +0000 (08:29 -0700)]
Interface: if vbv-maxrate < bitrate, set bitrate = vbv-maxrate
This probably makes more sense to the user than setting vbv-maxrate = bitrate,
as before.
Anton Mitrofanov [Tue, 28 May 2013 12:02:42 +0000 (05:02 -0700)]
OpenCL cosmetics
Anton Mitrofanov [Mon, 17 Jun 2013 20:16:33 +0000 (00:16 +0400)]
Fix possible crash when writing very large filler NALUs
Bitstream-reallocation function didn't handle the case of filler.
Loren Merritt [Mon, 17 Jun 2013 18:27:09 +0000 (11:27 -0700)]
Fix build with PIC on some systems
Henrik Gramner [Sun, 2 Jun 2013 16:41:17 +0000 (18:41 +0200)]
Fix potential misaligment crash in AVX2 denoise_dct
Anton Mitrofanov [Mon, 27 May 2013 21:48:15 +0000 (01:48 +0400)]
Fix building with compilers without inline asm support
Also fix crash in high bit depth builds compiled with unaligned stack.
Anton Mitrofanov [Wed, 22 May 2013 18:43:59 +0000 (22:43 +0400)]
Fix compilation with OpenCL on MacOS X
Also fix crash in the case of OpenCL error during encoding.
Anton Mitrofanov [Mon, 6 May 2013 18:51:11 +0000 (22:51 +0400)]
OpenCL support improvement/refactoring
Autoload the OpenCL library so that it's not required to run an openCL-enabled
build of x264.
Update X264_BUILD, which should have been changed with the first patch.
Fiona Glaser [Thu, 16 May 2013 20:51:37 +0000 (13:51 -0700)]
x86: shave a few instructions off AVX deblock
Henrik Gramner [Tue, 14 May 2013 16:57:40 +0000 (18:57 +0200)]
x86: AVX2 dequant_4x4_dc
Henrik Gramner [Tue, 14 May 2013 16:53:12 +0000 (18:53 +0200)]
x86: AVX2 high bit-depth dequant
Fiona Glaser [Fri, 10 May 2013 00:20:05 +0000 (17:20 -0700)]
x86-64: 64-bit variant of AVX2 hpel_filter
~5% faster than 32-bit.
Henrik Gramner [Mon, 6 May 2013 16:41:24 +0000 (18:41 +0200)]
x86: AVX2 high bit-depth denoise_dct
28->15 cycles
Also reorder instructions to use fewer registers, 3 cycles faster on Ivy Bridge with 64-bit Windows.
Henrik Gramner [Sat, 4 May 2013 16:48:58 +0000 (18:48 +0200)]
x86: AVX2 high bit-depth quant
quant_4x4: 13->6 cycles
quant_4x4_dc: 14->8 cycles
quant_8x8: 47->24 cycles
quant_4x4x4: 48->25 cycles
Fiona Glaser [Wed, 1 May 2013 21:32:11 +0000 (14:32 -0700)]
x86: AVX2 add16x16_idct_dc
27 -> 19 cycles
Fiona Glaser [Mon, 29 Apr 2013 23:16:54 +0000 (16:16 -0700)]
x86: faster AVX2 quant_4x4x4
10->9 cycles
Fiona Glaser [Sun, 28 Apr 2013 04:03:32 +0000 (21:03 -0700)]
x86: AVX2 intra_sad_x3_8x8c
30->22 cycles
Henrik Gramner [Sun, 28 Apr 2013 09:11:03 +0000 (11:11 +0200)]
x86: AVX2 high bit-depth intra_sad_x3_8x8
43->24 cycles
Fiona Glaser [Wed, 24 Apr 2013 21:22:15 +0000 (14:22 -0700)]
x86: AVX2 deblock strength
30->18 cycles
Henrik Gramner [Wed, 1 May 2013 15:42:48 +0000 (17:42 +0200)]
x86: Faster high bit-depth intra_sad_x3_4x4
20->16 cycles on Ivy Bridge
Fiona Glaser [Wed, 1 May 2013 00:36:46 +0000 (17:36 -0700)]
x86: faster SSSE3 hpel
~7% faster using the pmulhrsw trick from mc_chroma.
Fiona Glaser [Mon, 29 Apr 2013 21:22:23 +0000 (14:22 -0700)]
x86-64: faster SSSE3 trellis
~2% faster trellis.
Fiona Glaser [Fri, 3 May 2013 00:10:26 +0000 (17:10 -0700)]
x86: 32-byte align the stack if possible
Avoids the need for manual 32 byte array alignment on compilers that support
-mpreferred-stack-boundary.
Henrik Gramner [Sat, 11 May 2013 21:39:09 +0000 (23:39 +0200)]
x86inc: Utilize the shadow space on 64-bit Windows
Store XMM6 and XMM7 in the shadow space in functions that clobbers them.
This way we don't have to adjust the stack pointer as often,
reducing the number of instructions as well as code size.
Henrik Gramner [Fri, 3 May 2013 21:06:10 +0000 (23:06 +0200)]
x86: Don't use explicitly aligned versions of SAD on AVX CPUs
On modern CPUs movdqu isn't slower than movdqa when used on aligned data and using the same code in both cases saves cache.
This was already done for the high bit-depth AVX2 implementation but the aligned version still exists as dead code so remove that.
Henrik Gramner [Fri, 3 May 2013 18:18:03 +0000 (20:18 +0200)]
x86: Add missing initializations for high bit-depth sad_aligned
Fiona Glaser [Mon, 13 May 2013 23:52:18 +0000 (16:52 -0700)]
x86: add Jaguar CPU detection
Henrik Gramner [Tue, 7 May 2013 15:21:03 +0000 (17:21 +0200)]
x86inc: Remove .rodata kludges
The Mach-O bug was fixed in yasm 0.8.0 and we don't support versions that old.
a.out was superseded by ELF on sane systems a few decades ago.
Henrik Gramner [Sat, 4 May 2013 14:21:32 +0000 (16:21 +0200)]
checkasm: Use 64-bit cycle counters
Prevents overflows that can occur in some cases.
Henrik Gramner [Fri, 10 May 2013 11:55:32 +0000 (13:55 +0200)]
checkasm: Fix stack alignment bug
Fiona Glaser [Wed, 8 May 2013 17:48:41 +0000 (10:48 -0700)]
Fix invalid memcpy in sliced-threads
Likely didn't actually break in practice, but memcpy with src==dst
is incorrect.
Fiona Glaser [Mon, 29 Apr 2013 19:14:01 +0000 (12:14 -0700)]
Fix two bugs in slice-min-mbs and slices-max
Slices-max broke slice-max-size when slice-max wasn't used.
Slice-min-mbs broke in rare cases near the end of a threadslice.
Fiona Glaser [Fri, 5 Apr 2013 01:00:23 +0000 (18:00 -0700)]
x86: SSSE3 LUT-based faster coeff_level_run
~2x faster coeff_level_run.
Faster CAVLC encoding: {1%,2%,7%} overall with {superfast,medium,slower}.
Uses the same pshufb LUT abuse trick as in the previous ads_mvs patch.
Fiona Glaser [Mon, 25 Mar 2013 21:03:37 +0000 (14:03 -0700)]
x86-64: BMI2 cabac_residual functions
Fiona Glaser [Wed, 20 Mar 2013 22:08:35 +0000 (15:08 -0700)]
x86: SSSE3 ads_mvs
~55% faster ads in benchasm, ~15-30% in real encoding.
~4% faster "placebo" preset overall.
Henrik Gramner [Tue, 16 Apr 2013 21:27:53 +0000 (23:27 +0200)]
x86: AVX2 pixel_ssd_nv12_core
Henrik Gramner [Tue, 16 Apr 2013 21:27:50 +0000 (23:27 +0200)]
x86: AVX2 high bit-depth pixel_ssd
Henrik Gramner [Tue, 16 Apr 2013 21:27:46 +0000 (23:27 +0200)]
x86: AVX2 high bit-depth pixel_sad_x3/pixel_sad_x4
Also reduce the number of xmm registers used by sse2/ssse3 pixel_sad_x3.
Henrik Gramner [Tue, 16 Apr 2013 21:27:43 +0000 (23:27 +0200)]
x86: AVX2 high bit-depth vsad
Henrik Gramner [Tue, 16 Apr 2013 21:27:39 +0000 (23:27 +0200)]
x86: AVX2 high bit-depth pixel_sad
Also use loops instead of duplicating code; reduces code size by ~10kB with
negligible effect on performance.