]> granicus.if.org Git - libx264/log
libx264
10 years agoarm: correct x264_mc_chroma_neon function declaration
Janne Grunau [Tue, 1 Apr 2014 20:11:44 +0000 (22:11 +0200)]
arm: correct x264_mc_chroma_neon function declaration

10 years agoarm: do not export every asm function
Janne Grunau [Tue, 1 Apr 2014 20:11:43 +0000 (22:11 +0200)]
arm: do not export every asm function

Based on Libav's libavutil/arm/asm.S. Also prevents having the same
label twice for every function on systems not defining EXTERN_ASM.
Clang's integrated assembler does not like it.

10 years agoarm: move all .macro/.endm to column 0
Janne Grunau [Tue, 1 Apr 2014 20:11:42 +0000 (22:11 +0200)]
arm: move all .macro/.endm to column 0

10 years agoaarch64: require PIC in shared mode
William Grant [Sun, 23 Mar 2014 16:21:52 +0000 (09:21 -0700)]
aarch64: require PIC in shared mode

10 years agoarm: x264_coeff_last8_arm
Janne Grunau [Sun, 16 Mar 2014 16:21:58 +0000 (17:21 +0100)]
arm: x264_coeff_last8_arm

checkasm --bench on a coretex-a9:
coeff_last8_c: 173
coeff_last8_armv6: 151

60 instead of 73 cycles in ~130k runs on the same cpu while encoding.

10 years agoarm: x264_store_interleave_chroma_neon
Janne Grunau [Sat, 15 Mar 2014 19:09:18 +0000 (20:09 +0100)]
arm: x264_store_interleave_chroma_neon

store_interleave_chroma_c: 4036
store_interleave_chroma_neon: 1043

10 years agoarm: x264_plane_copy_interleave_neon
Janne Grunau [Sat, 15 Mar 2014 18:55:50 +0000 (19:55 +0100)]
arm: x264_plane_copy_interleave_neon

plane_copy_interleave_c: 40285
plane_copy_interleave_neon: 10137

10 years agoarm: x264_plane_copy_deinterleave_rgb_neon
Janne Grunau [Sat, 15 Mar 2014 18:21:12 +0000 (19:21 +0100)]
arm: x264_plane_copy_deinterleave_rgb_neon

plane_copy_deinterleave_rgb_c: 31543
plane_copy_deinterleave_rgb_neon: 8312

10 years agoarm: load_deinterleave_chroma_f{dec,enc}_neon
Janne Grunau [Sat, 15 Mar 2014 17:22:49 +0000 (18:22 +0100)]
arm: load_deinterleave_chroma_f{dec,enc}_neon

load_deinterleave_chroma_fdec_c: 4055
load_deinterleave_chroma_fdec_neon: 995
load_deinterleave_chroma_fenc_c: 4071
load_deinterleave_chroma_fenc_neon: 992

10 years agoarm: x264_plane_copy_deinterleave_neon
Janne Grunau [Sat, 15 Mar 2014 16:22:08 +0000 (17:22 +0100)]
arm: x264_plane_copy_deinterleave_neon

plane_copy_deinterleave_c: 42988
plane_copy_deinterleave_neon: 10184

10 years agoarm: implement deblock_strength_neon
Janne Grunau [Sat, 15 Mar 2014 12:29:41 +0000 (13:29 +0100)]
arm: implement deblock_strength_neon

Based on deblock_strength_avx.

checkasm --bench on a cortex-a9:
deblock_strength_c: 14611
deblock_strength_neon: 1848

10 years agoarm: add missing macro instantiation for x264_pixel_avg_4x16_neon
Janne Grunau [Sat, 15 Mar 2014 09:51:11 +0000 (10:51 +0100)]
arm: add missing macro instantiation for x264_pixel_avg_4x16_neon

checkasm --bench on a cortex-a9:
avg_4x16_c: 8910
avg_4x16_neon: 2091

10 years agoarm: implement x264_predict_4x4_v_armv6
Janne Grunau [Thu, 13 Mar 2014 00:02:13 +0000 (01:02 +0100)]
arm: implement x264_predict_4x4_v_armv6

Alone probably not worth it but allows use of predict_4x4_dc|h_armv6
in intra_sad|satd_x3_4x4_neon.

10 years agoppc: fix build on certain PowerPC variants without Altivec
Roland Stigge [Sun, 23 Mar 2014 16:29:37 +0000 (09:29 -0700)]
ppc: fix build on certain PowerPC variants without Altivec

10 years agoOnly add strip option '-s' for linker flags
Anton Mitrofanov [Mon, 21 Apr 2014 20:58:24 +0000 (00:58 +0400)]
Only add strip option '-s' for linker flags

Fixes some build warnings with clang.

10 years agoconfigure: remove an unnecessary option from CFLAGS on OS X
Tsukasa OMOTO [Sat, 15 Mar 2014 07:53:53 +0000 (16:53 +0900)]
configure: remove an unnecessary option from CFLAGS on OS X

Fixes Clang 3.4 compilation on OS X.

10 years agoMacroblock tree overhaul/optimization
Fiona Glaser [Sun, 23 Feb 2014 18:36:55 +0000 (10:36 -0800)]
Macroblock tree overhaul/optimization

Move the second core part of macroblock tree into an assembly function;
SIMD-optimize roughly half of it (for x86). Roughly ~25-65% faster mbtree,
depending on content.

Slightly change how mbtree handles the tradeoff between range and precision
for propagation.

Overall a slight (but mostly negligible) effect on SSIM and ~2% faster.

10 years agoarm: use available neon functions for intra_sa8d/sad/satd_x3
Janne Grunau [Wed, 12 Mar 2014 23:05:48 +0000 (00:05 +0100)]
arm: use available neon functions for intra_sa8d/sad/satd_x3

4% faster on main/medium, 15% faster on baseline/superfast on a cortex-a9.

10 years agoarm: implement x264_pixel_var2_8x16_neon
Janne Grunau [Wed, 12 Mar 2014 13:35:31 +0000 (14:35 +0100)]
arm: implement x264_pixel_var2_8x16_neon

checkasm --bench on a cortex-a9:
var2_8x16_c: 5677
var2_8x16_neon: 1421

10 years agoarm: implement x264_pixel_var_8x16_neon
Janne Grunau [Wed, 12 Mar 2014 12:16:00 +0000 (13:16 +0100)]
arm: implement x264_pixel_var_8x16_neon

checkasm --bench on a cortex-a9:
var_8x16_c: 4306
var_8x16_neon: 791

10 years agox86: SSE2 and SSSE3 plane_copy_deinterleave_rgb
Henrik Gramner [Sun, 23 Feb 2014 14:33:48 +0000 (15:33 +0100)]
x86: SSE2 and SSSE3 plane_copy_deinterleave_rgb

About 5.6x faster than C on Haswell.

10 years agox86: Minor mbtree_propagate_cost improvements
Henrik Gramner [Sun, 16 Feb 2014 20:24:54 +0000 (21:24 +0100)]
x86: Minor mbtree_propagate_cost improvements

Reduce the number of registers used from 7 to 6.
Reduce the number of vector registers used by the AVX2 implementation from 8 to 7.
Multiply fps_factor by 1/256 once per frame instead of once per macroblock row.
Use mova instead of movu for dst since it's guaranteed to be aligned.
Some cosmetics.

10 years agox86inc: Support arbitrary stack alignments
Henrik Gramner [Sun, 9 Feb 2014 22:58:04 +0000 (23:58 +0100)]
x86inc: Support arbitrary stack alignments

If the stack is known to be at least 32-byte aligned we can safely store ymm
registers on the stack without doing manual alignment.

Change ALLOC_STACK to always align the stack before allocating stack space for
consistency. Previously alignment would occur either before or after allocating
stack space depending on whether manual alignment was required or not.

10 years agox86inc: warn if XOP integer FMA instruction emulation is impossible
Anton Mitrofanov [Fri, 14 Feb 2014 11:53:58 +0000 (15:53 +0400)]
x86inc: warn if XOP integer FMA instruction emulation is impossible

Emulation requires a temporary register if arguments 1 and 4 are the same; this
doesn't obey the semantics of the original instruction, so we can't emulate
that in x86inc.

ffmpeg has an x86util emulation for that case; I'll add it if x264's asm ever
needs it.

Also add pmacsdql emulation.

10 years agox86inc: free up variable name "n" in global namespace
Loren Merritt [Sat, 1 Mar 2014 02:57:56 +0000 (02:57 +0000)]
x86inc: free up variable name "n" in global namespace

10 years agox86: Pass -Worphan-labels to yasm
Henrik Gramner [Wed, 22 Jan 2014 18:09:12 +0000 (19:09 +0100)]
x86: Pass -Worphan-labels to yasm

Makes it easier to detect typos.

10 years agoWrite 3D metadata when outputting Matroska
Steve Lhomme [Sun, 16 Feb 2014 12:15:09 +0000 (13:15 +0100)]
Write 3D metadata when outputting Matroska

For when --frame-packing is set.

10 years agoDon't set chroma_loc_info_present_flag for non-4:2:0
Anton Mitrofanov [Sun, 23 Feb 2014 12:56:03 +0000 (16:56 +0400)]
Don't set chroma_loc_info_present_flag for non-4:2:0

The H.264 spec says it shouldn't be set in these cases.

10 years agox264.h: fix documentation
Fiona Glaser [Mon, 10 Mar 2014 15:42:50 +0000 (08:42 -0700)]
x264.h: fix documentation

The full details of the return values of encoder_encode and encoder_headers
were mistakenly removed a while ago; re-add them.

10 years agoFix pointer cast warning for 64-bit builds
Anton Mitrofanov [Sun, 23 Feb 2014 11:52:57 +0000 (15:52 +0400)]
Fix pointer cast warning for 64-bit builds

10 years agombaff: fix mb_field_decoding_flag tracking and simplify allow skip check
Anton Mitrofanov [Mon, 10 Mar 2014 12:48:02 +0000 (16:48 +0400)]
mbaff: fix mb_field_decoding_flag tracking and simplify allow skip check

Fixes an issue with too many forced non-skips in mbaff+cavlc, as well as
non-deterministic output with mbaff+cavlc+sliced-threads.

10 years agoFix memory overwrite in x264_deblock_h_chroma_mbaff_sse2
Anton Mitrofanov [Sun, 9 Mar 2014 23:22:57 +0000 (03:22 +0400)]
Fix memory overwrite in x264_deblock_h_chroma_mbaff_sse2

Fixes possible corruption with MBAFF+sliced threads.

10 years agoFix corruption with CAVLC overflow handling in MBAFF+main profile
Fiona Glaser [Sun, 2 Mar 2014 18:09:01 +0000 (10:09 -0800)]
Fix corruption with CAVLC overflow handling in MBAFF+main profile

Probably a regression in r2178.

10 years agoFix checkasm --bench output when nop_cycles is too large
Anton Mitrofanov [Mon, 10 Mar 2014 17:17:19 +0000 (21:17 +0400)]
Fix checkasm --bench output when nop_cycles is too large

10 years agoReally fix quantization factor allocation
Anton Mitrofanov [Wed, 22 Jan 2014 08:54:49 +0000 (12:54 +0400)]
Really fix quantization factor allocation

Actually allocate less (instead of just initialize less) and fix comments.

10 years agoFix build with Android NDK
Yu Xiaolei [Sun, 23 Feb 2014 12:12:51 +0000 (04:12 -0800)]
Fix build with Android NDK

Android NDK does not expose sched_getaffinity.

10 years agox86inc: speed up compilation with yasm
Loren Merritt [Thu, 16 Jan 2014 21:34:46 +0000 (13:34 -0800)]
x86inc: speed up compilation with yasm

Work around yasm's inefficiency with handling large numbers of variables
in the global scope.

10 years agoAdd support for AVC-Intra Class 200
Kieran Kunhya [Fri, 10 Jan 2014 23:27:33 +0000 (23:27 +0000)]
Add support for AVC-Intra Class 200

10 years agov210 input support
James Weaver [Tue, 7 Jan 2014 10:31:58 +0000 (10:31 +0000)]
v210 input support

Assembly based on code by Henrik Gramner and Loren Merritt.

10 years agoFix quantization factor allocation
Fiona Glaser [Tue, 21 Jan 2014 21:39:33 +0000 (13:39 -0800)]
Fix quantization factor allocation

We don't need to wastefully allocate quant tables above QP_MAX_SPEC; they're
never used.

11 years agoAvoid some unneccesary memory loads in macroblock_encode
Henrik Gramner [Wed, 8 Jan 2014 00:06:56 +0000 (01:06 +0100)]
Avoid some unneccesary memory loads in macroblock_encode

11 years agoBump dates to 2014
Henrik Gramner [Sun, 5 Jan 2014 14:25:05 +0000 (15:25 +0100)]
Bump dates to 2014

Also update AUTHORS file and my e-mail address in the headers of various files.

11 years agoRemove tools/xyuv.c
Henrik Gramner [Sun, 5 Jan 2014 23:18:31 +0000 (00:18 +0100)]
Remove tools/xyuv.c

It's an old stand-alone application that isn't relevant to x264.

11 years agoUse 8x16c wrappers with x86 asm functions for 4:2:2 with high bit depth
Anton Mitrofanov [Wed, 6 Nov 2013 22:37:23 +0000 (02:37 +0400)]
Use 8x16c wrappers with x86 asm functions for 4:2:2 with high bit depth

11 years agoCLI: Avoid redundant 16-bit upconversions in piped raw input
Henrik Gramner [Fri, 20 Dec 2013 21:44:28 +0000 (22:44 +0100)]
CLI: Avoid redundant 16-bit upconversions in piped raw input

It's not possible to seek in pipes, so if we want to skip frames we have to read and
discard unused ones. It's pointless to do bit-depth upconversions in those frames.

11 years agoFix input support from named pipes in Windows
Anton Mitrofanov [Fri, 3 Jan 2014 16:06:06 +0000 (20:06 +0400)]
Fix input support from named pipes in Windows

11 years agoFix ARM asm compilation with Apple assembler
Steve Clark [Wed, 20 Nov 2013 17:40:23 +0000 (21:40 +0400)]
Fix ARM asm compilation with Apple assembler

11 years agoFix uninitialized variable
Anton Mitrofanov [Wed, 13 Nov 2013 15:24:48 +0000 (19:24 +0400)]
Fix uninitialized variable

Caused if the timebase is not specified in stats file. Found by Clang.

11 years agoRemove --visualize option.
Anton Mitrofanov [Sun, 27 Oct 2013 15:27:23 +0000 (19:27 +0400)]
Remove --visualize option.

It probably wasn't used or maintained for last few years.

11 years agoAdd L-SMASH support as preferable alternative for MP4-muxing
Anton Mitrofanov [Tue, 15 Oct 2013 08:32:25 +0000 (12:32 +0400)]
Add L-SMASH support as preferable alternative for MP4-muxing

11 years agoAdd AVC-Intra 1080p50/60 Class 100 parameters
Kieran Kunhya [Sat, 21 Sep 2013 18:16:12 +0000 (19:16 +0100)]
Add AVC-Intra 1080p50/60 Class 100 parameters

Also add some compatibility fixes.

11 years agoAdd --filler option
Fiona Glaser [Mon, 9 Sep 2013 19:37:59 +0000 (12:37 -0700)]
Add --filler option

Allows generation of hard-CBR streams without using NAL HRD.
Useful if you want to be able to reconfigure the bitrate (which you can't do
with NAL HRD on).

11 years agoMake x264_encoder_reconfig more threadsafe
Anton Mitrofanov [Sun, 27 Oct 2013 11:22:51 +0000 (15:22 +0400)]
Make x264_encoder_reconfig more threadsafe

Do the reconfig when the next frame's encode begins.
Fixes some rare crashes with frame-threading and encoder_reconfig.

11 years agochroma-me: take shortcut in BI analysis
Fiona Glaser [Fri, 25 Oct 2013 00:19:00 +0000 (17:19 -0700)]
chroma-me: take shortcut in BI analysis

~100 cycles faster with subme>=9

11 years agoCRF-max: don't warn if VBV underflow occurs
Fiona Glaser [Thu, 24 Oct 2013 21:44:43 +0000 (14:44 -0700)]
CRF-max: don't warn if VBV underflow occurs

Only warn if underflow occurs for reasons other than CRF-max, as CRF-max
implies that VBV underflow is desired by the user.

11 years agox86inc: Make ym# behave the same way as xm#
Henrik Gramner [Fri, 18 Oct 2013 20:43:36 +0000 (22:43 +0200)]
x86inc: Make ym# behave the same way as xm#

This makes more sense for future implementations of templates with zmm registers.

11 years agoUse calloc instead of malloc + memset
Henrik Gramner [Fri, 18 Oct 2013 20:21:38 +0000 (22:21 +0200)]
Use calloc instead of malloc + memset

11 years agoReplace gf_malloc with regular malloc in mp4 muxer
Henrik Gramner [Thu, 10 Oct 2013 14:54:12 +0000 (16:54 +0200)]
Replace gf_malloc with regular malloc in mp4 muxer

It was used as a workaround for a bug that only existed in the GPAC repository
for a few weeks back in 2010. There's no reason to keep it anymore.

11 years agoUpdate to current libav/ffmpeg API
Anton Mitrofanov [Tue, 8 Oct 2013 19:20:40 +0000 (23:20 +0400)]
Update to current libav/ffmpeg API

11 years agoversion.sh: change to use /bin/sh
Rafaël Carré [Fri, 25 Oct 2013 14:12:24 +0000 (07:12 -0700)]
version.sh: change to use /bin/sh

11 years agoconfigure: don't generate a git version number if .git isn't present
Sean McGovern [Wed, 4 Sep 2013 21:15:00 +0000 (14:15 -0700)]
configure: don't generate a git version number if .git isn't present

11 years agoconfigure: include dependency libs in the Libs pkg-config
Martin Storsjo [Tue, 3 Sep 2013 21:56:18 +0000 (14:56 -0700)]
configure: include dependency libs in the Libs pkg-config

If only a static library is built, the user of the library that just
tries to link to the lib using the flags provided by pkg-config
might not know that only a static lib exists and that he'd have to
pass --static to pkg-config to get the internal dependencies to
be able to link the library.

For a shared build, the internal dependencies are kept in Libs.private
as before.

This matches how libav's pkg-config files are generated.

11 years agoFix compilation in case of HAVE_LOG2F check fails spuriously
Anton Mitrofanov [Thu, 17 Oct 2013 20:38:06 +0000 (00:38 +0400)]
Fix compilation in case of HAVE_LOG2F check fails spuriously

11 years agoFix compilation of shared library for Windows with original MinGW toolchain
Anton Mitrofanov [Sat, 12 Oct 2013 08:01:57 +0000 (12:01 +0400)]
Fix compilation of shared library for Windows with original MinGW toolchain

11 years agoFix possible crashes in resize and crop filters with high bitdepth input
Anton Mitrofanov [Tue, 8 Oct 2013 19:32:37 +0000 (23:32 +0400)]
Fix possible crashes in resize and crop filters with high bitdepth input

11 years agoFix INSTALL in configure for Solaris systems
Tim Mooney [Tue, 3 Sep 2013 20:43:50 +0000 (13:43 -0700)]
Fix INSTALL in configure for Solaris systems

11 years agoWorkaround for FFMS indexing bug
Henrik Gramner [Tue, 27 Aug 2013 22:50:31 +0000 (00:50 +0200)]
Workaround for FFMS indexing bug

If FFMS_ReadIndex is used with an empty index file it gets stuck in an infinite loop instead of returning NULL
like it's supposed to do on failure. Explicitly check if the file is empty before calling it as a workaround.

11 years agoFix masked access violation in KERNEL32
Anton Mitrofanov [Mon, 26 Aug 2013 17:20:31 +0000 (21:20 +0400)]
Fix masked access violation in KERNEL32

Caused crashes under gdb in Windows and might cause other unknown problems.

11 years agoFix GPAC support on Windows
Hiroki Taniura [Sat, 24 Aug 2013 16:18:57 +0000 (01:18 +0900)]
Fix GPAC support on Windows

11 years agoWindows Unicode support
Henrik Gramner [Sun, 11 Aug 2013 17:50:42 +0000 (19:50 +0200)]
Windows Unicode support

Windows, unlike most other operating systems, uses UTF-16 for Unicode strings while x264 is designed for UTF-8.

This patch does the following in order to handle things like Unicode filenames:
* Keep strings internally as UTF-8.
* Retrieve the CLI command line as UTF-16 and convert it to UTF-8.
* Always use Unicode versions of Windows API functions and convert strings to UTF-16 when calling them.
* Attempt to use legacy 8.3 short filenames for external libraries without Unicode support.

11 years agoAVC-Intra support
Kieran Kunhya [Sat, 20 Jul 2013 17:47:59 +0000 (18:47 +0100)]
AVC-Intra support

This format has been reverse engineered and x264's output has almost exactly
the same bitstream as Panasonic cameras and encoders produce. It therefore does
not comply with SMPTE RP2027 since Panasonic themselves do not comply with
their own specification. It has been tested in Avid, Premiere, Edius and
Quantel.

Parts of this patch were written by Fiona Glaser and some reverse
engineering was done by Joseph Artsimovich.

11 years agoTransparent hugepage support
Henrik Gramner [Mon, 8 Jul 2013 19:06:42 +0000 (12:06 -0700)]
Transparent hugepage support

Combine frame and mb data mallocs into a single large malloc.
Additionally, on Linux systems with hugepage support, ask for hugepages on
large mallocs.

This gives a small performance improvement (~0.2-0.9%) on systems without
hugepage support, as well as a small memory footprint reduction.

On recent Linux kernels with hugepage support enabled (set to madvise or
always), it improves performance up to 4% at the cost of about 7-12% more
memory usage on typical settings..

It may help even more on Haswell and other recent CPUs with improved 2MB page
support in hardware.

11 years agox86: SSSE3 implementation of pixel_sad_x3 and pixel_sad_x4
Henrik Gramner [Fri, 5 Jul 2013 19:15:54 +0000 (21:15 +0200)]
x86: SSSE3 implementation of pixel_sad_x3 and pixel_sad_x4

11 years agox86: Faster AVX2 pixel_sad_x3 and pixel_sad_x4
Henrik Gramner [Fri, 5 Jul 2013 19:15:49 +0000 (21:15 +0200)]
x86: Faster AVX2 pixel_sad_x3 and pixel_sad_x4

11 years agoconfigure: Support cygwin64
Diogo Franco [Wed, 24 Jul 2013 01:17:44 +0000 (22:17 -0300)]
configure: Support cygwin64

11 years agox86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"
Derek Buitenhuis [Fri, 9 Aug 2013 17:39:27 +0000 (13:39 -0400)]
x86inc: Check for __OUTPUT_FORMAT__ having a value of "x64"

This is also a valid value for WIN64.

11 years agoFix cases in which intra refresh allowed prediction from disallowed pixels
Anton Mitrofanov [Tue, 23 Jul 2013 21:11:50 +0000 (14:11 -0700)]
Fix cases in which intra refresh allowed prediction from disallowed pixels

11 years agoFix a few minor bugs found with a static analyzer
Anton Mitrofanov [Tue, 6 Aug 2013 21:56:34 +0000 (01:56 +0400)]
Fix a few minor bugs found with a static analyzer

11 years agoFix AVX2 detection bug with "limit CPUID" enabled in BIOS
Fiona Glaser [Fri, 12 Jul 2013 23:07:35 +0000 (16:07 -0700)]
Fix AVX2 detection bug with "limit CPUID" enabled in BIOS

11 years agox86: Remove X264_CPU_SSE_MISALIGN functions
Henrik Gramner [Fri, 5 Jul 2013 19:15:43 +0000 (21:15 +0200)]
x86: Remove X264_CPU_SSE_MISALIGN functions

Prevents a crash if the misaligned exception mask bit is cleared for some reason.

Misaligned SSE functions are only used on AMD Phenom CPUs and the benefit is miniscule.
They also require modifying the MXCSR control register and by removing those functions
we can get rid of that complexity altogether.

VEX-encoded instructions also supports unaligned memory operands. I tried adding AVX
implementations of all removed functions but there were no performance improvements on
Ivy Bridge. pixel_sad_x3 and pixel_sad_x4 had significant code size reductions though
so I kept them and added some minor cosmetics fixes and tweaks.

11 years agoTweak i16x16-delta-quant-avoidance code
Fiona Glaser [Thu, 20 Jun 2013 22:51:39 +0000 (15:51 -0700)]
Tweak i16x16-delta-quant-avoidance code

Don't omit the delta quant if it'd raise the quantizer to do so; this fixes
a rare flickering issue caused by deblocking.

11 years agox86: faster AVX2 iDCT, AVX deblock_luma_h, deblock_luma_h_intra
Fiona Glaser [Sun, 9 Jun 2013 16:06:27 +0000 (09:06 -0700)]
x86: faster AVX2 iDCT, AVX deblock_luma_h, deblock_luma_h_intra

11 years agoAdd new color primaries, transfer characteristics, matrix coefficients
Lucien [Mon, 17 Jun 2013 18:28:09 +0000 (18:28 +0000)]
Add new color primaries, transfer characteristics, matrix coefficients

11 years agoAdd "--stitchable" option for segmented encoding
Fiona Glaser [Sat, 1 Jun 2013 00:01:29 +0000 (17:01 -0700)]
Add "--stitchable" option for segmented encoding

Stops x264 from attempting to optimize global stream headers, ensuring that
different segments of a video will have identical headers when used with
identical encoding settings.

11 years agoInterface: if vbv-maxrate < bitrate, set bitrate = vbv-maxrate
Fiona Glaser [Thu, 27 Jun 2013 15:29:06 +0000 (08:29 -0700)]
Interface: if vbv-maxrate < bitrate, set bitrate = vbv-maxrate

This probably makes more sense to the user than setting vbv-maxrate = bitrate,
as before.

11 years agoOpenCL cosmetics
Anton Mitrofanov [Tue, 28 May 2013 12:02:42 +0000 (05:02 -0700)]
OpenCL cosmetics

11 years agoFix possible crash when writing very large filler NALUs
Anton Mitrofanov [Mon, 17 Jun 2013 20:16:33 +0000 (00:16 +0400)]
Fix possible crash when writing very large filler NALUs

Bitstream-reallocation function didn't handle the case of filler.

11 years agoFix build with PIC on some systems
Loren Merritt [Mon, 17 Jun 2013 18:27:09 +0000 (11:27 -0700)]
Fix build with PIC on some systems

11 years agoFix potential misaligment crash in AVX2 denoise_dct
Henrik Gramner [Sun, 2 Jun 2013 16:41:17 +0000 (18:41 +0200)]
Fix potential misaligment crash in AVX2 denoise_dct

11 years agoFix building with compilers without inline asm support
Anton Mitrofanov [Mon, 27 May 2013 21:48:15 +0000 (01:48 +0400)]
Fix building with compilers without inline asm support

Also fix crash in high bit depth builds compiled with unaligned stack.

11 years agoFix compilation with OpenCL on MacOS X
Anton Mitrofanov [Wed, 22 May 2013 18:43:59 +0000 (22:43 +0400)]
Fix compilation with OpenCL on MacOS X

Also fix crash in the case of OpenCL error during encoding.

11 years agoOpenCL support improvement/refactoring
Anton Mitrofanov [Mon, 6 May 2013 18:51:11 +0000 (22:51 +0400)]
OpenCL support improvement/refactoring

Autoload the OpenCL library so that it's not required to run an openCL-enabled
build of x264.

Update X264_BUILD, which should have been changed with the first patch.

11 years agox86: shave a few instructions off AVX deblock
Fiona Glaser [Thu, 16 May 2013 20:51:37 +0000 (13:51 -0700)]
x86: shave a few instructions off AVX deblock

11 years agox86: AVX2 dequant_4x4_dc
Henrik Gramner [Tue, 14 May 2013 16:57:40 +0000 (18:57 +0200)]
x86: AVX2 dequant_4x4_dc

11 years agox86: AVX2 high bit-depth dequant
Henrik Gramner [Tue, 14 May 2013 16:53:12 +0000 (18:53 +0200)]
x86: AVX2 high bit-depth dequant

11 years agox86-64: 64-bit variant of AVX2 hpel_filter
Fiona Glaser [Fri, 10 May 2013 00:20:05 +0000 (17:20 -0700)]
x86-64: 64-bit variant of AVX2 hpel_filter

~5% faster than 32-bit.

11 years agox86: AVX2 high bit-depth denoise_dct
Henrik Gramner [Mon, 6 May 2013 16:41:24 +0000 (18:41 +0200)]
x86: AVX2 high bit-depth denoise_dct

28->15 cycles

Also reorder instructions to use fewer registers, 3 cycles faster on Ivy Bridge with 64-bit Windows.

11 years agox86: AVX2 high bit-depth quant
Henrik Gramner [Sat, 4 May 2013 16:48:58 +0000 (18:48 +0200)]
x86: AVX2 high bit-depth quant

quant_4x4: 13->6 cycles
quant_4x4_dc: 14->8 cycles
quant_8x8: 47->24 cycles
quant_4x4x4: 48->25 cycles

11 years agox86: AVX2 add16x16_idct_dc
Fiona Glaser [Wed, 1 May 2013 21:32:11 +0000 (14:32 -0700)]
x86: AVX2 add16x16_idct_dc

27 -> 19 cycles

11 years agox86: faster AVX2 quant_4x4x4
Fiona Glaser [Mon, 29 Apr 2013 23:16:54 +0000 (16:16 -0700)]
x86: faster AVX2 quant_4x4x4

10->9 cycles