]>
granicus.if.org Git - libx264/log
Henrik Gramner [Mon, 12 Oct 2015 19:55:11 +0000 (21:55 +0200)]
x86inc: Use more consistent indentation
Henrik Gramner [Mon, 12 Oct 2015 18:15:18 +0000 (20:15 +0200)]
x86inc: Preserve arguments when allocating stack space
When allocating stack space with a larger alignment than the known stack
alignment a temporary register is used for storing the stack pointer.
Ensure that this isn't one of the registers used for passing arguments.
Henrik Gramner [Sat, 16 Jan 2016 23:25:47 +0000 (00:25 +0100)]
x86inc: Improve FMA instruction handling
* Correctly handle FMA instructions with memory operands.
* Print a warning if FMA instructions are used without the correct cpuflag.
* Simplify the instantiation code.
* Clarify documentation.
Only the last operand in FMA3 instructions can be a memory operand. When
converting FMA4 instructions to FMA3 instructions we can utilize the fact
that multiply is a commutative operation and reorder operands if necessary
to ensure that a memory operand is used only as the last operand.
Henrik Gramner [Sun, 11 Oct 2015 20:31:53 +0000 (22:31 +0200)]
x86inc: Be more verbose in assertion failures
Henrik Gramner [Wed, 30 Sep 2015 21:17:00 +0000 (23:17 +0200)]
x86inc: Make cpuflag() and notcpuflag() return 0 or 1
Makes it possible to use them in arithmetic expressions.
Henrik Gramner [Fri, 30 Oct 2015 15:55:49 +0000 (16:55 +0100)]
encoder_open: Fix memory leak
Furthermore, the x264_analyse_prepare_costs() and x264_analyse_init_costs()
functions were only used in x264_encoder_open(), so move that entire section
of code to analyse.c as well to simplify things.
Janne Grunau [Wed, 18 Nov 2015 10:08:22 +0000 (11:08 +0100)]
arm: do not fill mc_weight*_neon tabs for HIGH_BIT_DEPTH
The asm is only for 8-bit and function prototypes reflect that. Avoids
numerous warnings with --bit-depth=9/10.
Janne Grunau [Tue, 13 Oct 2015 21:50:11 +0000 (23:50 +0200)]
arm: Eliminate text relocations in asm
Android 6 does not link shared libraries with text relocations.
Make the movrel macro position independent and add movrelx for indirect
loads of external symbols.
Move the function pointer table for the aligned memcpy variants to the
data.rel.ro section on Linux/Android.
Martin Storsjö [Thu, 15 Oct 2015 08:50:33 +0000 (11:50 +0300)]
arm: Don't assume alignment in mbtree_propagate_list_internal where it isn't provided
Janne Grunau [Tue, 13 Oct 2015 21:50:12 +0000 (23:50 +0200)]
arm: Fix checkasm register clobber check on iOS
r9 is a volatile register in the iOS ABI and will therefore not be
preserved by compiled functions like the luma motion compensation.
Add the symbol prefix to the puts() call and use blx since a switch
between arm and thumb mode might be required.
Anton Mitrofanov [Wed, 30 Sep 2015 22:02:16 +0000 (01:02 +0300)]
ppc: Add detection of AltiVec support for FreeBSD
Patch from FreeBSD ports.
Anton Mitrofanov [Mon, 28 Sep 2015 18:07:55 +0000 (21:07 +0300)]
Don't assume 16-byte stack alignment by default on x86-32
Some compilers depending on target OS uses 4-byte stack alignment by default.
Explicitly check known good compilers and specific options for stack alignment.
Anton Mitrofanov [Tue, 22 Sep 2015 18:33:07 +0000 (21:33 +0300)]
Fix a few static analyzer performance hints
Anton Mitrofanov [Tue, 22 Sep 2015 17:19:23 +0000 (20:19 +0300)]
Revise the row VBV algorithm
Anton Mitrofanov [Tue, 22 Sep 2015 16:26:25 +0000 (19:26 +0300)]
Fix high bit depth lookahead cost compensation algorithm
Now high bit depth VBV should act more like 8-bit depth one.
Anton Mitrofanov [Tue, 22 Sep 2015 16:05:52 +0000 (19:05 +0300)]
Correctly update the intra row predictor in B-frames
It was previously used but never updated from it's initialization value.
Anton Mitrofanov [Tue, 22 Sep 2015 15:58:24 +0000 (18:58 +0300)]
Change the predictors update algorithm
Keep predictor offsets more stable. This should fix VBV misprediction in frames
with a large difference in complexity between the top and bottom parts.
Martin Storsjö [Thu, 3 Sep 2015 06:30:44 +0000 (09:30 +0300)]
arm: Implement x264_mbtree_propagate_{cost, list}_neon
The cost function could be simplified to avoid having to clobber
q4/q5, but this requires reordering instructions which increase
the total runtime.
checkasm timing Cortex-A7 A8 A9
mbtree_propagate_cost_c 63702 155835 62829
mbtree_propagate_cost_neon 17199 10454 11106
mbtree_propagate_list_c 104203 108949 84532
mbtree_propagate_list_neon 82035 78348 60410
Martin Storsjö [Thu, 3 Sep 2015 06:30:43 +0000 (09:30 +0300)]
x86: Share the mbtree_propagate_list macro with aarch64
This avoids having to duplicate the same code for all architectures
that implement only the internal part of this function in assembler.
Martin Storsjö [Wed, 2 Sep 2015 19:39:51 +0000 (22:39 +0300)]
arm: Implement luma intra deblocking
checkasm timing Cortex-A7 A8 A9
deblock_luma_intra[0]_c 5988 4653 4316
deblock_luma_intra[0]_neon 3103 2170 2128
deblock_luma_intra[1]_c 7119 5905 5347
deblock_luma_intra[1]_neon 2068 1381 1412
This includes extra optimizations by Janne Grunau.
Timings from a separate build, on Exynos 5422:
Cortex-A7 A15
deblock_luma_intra[0]_c 6627 3300
deblock_luma_intra[0]_neon 3059 1128
deblock_luma_intra[1]_c 7314 4128
deblock_luma_intra[1]_neon 2038 720
Martin Storsjö [Mon, 31 Aug 2015 19:40:31 +0000 (22:40 +0300)]
arm: Implement some neon 8x16c intra predict functions
checkasm timing Cortex-A7 A8 A9
intra_predict_8x16c_dct_c 862 540 590
intra_predict_8x16c_dct_neon 608 511 657
intra_predict_8x16c_h_c 972 707 719
intra_predict_8x16c_h_neon 722 656 672
intra_predict_8x16c_p_c 10183 9819 8655
intra_predict_8x16c_p_neon 2622 1972 1983
Martin Storsjö [Thu, 27 Aug 2015 21:15:01 +0000 (00:15 +0300)]
arm: Implement x264_plane_copy_neon
checkasm timing Cortex-A7 A8 A9
plane_copy_c 13124 10925 9106
plane_copy_neon 7349 5103 8945
Martin Storsjö [Fri, 28 Aug 2015 06:40:24 +0000 (09:40 +0300)]
checkasm: arm: Check register clobbering
Cast the function pointer to a different type signature, to
be able to use uint64_t as return type (instead of intptr_t) for
those calls that require it.
Use two separate functions, depending on whether neon is available.
Martin Storsjö [Thu, 13 Aug 2015 21:00:57 +0000 (00:00 +0300)]
checkasm: Try different widths for ssd_nv12
To test all codepaths in the aarch64 neon implementation, one at
the very least needs to test with width 8, 16, 24 and 32.
Jerome Duval [Fri, 13 Jun 2014 19:56:27 +0000 (19:56 +0000)]
Haiku support
Add Haiku as supported platform in configure.
Haiku has no nice() function, use the platform specific substitute instead.
Martin Storsjö [Tue, 25 Aug 2015 11:38:20 +0000 (14:38 +0300)]
checkasm: aarch64: Check register clobbering
Disable this on iOS, since it has got a slightly different ABI
for vararg parameters.
Martin Storsjö [Tue, 25 Aug 2015 20:36:45 +0000 (23:36 +0300)]
arm: Implement x284_decimate_score15/16/64_neon
checkasm timing Cortex-A7 A8 A9
decimate_score15_c 764 736 535
decimate_score15_neon 487 494 453
decimate_score16_c 782 727 553
decimate_score16_neon 487 494 521
decimate_score64_c 2361 2597 2011
decimate_score64_neon 1017 802 785
Martin Storsjö [Tue, 25 Aug 2015 20:36:44 +0000 (23:36 +0300)]
arm: Implement chroma intra deblock
checkasm timing Cortex-A7 A8 A9
deblock_chroma_420_intra_mbaff_c 1469 1276 1181
deblock_chroma_420_intra_mbaff_neon 981 717 644
deblock_chroma_intra[1]_c 2954 2402 2321
deblock_chroma_intra[1]_neon 947 581 575
deblock_h_chroma_420_intra_c 2859 2509 2264
deblock_h_chroma_420_intra_neon 1480 1119 1028
deblock_h_chroma_422_intra_c 6211 5030 4792
deblock_h_chroma_422_intra_neon 2894 1990 2077
Martin Storsjö [Tue, 25 Aug 2015 11:38:17 +0000 (14:38 +0300)]
arm: Implement x264_pixel_sa8d_satd_16x16_neon
This requires spilling some registers to the stack,
contray to the aarch64 version.
checkasm timing Cortex-A7 A8 A9
sa8d_satd_16x16_neon 12936 6365 7492
sa8d_satd_16x16_separate_neon 14841 6605 8324
Martin Storsjö [Tue, 25 Aug 2015 11:38:16 +0000 (14:38 +0300)]
arm: Implement x264_deblock_h_chroma_mbaff_neon
checkasm timing Cortex-A7 A8 A9
deblock_chroma_420_mbaff_c 1944 1706 1526
deblock_chroma_420_mbaff_neon 1210 873 865
Martin Storsjö [Tue, 25 Aug 2015 11:38:15 +0000 (14:38 +0300)]
arm: Implement x264_deblock_h_chroma_422_neon
checkasm timing Cortex-A7 A8 A9
deblock_h_chroma_422_c 6953 6269 5145
deblock_h_chroma_422_neon 3905 2569 2551
Martin Storsjö [Tue, 25 Aug 2015 11:38:14 +0000 (14:38 +0300)]
arm: Implement integral_init4/8h/v_neon
checkasm timing Cortex-A7 A8 A9
integral_init4h_c 10466 8590 6161
integral_init4h_neon 3021 1494 1800
integral_init4v_c 16250 13590 13628
integral_init4v_neon 3473 2073 3291
integral_init8h_c 10100 8275 5705
integral_init8h_neon 4403 2344 2751
integral_init8v_c 6403 4632 4999
integral_init8v_neon 1184 783 1306
Martin Storsjö [Tue, 25 Aug 2015 11:38:13 +0000 (14:38 +0300)]
arm: Implement x264_denoise_dct_neon
checkasm timing Cortex-A7 A8 A9
denoise_dct_c 6604 5510 5858
denoise_dct_neon 1774 1139 1614
Martin Storsjö [Tue, 25 Aug 2015 11:38:12 +0000 (14:38 +0300)]
arm: Add x264_nal_escape_neon
checkasm timing Cortex-A7 A8 A9
nal_escape_c 852758 879566 655497
nal_escape_neon 376831 450678 371673
Martin Storsjö [Tue, 25 Aug 2015 11:38:11 +0000 (14:38 +0300)]
arm: Add neon versions of vsad, asd8 and ssd_nv12_core
These are straight translations of the aarch64 versions.
checkasm timing Cortex-A7 A8 A9
vsad_c 16234 10984 9850
vsad_neon 2132 1020 789
asd8_c 5859 3561 3543
asd8_neon 1407 1279 1250
ssd_nv12_c 608096 591072 426285
ssd_nv12_neon 72752 33549 41347
Martin Storsjö [Tue, 25 Aug 2015 11:38:10 +0000 (14:38 +0300)]
checkasm: Check the right output range for integral_initXh
These functions write their output into sum+stride, while we previously
only checked [0..stride-8] within the sum array.
This catches the previously broken aarch64 version of these functions.
Also check up until stride-4 elements for init4h.
Janne Grunau [Thu, 20 Aug 2015 11:55:54 +0000 (13:55 +0200)]
aarch64: Skip deblocking in 264_deblock_h_chroma_422_neon
If the parameters (alpha, beta, tc0[]) indicated that the deblocking
should have been skipped, every 2nd chrome line would have deblocked
anyway.
deblock_h_chroma_422_neon: 2259 (before)
deblock_h_chroma_422_neon: 2192 (after)
Janne Grunau [Mon, 17 Aug 2015 14:39:20 +0000 (16:39 +0200)]
aarch64: Optimize various intra_predict asm functions
Make them at least as fast as the compiled C version (tested on
cortex-a53 vs. gcc 4.9.2).
C NEON (before) NEON (after)
intra_predict_4x4_dc: 260 335 260
intra_predict_4x4_dct: 210 265 200
intra_predict_8x8c_dc: 497 548 493
intra_predict_8x8c_v: 232 309 179 (arm64)
intra_predict_8x16c_dc: 795 830 790
Janne Grunau [Tue, 18 Aug 2015 08:25:10 +0000 (10:25 +0200)]
aarch64: Faster intra_predict_4x4_h
Use multiplication with 0x01010101 for splats.
On a cortex-a53:
gcc 4.9.2 llvm 3.6 neon (before) neon (after)
intra_predict_4x4_h: 162 147 160/155 139/135
Janne Grunau [Tue, 18 Aug 2015 08:25:09 +0000 (10:25 +0200)]
aarch64: Fix coeff_level_run* macros with LLVM's assembler
LLVM's integrated assembler does not treat symbols as integer constants.
Janne Grunau [Tue, 18 Aug 2015 08:25:08 +0000 (10:25 +0200)]
aarch64: Remove commas LLVM's assembler complains about
Martin Storsjö [Thu, 13 Aug 2015 20:59:31 +0000 (23:59 +0300)]
arm: Implement x264_sub8x16_dct_dc_neon
checkasm timing Cortex-A7 A8 A9
sub8x16_dct_dc_c 6386 3901 4080
sub8x16_dct_dc_neon 1491 698 917
Martin Storsjö [Thu, 13 Aug 2015 20:59:28 +0000 (23:59 +0300)]
arm: Optimize x264_deblock_h_chroma_neon
Shuffle both chroma components together as a 16 bit unit, and
don't write the unchanged columns (like in x264_deblock_h_luma_neon
and in the aarch64 version of the function).
This causes a minor slowdown for x264_deblock_v_chroma_neon, but
it is negligible compared to the speedup.
checkasm timing Cortex-A7 A8 A9
deblock_chroma[1]_c 4817 4057 3601
deblock_chroma[1]_neon 1249 716 817 (before)
deblock_chroma[1]_neon 1249 766 845 (after)
deblock_h_chroma_420_c 3699 3275 2830
deblock_h_chroma_420_neon 2068 1414 1400 (before)
deblock_h_chroma_420_neon 1838 1355 1291 (after)
Martin Storsjö [Thu, 13 Aug 2015 20:59:27 +0000 (23:59 +0300)]
aarch64: Remove leftover commented out code
Martin Storsjö [Thu, 13 Aug 2015 20:59:26 +0000 (23:59 +0300)]
aarch64: Simplify the decimate_score functions
After doing a left shift by the number of bits returned by clz,
only bits set to zero can be shifted out, so if the register
was nonzero to start with (which is checked), it can't become
zero here.
Martin Storsjö [Thu, 13 Aug 2015 20:59:25 +0000 (23:59 +0300)]
arm: Use aligned loads in x264_coeff_last15_neon
After subtracting 2, the pointer will be aligned.
checkasm timing Cortex-A7 A8 A9
coeff_last15_c 423 375 230
coeff_last15_neon 350 420 404 (before)
coeff_last15_neon 350 400 394 (after)
Martin Storsjö [Thu, 13 Aug 2015 20:59:24 +0000 (23:59 +0300)]
arm: Simplify x264_predict_8x8c_p_neon
This gets rid of a few unnecessary (and confusing) steps in
calculating the increment to i00.
checkasm timing Cortex-A7 A8 A9
intra_predict_8x8c_p_c 5525 4732 4755
intra_predict_8x8c_p_neon 1719 1140 1262 (before)
intra_predict_8x8c_p_neon 1663 1142 1255 (after)
Vittorio Giovara [Tue, 15 Sep 2015 13:40:14 +0000 (15:40 +0200)]
lavf: Use the prefixed name for pixel format enum
Janne Grunau [Wed, 2 Sep 2015 22:21:58 +0000 (00:21 +0200)]
aarch64: fix x264_mbtree_propagate_cost_neon
The branch conditon caused the loop to execute one time more than
intended. Detected by a memory corruption on arm with the 1 to 1 port of
the function.
Martin Storsjö [Thu, 13 Aug 2015 20:59:22 +0000 (23:59 +0300)]
aarch64: Fix integral_init4/8h_neon
The stride is the number of uint16_t elements and thus needs
to be shifted.
This issue had slipped unnoticed since checkasm didn't actually
verify the output of these functions.
Henrik Gramner [Thu, 27 Aug 2015 17:53:00 +0000 (19:53 +0200)]
x86: Fix integral_init4/8h_avx2
The AVX2 implementation was using the wrong offsets. It went undetected due to
the checkasm test being incorrect.
Mark Webster [Wed, 5 Aug 2015 03:28:17 +0000 (04:28 +0100)]
Simplify inclusion of x264.h in C++ projects
Name all structs to support forward declarations.
Add a conditional extern "C" wrapper in x264.h itself instead of having to
specify it in every location where it's included.
Henrik Gramner [Sun, 16 Aug 2015 19:59:26 +0000 (21:59 +0200)]
checkasm: Properly save rdx/edx in checkasm_call() on x86
If the return value doesn't fit in a single register rdx/edx can in some
cases be used in addition to rax/eax.
Doesn't affect any of the existing checkasm tests but it's more correct
behavior and it might be useful in the future.
Henrik Gramner [Tue, 11 Aug 2015 15:19:35 +0000 (17:19 +0200)]
x86: Enable SSE2 by default on x86-32
It makes more sense to tune the defaults to benefit the vast majority of users.
Anyone still using a Pentium III for video encoding is of course free to
explicitly set different flags when compiling.
Henrik Gramner [Mon, 10 Aug 2015 20:30:21 +0000 (22:30 +0200)]
msvs/icl: Improve default CFLAGS
Use -fp:fast as a substitute for -ffast-math.
Increase warning level from -W0 to -W1 (the default setting).
Disable -GS (stack cookies) on MSVS. It's disabled by default on ICL.
Henrik Gramner [Wed, 12 Aug 2015 20:23:31 +0000 (22:23 +0200)]
Use a relative $SRCPATH for out-of-tree builds
Fixes out-of-tree MSVS builds on Cygwin.
Henrik Gramner [Sat, 8 Aug 2015 20:26:38 +0000 (22:26 +0200)]
cygwin: Enable MSVS support
`cl -showIncludes` creates absolute Windows paths for some files, attempt
to convert those to Unix paths.
Use relative paths for dependencies located in or below the working directory
in order to mimic the behavior of gcc and to make the paths more readable.
Make the dependency generation script a bit more robust in general.
Henrik Gramner [Sat, 8 Aug 2015 16:34:21 +0000 (18:34 +0200)]
cltostr.sh: Minor fixes
Henrik Gramner [Sat, 8 Aug 2015 10:21:54 +0000 (12:21 +0200)]
Simplify version.sh
Also remove some non-POSIX syntax and improve robustness.
As a bonus the script now runs about 2-3 times faster.
`git rev-list --count` could be used to simplify things even further,
but that functionality was added in git 1.7.2 so keep `wc -l` for now
to maintain compatibility with older git versions.
장영훈 [Fri, 7 Aug 2015 05:43:24 +0000 (14:43 +0900)]
msvs: Fix cl detection in non-English environments
Henrik Gramner [Mon, 3 Aug 2015 19:05:11 +0000 (21:05 +0200)]
x86inc: Sync minor changes from ffmpeg/libav
Henrik Gramner [Wed, 29 Jul 2015 17:30:52 +0000 (19:30 +0200)]
matroska: Add comments for the remaining element names
Henrik Gramner [Wed, 29 Jul 2015 17:30:41 +0000 (19:30 +0200)]
Silence various static analyzer warnings
Those are false positives, but it doesn't hurt to get rid of them.
Henrik Gramner [Sun, 26 Jul 2015 21:13:29 +0000 (23:13 +0200)]
mingw: Enable the tsaware linker flag
Avoids an irrelevant compatibility layer in Terminal Services environments.
https://msdn.microsoft.com/en-us/library/
cc834995 .aspx
Henrik Gramner [Sun, 26 Jul 2015 21:13:26 +0000 (23:13 +0200)]
msvs: Don't redefine snprintf for VS2015
Visual Studio 2015 has a proper snprintf implementation.
Henrik Gramner [Sun, 26 Jul 2015 21:13:19 +0000 (23:13 +0200)]
msvs: Prefer link.exe from the same directory as cl.exe
/usr/bin/link from coreutils may be located before the MSVS linker in $PATH
which causes linking to fail due to using the wrong binary.
Henrik Gramner [Sun, 26 Jul 2015 22:10:00 +0000 (00:10 +0200)]
frame_dump: check fseek() return value
Henrik Gramner [Sun, 26 Jul 2015 22:08:38 +0000 (00:08 +0200)]
x264_vfprintf: use va_copy
It's undefined behavior to use the same va_list twice.
This most likely didn't cause any issues in practice since the string would
have to be larger than 4 KiB to trigger the fallback path.
Use workaround for ICL as it doesn't define va_copy even for C99.
Henrik Gramner [Sun, 26 Jul 2015 22:08:31 +0000 (00:08 +0200)]
param_parse: Fix framerate rounding issues
Marcin Juszkiewicz [Mon, 1 Jun 2015 09:24:45 +0000 (11:24 +0200)]
aarch64: Remove broken CFLAGS in configure
GCC doesn't have an "-arch" switch, but works when that entire line is removed.
Rong Yan [Mon, 20 Jul 2015 08:34:20 +0000 (03:34 -0500)]
ppc: Add little-endian PowerPC support
Rishikesh More [Thu, 18 Jun 2015 12:18:46 +0000 (17:48 +0530)]
mips: MSA quant optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:45 +0000 (17:48 +0530)]
mips: MSA predict optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:44 +0000 (17:48 +0530)]
mips: MSA pixel optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:43 +0000 (17:48 +0530)]
mips: MSA deblock optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:42 +0000 (17:48 +0530)]
mips: MSA dct optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:40 +0000 (17:48 +0530)]
mips: MSA mc optimizations
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Thu, 18 Jun 2015 12:18:38 +0000 (17:48 +0530)]
mips: Common MSA macros
Add macros for load/store, slide, shift, transpose and basic arithmetic
operations required by subsequent patches.
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Rishikesh More [Tue, 12 May 2015 14:08:09 +0000 (19:38 +0530)]
mips: Add MSA support to checkasm
Signed-off-by: Rishikesh More <rishikesh.more@imgtec.com>
Kaustubh Raste [Fri, 17 Apr 2015 12:08:58 +0000 (17:38 +0530)]
mips: Initial MSA support
MSA is the MIPS SIMD Architecture.
Add X264_CPU_MSA define.
Update configure to detect MIPS platform and set flags.
CPU-specific gcc options are expected through --extra-cflags.
Sample command line for mips32r5:
./configure --host=mipsel-linux-gnu --cross-prefix=<TOOLCHAIN>/mips-mti-linux-gnu-
--extra-cflags="-EL -mips32r5 -msched-weight -mload-store-pairs"
Signed-off-by: Kaustubh Raste <kaustubh.raste@imgtec.com>
Anton Mitrofanov [Thu, 16 Jul 2015 21:22:29 +0000 (00:22 +0300)]
Limit autodetection of threads number according to the source height
Anton Mitrofanov [Thu, 16 Jul 2015 16:04:59 +0000 (19:04 +0300)]
Fine-tune of frame's size predictors at ratecontrol start
This is attempt to improve VBV at start of video with a lot of threads which
delay feedback for predictors.
Anton Mitrofanov [Thu, 16 Jul 2015 13:15:56 +0000 (16:15 +0300)]
Use forced frame types in slicetype analysis
This should improve MBTree and VBV when a lot of forced frame types are used.
Henrik Gramner [Mon, 1 Dec 2014 21:05:42 +0000 (22:05 +0100)]
x86: SSSE3 and AVX2 implementations of plane_copy_swap
For NV21 input.
Yu Xiaolei [Fri, 6 Jun 2014 08:05:27 +0000 (16:05 +0800)]
NV21 input support
Eliminates an extra copy when encoding Android camera preview images.
Checkasm test by Janne Grunau.
ARM assembly with improvements from Janne Grunau.
Henrik Gramner [Tue, 23 Jun 2015 15:00:47 +0000 (17:00 +0200)]
deblock: Write combining
Henrik Gramner [Tue, 23 Jun 2015 12:59:59 +0000 (14:59 +0200)]
Get rid of some tabs and trailing whitespaces
Henrik Gramner [Sat, 23 May 2015 17:44:16 +0000 (19:44 +0200)]
x86: Experimental nasm support
Enables the use of nasm as an alternative to yasm.
Note that nasm cannot assemble x264 with PIC enabled since it currently doesn't
support [symbol-$$] addressing which is used extensively by x264's PIC code.
This includes all 64-bit Windows and 64-bit OS X builds, even non-shared.
For the above reason nasm is currently intentionally not auto-detected, instead
the assembler must be explicitly specified using "AS=nasm ./configure".
Also drop -O2 from ASFLAGS since it's simply ignored anyway.
Timothy Gu [Tue, 26 May 2015 17:12:42 +0000 (19:12 +0200)]
x86inc: Prevent warnings when using `struc` and `endstruc`
struc and endstruc attempts to revert to the previous section state set by
the SECTION macro.
Use the primitive [SECTION] directive instead of the SECTION macro for the
.note.GNU-stack section to prevent it from being emitted again during endstruc.
Henrik Gramner [Wed, 27 May 2015 19:38:14 +0000 (21:38 +0200)]
x86inc: Drop SECTION_TEXT macro
The .text section is already 16-byte aligned by default on all supported
platforms so `SECTION_TEXT` isn't any different from `SECTION .text`.
Henrik Gramner [Sat, 23 May 2015 11:38:05 +0000 (13:38 +0200)]
x86inc: Disable vpbroadcastq workaround in newer yasm versions
The bug was fixed in 1.3.0, so only perform the workaround in earlier versions.
Henrik Gramner [Sun, 24 May 2015 20:57:00 +0000 (22:57 +0200)]
Prefer Unicode versions of Windows API calls
Just for consistency, doesn't affect behavior.
Henrik Gramner [Sun, 24 May 2015 21:21:20 +0000 (23:21 +0200)]
Get rid of fPIC warnings when compiling a shared library on Windows
PIC is always enabled when compiling for Windows so gcc complains when using
-fPIC since it doesn't do anything.
Henrik Gramner [Sat, 25 Jul 2015 20:42:59 +0000 (22:42 +0200)]
matroska: Write the correct DocTypeVersion when using frame-packing
The StereoMode element is only valid with DocTypeVersion 3 or higher.
Anton Mitrofanov [Fri, 24 Jul 2015 21:21:52 +0000 (00:21 +0300)]
dump_yuv: Fix file handle leak
Anton Mitrofanov [Fri, 24 Jul 2015 21:20:47 +0000 (00:20 +0300)]
mp4: Fix file handle leak
Henrik Gramner [Tue, 23 Jun 2015 22:40:45 +0000 (00:40 +0200)]
flv: Check fseek() and fwrite() return values
Henrik Gramner [Tue, 23 Jun 2015 22:22:56 +0000 (00:22 +0200)]
flv: Fix memory and file handle leaks
Henrik Gramner [Tue, 23 Jun 2015 23:23:35 +0000 (01:23 +0200)]
avs: Fix file handle leak
Henrik Gramner [Tue, 23 Jun 2015 11:38:02 +0000 (13:38 +0200)]
matroska: Fix memory leak