]>
granicus.if.org Git - libx264/log
Anton Mitrofanov [Sat, 22 Oct 2011 15:41:07 +0000 (19:41 +0400)]
Improve yasm version check
Previous check allowed certain earlier versions that weren't fully compatible.
Fiona Glaser [Tue, 18 Oct 2011 21:30:26 +0000 (14:30 -0700)]
Add fenc prefetching to adaptive quant
Many fewer cache misses, faster adaptive quant.
Fiona Glaser [Tue, 18 Oct 2011 21:14:03 +0000 (14:14 -0700)]
Split prefetch_fenc between colorspaces
Add 4:2:2 version.
Fiona Glaser [Wed, 12 Oct 2011 00:04:32 +0000 (17:04 -0700)]
Some more 4:2:2 x86 asm
coeff_last8, coeff_level_run8, var2_8x16, predict_8x16c_dc, satd_4x16, intra_mbcmp_8x16c_x3, deblock_h_chroma_422
Loren Merritt [Tue, 11 Oct 2011 18:12:43 +0000 (18:12 +0000)]
Remove obsolete versions of intra_mbcmp_x3
intra_mbcmp_x3 is unnecessary if x9 exists (SSSE3 and onwards).
Loren Merritt [Mon, 10 Oct 2011 05:42:36 +0000 (05:42 +0000)]
SSSE3/SSE4/AVX 9-way fully merged i8x8 analysis (sa8d_x9)
x86_64 only for now, due to register requirements (like sa8d_x3).
i8x8 analysis cycles (per partition):
penryn sandybridge bulldozer
616->600 482->374 418->356 preset=faster
892->632 725->387 598->373 preset=medium
948->650 789->409 673->383 preset=slower
Fiona Glaser [Sat, 1 Oct 2011 02:09:19 +0000 (19:09 -0700)]
SSSE3/SSE4/AVX 9-way fully merged i8x8 analysis (sad_x9)
~3 times faster than current analysis, plus (like intra_sad_x9_4x4) analyzes all modes without shortcuts.
Loren Merritt [Wed, 5 Oct 2011 20:29:21 +0000 (13:29 -0700)]
Merge i4x4 prediction with intra_mbcmp_x9_4x4
Avoids a redundant prediction after analysis.
Fiona Glaser [Wed, 5 Oct 2011 20:17:31 +0000 (13:17 -0700)]
Inline i4x4/i8x8 encode into intra analysis
Larger code size, but faster.
Fiona Glaser [Thu, 22 Sep 2011 00:12:10 +0000 (17:12 -0700)]
Initial XOP and FMA4 support on AMD Bulldozer
~10% faster Hadamard functions (SATD/SA8D/hadamard_ac) plus other improvements.
Mans Rullgard [Tue, 27 Sep 2011 17:14:14 +0000 (21:14 +0400)]
ARM: update NEON chroma deblock functions to NV12 pixel format
Sean McGovern [Mon, 17 Oct 2011 19:45:15 +0000 (12:45 -0700)]
Add /usr/lib/{64/}values-xpg6.o to $LDFLAGS on Solaris
This is required for POSIX.1-2001 compliance.
Sean McGovern [Mon, 17 Oct 2011 19:44:03 +0000 (12:44 -0700)]
Fix linker test for -Bsymbolic
The Solaris linker only accepts -Bsymbolic for objects compiled in dynamic mode (i.e. shared objects), so pass -shared to gcc.
Additionally, for x86_32 unresolved textrels cause a linker error so mark the .text section as 'impure'.
Sean McGovern [Mon, 17 Oct 2011 19:43:28 +0000 (12:43 -0700)]
Add $SOFLAGS to exported SOFLAGS make variable
Henrik Gramner [Sat, 24 Sep 2011 13:56:08 +0000 (15:56 +0200)]
Allow setting a chroma format at compile time
Gives a slight speed increase and significant binary size reduction when only one chroma format is needed.
Harfe Leier [Fri, 30 Sep 2011 19:49:33 +0000 (12:49 -0700)]
Improve profile help
List high422/high444 profiles, and don't show non-high-bit-depth profiles in high bit depth builds.
Yusuke Nakamura [Wed, 19 Oct 2011 18:09:51 +0000 (03:09 +0900)]
Fix infinite loop parsing TDecimate Mode 3 timecode v1 files
Fiona Glaser [Tue, 11 Oct 2011 00:44:31 +0000 (17:44 -0700)]
Fix some integer overflows/signedness errors found by IOC
The only real bug here is in slicetype.c, which may or may not affect real encodes.
Fiona Glaser [Wed, 12 Oct 2011 16:16:32 +0000 (09:16 -0700)]
Fix pixel_var2 with 4:2:2 encoding
Might have caused artifacts or suboptimal chroma compression.
Anton Mitrofanov [Sun, 9 Oct 2011 15:14:16 +0000 (19:14 +0400)]
Fix chroma intra analysis in 4:4:4 lossless mode
Anton Mitrofanov [Sat, 8 Oct 2011 21:13:29 +0000 (01:13 +0400)]
Fix use of uninitialized MVs in sub8x8 RDO
Fabian Greffrath [Sat, 8 Oct 2011 02:04:17 +0000 (19:04 -0700)]
Fix detection of Alpha CPU arch on alphaev67
Fiona Glaser [Wed, 14 Sep 2011 21:53:04 +0000 (14:53 -0700)]
Optimize x86 asm for Intel macro-op fusion
That is, place all loop counter tests right before their conditional jumps.
Fiona Glaser [Mon, 12 Sep 2011 18:51:23 +0000 (11:51 -0700)]
CAVLC: clean up and restructure
Somewhat faster CAVLC and RD bit-counting.
Fiona Glaser [Fri, 9 Sep 2011 00:27:02 +0000 (17:27 -0700)]
CABAC: clean up and restructure
Somewhat faster CABAC and RD bit-counting.
Fiona Glaser [Sun, 4 Sep 2011 09:31:29 +0000 (11:31 +0200)]
Some initial 4:2:2 x86 asm
Henrik Gramner [Fri, 26 Aug 2011 13:57:04 +0000 (15:57 +0200)]
4:2:2 encoding support
Loren Merritt [Mon, 15 Aug 2011 18:18:55 +0000 (18:18 +0000)]
SSSE3/SSE4 9-way fully merged i4x4 analysis (sad/satd_x9)
i4x4 analysis cycles (per partition):
penryn sandybridge
184-> 75 157-> 54 preset=superfast (sad)
281->165 225->124 preset=faster (satd with early termination)
332->165 263->124 preset=medium
379->165 297->124 preset=slower (satd without early termination)
This is the first code in x264 that intentionally produces different behavior
on different cpus: satd_x9 is implemented only on ssse3+ and checks all intra
directions, whereas the old code (on fast presets) may early terminate after
checking only some of them. There is no systematic difference on slow presets,
though they still occasionally disagree about tiebreaks.
For ease of debugging, add an option "--cpu-independent" to disable satd_x9
and any analogous future code.
Loren Merritt [Mon, 15 Aug 2011 17:43:42 +0000 (17:43 +0000)]
Faster intra_mbcmp_x3 for versions without dedicated asm
Select asm subroutines more intelligently in the wrapper functions.
Loren Merritt [Sat, 13 Aug 2011 19:01:22 +0000 (19:01 +0000)]
Optimize x86 intra_predict_4x4 and 8x8
High bit depth Penryn, Sandybridge cycles:
4x4_ddl: 11->10, 9-> 8
4x4_ddr: 15->13, 12->11
4x4_hd: , 15->12
4x4_hu: , 14->13
4x4_vr: 15->14, 14->12
8x8_ddl: 32->19, 19->14
8x8_ddr: 42->19, 21->14
8x8_hd: , 15->13
8x8_hu: 21->17, 16->12
8x8_vr: 33->19,
8-bit Penryn, Sandybridge cycles:
4x4_ddr: 24->15,
4x4_hd: 24->16,
4x4_hu: 23->15,
4x4_vr: 23->16,
4x4_vl: 10-> 9,
8x8_ddl: 23->15,
8x8_hd: , 17->14
8x8_hu: , 15->14
8x8_vr: 20->16, 17->13
Loren Merritt [Sat, 13 Aug 2011 06:44:28 +0000 (06:44 +0000)]
Use realistic alignment for intra pred benchmarks in checkasm
Yusuke Nakamura [Tue, 20 Sep 2011 16:15:38 +0000 (01:15 +0900)]
Fix frame packing SEI with --frame-packing 0
According to the spec, when frame_packing_arrangement_type is equal to 0, quincunx_sampling_flag shall be equal to 1.
Oka Motofumi [Mon, 5 Sep 2011 02:50:37 +0000 (11:50 +0900)]
Fix install/uninstall shared libs if SYS is WINDOWS/CYGWIN
Reinhard Tartler [Wed, 10 Aug 2011 07:16:46 +0000 (00:16 -0700)]
Add Hurd support to configure
Loren Merritt [Sat, 13 Aug 2011 00:39:35 +0000 (00:39 +0000)]
Optimize x86 intra_satd_x3_*
~7% faster.
Loren Merritt [Fri, 12 Aug 2011 19:13:07 +0000 (19:13 +0000)]
Optimize x86 intra_sa8d_x3_8x8
~40% faster.
Also some other minor asm cosmetics.
Loren Merritt [Fri, 12 Aug 2011 02:15:46 +0000 (02:15 +0000)]
Scale interlaced refs/mvs for mvr predictors
Slightly improves compression and fixes a Valgrind error.
Loren Merritt [Thu, 11 Aug 2011 15:03:12 +0000 (15:03 +0000)]
Optimize predict_8x8_filter and incidentally remove a valgrind false-positive
Anton Mitrofanov [Mon, 15 Aug 2011 08:22:18 +0000 (12:22 +0400)]
Don't override flat SSE2 dequant functions with non-flat AVX ones
Slightly faster.
Loren Merritt [Mon, 8 Aug 2011 13:40:53 +0000 (13:40 +0000)]
Shut up some valgrind false-positives
Fiona Glaser [Tue, 16 Aug 2011 20:02:24 +0000 (13:02 -0700)]
Avoid some unnecessary allocations with B-frames/CABAC off
Fiona Glaser [Tue, 23 Aug 2011 00:07:03 +0000 (17:07 -0700)]
Fix typo in p8x8 RD analysis
Passed wrong idx to trellis.
Anton Mitrofanov [Sat, 20 Aug 2011 22:44:45 +0000 (02:44 +0400)]
Fix invalid memory accesses in x86 lowres_init when width <= 16
Anton Mitrofanov [Mon, 15 Aug 2011 08:03:09 +0000 (12:03 +0400)]
Fix intermediate conversion for YUVJ* pixfmts with 4:4:4 encoding
Henrik Gramner [Sun, 14 Aug 2011 11:39:29 +0000 (13:39 +0200)]
Fix pic_out returned by x264_encoder_encode with 4:4:4
Loren Merritt [Thu, 11 Aug 2011 22:12:26 +0000 (22:12 +0000)]
Fix zeroing of mvr predictors in bskip blocks
Loren Merritt [Thu, 11 Aug 2011 01:33:13 +0000 (01:33 +0000)]
Fix: chroma planes for weightp analysis were not initted if U early-terminates and V doesn't.
Henrik Gramner [Wed, 10 Aug 2011 18:25:07 +0000 (20:25 +0200)]
Expand borders before chroma weightp analysis
Prevents mc from using uninitialized source pixels.
Henrik Gramner [Wed, 10 Aug 2011 17:29:14 +0000 (19:29 +0200)]
Another 4:4:4 chroma weightp bug fix
Fiona Glaser [Wed, 10 Aug 2011 07:17:26 +0000 (00:17 -0700)]
Fix typo in help
Fiona Glaser [Sat, 6 Aug 2011 17:45:47 +0000 (10:45 -0700)]
Improve support for varying resolution between passes
Should give much better quality, but still doesn't support MB-tree yet.
Also check for the same interlaced options between passes.
Various minor ratecontrol cosmetics.
Loren Merritt [Sun, 7 Aug 2011 22:57:27 +0000 (22:57 +0000)]
asm cosmetics: base-4 constants for shuffles
Loren Merritt [Wed, 3 Aug 2011 14:58:50 +0000 (14:58 +0000)]
Enable some existing asm functions that were missing function pointers
pixel_ads1_avx, predict_8x8_hd_avxx
High bit depth mc_copy_w8_sse2, denoise_dct_avx, prefetch_fenc/ref, and several pixel*sse4.
Loren Merritt [Wed, 3 Aug 2011 14:57:06 +0000 (14:57 +0000)]
Remove some unused, broken, and/or useless functions
Unused frame_sort.
Unused x86_64 dequant_4x4dc_mmx2, predict_8x8_vr_mmx2.
Unused and broken high_depth integral_init*h_sse4, optimize_chroma_*, dequant_flat_*, sub8x8_dct_dc_*, zigzag_sub_*.
Useless high_depth dequant_sse4, dequant_dc_sse4.
Loren Merritt [Wed, 3 Aug 2011 14:56:27 +0000 (14:56 +0000)]
asm cosmetics: merge all the variants of ABS macros
Loren Merritt [Wed, 3 Aug 2011 14:53:29 +0000 (14:53 +0000)]
asm cosmetics part 2
These changes were split out of the cpuflags commit because they change the output executable.
Loren Merritt [Wed, 3 Aug 2011 14:46:41 +0000 (14:46 +0000)]
asm cosmetics: INIT_MMX/XMM/YMM now support a cpuflags argument
Reduces the number of macro args that need to be passed around.
Allows multiple implementations of a given macro (e.g. PALIGNR) to check
cpuflags at the location where the macro is defined, instead of having
to select implementations by %define at toplevel.
Remove INIT_AVX, as it's replaced by "INIT_XMM avx".
This commit does not change the stripped executable.
Loren Merritt [Wed, 3 Aug 2011 14:43:34 +0000 (14:43 +0000)]
Import x86inc.asm patches from libav
Loren Merritt [Wed, 3 Aug 2011 14:42:12 +0000 (14:42 +0000)]
Cosmetics: s/mmxext/mmx2/
Henrik Gramner [Sun, 7 Aug 2011 09:58:36 +0000 (11:58 +0200)]
Fix two bugs in 4:4:4 chroma weightp analysis
Caused slightly worse compression.
Loren Merritt [Wed, 3 Aug 2011 14:40:01 +0000 (14:40 +0000)]
Fix "--asm avx"
Previously required "--asm sse2fast,fastshuffle,sse4.2,avx".
Anton Mitrofanov [Fri, 5 Aug 2011 11:59:20 +0000 (15:59 +0400)]
Re-add support for glibc <2.6, which doesn't have CPU_COUNT
Yasuhiro Ikeda [Mon, 1 Aug 2011 23:59:15 +0000 (08:59 +0900)]
Avoid using deprecated libavformat functions
Replace av_find_stream_info with avformat_find_stream_info.
Now requires libavformat 53.3.0 or newer.
Henrik Gramner [Wed, 27 Jul 2011 00:23:12 +0000 (02:23 +0200)]
Use assembly versions of some deblocking functions in MBAFF
Anton Mitrofanov [Wed, 27 Jul 2011 20:26:27 +0000 (00:26 +0400)]
Move X264_VERSION / X264_POINTVER from config.h to x264_config.h
This makes them available to external programs as part of the public API.
Henrik Gramner [Fri, 29 Jul 2011 18:15:52 +0000 (20:15 +0200)]
Fix padding bug in x264_expand_border_mbpair
Yusuke Nakamura [Fri, 29 Jul 2011 14:39:26 +0000 (23:39 +0900)]
Timecode parsing: Add missing initialization
Fix crash when failed to parse timecode file before malloc pts.
Fix detection of user timebase considered to be exceeding H.264 maximum.
Anton Mitrofanov [Thu, 28 Jul 2011 09:37:24 +0000 (13:37 +0400)]
Fix crash with high bitdepth 4:2:0 input
Daniel Kang [Wed, 27 Jul 2011 01:57:39 +0000 (21:57 -0400)]
x86 asm cosmetics
Use FDEC_STRIDEB where appropriate.
Fiona Glaser [Tue, 26 Jul 2011 14:40:23 +0000 (07:40 -0700)]
Fix a bug in lossless sub-8x8 RD
Caused crashes in rare cases with lossless encoding. Regression in 4:4:4.
Fiona Glaser [Tue, 19 Jul 2011 06:10:30 +0000 (23:10 -0700)]
Improved p8x4/4x8 search decision
Use the same thresholding as for p16x8/8x16.
Does p8x4/4x8 search more often, for a small compression improvement.
Dan Larkin [Wed, 13 Jul 2011 17:45:23 +0000 (12:45 -0500)]
Add --subme 11, which disables all early terminations in analysis
Necessary for a future trellis mode decision/motion estimation patch.
Also add the slowest presets to the regression test.
Dan Larkin [Wed, 13 Jul 2011 16:33:48 +0000 (11:33 -0500)]
Some trivial changes to RD thresholds
The output-changing portion of the next patch.
Anton Mitrofanov [Wed, 20 Jul 2011 18:54:43 +0000 (22:54 +0400)]
Allow setting a wider range of chroma QP offsets
This allows use of the full range of chroma QP offsets, even in combination with the automatic psy-based adjustments.
Fiona Glaser [Fri, 15 Jul 2011 20:24:38 +0000 (13:24 -0700)]
Optimize macroblock_deblock_strength, add more early terminations
Fiona Glaser [Fri, 15 Jul 2011 01:23:44 +0000 (18:23 -0700)]
Function-pointerify MBAFF deblocking functions
Fiona Glaser [Thu, 14 Jul 2011 21:04:11 +0000 (14:04 -0700)]
Clean up MBAFF deblocking code
Fiona Glaser [Wed, 13 Jul 2011 00:27:18 +0000 (17:27 -0700)]
Optimize frame_deblock_row
Henrik Gramner [Wed, 20 Jul 2011 20:30:59 +0000 (22:30 +0200)]
Shrink two arrays
Anton Mitrofanov [Mon, 18 Jul 2011 11:20:05 +0000 (15:20 +0400)]
Add support for the new (4:4:4) colorspaces to x264_picture_alloc
Anton Mitrofanov [Wed, 20 Jul 2011 14:06:41 +0000 (18:06 +0400)]
Various cosmetics
Yasuhiro Ikeda [Tue, 12 Jul 2011 14:41:42 +0000 (23:41 +0900)]
Improve configure help
Yasuhiro Ikeda [Tue, 12 Jul 2011 05:46:29 +0000 (14:46 +0900)]
Use $optarg for some configure options
Rafaël Carré [Fri, 15 Jul 2011 01:51:43 +0000 (18:51 -0700)]
Linux x264_cpu_num_processors(): use glibc macros
The cpu_set_t structure is considered opaque.
Also handle sched_getaffinity() error case if "cpusetsize is smaller than the size of the affinity mask used by the kernel."
Anton Mitrofanov [Thu, 14 Jul 2011 13:02:43 +0000 (17:02 +0400)]
Fix spurious "stream properties changed" with --seek option on some inputs
Anton Mitrofanov [Fri, 15 Jul 2011 11:06:37 +0000 (15:06 +0400)]
Fix use of deprecated libavcodec functions
Replace avcodec_open with avcodec_open2. Now requires libavcodec 53.6.0 or newer.
Kieran Kunhya [Wed, 13 Jul 2011 19:25:40 +0000 (20:25 +0100)]
Fix nalu_process callback with HRD
Anton Mitrofanov [Wed, 13 Jul 2011 11:55:38 +0000 (15:55 +0400)]
Fix incorrect chroma swap for some input pixfmts
Problem occurred if pixfmt of lavf/ffms input was PIX_FMT_RGB24 or PIX_FMT_YUV444P.
Anton Mitrofanov [Tue, 28 Jun 2011 17:39:09 +0000 (21:39 +0400)]
Fix resize filter crash with YUVJ* input pixfmt
xvidfan [Thu, 23 Jun 2011 01:46:14 +0000 (18:46 -0700)]
RGB encoding support
Much less efficient than YUV444, but easy to support using the YUV444 framework.
Fiona Glaser [Wed, 22 Jun 2011 10:32:53 +0000 (03:32 -0700)]
4:4:4 encoding support
Fiona Glaser [Mon, 20 Jun 2011 23:20:21 +0000 (16:20 -0700)]
Properly weight slice header lambda in chroma weightp analysis
Daniel Kang [Sun, 3 Jul 2011 21:32:00 +0000 (17:32 -0400)]
Better x86 high bit depth predict_8x8c_p
Avoid the need to check for corner cases by reordering arithmetic.
Also make a minor optimization to high bit depth predict_16x16_p.
Fiona Glaser [Thu, 23 Jun 2011 18:54:42 +0000 (11:54 -0700)]
Eliminate extra layer of indirection for sps/pps references
Also remove poc type 1 support (it didn't work anyways) to reduce sps size.
Fiona Glaser [Sun, 10 Jul 2011 02:21:00 +0000 (19:21 -0700)]
Fix SSIM calculation with sliced threads
Anton Mitrofanov [Sat, 9 Jul 2011 19:57:44 +0000 (23:57 +0400)]
Avoid possible NaNs in B-frame output stats
Rémi Denis-Courmont [Thu, 30 Jun 2011 21:07:43 +0000 (14:07 -0700)]
ARM: do not override the toolchain default for FPU ABI
Steven Walters [Fri, 24 Jun 2011 00:29:01 +0000 (20:29 -0400)]
Fix link errors with libswscale/libavutil as shared libraries
Steven Walters [Sat, 18 Jun 2011 18:12:34 +0000 (14:12 -0400)]
Fix deprecation in libavformat usage
Replace av_open_input_file with avformat_open_input. Now requires libavformat 53.2.0 or newer.
Anton Mitrofanov [Wed, 8 Jun 2011 21:34:14 +0000 (01:34 +0400)]
Fix various issues with VBV+threads
Eliminate the race condition with interframe row predictors and threads.
Recalculate frame_size_estimated at the end of a frame, for improved update_vbv_plan.
Some cosmetics.