granicus.if.org Git - libx264/log

]> granicus.if.org Git - libx264/log

projects / libx264 / log

commit | commitdiff | tree

Fiona Glaser [Sat, 19 Sep 2009 16:50:59 +0000 (09:50 -0700)]

Fix integer overflow in 2-pass VBV
Bug caused slight undersizing in 2-pass mode in some cases.

commit | commitdiff | tree

Fiona Glaser [Fri, 18 Sep 2009 21:28:31 +0000 (14:28 -0700)]

Fix bug with various bizarre commandline combinations and mbtree
Second pass would have mbtree on even though the first pass didn't (and thus encoding would immediately fail).

commit | commitdiff | tree

Fiona Glaser [Thu, 17 Sep 2009 20:02:02 +0000 (13:02 -0700)]

Add intra prediction modes to output stats
Also eliminate some NANs in stat output with intra-only encoding.
Marginal speedup: disable stat calculation if log level is below X264_LOG_INFO.
Various minor cosmetics.

commit | commitdiff | tree

Fiona Glaser [Thu, 17 Sep 2009 04:34:48 +0000 (21:34 -0700)]

Overhaul syntax in muxers.c/matroska.c
The inconsistent syntax in these files has finally come to an end.

commit | commitdiff | tree

Fiona Glaser [Thu, 17 Sep 2009 03:00:00 +0000 (20:00 -0700)]

Major API change: encapsulate NALs within libx264
libx264 now returns NAL units instead of raw data. x264_nal_encode is no longer a public function.
See x264.h for full documentation of changes.
New parameter: b_annexb, on by default. If disabled, startcodes are replaced by sizes as in mp4.
x264's VBV now works on a NAL level, taking into account escape codes.
VBV will also take into account the bit cost of SPS/PPS, but only if b_repeat_headers is set.
Add an overhead tracking system to VBV to better predict the constant overhead of frames (headers, NALU overhead, etc).

commit | commitdiff | tree

Fiona Glaser [Mon, 14 Sep 2009 19:30:38 +0000 (12:30 -0700)]

Add missing fclose for mbtree input statsfile on second pass
Bug report by VFRmaniac

commit | commitdiff | tree

Fiona Glaser [Mon, 14 Sep 2009 18:07:23 +0000 (11:07 -0700)]

Improve progress indicator behavior
Progress indicator will now indicate based on output frame, not input frame.

commit | commitdiff | tree

Fiona Glaser [Mon, 14 Sep 2009 10:21:14 +0000 (03:21 -0700)]

Update yasm configure check
lzcnt apparently requires yasm 0.6.2.

commit | commitdiff | tree

Fiona Glaser [Sun, 13 Sep 2009 08:02:37 +0000 (01:02 -0700)]

Make MV costs global instead of static
Fixes some extremely rare threading race conditions and makes the code cleaner.
Downside: slightly higher memory usage when calling multiple encoders from the same application.

commit | commitdiff | tree

Fiona Glaser [Sat, 12 Sep 2009 00:30:14 +0000 (17:30 -0700)]

Don't print scenecut message multiple times in verbose mode
Occurred mostly with b-adapt 2.

commit | commitdiff | tree

Fiona Glaser [Thu, 10 Sep 2009 09:55:21 +0000 (02:55 -0700)]

Optimize rounding of luma and chroma DC coefficients
Reduce bitrate mostly-losslessly at low quantizers.
In some rare cases, bitrate reduction may be as high as 10%.
Luma rounding optimization (helps much less than chroma) requires trellis.

commit | commitdiff | tree

Steven Walters [Wed, 9 Sep 2009 19:19:40 +0000 (12:19 -0700)]

Fix crash if encoder_close is called before delayed frames are flushed
Also no longer flush frames when ctrl-Cing x264, so x264 will close faster.

commit | commitdiff | tree

Fiona Glaser [Sun, 6 Sep 2009 21:55:48 +0000 (14:55 -0700)]

Improve x264 help
Now has three help options: --help, --longhelp, and --fullhelp.
--help only shows the most basic options; most users should not need more than these.
Add usage examples.
Fix typo in a comment.

commit | commitdiff | tree

Fiona Glaser [Sun, 6 Sep 2009 02:22:21 +0000 (19:22 -0700)]

Factor out a redundant RD call in qpel-RD
Fixes a problem that was supposed to be, but didn't, get fully fixed in r1238.

commit | commitdiff | tree

Fiona Glaser [Sun, 6 Sep 2009 01:56:18 +0000 (18:56 -0700)]

Fix RD early-skip
Small quality improvement and speedup, was broken by r1214.

commit | commitdiff | tree

Fiona Glaser [Sun, 6 Sep 2009 01:55:46 +0000 (18:55 -0700)]

Faster CAVLC mb header writing for B macroblocks

commit | commitdiff | tree

David Conrad [Wed, 2 Sep 2009 23:14:59 +0000 (16:14 -0700)]

Compile fixes for pre-ARMv6T2 and/or PIC

commit | commitdiff | tree

Steven Walters [Wed, 2 Sep 2009 19:33:50 +0000 (12:33 -0700)]

Change priority handling on some OSs
Instead of setting the lookahead thread to max priority, lower all the other threads' priorities instead.
This is particularly useful when the "max priority" is "realtime", as in Windows, which can cause some problems.

commit | commitdiff | tree

Steven Walters [Wed, 2 Sep 2009 01:46:51 +0000 (18:46 -0700)]

Threaded lookahead
Move lookahead into a separate thread, set to higher priority than the other threads, for optimal performance.
Reduces the amount that lookahead bottlenecks encoding, greatly increasing performance with lookahead-intensive settings (e.g. b-adapt 2) on many-core CPUs.
Buffer size can be controlled with --sync-lookahead, which defaults to auto (threads+bframes buffer size).
Note that this buffer is separate from the rc-lookahead value.
Note also that this does not split lookahead itself into multiple threads yet; this may be added in the future.
Additionally, split frames into "fdec" and "fenc" frame types and keep the two separate.
This split greatly reduces memory usage, which helps compensate for the larger lookahead size.
Extremely special thanks to Michael Kazmier and Alex Giladi of Avail Media, the original authors of this patch.

commit | commitdiff | tree

Fiona Glaser [Tue, 1 Sep 2009 18:36:54 +0000 (11:36 -0700)]

Force a link error in case of incompatible API
This is because the number of bug reports due to miscompiled ffmpeg builds is reaching critical mass.
The name of x264_encoder_open is now #defined based on the current X264_BUILD.
Note that this changes the calling convention required for dlopen, but not for ordinary calls to x264_encoder_open.

commit | commitdiff | tree

Fiona Glaser [Tue, 1 Sep 2009 05:44:45 +0000 (22:44 -0700)]

Get rid of "CBR" descriptor from qcomp
Though technically accurate in some vague way, I have never actually seen this
option used correctly, rather it has been used by hundreds of people who can't
read the documentation and believe that qcomp=0 is what should be used for CBR
encoding.

commit | commitdiff | tree

Loren Merritt [Sun, 30 Aug 2009 20:49:07 +0000 (20:49 +0000)]

Faster me=tesa
But it still spends all too much time in me_search_ref rather than asm.

commit | commitdiff | tree

Fiona Glaser [Mon, 31 Aug 2009 13:36:41 +0000 (06:36 -0700)]

Multi-slice encoding support
Slicing support is available through three methods (which can be mixed):
--slices sets a number of slices per frame and ensures rectangular slices (required for Blu-ray). Overridden by either of the following options:
--slice-max-mbs sets a maximum number of macroblocks per slice.
--slice-max-size sets a maximum slice size, in bytes (includes NAL overhead).
Implement macroblock re-encoding support to allow highly accurate slice size limitation. Might be useful for other things in the future, too.

commit | commitdiff | tree

Fiona Glaser [Sun, 30 Aug 2009 00:09:55 +0000 (17:09 -0700)]

Fix a valgrind warning in b-adapt 2

commit | commitdiff | tree

Loren Merritt [Sat, 29 Aug 2009 10:31:08 +0000 (10:31 +0000)]

fix asm symbols for oprofile (regression in r1221)

commit | commitdiff | tree

Anton Mitrofanov [Fri, 28 Aug 2009 22:07:12 +0000 (15:07 -0700)]

Fix bug in intra analysis in B-frames
i8x8/i4x4 never got analysed when fast_intra was toggled and RD was off; up to a 2-3% quality improvement in non-RD mode.
With this bug dating back to r369, this is probably the second-oldest bug ever fixed in x264.

commit | commitdiff | tree

Anton Mitrofanov [Fri, 28 Aug 2009 21:56:44 +0000 (14:56 -0700)]

Fix bug in b16x16 qpel RD
Incorrect cost was used to initialize the search.

commit | commitdiff | tree

Fiona Glaser [Thu, 27 Aug 2009 22:21:22 +0000 (15:21 -0700)]

Check minimum chroma QP in addition to luma QP during CQM init
Correctly error out if the implied minimum chroma QP is too low.
Add missing emms to checkasm macroblock_tree_propagate test.

commit | commitdiff | tree

Fiona Glaser [Thu, 27 Aug 2009 21:16:45 +0000 (14:16 -0700)]

Faster mbtree propagate and x264_log2, less memory usage
Avoid an int->float conversion with a small table.
Change lowres_inter_types to a bitfield; cut its size by 75%.
Somewhat lower memory usage with lots of bframes.
Make log2/exp2 tables global to avoid duplication.

commit | commitdiff | tree

Fiona Glaser [Thu, 27 Aug 2009 03:30:47 +0000 (20:30 -0700)]

Fix keyint=1 + VBV + rc-lookahead

commit | commitdiff | tree

Fiona Glaser [Thu, 27 Aug 2009 03:16:10 +0000 (20:16 -0700)]

Faster x264_exp2fix8
22->13 cycles on Core 2 with mfpmath=sse

commit | commitdiff | tree

Loren Merritt [Thu, 27 Aug 2009 06:05:57 +0000 (06:05 +0000)]

compile x86 with fpmath=sse by default

commit | commitdiff | tree

David Conrad [Tue, 25 Aug 2009 00:17:41 +0000 (17:17 -0700)]

ARM configure: enable NEON-related options by default
When compiling for ARM, x264 will compile by default for Cortex A8 unless specified otherwise.
To compile for pre-ARMv6, --disable-asm is required.

commit | commitdiff | tree

Fiona Glaser [Mon, 24 Aug 2009 10:28:11 +0000 (03:28 -0700)]

2-pass VBV fixes
Properly run slicetype frame cost with 2pass + MB-tree.
Slash the VBV rate tolerance in 2-pass mode; increasing it made sense for the highly reactive 1-pass VBV algorithm, but not for 2-pass.
2-pass's planned frame sizes are guaranteed to be reasonable, since they are based on a real first pass, while 1-pass's, based on lookahead SATD, cannot always be trusted.

commit | commitdiff | tree

David Conrad [Mon, 24 Aug 2009 08:38:42 +0000 (01:38 -0700)]

GSOC merge part 8: ARM NEON intra prediction assembly functions (partial)
4x4 dc/h/ddr/ddl, 8x8 dc/h, 8x8c h/v, 16x16 dc/h/v

commit | commitdiff | tree

David Conrad [Mon, 24 Aug 2009 08:10:30 +0000 (01:10 -0700)]

GSOC merge part 7: ARM NEON deblock assembly functions (partial)
Originally written for ffmpeg by Mans Rullgard; ported by David.
Luma and chroma inter deblocking; no intra yet.

commit | commitdiff | tree

David Conrad [Mon, 24 Aug 2009 07:58:42 +0000 (00:58 -0700)]

GSOC merge part 6: ARM NEON quant assembly functions (partial)
(de)quant 4x4, (de)quant 8x8, (de)quant DC, coeff_last

commit | commitdiff | tree

David Conrad [Sun, 23 Aug 2009 09:03:48 +0000 (02:03 -0700)]

GSOC merge part 5: ARM NEON dct assembly functions
(i)dct4x4dc, (i)dct4x4, (i)dct8x8, (i)dct_dc, zigzag_scan_frame_4x4

commit | commitdiff | tree

David Conrad [Sun, 23 Aug 2009 08:35:10 +0000 (01:35 -0700)]

GSOC merge part 4: ARM NEON mc assembly functions
prefetch, memcpy_aligned, memzero_aligned, avg, mc_luma, get_ref, mc_chroma, hpel_filter, frame_init_lowres

commit | commitdiff | tree

David Conrad [Sun, 23 Aug 2009 06:55:29 +0000 (23:55 -0700)]

GSOC merge part 3: ARM NEON pixel assembly functions
SAD, SADX3/X4, SSD, SATD, SA8D, Hadamard_AC, VAR, VAR2, SSIM

commit | commitdiff | tree

David Conrad [Sun, 23 Aug 2009 06:40:33 +0000 (23:40 -0700)]

GSOC merge part 2: ARM stack alignment
Neither GCC nor ARMCC support 16 byte stack alignment despite the fact that NEON loads require it.
These macros only work for arrays, but fortunately that covers almost all instances of stack alignment in x264.

commit | commitdiff | tree

David Conrad [Fri, 21 Aug 2009 03:44:09 +0000 (20:44 -0700)]

Fix unaligned accesses in bitstream writer
Fixes x264 on CPUs with no unaligned access support (e.g. SPARC).
Improves performance marginally on CPUs with penalties for unaligned stores (e.g. some x86).

commit | commitdiff | tree

Fiona Glaser [Thu, 20 Aug 2009 20:08:25 +0000 (13:08 -0700)]

Fix bug in calculation of I-frame costs with AQ.

commit | commitdiff | tree

David Conrad [Thu, 20 Aug 2009 00:03:02 +0000 (17:03 -0700)]

GSOC merge part 1: Framework for ARM assembly optimizations
x264 will detect which ARM core it's building for and only build NEON asm if the target is ARMv6 or above, then enable NEON at runtime.

commit | commitdiff | tree

David Conrad [Wed, 19 Aug 2009 23:18:36 +0000 (16:18 -0700)]

Fix a bug in checkasm and two OSX fixes
MC chroma checkasm test could crash in some situations
Remove -lmx, as it's not needed and the iPhone doesn't have it.
Remove unused sqrtf emulation; it breaks if math.h is included.

commit | commitdiff | tree

Fiona Glaser [Wed, 19 Aug 2009 08:49:47 +0000 (01:49 -0700)]

Improve QPRD
Always check the last macroblock's QP, even if the normal search doesn't reach it.
Raise the failure threshold when moving towards the last macroblock's QP.
0.2-1% improved compression.

commit | commitdiff | tree

Fiona Glaser [Wed, 19 Aug 2009 04:53:28 +0000 (21:53 -0700)]

Fix MB-tree with keyint<3
Also slightly improve VBV keyint handling.

commit | commitdiff | tree

Fiona Glaser [Wed, 19 Aug 2009 02:25:45 +0000 (19:25 -0700)]

Fix bug in VBV lookahead + no MB-tree
I-frames need to have VBV lookahead run on them as well.

commit | commitdiff | tree

Fiona Glaser [Wed, 19 Aug 2009 01:37:26 +0000 (18:37 -0700)]

Add support for frame-accurate parameter changes
Parameter structs can now be passed with individual frames.
The previous method would only change the parameter of what was currently being encoded, which due to delay might be very far from an intended exact frame.
Also add support for changing aspect ratio. Only works in a stream with repeating headers and requires the caller to force an IDR to ensure instant effect.

commit | commitdiff | tree

Fiona Glaser [Tue, 18 Aug 2009 22:46:26 +0000 (15:46 -0700)]

Fix x264_encoder_reconfig with multithreading
New behavior: reconfigging the encoder will result in changes being applied
to each of the encoding threads as they finish encoding the current frame.

commit | commitdiff | tree

Fiona Glaser [Sun, 16 Aug 2009 10:29:49 +0000 (03:29 -0700)]

Fix two bugs in QPRD
QPRD could in some cases force blocks to skip when they shouldn't be ~(+0.01db)
Force QPRD to abide by qpmin/qpmax restrictions.

commit | commitdiff | tree

Fiona Glaser [Sun, 16 Aug 2009 02:02:31 +0000 (19:02 -0700)]

Lookahead VBV
Use the large-scale lookahead capability introduced in MB-tree for ratecontrol purposes.
(Does not require MB-tree, however.)
Greatly improved quality and compliance in 1-pass VBV mode, especially in CBR; +2db OPSNR or more in some cases.
Fix some other bugs in VBV, which should improve non-lookahead mode as well.
Change the tolerance algorithm in row VBV to allow for more significant mispredictions when buffer is nearly full.
Note that due to the fixing of an extremely long-standing bug (>1 year), bitrates may change by nontrivial amounts in CRF without MB-tree.

commit | commitdiff | tree

Fiona Glaser [Fri, 14 Aug 2009 14:20:07 +0000 (07:20 -0700)]

Fix bug in b-adapt 1
B-adapt 1 didn't use more than MAX(1,bframes-1) B-frames when MB-tree was off.

commit | commitdiff | tree

Fiona Glaser [Fri, 14 Aug 2009 00:13:33 +0000 (17:13 -0700)]

Fix a potential failure in VBV
If VBV does underflow, ratecontrol could be permanently broken for the rest of the clip.
Revert part of the previous VBV changes to fix this.

commit | commitdiff | tree

Anton Mitrofanov [Thu, 13 Aug 2009 21:40:21 +0000 (21:40 +0000)]

new API function x264_encoder_delayed_frames.
fix x264cli on streams whose total length is less than the encoder latency.

commit | commitdiff | tree

Fiona Glaser [Thu, 13 Aug 2009 21:12:26 +0000 (14:12 -0700)]

Add no-mbtree to fprofile (and fix pyramid in fprofile)

commit | commitdiff | tree

Fiona Glaser [Sun, 9 Aug 2009 23:06:52 +0000 (16:06 -0700)]

Don't print a warning about direct=auto in 2pass when B-frames are off

commit | commitdiff | tree

Loren Merritt [Thu, 13 Aug 2009 05:02:59 +0000 (05:02 +0000)]

fix lowres padding, which failed to extrapolate the right side for some resolutions.
fix a buffer overread in x264_mbtree_propagate_cost_sse2. no effect on actual behavior, only theoretical correctness.
fix x264_slicetype_frame_cost_recalculate on I-frames, which previously used all 0 mb costs.
shut up a valgrind warning in predict_8x8_filter_mmx.

commit | commitdiff | tree

Loren Merritt [Sun, 9 Aug 2009 04:00:36 +0000 (04:00 +0000)]

simd part of x264_macroblock_tree_propagate.
1.6x faster on conroe.

commit | commitdiff | tree

Loren Merritt [Sat, 8 Aug 2009 14:53:27 +0000 (14:53 +0000)]

MB-tree fixes:
AQ was applied inconsistently, with some AQed costs compared to other non-AQed costs. Strangely enough, fixing this increases SSIM on some sources but decreases it on others. More investigation needed.
Account for weighted bipred.
Reduce memory, increase precision, simplify, and early terminate.

commit | commitdiff | tree

Fiona Glaser [Sun, 9 Aug 2009 00:51:01 +0000 (17:51 -0700)]

Add missing free()s for new data allocated for MB-tree
Eliminates a memory leak.

commit | commitdiff | tree

Fiona Glaser [Sat, 8 Aug 2009 19:53:06 +0000 (12:53 -0700)]

Fix keyframe insertion with MB-tree and no B-frames

commit | commitdiff | tree

Fiona Glaser [Sat, 8 Aug 2009 18:26:36 +0000 (11:26 -0700)]

Fix MP4 output (bug in malloc checking patch)

commit | commitdiff | tree

Steven Walters [Fri, 7 Aug 2009 23:18:01 +0000 (16:18 -0700)]

Gracefully terminate in the case of a malloc failure
Fuzz tests show that all mallocs appear to be checked correctly now.

commit | commitdiff | tree

Anton Mitrofanov [Fri, 7 Aug 2009 17:44:13 +0000 (10:44 -0700)]

Fix a potential infinite loop in QPfile parsing on Windows
ftell doesn't seem to work properly on Windows in text mode.

commit | commitdiff | tree

Fiona Glaser [Fri, 7 Aug 2009 17:31:16 +0000 (10:31 -0700)]

Fix delay calculation with multiple threads
Delay frames for threading don't actually count as part of lookahead.

commit | commitdiff | tree

Fiona Glaser [Fri, 7 Aug 2009 06:09:46 +0000 (23:09 -0700)]

Add "veryslow" preset
Apparently some people are actually *using* placebo, so I've added this preset to bridge the gap.

commit | commitdiff | tree

Fiona Glaser [Wed, 5 Aug 2009 00:46:33 +0000 (17:46 -0700)]

Macroblock-tree ratecontrol
On by default; can be turned off with --no-mbtree.
Uses a large lookahead to track temporal propagation of data and weight quality accordingly.
Requires a very large separate statsfile (2 bytes per macroblock) in multi-pass mode.
Doesn't work with b-pyramid yet.
Note that MB-tree inherently measures quality different from the standard qcomp method, so bitrates produced by CRF may change somewhat.
This makes the "medium" preset a bit slower. Accordingly, make "fast" slower as well, and introduce a new preset "faster" between "fast" and "veryfast".
All presets "fast" and above will have MB-tree on.
Add a new option, --rc-lookahead, to control the distance MB tree looks ahead to perform propagation analysis.
Default is 40; larger values will be slower and require more memory but give more accurate results.
This value will be used in the future to control ratecontrol lookahead (VBV).
Add a new option, --no-psy, to disable all psy optimizations that don't improve PSNR or SSIM.
This disables psy-RD/trellis, but also other more subtle internal psy optimizations that can't be controlled directly via external parameters.
Quality improvement from MB-tree is about 2-70% depending on content.
Strength of MB-tree adjustments can be tweaked using qcompress; higher values mean lower MB-tree strength.
Note that MB-tree may perform slightly suboptimally on fades; this will be fixed by weighted prediction, which is coming soon.

commit | commitdiff | tree

Fiona Glaser [Tue, 4 Aug 2009 03:52:30 +0000 (20:52 -0700)]

Various 1-pass VBV tweaks
Make predictors have an offset in addition to a multiplier.
This primarily fixes issues in sources with lots of extremely static scenes, such as anime and CGI.
We tried linear regressions, but they were very unreliable as predictors.
Also allow VBV to be slightly more aggressive in raising QPs to avoid not having enough bits left in some situations.
Up to 1db improvement on some clips.

commit | commitdiff | tree

Fiona Glaser [Wed, 29 Jul 2009 03:41:27 +0000 (20:41 -0700)]

Fix another 10L in QPRD
An entry in subpel_iterations was missing.
I have no idea how QPRD was working at all without this change.

commit | commitdiff | tree

Fiona Glaser [Tue, 28 Jul 2009 08:16:23 +0000 (01:16 -0700)]

Update help and cleanup in ratecontrol.c
Deal with some out-of-date information.

commit | commitdiff | tree

Loren Merritt [Tue, 28 Jul 2009 07:16:31 +0000 (07:16 +0000)]

15% faster refine_bidir_satd, 10% faster refine_bidir_rd (or less with trellis=2)
re-roll a loop (saves 44KB code size, which is the cause of most of this speed gain)
don't re-mc mvs that haven't changed

commit | commitdiff | tree

Fiona Glaser [Tue, 28 Jul 2009 04:03:00 +0000 (21:03 -0700)]

Faster bidir_rd plus some bugfixes
Cache chroma MC during refine_bidir_rd and use both the luma and chroma caches to skip MC in macroblock_encode.
Fix incorrect call to rd_cost_part; refine_bidir_rd output was incorrect for i8>0.
Remove some redundant clips.
~12% faster refine_bidir_rd.

commit | commitdiff | tree

Fiona Glaser [Mon, 27 Jul 2009 11:45:03 +0000 (04:45 -0700)]

Add "fastdecode" tune option
It does what it says it does.

commit | commitdiff | tree

Fiona Glaser [Sun, 26 Jul 2009 19:20:09 +0000 (12:20 -0700)]

Fix two bugs in QPRD
fprofile settings now actually fprofile QPRD.
Don't use i_mbrd before initializing it.

commit | commitdiff | tree

Fiona Glaser [Sun, 26 Jul 2009 10:03:12 +0000 (03:03 -0700)]

Fix 10l in QPRD
Trellis used wrong lambda with trellis=1

commit | commitdiff | tree

Fiona Glaser [Sun, 26 Jul 2009 05:31:06 +0000 (22:31 -0700)]

Fix a nondeterminism with threads and subme>7
Also add a few more checks to eliminate the need for spel_border.

commit | commitdiff | tree

Fiona Glaser [Thu, 23 Jul 2009 19:20:39 +0000 (12:20 -0700)]

Add QPRD support as subme=10
Refactor trellis lambda selection to be done in analyse_init instead of in trellis.
This will allow for more easy adaption of lambda later on; for now it allows constant lambda across variable QPs.
QPRD is only available with adaptive quantization enabled and generally improves SSIM and visual quality.
Additionally, weight the SSD values from RD based on the relative QP offset for chroma; helps visually at high QPs where chroma has a lower QP than luma.
This fixes some visual artifacts created by QPRD at high QPs.
Note that this generally hurts PSNR and SSIM, and so is only on when psy-RD is on.

commit | commitdiff | tree

Fiona Glaser [Wed, 22 Jul 2009 02:56:21 +0000 (19:56 -0700)]

SSSE3 cachesplit workaround for avg2_w16
Palignr-based solution for the most commonly used qpel function.
1-1.5% faster overall on Core 2 chips.

commit | commitdiff | tree

Loren Merritt [Wed, 22 Jul 2009 20:20:52 +0000 (20:20 +0000)]

shut up valgrind warnings in trellis

commit | commitdiff | tree

Anton Mitrofanov [Sat, 18 Jul 2009 23:30:18 +0000 (16:30 -0700)]

New AQ algorithm option
"Auto-variance" uses log(var)^2 instead of log(var) and attempts to adapt strength per-frame.
Generates significantly better SSIM; on by default with --tune ssim.
Whether it generates visually better quality is still up for debate.
Available as --aq-mode 2.

commit | commitdiff | tree

Fiona Glaser [Wed, 15 Jul 2009 19:43:35 +0000 (12:43 -0700)]

Cacheline-split SSSE3 chroma MC
~70% faster chroma MC on 32-bit Conroe
Also slightly faster SSSE3 intra_sad_8x8c

commit | commitdiff | tree

Fiona Glaser [Sun, 12 Jul 2009 19:07:01 +0000 (12:07 -0700)]

Improve documentation of qp/crf options

commit | commitdiff | tree

Fiona Glaser [Fri, 10 Jul 2009 02:02:57 +0000 (19:02 -0700)]

Merge array_non_zero into zigzag_sub
Faster lossless, cleaner code.
SSSE3 version of zigzag_sub_4x4_field, faster lossless interlaced coding.

commit | commitdiff | tree

James Darnley [Thu, 9 Jul 2009 18:25:55 +0000 (11:25 -0700)]

Fix bug in reference frame autoadjustment
For some types of input file, x264 did the adjustment before width/height were known.

commit | commitdiff | tree

Fiona Glaser [Tue, 7 Jul 2009 18:13:39 +0000 (11:13 -0700)]

Fix fprofile settings to match changes in defaults
Also add b-adapt 2 to fprofile.

commit | commitdiff | tree

Fiona Glaser [Fri, 3 Jul 2009 09:33:44 +0000 (02:33 -0700)]

Slightly faster dequant_flat assembly
Eliminate some redundant shifts.

commit | commitdiff | tree

Fiona Glaser [Thu, 2 Jul 2009 04:14:57 +0000 (21:14 -0700)]

Totally new preset system for x264.c (not libx264), new defaults
Other new features include "tune" and "profile" settings; see --help for more details.
Unlike most other settings, "preset" and "tune" act before all other options.
However, "profile" acts afterwards, overriding all other options.
Our defaults have also changed: new defaults are --subme 7 --bframes 3 --8x8dct --no-psnr --no-ssim --threads auto --ref 3 --mixed-refs --trellis 1 --weightb --crf 23 --progress.
Users will hopefully find these changes to greatly improve usability.

commit | commitdiff | tree

Fiona Glaser [Wed, 1 Jul 2009 23:33:12 +0000 (16:33 -0700)]

Update Gabriel's email address in AUTHORS

commit | commitdiff | tree

Fiona Glaser [Tue, 30 Jun 2009 22:20:32 +0000 (15:20 -0700)]

Early termination for chroma encoding
Faster chroma encoding by terminating early if heuristics indicate that the block will be DC-only.
This works because the vast majority of inter chroma blocks have no coefficients at all, and those that do are almost always DC-only.
Add two new helper DSP functions for this: dct_dc_8x8 and var2_8x8. mmx/sse2/ssse3 versions of each.
Early termination is disabled at very low QPs due to it not being useful there.
Performance increase is ~1-2% without trellis, up to 5-6% with trellis=2.
Increase is greater with lower bitrates.

commit | commitdiff | tree

David Conrad [Fri, 26 Jun 2009 20:09:44 +0000 (13:09 -0700)]

Fix bug in checkasm
frame_init_lowres_core check didn't check the C plane.
However, all x86 and PPC assembly was correct regardless of the unit test being incorrect.

commit | commitdiff | tree

Fiona Glaser [Wed, 24 Jun 2009 21:39:15 +0000 (14:39 -0700)]

Add subpartition cost for sub-8x8 blocks
Improves sub-p8x8 mode decision.

commit | commitdiff | tree

Fiona Glaser [Wed, 24 Jun 2009 20:24:18 +0000 (13:24 -0700)]

Yet more CABAC and CAVLC optimizations
Also clean up a lot of pointless code duplication in CAVLC MV coding.

commit | commitdiff | tree

Fiona Glaser [Sat, 20 Jun 2009 01:49:55 +0000 (18:49 -0700)]

Various CABAC optimizations and cleanups
Faster CABAC CBF context calculation for inter blocks.
Add x264_constant_p(), will probably be useful in the future as well.
Simpler subpartition functions.
Clean up and optimize mvd_cpn a bit more.
Various other minor optimizations.

commit | commitdiff | tree

David Wolstencroft [Sat, 20 Jun 2009 19:42:55 +0000 (21:42 +0200)]

AltiVec version of frame_init_lowres_core. 22.4x faster than C on PPC7450 and 25x on PPC970MP.

commit | commitdiff | tree

Fiona Glaser [Fri, 19 Jun 2009 23:03:18 +0000 (16:03 -0700)]

MMX CABAC mvd sum calculation
Faster CABAC mvd coding.

commit | commitdiff | tree

Fiona Glaser [Fri, 19 Jun 2009 23:02:39 +0000 (16:02 -0700)]

Faster MV prediction
Smaller code size, plus I get to use goto.

commit | commitdiff | tree

Fiona Glaser [Wed, 10 Jun 2009 17:37:01 +0000 (10:37 -0700)]

Fix potential crash in checkasm
ssim_end4_sse2 requires aligned sums