Jingning Han [Wed, 16 Oct 2013 19:43:03 +0000 (12:43 -0700)]
Make memory alloc in pick_mode_context bsize aware
This commit makes the buffer allocation of zcoeff_blk array in
pick_mode_context block size aware. It calculates the number of
4x4 blocks in the partition and assigns the memory space accordingly.
This process (and the uninitialization) is done once for each encoding
pass. It allows memory copy of smaller buffer when possible.
For football at 600kbps, the runtimes improve by about 1%:
speed 1, 45961ms -> 45472ms
speed 2, 23863ms -> 23598ms
* changes:
Use a separate MODE_INFO stream for each tile column
Get rid of "this_mi", use "mi_8x8[0]" everywhere instead
Make the static_segmentation feature work again
Get rid of "this_mi", use "mi_8x8[0]" everywhere instead
The only case where they were intentionally pointing to different
structures was in mbgraph, and this didn't have the expected behavior
because both of these pointers are used interchangeably through the code
Dmitry Kovalev [Wed, 16 Oct 2013 22:11:42 +0000 (15:11 -0700)]
Adding get_band_translate() function.
Moving code that gets band_translate array from get_scan_and_band()
function to get_band_translate() function. Renaming get_scan_and_band() to
get_scan().
This should be similar to what x264 does with --aq-mode 1.
It works well with clips like parkjoy and touhou
(http://x264.nl/developers/Dark_Shikari/LosslessTouhou.mkv).
At low bitrates, the segmentation signaling overhead may negate the
benefits of this feature.
(PGW) Default changed to feature OFF to allow provisional merge.
Change-Id: I938abf9bb487e1d4ad3b0264ea03d9826275c70b
Updated the encoder to handle frames that are coded
intra-only. Intra-only frames must be non-showable,
that is, the "show frame" flag must be set to 0 in
the frame header.
Tested by forcing the ARF frames to be coded intra-
only.
Note: The rate control code will need to be modified
to account for intra-only frames better than they
are currently handled.
Jingning Han [Mon, 14 Oct 2013 23:03:23 +0000 (16:03 -0700)]
Re-design all-zero-coeff block index buffer use
Use the zcoeff_blk buffer of PICK_MODE_CONTEXT to store the indexes
of all-zero-coeff block of the current best mode. Remove the temporary
buffer best_zcoeff_blk defined in the rate-distortion optimization
loop. This improves the speed performance by about 0.5% in all speed
settings.
Jingning Han [Fri, 11 Oct 2013 18:26:32 +0000 (11:26 -0700)]
Move token_cache from cost_coeffs to MACROBLOCK
This commit moves token_cache buffer into macroblock struct, instead
of defining as a local variable in cost_coeffs. This avoids repeatedly
re-allocating memory space in the rate-distortion optimization loop.
The runtime at speed 0 reduces:
bus 2000kbps, 161692ms to 159951ms
football 600kbps, 229505ms to 225821ms
Yunqing Wang [Sat, 12 Oct 2013 01:57:22 +0000 (18:57 -0700)]
Adjust icc compiler options
"-no-prec-div" option helps codec performance, so it was added back.
"-no-intel-extensions" was added to suppress link warning #10237.
option '-use-asm' is deprecated and removed.
Dmitry Kovalev [Fri, 11 Oct 2013 23:25:50 +0000 (16:25 -0700)]
Adding TREE_SIZE macro + cleanup.
Using TREE_SIZE for the following trees:
vp9_intra_mode_tree
vp9_inter_mode_tree
vp9_partition_tree
vp9_switchable_interp_tree
vp9_mv_joint_tree
vp9_mv_class_tree
vp9_mv_class0_tree
vp9_mv_fp_tree
Dmitry Kovalev [Fri, 11 Oct 2013 17:47:22 +0000 (10:47 -0700)]
Replacing {VP9_COEF, MODE}_UPDATE_PROB with DIFF_UPDATE_PROB.
Values of MODE_UPDATE_PROB and VP9_COEF_UPDATE_PROB are equal, so replacing
them with one constant. Inlining appropriate arguments for functions:
vp9_cond_prob_diff_update (encoder)
vp9_diff_update_prob (decoder)
Deb Mukherjee [Fri, 11 Oct 2013 00:24:55 +0000 (17:24 -0700)]
Change in rddiv parameter to make it a power of 2
Converts the constant rddiv parameter to 128 (from 100) and
implements RDCOST with bit-shift rather than multiplication.
Other parameters are also adjusted to roughly keep the same
balance between Rate and Distortion.
There is a slight speed-up of about 0.5-1% (at speed 0) as
testted on football_cif.
There is a slight change in performance due to small change
in the parameters.
derfraw300: +0.033%
stdhdraw250; +0.102%
Paul Wilkins [Mon, 7 Oct 2013 18:20:10 +0000 (19:20 +0100)]
Experimental rate control change.
When the codec in VBR (or cq) mode hits its max q limits and is
struggling to hit a target bandwidth, the bit target per frame collapses.
In the first instance normal frames cap out at the maximum allowed
Q and then the ARF and GFs do the same. This latter behavior is not
generally desirable as GFs and ARFs are only effective from a quality
and data rate perspective if they have at lease some level of -Q delta
compared to the surrounding frames.
In this patch I define a separate max Q for GFs and ARFs that is
derived from but somewhat lower than that defined for normal frames.
In effect there is a minimum Q delta that will always be available for
GFs and ARFs regardless of the target rate and MAXQ setting.
This may of course mean that the absolute lowest rate obtainable for
a given clip is somewhat higher.