Paul Wilkins [Wed, 31 Mar 2021 15:58:51 +0000 (16:58 +0100)]
Change calculation of rd multiplier.
Change the way the rd multiplier is adjusted for Q and frame type.
Previously in VP9 the rd multiplier was adjusted based on crude Q bins
and whether the frame was a key frame or inter frame.
The Q bins create some problems as they potentially introduce
discontinuities in the RD curve. For example, rate rising with a
stepwise increase in Q instead of falling. As such, in AV1 they
have been removed.
A further issue was identified when examining the first round of
results from from the Vizier project. Here the multiplier for each Q bin
and each frame type was optimized for a training set, for various video
formats, using single point encodes at the appropriate YT rates.
These initial results appeared to show a trend for increased rd
multiplier at higher Q for key frames. This fits with intuition as in
this encoding context a higher Q indicates that a clip is harder to
encode and frames less well predicted. However, the situation
appeared to reverse for inter frames with higher rd multipliers
chosen at low Q.
My initial suspicion was that this was a result of over fitting, but on
closer analysis I realized that this may be more related to frame type
within the broader inter frame classification. Specifically frames coded
at low Q are predominantly ARF frames, for the mid Q bin there will
likely be a mix of ARF and normal inter frames, and for the high Q bin
the frames will almost exclusively be normal inter frames from difficult
content.
ARF frames are inherently less well predicted than other inter frames
being further apart and not having access to as many prediction modes.
We also know from previous work that ARF frames have a higher
incidence of INTRA coding and may well behave more like key frames
in this context.
This patch replaces the bin based approach with a linear function
that applies a small but smooth Q based adjustment. It also splits
ARF frames and normal inter frames into separate categories.
With this done number of parameters that will be exposed for the
next round of Vizier training is reduced from 7 to 3 (one adjustment
factor each for inter, ARF and key frames)
This patch gives net BDATE gains for our test sets even with the
baseline / default factors as follows: (% BDRATE change in overall
PSNR and SSIM, -ve is better)
Paul Wilkins [Mon, 22 Mar 2021 19:45:10 +0000 (19:45 +0000)]
Convert Vizier RD parameters to normalized factors
This patch converts the Vizier custom RD multipliers, to factors
that adjust each RD multiplier either side of its default value, where
a factor of 1.0 will give the previous default behavior.
Ultimately I would like to replace the multiple RD multipliers
triggered at different Q thresholds (eg, low, medium, high q)
with a function that adjusts the rd behavior smoothly as Q
changes.
Vizier could then be presented with a single adjustment control
for each of key frame and inter frame rd.
The current behavior is problematic.
Firstly having hard threshold Q values at which rd behavior changes
may cause anomalies in the rate distortion curve, where in some
situations, raising Q, for example, may not cause the expected drop
in rate and rise in distortion, because we have crossed a threshold
where the rate distortion multiplier changes sharply and this alters
the balance of bits spent in the prediction and residual parts of the
signal.
Having a single value that is used for a range of Q index values
(eg 0-64), (65-128) may also cause problems and over-fitting in
the context of the Vizier ML project. This project tries to optimize
the values for each Q range, for various YT formats, but does so
by analyzing the results of single point encodes on a set of clips.
For a given format all the clips are encoded with the same parameters
(target rate etc) so there is likely to be clustering in regards to the
Q values used. For example the training set may give a new value
for the Q range 0-64 but most of the data points used may have Q
close 64.
It will likely require several iterations working with the Vizier team
to get this right. This patch just gives an initial framework for
testing.
Paul Wilkins [Wed, 10 Mar 2021 14:39:43 +0000 (14:39 +0000)]
Change SR_diff calculation and representation
This patch changes the way prediction decay is calculated.
We expect that frames that are further from an ALT-REF frame (or Golden
Frame) will be less well predicted by that ALT-REF frame. As such it is
desirable that they should contribute less to the boost calculation used
to assign bits to the ALT_REF.
This code looks at the reduction in prediction quality between the last
frame and the second reference frame (usually two frames old). We make
the assumption that we can accumulate this to get a proxy for the likely
loss of prediction quality over multiple frames.
Previously the calculation looked at the absolute difference in the
coded errors. The issue here is that the meaning of a unit difference
is not the same for very complex frames as it is for easy frames.
In this patch we scale the decay value based on how the error difference
compares to the overall frame complexity as represented by the intra
coding error.
This was tuned experimentally to give test results that
were approximately neutral for our various test sets. There was
a slight drop in Overall PSNR but a consistent improvement in
SSIM. This balance may be improved with tuning further as it is
noteworthy that it was much better on the hd_res set.
Results (Overall PSNR, SSIM -ve better) for low_res, ugc360, midres2,
ugc480P and hd_res are as follows:
As part of this adjustment the contribution of motion amplitude was
removed.
This patch also changes the control mechanism that will be exposed
on the command line for use by the Vizier project. The control is now
a linear factor which defaults to 1.0, where values < 1.0 mean a lower
decay rate and values > 1.0 mean an increased decay rate.
This presents a more easily understandable interface for use in
optimizing the decay behavior for various formats, where it is clear
what a passed in value means relative to the default.
With the new decay mechanism the current values for various formats
are almost certainly wrong and we still need to define sensible upper
and lower bounds for use during future training.
Paul Wilkins [Tue, 9 Mar 2021 15:11:41 +0000 (15:11 +0000)]
Vizer: Added in experimental max KF boost values.
Added the experimental max per frame KF boost values derived from
the Vizier experiments.
These are still all off by default.
When enabled I expect these to cause significant regression as they
fluctuate wildly and in a way that makes no sense from format to format.
I suspect these values reflect over fitting perhaps from a subset of
training clips with more frequent mid chunk key frames and or short key
frame groups.
Also fixed incorrect value for gf boost for one format.
Experiment to moderate these values and use different values for first
and subsequent KF groups to follow.
Paul Wilkins [Tue, 9 Mar 2021 14:47:25 +0000 (14:47 +0000)]
Vizier: Add in field for min kf frame boost.
Added kf_frame_min_boost field to hold the minimum per frame
boost in key frame boost calculations. Replaces hard wired value.
To be used in conjunction with and tied to the maximum value.
Paul Wilkins [Tue, 9 Mar 2021 14:07:48 +0000 (14:07 +0000)]
Vizier: Add defaults for > 1080P
Previous code did not have sensible defaults for larger image formats.
Added defaults for Vizier RD parameters for sizes > 1080P and changed
the first pass parameters for large formats to use the 1080P values.
No supplied value for rd_mult_q_sq_key_high_qp case yet so set to
old hard wired default value.
If the Vizier parameters were enabled the lack of sensible defaults
caused a large regression for 2K clips in one of our test sets.
Paul Wilkins [Thu, 4 Mar 2021 17:10:09 +0000 (17:10 +0000)]
Further integration for Vizier.
Further integration of Vizier adjustable parameters,
This patch connects up additional configurable two pass rate control
parameters for the Vizier project. This still needs to be connected up
to a command line interface and at the moment should still be using
default values that match previous behavior.
Do not submit until verified that defaults are all working correctly.
Paul Wilkins [Wed, 3 Mar 2021 16:45:42 +0000 (16:45 +0000)]
Add fields into RC for Vizier ML experiments.
This patch adds fields into the RC data structure for the Vizier.
The added fields allow control of some extra rate control parameters
and rate distortion.
This patch also adds functions to initialize the various parameters
though many are not yet used / wired in and for now all are set to
default values. Ultimately many will be set through new command
line options.
Marco Paniconi [Thu, 4 Feb 2021 06:09:24 +0000 (22:09 -0800)]
svc: Fix an existing unittest for flexible mode
The flag update_pattern_ was being set to 0
(because it was set before reset) instead of 1.
And the example flexible mode pattern was not setting
non-reference frame on top temporal top spatial.
Previous parser assumed that the header would not exceed
80 characters. However, with latest FFMPEG changes, the header
of Y4M files can exceed this limit.
New parser can parse an arbitrarily long header, as long each
tag is 255 or less characters.
Previous parser assumed that the header would not exceed
80 characters. However, with latest FFMPEG changes, the header
of Y4M files can exceed this limit.
New parser can parse up to ~200 characters. Arbitrary parsing in
future commit.
James Zern [Wed, 27 Jan 2021 02:03:17 +0000 (18:03 -0800)]
sad_test: fix compilation w/gcc 4.8.5
use a #define for kDataAlignment as it's used with DECLARE_ALIGNED
(__attribute__((aligned(n)))) and this version under CentOS is more
strict over integer constants:
../vpx_ports/mem.h:18:72: error: requested alignment is not an integer constant
#define DECLARE_ALIGNED(n, typ, val) typ val __attribute__((aligned(n)))
* changes:
Add return to vp9_extrc_update_encodeframe_result
Add status in vp9_extrc_get_encodeframe_decision
Return status in vp9_extrc_send_firstpass_stats
Return status in vp9_extrc_create/init/delete
Seen with arm-linux-gnueabihf-gcc-8 (8.3.0 & 8.4.0)
Without reworking the code or adding an additional branch this warning
cannot be silenced otherwise. The loopfilter is only called when needed
for a block so these output pixels will be set.
Marco Paniconi [Thu, 12 Nov 2020 07:11:16 +0000 (23:11 -0800)]
vp9: Allow for disabling loopfilter per spatial layer
For SVC: add parameter to the control SET_SVC_PARAMS to
allow for disabling the loopfilter per spatial layer.
Note this svc setting will override the setting via
VP9E_SET_DISABLE_LOOPFILTER (which should only be used
for non-SVC).
Add unittest to handle both SVC (spatial or temporal layers)
and non-SVC (single layer) case.
Cheng Chen [Thu, 5 Nov 2020 23:26:54 +0000 (15:26 -0800)]
Accumulate frame tpl stats and pass through rate control api
Tpl stats is computed at the beginning of encoding the altref
frame. We aggregate tpl stats of all blocks for every frame of
the current group of picture.
After the altref frame is encoded, the tpl stats is passed through
the encode frame result to external environment.
Change-Id: I2284f8cf9c45d35ba02f3ea45f0187edbbf48294
Add a comment to vp9_args to point out that bitdeptharg and
inbitdeptharg do not have a corresponding entry in vp9_arg_ctrl_map and
must be listed at the end of vp9_args.
Angie Chiang [Thu, 15 Oct 2020 19:05:37 +0000 (12:05 -0700)]
Add unit test for vp9_ext_ratectrl
Fix three bugs along the way.
1) Call vp9_extrc_send_firstpass_stats() after vp9_extrc_create()
2) Pass in model pointer in vp9_extrc_create()
3) Free frame_stats buffer in vp9_extrc_delete()