x86inc.asm's cglobal macro is frequently used to declare more
arguments than the function actually has. Normally, this is
done to aquire an alias to a register that would correspond to
that positional function argument if it existed. This is safe
when used in this manner.
In the case fixed here, however, the alias is used to temporarily
store adresses obtained through the GOT in memory. Because those
extra arguments don't actually exist, those stores corrupt the
callers stack frame.
SSE2/VpxHBDSubpelVarianceTest.Ref is a test that may fail as a
result.
To simply fix the space allocated to actual arguments that have
been loaded into registers already is reused.
This avoids having to allocate extra space for local variables.
Jingning Han [Mon, 17 Sep 2018 18:46:17 +0000 (11:46 -0700)]
Add a frame_index entry to RefCntBuffer
This entry will only be effectively used at the encoder side.
Adding it to the RefCntBuffer data structure would help make the
associated logic a lot simpler. Its effect on the decoder side
would be explicitly sent through the bit-stream.
James Zern [Wed, 22 Aug 2018 21:03:54 +0000 (14:03 -0700)]
cosmetics: normalize include guards
use the recommended format [1] of:
<PROJECT>_<PATH>_<FILE>_H_
[1] https://google.github.io/styleguide/cppguide.html#The__define_Guard
"All header files should have #define guards to prevent multiple
inclusion. The format of the symbol name should be
<PROJECT>_<PATH>_<FILE>_H_."
Jingning Han [Thu, 13 Sep 2018 05:51:19 +0000 (22:51 -0700)]
Remove deprecated first_inter_index
With the refactoring of logics that determines if a frame needs
re-code runs to adapt to the target bit-rate, the variable
first_inter_index is no longer in effect use. Hence remove it.
* changes:
Remove some deprecated FRAME_UPDATE_TYPE elements.
Remove some deprecated constants.
Remove unused rate control data elements
Remove extra_arf_allowed.
Jingning Han [Tue, 11 Sep 2018 17:50:49 +0000 (10:50 -0700)]
Simplify vp9_frame_type_qdelta()
Make direct use of frame type in the available VP9_COMMON structure.
Eliminate the need to map through rf_level to fetch the frame type.
This change doesn't alter the coding stats. It simplifies the
vp9_frame_type_qdelta() function logic and removes unnecessary
reference to rf_level.
Jingning Han [Mon, 10 Sep 2018 18:55:10 +0000 (11:55 -0700)]
Rework two_pass_first_group_inter()
This function is used to in part decide if to trigger recode loop
for the first normal P frame in a GOP. Rework its design logic to
support the GOP with multi-layer ARF. Allow recode when there is
a transition from ARF/OVERLAY/USE_BUF to normal P frame.
The overall coding performance for multi-ARF gets slightly better
(less than 0.1% for show_existing_frame case). Tested on a few
clips, the encoding speed remains similar too. This change primarily
serves to help integration of multi-layer ARF and dual-ARF systems.
Jingning Han [Fri, 7 Sep 2018 23:37:42 +0000 (16:37 -0700)]
Separate frame context index for GOP layers
Use separate frame context index to code frames at different layers.
The maximum index cap is set as 3. This improves the compression
performance of multi-layer ARF by 0.15% across the test sets.
Paul Wilkins [Thu, 6 Sep 2018 14:56:01 +0000 (15:56 +0100)]
Fix rate control bug with recode all.
This patch fixes a rate control bug that can manifest if the recode
loop is activated for all frame types. Specifically things go wrong when the
recode loop is used on an overlay frame that has a rate target of 0 bits.
The patch prevents adjustment of the active worst quality and repeat recode
loops for overlay frames.
The bug showed up during artificial experiments on re-distribution of bits in
ARF groups but does not activate in any current encode profile, as even best
best quality does not currently allow recodes for all frames.
Jingning Han [Fri, 7 Sep 2018 17:25:07 +0000 (10:25 -0700)]
Fork auto-alt-ref control
Temporarily fork the auto-alt-ref control meaning. When it is set
to be 1, use single layer ARF as baseline. The value 2 would enable
dual ARF system. Any number above it would trigger automatic multi-
layer ARFs.
We would gradually refactor and integrate dual ARF and multi-layer
ARF systems next, and eventually make auto-alt-ref directly control
the layer depth.
Jingning Han [Fri, 7 Sep 2018 17:20:56 +0000 (10:20 -0700)]
Extend auto-alt-ref parameter range
Extend the upper limit from 2 (dual ARFs) to maximum ARF layers.
This would later allow --auto-alt-ref to directly control the
ARF layer depth later on.
Jingning Han [Fri, 31 Aug 2018 22:54:10 +0000 (15:54 -0700)]
Adaptive ARF factor decision
Re-count the factors to decide bit boost factor for the
intermediate layer ARFs. Make the gfu_boost factor assigned to
each ARF adapt to its local factors.
This and the recursive change 5bfe9eb together improves the
multi-layer ARF compression performance:
Jingning Han [Thu, 30 Aug 2018 04:30:35 +0000 (21:30 -0700)]
Recursive rate allocation for multi-layer ARF coding
Recursively calculate the rate boost for the ARF frames at the
given layer depth from the remaining available bit resource after
the prior layer ARFs consumption.
Jingning Han [Tue, 28 Aug 2018 21:08:02 +0000 (14:08 -0700)]
Enable adaptive rate allocation for multi-layer ARFs
Increase the bit allocation for the intermediate layer ARFs. The
current strategy assigns higher offset to the lower layer ARFs.
The needed budget is borrowed from the base layer ARF allocation.
Jingning Han [Tue, 28 Aug 2018 19:43:34 +0000 (12:43 -0700)]
Increase encoder buffer for multi-layer ARFs
When multi-layer ARF mode is enabled, increase the encoder buffer
to account for the situation where several ARFs are coded together
in a frame packet.
Paul Wilkins [Mon, 3 Sep 2018 15:48:02 +0000 (16:48 +0100)]
Fix short first kf bug.
This change is in response to quality issue in b/112953058
The quality regression observed is a result of a bug that manifested
because of a very short key frame group at the start of a chunk.
The group was so short that it was less than the minimum allowed
length of an ARF group, so the initial group was coded as a GF only
group. However, group length was not set correctly and the result
was a frame coded with a target of 0 bits.
This causes two problems:
Firstly one very poor frame that caused the issue to be raised.
Secondly that one frame obviously overshoots its 0 target very heavily
and this has the effect moving the needle significantly in terms of the
adaptive rate control (specifically the estimate of bits per macro block
used to estimate the active Q range). Consequently there is undershoot
for most of the rest of the chunk and the overall rate ends up much lower
than the target (14Mb/s vs a target of 22Mb/s). (The sharp drop in the
overall rate is also noted in the issue.
Paul Wilkins [Mon, 3 Sep 2018 15:12:22 +0000 (16:12 +0100)]
Revert "Revert "Prevent double application of min rate in two pass.""
This rate control bug in the original patch is not the underlying cause
of the quality regression but simply unmasked a problem which stems
from applying 0 bits to the last frame in a short KF group at the start
of a chunk.
Marco Paniconi [Mon, 3 Sep 2018 05:17:32 +0000 (22:17 -0700)]
vp9-svc: Fix condition for pattern constraints
For fixed/non-flexible SVC mode: on non-key spatial
enhancement layers modify constraint on the inter-layer
prediction to include the first_spatial_layer_to_encode.
Marco Paniconi [Fri, 31 Aug 2018 22:42:19 +0000 (15:42 -0700)]
vp9-svc: Add first_spatial_layer_to_encode per superframe
VP9E_SET_SVC_LAYER_ID sets the first spatial layer to
encoder per superframe, so add this parameter to svc encoder.
This is needed, for example, to properly set is_key_frame for
spatial layers when base spatial layer is skipped encoded.
Hui Su [Tue, 17 Jul 2018 05:05:19 +0000 (22:05 -0700)]
ML based rectangular partition search pruning
Add a ML model to predict if rectangular partition search can be skipped
without much coding loss. This model is enabled for speed 0 low bitdepth
only.