When built with -fsanitizer=address,undefined a number of tests,
such as ByteAlignmentTest.SwitchByteAlignment or
ByteAlignmentTest.SwitchByteAlignment produce runtime errors about
unaligned 4-byte loads/stores. While normally not really a problem,
this does technically violate the language and it is eays to fix in
a standard conforming way using memcpy which does not produce
inferior code.
When running tests built with
-fsanitize=undefined and--disable-optimizations
the sanitizer will emit errors of the following general form:
runtime error: member call on address 0xxxxxxxxx which does not
point to an object of type 'WithParamInterface'
0xxxxxxxxx: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
This can be traced to calls to WithParamInterface<T>::GetParam before
the object argument has been initialized. Although GetParam only
accesses static data it is a non-static member function. This causes
that call to have undefined behaviour.
The patch makes GetParam a static member function.
The alternative - if the pull request is denied - would be to
modify all parameterized tests to have them derive from
::libvpx_test::CodecTestWith*Params as the first base class.
x86inc.asm's cglobal macro is frequently used to declare more
arguments than the function actually has. Normally, this is
done to aquire an alias to a register that would correspond to
that positional function argument if it existed. This is safe
when used in this manner.
In the case fixed here, however, the alias is used to temporarily
store adresses obtained through the GOT in memory. Because those
extra arguments don't actually exist, those stores corrupt the
callers stack frame.
SSE2/VpxHBDSubpelVarianceTest.Ref is a test that may fail as a
result.
To simply fix the space allocated to actual arguments that have
been loaded into registers already is reused.
This avoids having to allocate extra space for local variables.
Jingning Han [Mon, 17 Sep 2018 21:17:36 +0000 (14:17 -0700)]
Update frame index per buffer at encoder
Update the frame index counting from key frame offset for all
the processed frames at the encoder. This would allow encoder to
automatically decide frame sign bias next.
Jingning Han [Mon, 17 Sep 2018 18:46:17 +0000 (11:46 -0700)]
Add a frame_index entry to RefCntBuffer
This entry will only be effectively used at the encoder side.
Adding it to the RefCntBuffer data structure would help make the
associated logic a lot simpler. Its effect on the decoder side
would be explicitly sent through the bit-stream.
James Zern [Wed, 22 Aug 2018 21:03:54 +0000 (14:03 -0700)]
cosmetics: normalize include guards
use the recommended format [1] of:
<PROJECT>_<PATH>_<FILE>_H_
[1] https://google.github.io/styleguide/cppguide.html#The__define_Guard
"All header files should have #define guards to prevent multiple
inclusion. The format of the symbol name should be
<PROJECT>_<PATH>_<FILE>_H_."
Jingning Han [Thu, 13 Sep 2018 05:51:19 +0000 (22:51 -0700)]
Remove deprecated first_inter_index
With the refactoring of logics that determines if a frame needs
re-code runs to adapt to the target bit-rate, the variable
first_inter_index is no longer in effect use. Hence remove it.
* changes:
Remove some deprecated FRAME_UPDATE_TYPE elements.
Remove some deprecated constants.
Remove unused rate control data elements
Remove extra_arf_allowed.
Always allocate cpi->common.postproc_state.limits using unscaled width.
With ./configure --enable-pic --enable-decode-perf-tests
--enable-encode-perf-tests --enable-encode-perf-tests
--enable-vp9-highbitdepth --enable-better-hw-compatibility
--enable-internal-stats --enable-postproc --enable-vp9-postproc
--enable-error-concealment --enable-coefficient-range-checking
--enable-postproc-visualizer --enable-multi-res-encodin
--enable-vp9-temporal-denoising --enable-webm-io --enable-libyuv
segfaults tend to occur in VP9/DatarateOnePassCbrSvcSingleBR.* tests.
This is an analogue to issue
https://bugs.chromium.org/p/webm/issues/detail?id=1374
where a buffer allocated using a scaled width is reused after scaling
back to the original size. Unfortunately, in this case the unscaled
width doesn't appear to be known in the immediated context of the
allocation, so the the signature of vp9_post_proc_frame needs to be
changed to provide that information in order to provide a similar fix
as in #1374.
Jingning Han [Tue, 11 Sep 2018 17:50:49 +0000 (10:50 -0700)]
Simplify vp9_frame_type_qdelta()
Make direct use of frame type in the available VP9_COMMON structure.
Eliminate the need to map through rf_level to fetch the frame type.
This change doesn't alter the coding stats. It simplifies the
vp9_frame_type_qdelta() function logic and removes unnecessary
reference to rf_level.
Jingning Han [Mon, 10 Sep 2018 18:55:10 +0000 (11:55 -0700)]
Rework two_pass_first_group_inter()
This function is used to in part decide if to trigger recode loop
for the first normal P frame in a GOP. Rework its design logic to
support the GOP with multi-layer ARF. Allow recode when there is
a transition from ARF/OVERLAY/USE_BUF to normal P frame.
The overall coding performance for multi-ARF gets slightly better
(less than 0.1% for show_existing_frame case). Tested on a few
clips, the encoding speed remains similar too. This change primarily
serves to help integration of multi-layer ARF and dual-ARF systems.
Jingning Han [Fri, 7 Sep 2018 23:37:42 +0000 (16:37 -0700)]
Separate frame context index for GOP layers
Use separate frame context index to code frames at different layers.
The maximum index cap is set as 3. This improves the compression
performance of multi-layer ARF by 0.15% across the test sets.
Paul Wilkins [Thu, 6 Sep 2018 14:56:01 +0000 (15:56 +0100)]
Fix rate control bug with recode all.
This patch fixes a rate control bug that can manifest if the recode
loop is activated for all frame types. Specifically things go wrong when the
recode loop is used on an overlay frame that has a rate target of 0 bits.
The patch prevents adjustment of the active worst quality and repeat recode
loops for overlay frames.
The bug showed up during artificial experiments on re-distribution of bits in
ARF groups but does not activate in any current encode profile, as even best
best quality does not currently allow recodes for all frames.
Jingning Han [Fri, 7 Sep 2018 17:25:07 +0000 (10:25 -0700)]
Fork auto-alt-ref control
Temporarily fork the auto-alt-ref control meaning. When it is set
to be 1, use single layer ARF as baseline. The value 2 would enable
dual ARF system. Any number above it would trigger automatic multi-
layer ARFs.
We would gradually refactor and integrate dual ARF and multi-layer
ARF systems next, and eventually make auto-alt-ref directly control
the layer depth.
Jingning Han [Fri, 7 Sep 2018 17:20:56 +0000 (10:20 -0700)]
Extend auto-alt-ref parameter range
Extend the upper limit from 2 (dual ARFs) to maximum ARF layers.
This would later allow --auto-alt-ref to directly control the
ARF layer depth later on.
Jingning Han [Fri, 31 Aug 2018 22:54:10 +0000 (15:54 -0700)]
Adaptive ARF factor decision
Re-count the factors to decide bit boost factor for the
intermediate layer ARFs. Make the gfu_boost factor assigned to
each ARF adapt to its local factors.
This and the recursive change 5bfe9eb together improves the
multi-layer ARF compression performance:
Jingning Han [Thu, 30 Aug 2018 04:30:35 +0000 (21:30 -0700)]
Recursive rate allocation for multi-layer ARF coding
Recursively calculate the rate boost for the ARF frames at the
given layer depth from the remaining available bit resource after
the prior layer ARFs consumption.