Jingning Han [Wed, 26 Sep 2018 23:34:16 +0000 (16:34 -0700)]
Adapt GOP size threshold to the allowed layer depth
Increase the total prediction error budget linearly with the
allowed ARF layer depth. This in general improves the compression
performance, but does hit corner cases on a few clips at very
low bit-rate range (corresponding to 26 - 28 dB range). To mitigate
such problem, we temporarily work around this problem by limiting
the first GOP size to be ~8 so as to not drain up the bit resource.
The overall compression performance improvements over the current
multi-layer ARF system in speed 0 are:
Jingning Han [Mon, 24 Sep 2018 20:53:53 +0000 (13:53 -0700)]
Use layer dependent gfu_boost factor
When multi-layer ARF is enabled, use the corresponding gfu_boost
factor assigned to each ARF to compute the best_quality_index
adjustment. This on average improves the coding performance by
0.2% for lowres and hdres, 0.4% for ntflx2k. It seems this change
will only affect a small group of clips, e.g., pamphlet, bowing,
mobcal_720p, etc., which tend to gain 4-5%, whereas the rest
clips remain largely identical coding statistics.
With --enable-better-hw-compatibility an access to array element -1
can be observed for VP9/ActiveMapTest.Test/0
../vp9/encoder/vp9_rdopt.c:3938:53: runtime error:
index -1 out of bounds for type 'RefBuffer [3]'
There doesn't seem anything that would prevent ref_frame from being 0.
If there is no reference frame it can probably be assumed that it
isn't scaled.
When built with -fsanitizer=address,undefined a number of tests,
such as ByteAlignmentTest.SwitchByteAlignment or
ByteAlignmentTest.SwitchByteAlignment produce runtime errors about
unaligned 4-byte loads/stores. While normally not really a problem,
this does technically violate the language and it is eays to fix in
a standard conforming way using memcpy which does not produce
inferior code.
Jingning Han [Thu, 20 Sep 2018 17:03:27 +0000 (10:03 -0700)]
Generalize encoder comp_var_ref setting
Generalize the encoder comp_fixed_ref and comp_var_ref assignments.
Make it fully support 2 fwd + 1 bwd and 1 fwd + 2 bwd settings
that VP9 decoder allows.
When running tests built with
-fsanitize=undefined and--disable-optimizations
the sanitizer will emit errors of the following general form:
runtime error: member call on address 0xxxxxxxxx which does not
point to an object of type 'WithParamInterface'
0xxxxxxxxx: note: object has invalid vptr
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
^~~~~~~~~~~~~~~~~~~~~~~
invalid vptr
This can be traced to calls to WithParamInterface<T>::GetParam before
the object argument has been initialized. Although GetParam only
accesses static data it is a non-static member function. This causes
that call to have undefined behaviour.
The patch makes GetParam a static member function.
The alternative - if the pull request is denied - would be to
modify all parameterized tests to have them derive from
::libvpx_test::CodecTestWith*Params as the first base class.
Jingning Han [Tue, 18 Sep 2018 22:08:53 +0000 (15:08 -0700)]
Skip RD check for compound modes that have same sign bias
The compound mode can only be run between two reference frames
with different sign bias flags. Skip the search over same sign
bias reference frames in the rate-distortion optimization.
x86inc.asm's cglobal macro is frequently used to declare more
arguments than the function actually has. Normally, this is
done to aquire an alias to a register that would correspond to
that positional function argument if it existed. This is safe
when used in this manner.
In the case fixed here, however, the alias is used to temporarily
store adresses obtained through the GOT in memory. Because those
extra arguments don't actually exist, those stores corrupt the
callers stack frame.
SSE2/VpxHBDSubpelVarianceTest.Ref is a test that may fail as a
result.
To simply fix the space allocated to actual arguments that have
been loaded into registers already is reused.
This avoids having to allocate extra space for local variables.
Jingning Han [Mon, 17 Sep 2018 21:17:36 +0000 (14:17 -0700)]
Update frame index per buffer at encoder
Update the frame index counting from key frame offset for all
the processed frames at the encoder. This would allow encoder to
automatically decide frame sign bias next.
Jingning Han [Mon, 17 Sep 2018 18:46:17 +0000 (11:46 -0700)]
Add a frame_index entry to RefCntBuffer
This entry will only be effectively used at the encoder side.
Adding it to the RefCntBuffer data structure would help make the
associated logic a lot simpler. Its effect on the decoder side
would be explicitly sent through the bit-stream.
James Zern [Wed, 22 Aug 2018 21:03:54 +0000 (14:03 -0700)]
cosmetics: normalize include guards
use the recommended format [1] of:
<PROJECT>_<PATH>_<FILE>_H_
[1] https://google.github.io/styleguide/cppguide.html#The__define_Guard
"All header files should have #define guards to prevent multiple
inclusion. The format of the symbol name should be
<PROJECT>_<PATH>_<FILE>_H_."
Jingning Han [Thu, 13 Sep 2018 05:51:19 +0000 (22:51 -0700)]
Remove deprecated first_inter_index
With the refactoring of logics that determines if a frame needs
re-code runs to adapt to the target bit-rate, the variable
first_inter_index is no longer in effect use. Hence remove it.
* changes:
Remove some deprecated FRAME_UPDATE_TYPE elements.
Remove some deprecated constants.
Remove unused rate control data elements
Remove extra_arf_allowed.
Always allocate cpi->common.postproc_state.limits using unscaled width.
With ./configure --enable-pic --enable-decode-perf-tests
--enable-encode-perf-tests --enable-encode-perf-tests
--enable-vp9-highbitdepth --enable-better-hw-compatibility
--enable-internal-stats --enable-postproc --enable-vp9-postproc
--enable-error-concealment --enable-coefficient-range-checking
--enable-postproc-visualizer --enable-multi-res-encodin
--enable-vp9-temporal-denoising --enable-webm-io --enable-libyuv
segfaults tend to occur in VP9/DatarateOnePassCbrSvcSingleBR.* tests.
This is an analogue to issue
https://bugs.chromium.org/p/webm/issues/detail?id=1374
where a buffer allocated using a scaled width is reused after scaling
back to the original size. Unfortunately, in this case the unscaled
width doesn't appear to be known in the immediated context of the
allocation, so the the signature of vp9_post_proc_frame needs to be
changed to provide that information in order to provide a similar fix
as in #1374.