Marco Paniconi [Thu, 28 Jun 2018 23:08:55 +0000 (16:08 -0700)]
vp9-svc: Adjust threshold for early exit on golden
Use the avg_frame_low_motion to reduce/turnoff this
early exit for higher motion content. Get some quality
back for higher motion clips and keep the same exit
thresh for low motion clips.
Marco Paniconi [Wed, 27 Jun 2018 20:04:18 +0000 (13:04 -0700)]
vp9-svc: Set avg_frame_low_motion for lower layers.
The avg_frame_low_motion metric is only computed on the
top spatial layer, and since its part of the layer context
struct, it needs to written to all lower spatial layers for
consistency.
Small/minor change in metrics.
Change-Id: I92a001c37aeb332e613212288b13a2ed9745af88
Marco Paniconi [Wed, 27 Jun 2018 19:26:09 +0000 (12:26 -0700)]
vp9-svc: Fix to early golden exit nonrd-pickmode
For SVC: apply the sse_zeromv early exit also to
the case where golden is second temporal reference.
Set the thresh_svc_golden threshold for this case.
This is reduce the encode time for case where golden
is second temporal reference for SVC.
Change-Id: I8c0c87dd746579d3c4f5e983c7f9dd0a1e1476e0
Marco Paniconi [Mon, 25 Jun 2018 04:44:29 +0000 (21:44 -0700)]
vp9: Add lower Q limt to cyclic refresh usage.
Disable the cyclic refresh for very low average Q.
This reduces encoded bitrate for static slides after the
the quality has ramped up well enough (low Q). And as the
cyclic refresh is not needed at low Q in most cases, this
has minimal/no effect on quality on RTC set.
Marco Paniconi [Mon, 25 Jun 2018 05:00:58 +0000 (22:00 -0700)]
vp9: Fixes for lossless mode for real-time mode.
Fixes to nonrd coding mode for lossless mode: keep
skip_txfm to 0 (no skip) and disable the encoder breakout.
This makes the encoding lossless when that mode is selected
for real-time (nonrd pickmode).
Also the disable the cyclic refresh for lossless mode.
Zoe Liu [Fri, 22 Jun 2018 02:26:32 +0000 (19:26 -0700)]
Add reference frame update flags for hierarchical
Previous CLs have implemented the construction of the hierarchical
structure at the encoder side. This CL is to define and configure the
according flags that will guide the reference frame update according to
the constructed hierarchical structure.
Zoe Liu [Thu, 21 Jun 2018 23:28:15 +0000 (16:28 -0700)]
Add extra altref option for hierarchical structure
This CL is to hook up the implemented hierarchical structure
construction as well as its corresponding bitrate allocation
functionality with the defining of a GF group.
Currently the hierarchical structure is off by default. Hence this CL
has no impact on coding performance.
Fix mingw builds for x86_32 by updating past:
https://chromium.googlesource.com/libyuv/libyuv/+/8fa02df3c0591754958a50
Pick up upstream fixes for clang 5 builds with --disable-optimizations.
Disable libyuv by default when building for msa. We have not been able
to update libyuv because of build issues with mips. This can be
revisited when we update the mips compiler used in Jenkins.
Zoe Liu [Wed, 20 Jun 2018 01:11:08 +0000 (18:11 -0700)]
Add bit allocation for hierarchical layer
This CL migrates the bit allocation scheme from libaom and combines the
scheme for hierarchical layer with the updated scheme in libvpx that
uses a modified scheme to calculate the target bitrate per frame.
Johann [Wed, 20 Jun 2018 20:10:54 +0000 (13:10 -0700)]
libyuv: remove problematic functions
These fail to build with clang on 32 bit with
--disable-optimizations
Upstream libyuv has addressed these and we will get updated
versions on the next roll. At the moment, we don't use
libyuv for copying alpha data and so this is a quick fix.
Marco Paniconi [Tue, 12 Jun 2018 18:50:29 +0000 (11:50 -0700)]
vp9-svc: Add support for spatial layer sync frames.
Add encoder control to allow application to insert
spatial layer sync frame. The sync frame disables
temporal prediction for that spatial layer.
This is useful for RTC application to have receiver
start decoding a higher spatial layer, without inserting
a key frame on base spatial layer.
If the layer sync is requested on the base spatial layer
this then force a key frame, otherwise it only disables
the temporal reference for that spatial layer, allowing
temporal prediction to continue for the other layers.
Although the temporal prediction is disabled and reset
on a layer sync frame, the inter-layer prediction for the
sync frame is enabled on INTER frames. So the meaning of
INTER_LAYER_PRED_OFF_NONKEY is modified to mean disable
inter-layer prediction on non-key and non-sync frames.
Added unittest for inserting layer sync frames.
Bump up ABI version.
Change-Id: Id458acc400a77c853551f125c4e7b6d001991f03
Jingning Han [Wed, 30 May 2018 20:31:08 +0000 (13:31 -0700)]
Refactor partition mode cost calculation
Compute the coding block partition mode cost as additional rdcost
to the cumulative rate-distortion cost from each coding block. This
changes the coding performance slightly due to the rounding error.
The compression performance change is neutral.
Hui Su [Tue, 12 Jun 2018 18:56:09 +0000 (11:56 -0700)]
Improve the partition search breakout speed feature
Use a linear model to make partition search breakout decisions.
Currently the model is tuned for large quantizers and small resolutions.
So it is only used when q-index is larger than 200 and frame
width/height is smaller than 720. Also it's not yet supported for high
bit depth.
Tested speed 1 and 2 on lowres and midres. Compression performance is
neutral. At low bitrates, encoding speedup is up to 50% for speed 1;
up to 30% for speed 2.
Some sample numbers:
Luc Trudeau [Wed, 13 Jun 2018 19:24:54 +0000 (15:24 -0400)]
[VSX] Optimize PROCESS16 macro
The PROCESS16 macro now uses 8-bit lanes instead of 16-bit lanes.
SADTest Speed Test (POWER8 Model 2.1)
16x8 Old VSX time = 16.7 ms, new VSX time = 9.1 ms [1.8x]
16x16 Old VSX time = 15.7 ms, new VSX time = 7.9 ms [2.0x]
16x32 Old VSX time = 14.4 ms, new VSX time = 7.2 ms [2.0x]
32x16 Old VSX time = 14.0 ms, new VSX time = 7.4 ms [1.9x]
32x32 Old VSX time = 13.4 ms, new VSX time = 6.5 ms [2.0x]
32x64 Old VSX time = 12.7 ms, new VSX time = 6.3 ms [2.0x]
64x32 Old VSX time = 12.6 ms, new VSX time = 6.3 ms [2.0x]
64x64 Old VSX time = 12.7 ms, new VSX time = 6.2 ms [2.0x]
Zoe Liu [Thu, 14 Jun 2018 00:33:57 +0000 (17:33 -0700)]
Unify frame_index in defining GF group structure
Following are completed in defining GF group structure in firstpass:
1. Remove redundant alt_frame_index;
2. Remove hard coded index value with the variable of frame_index.
Luc Trudeau [Wed, 13 Jun 2018 17:39:04 +0000 (13:39 -0400)]
VSX Version of SAD8xN
VSX versions of the SAD functions of width 8.
SADTest Speed Test (POWER8 Model 2.1)
8x4 C time = 68.7 ms (±0.3 ms), VSX time = 31.8 ms (±0.1 ms) [2.2x]
8x8 C time = 55.6 ms (±0.3 ms), VSX time = 18.3 ms (±0.1 ms) [3.0x]
8x16 C time = 46.5 ms (±0.1 ms), VSX time = 15.6 ms (±0.1 ms) [3.0x]
Luc Trudeau [Wed, 13 Jun 2018 17:36:17 +0000 (13:36 -0400)]
Add Speed Tests for the SADTest test suite.
Speed tests are added for the SADTest test suite. These test use the
AbstractBench and print the median run time of SAD operations. Speed
tests are disabled by default.