Jerome Jiang [Mon, 13 Aug 2018 18:01:31 +0000 (11:01 -0700)]
vp9: fix memory alloc for adaptive_rd_thresh_row_mt.
When the feature is enabled and the memory is not available, allocate
it. There was a case where speed feature changed in the middle of stream
but the number of tiles stayed the same, memory was not re-allocated.
Another case is where speed for base layer is different than that of
higher quality layers (same resolution). Removed the speed constraints
forcing base layer using same speed setting.
Thus the memory for adaptive_rd_thresh_row_mt stayed NULL but the
feature was enabled.
Marco Paniconi [Sat, 11 Aug 2018 19:59:40 +0000 (12:59 -0700)]
vp9-svc: Fix to updated SET_SVC_REF_FRAME_CONFIG control
Add flag to separate two cases of bypass (flexible) SVC mode:
usage of using the SET_SVC_REF_FRAME_CONFIG vs passing in the
frame_flags in the vpx_encode (only used for temporal layers).
This fixes failures in Datarate Temporal layer test,
introduced in commit: a66da31
Marco Paniconi [Wed, 8 Aug 2018 21:01:26 +0000 (14:01 -0700)]
vp9: Allow for overshoot detection for non-screen CBR mode.
For CBR real-time mode: refactor usage of speed feature to
handle overshoot on slide/scene change. Add 2 modes to indicate
how slide/scene change is processed for re-setting Q/rate control.
Keep the speed setting to 1 for speed >= 5, otherwise set to 0.
Video content and screen content are now handled in similar way,
though with different thresholds.
Some fixes to thresholds and reset: correct the reset of the buffer
level to optimal level for each temporal layer, if scene change
frame will be encoded at max_q.
Also increase the min_thresh for video mode (non-screen content):
this is to avoid scene change detection on cases like large
lighting changes, cameras focus. And increase in min_thresh
makes it more robust to sudden increase in noise level.
Marco Paniconi [Thu, 9 Aug 2018 16:34:05 +0000 (09:34 -0700)]
vp9-svc: Fix for scene detection for SVC
For spatial layers: use the correct mi_cols/rows in the
scene detection. The scene detection for spatial layers
is only called once per superframe, but we were using wrong
mi_cols/rows (those for base spatial were being used).
Also increase frame_since_key threshold to account for spatial
layers.
James Zern [Wed, 8 Aug 2018 03:07:09 +0000 (20:07 -0700)]
loop_filter_rows_mt: use sb_rows to limit workers
Previously if the number of tiles decreased within a clip and there were
fewer super block rows than workers the mi_row calculation would cause
rows to be skipped. The num_workers stored is the max allocated amount,
use sb_rows to limit the active ones if the row count is smaller as
additional threads will provide no benefit.
Marco Paniconi [Fri, 3 Aug 2018 17:45:41 +0000 (10:45 -0700)]
vp9: Add screen-content mode to overshoot detection.
For real-time 1 pass mode: overshoot detection and max_Q
reset should only be for screen-content mode.
This fixes some failures in the 1 pass VBR tests, from
the commit: 2fae9991
Marco Paniconi [Fri, 3 Aug 2018 16:20:55 +0000 (09:20 -0700)]
vp9: Adjust qp_thresh on slide change overshoot detection
For real-time screen-content mode: increase the
qp_thresh for max_Q setting on slide changes.
This will make bitrate spikes less likely on slide changes.
Marco Paniconi [Thu, 2 Aug 2018 16:22:58 +0000 (09:22 -0700)]
vp9: Disable re_encode_overshoot feature for speed >= 6.
For real-time screen content mode: for speed >= 6 disable
the re_encode_overshoot feature. This means for speed >= 6
the Q and rate control is reset on slide changes based on
the scene/slide detection and the current Q (and not on a
first pass encoded frame at current Q).
This reduces encode time on slide changes, but may be less
accurate in deciding when to reset/max-out the Q.
Hui Su [Wed, 1 Aug 2018 22:43:05 +0000 (15:43 -0700)]
Handle partition cost better in RD search
Take partition cost into consideration during rectangular partition
mode search.
Compression change is neutral. Encoding speed can be a little faster
at low quality settings. With QP=55 at speed 0, average speed up over
15 midres sequences is about 2.7%.
Jingning Han [Tue, 31 Jul 2018 16:43:17 +0000 (09:43 -0700)]
Use mesh full pixel motion search to build the source ARF
Append mesh search to the diamond shape search to refine
the full pixel motion estimation for source ARF generation.
It improves the average compression performance.
Marco Paniconi [Tue, 31 Jul 2018 22:28:58 +0000 (15:28 -0700)]
vp9: Clamp tx_size in model_rd_large
For nonrd_pickmode: add clamp/check to make
sure tx_size is not set to lower than 8X8,
for the model_rd_large function (which is only
called for big block sizes).
Marco Paniconi [Mon, 30 Jul 2018 21:32:54 +0000 (14:32 -0700)]
vp9: Add scene change detection flag to cyclic refresh setup
Disable cyclic refresh on slide/scene change frame. It was already
disabled on the re-encode for the slide change, but this change
makes sure its always disabled on a detected slide change (which
may not be re-encoded at high Q).
Martin Storsjo [Sat, 28 Jul 2018 05:02:27 +0000 (08:02 +0300)]
arm: Consistently use unified syntax for asm
The ".syntax unified" directives in a few source files aren't valid
ADS assembly directives, and they break compilation for windows,
since ads2armasm_ms.pl doesn't handle them.
Explicity add them via ads2gas.pl and ads2gas_apple.pl instead,
and tweak one instruction to be valid unified syntax.
Jingning Han [Thu, 26 Jul 2018 23:41:55 +0000 (16:41 -0700)]
Use diamond search to build tpl model and arf frames
Use diamond search for full pixel motion estimation to build
the temporal dependency model and the source arf frame. This gives
better full pixel motion estimation accuracy. It improves the
compression performance.