Marco Paniconi [Thu, 19 Apr 2018 00:15:16 +0000 (17:15 -0700)]
vp9: Rate control fix for CBR mode.
For CBR mode: modify the qp clamping to allow q to respond
faster to overshoot. Can reduce some suprious overshoot events
observed in screen content coding.
Marco Paniconi [Mon, 16 Apr 2018 23:15:05 +0000 (16:15 -0700)]
vp9: Changes for scene detection overshoot and SVC.
Refactor the scene detection for 1 pass cbr to allow the
scene detection to be checked once per superframe (on the base layer),
using the full resolution sources.
If scene change is detected: check for re-encoding due to
large overshoot for all spatial layers withing the superframe.
Add speed feature to control the re-encode step.
Keep the re-encode step on for now.
Small change in nonrd_pickmode to remove the possible skip of golden
reference for SVC, when the high_source_sad is set for the superframe.
Change only affects SVC encoding with screen-content mode enabled.
Old vs New
Variance 64x64 time: 1145 ms 797 ms
Variance 64x32 time: 1200 ms 831 ms
Variance 32x32 time: 1228 ms 1135 ms
Variance 32x16 time: 1374 ms 1491 ms
Variance 16x16 time: 1688 ms 1571 ms
sse2 vs avx2
Variance 32x64 time: 1645 ms 957 ms
Variance 16x32 time: 2031 ms 1243 ms
Variance 16x8 time: 3071 ms 2275 ms
Old vs New
Variance 64x64 time: 197 ms 143 ms
Variance 64x32 time: 200 ms 146 ms
Variance 32x64 time: 203 ms 140 ms
Variance 32x32 time: 214 ms 152 ms
Variance 32x16 time: 243 ms 153 ms
Variance 16x32 time: 234 ms 197 ms
Variance 16x16 time: 205 ms 205 ms
Variance 16x8 time: 228 ms 222 ms
Variance 8x16 time: 228 ms 232 ms
Variance 8x8 time: 282 ms 240 ms
Variance 8x4 time: 506 ms 341 ms
Variance 4x8 time: 518 ms 415 ms
Variance 4x4 time: 604 ms 628 ms
Observed vp9 encoder speed up when encoding a 720p video.
Martin Storsjo [Sat, 14 Apr 2018 20:40:46 +0000 (23:40 +0300)]
configure: Test linking pthreads before using it
This avoids enabling pthreads if only pthreads-w32 is available.
pthreads-w32 provides pthread.h but has a link library with a
different name (libpthreadGC2.a).
Generally, always using win32 threads when on windows would be
sensible.
However, libstdc++ can be configured to use pthreads (winpthreads), and
in these cases, standard C++ headers can pollute the namespace with
pthreads declarations, which break the win32 threads headers that
declare similar symbols - leading us to prefer pthreads on windows
whenever available (see d167a1ae and bug 1132).
Marco Paniconi [Fri, 6 Apr 2018 16:17:46 +0000 (09:17 -0700)]
vp9-svc: Hybrid search on spatial layers whose base is key.
For spatial layers whose base is a key frame, i.e., when
svc.layer_context[cpi->svc.temporal_layer_id].is_key_frame = 1,
allow for hybrid search, similar to what we do on key frames.
For small blocks (<= 8x8) rd-based intra search will be used,
otherwise non-rd pick mode is used.
Feature is controlled by nonrd_keyframe, which is set to 1
for now on non-base spatial layers, so this change has
currently no effect.
Small change only when inter-layer prediction is off, as we now
call vp9_pick_intra_mode instead of vp9_pick_inter_mode on key frame.
But this change is very small/insignificant.
Reason for revert: <INSERT REASONING HERE>
We need to do this on all key frames in the stream (not just the first one). Will make another cleaner change for this.
Original change's description:
> vp9-svc: Fix to first superframe when inter_layer is off.
>
> When the application selects the setting INTER_LAYER_PRED_OFF
> each spatial stream should be decodeable separately.
> For this we need to force key frames on all spatial layers
> on the first superframe.
>
> In order to maintain the quality at the beginning of the stream
> the active_worst for spatial layer of the second superframe is set
> to the last_QP of the correspondng spatial layer of the first superframe.
> Also make sure nonrd_keyframe is set for non-base spatial layers.
>
> Change only affects SVC mode wit number_spatial_layers > 1 and
> svc->disable_inter_layer_pred == INTER_LAYER_PRED_OFF.
> And only affects first and second frame of sequence.
>
> Change-Id: I8ee9a0873ab1d3a02515774571f719617771ad41
Marco Paniconi [Wed, 4 Apr 2018 23:24:39 +0000 (16:24 -0700)]
vp9-svc: Fix to first superframe when inter_layer is off.
When the application selects the setting INTER_LAYER_PRED_OFF
each spatial stream should be decodeable separately.
For this we need to force key frames on all spatial layers
on the first superframe.
In order to maintain the quality at the beginning of the stream
the active_worst for spatial layer of the second superframe is set
to the last_QP of the correspondng spatial layer of the first superframe.
Also make sure nonrd_keyframe is set for non-base spatial layers.
Change only affects SVC mode wit number_spatial_layers > 1 and
svc->disable_inter_layer_pred == INTER_LAYER_PRED_OFF.
And only affects first and second frame of sequence.
Marco Paniconi [Thu, 5 Apr 2018 22:54:17 +0000 (15:54 -0700)]
vp9-svc: Fix to disable cyclic refresh on key superframes.
Cyclic refresh is disabled on key frames, but we did not
disable it for for spatial layers whose base is a key frame
(i.e., on a key-superframe).
This fix means generally somewhat lower frame-level QP will be
used for those spatial layers whose base is a key frame,
which will generally mean little better quality for the
key-superframes.
Johann [Tue, 27 Mar 2018 17:41:54 +0000 (10:41 -0700)]
ios configure: quiet shell warning
Generating file lists on a non-mac with:
--target=x86-iphonsimulator-gcc --enable-external-build
the lack of xcrun would cause a warning to print:
libvpx/build/make/configure.sh: line 1397: [: : integer expression expected
Marco Paniconi [Tue, 3 Apr 2018 22:50:19 +0000 (15:50 -0700)]
vp9-svc: Fix in choose_partitioning for different scaling.
In the SVC encoder LAST ref frame should be the last temporal
reference at the same resolution. This is the case for the default/fixed
patterns, but may not be the case for arbitrary pattern in flexible mode.
Add check that the LAST reference frame has same resolution as the current frame.
If the reference scale for LAST is different from current treat the current
frame as key frame just for the purpose of superblock partitioning.
This avoids potential segfault in vp9_int_pro_motion_estimation() for different
scaled reference.
paulwilkins [Thu, 29 Mar 2018 11:52:15 +0000 (12:52 +0100)]
Add extra case to wq_err_divisor()
Add extra case for 360P and smaller.
This hurts a little in psnr for the derf cif set but helps a little
in terms of average rate accuracy. Most clips come in a little
smaller with this patch.
Marco Paniconi [Wed, 28 Mar 2018 19:09:54 +0000 (12:09 -0700)]
vp9-svc: Modify logic for frame dropping with spatial layers.
In the constrained framedrop mode for svc: modify the buffer check
condition relative to (non-zero) dropmark to include uppper spatial layers,
in addition to the current spatial layer.
But keep the single layer check if the buffer goes below zero, since
in this case (buffer underflow) we should force drop of that layer
regardless of upper layers.
James Zern [Wed, 28 Mar 2018 19:42:27 +0000 (12:42 -0700)]
test: use testing::*tuple instead of std::tr1
googletest imports tuple into testing to allow for compatibility across
c++ versions where tuple may be in std::tr1 or std. fixes deprecation
warnings under visual studio 2017
Marco Paniconi [Mon, 26 Mar 2018 17:53:38 +0000 (10:53 -0700)]
vp9-svc: Allow for setting frame drop thresholds per layer.
Add encoder control to set the frame drop thresholds per
spatial layer, and add a frame drop mode: 0 = per-layer drop,
and 1 = constrained drop mode (a drop on a given layer forces
drops to all upper layers).
Default is mode 0 (per-layer dropping).
Implementation for mode 1 will come in subsequent change.
If the control is not used, then the spatial layer frame
drop thresholds (water mark) are all equal and set to the value
given by the encoder config (oxcf->drop_frames_water_mark).
Even on x86_64, emms has to be called if the x87 state has
been clobbered - the calling code (either within libvpx or
in a caller outside of libvpx) may be using the x87 instructions,
even though use of them isn't all that common on x86_64.
Martin Storsjö [Fri, 23 Mar 2018 19:49:40 +0000 (19:49 +0000)]
Merge changes from topic "llvm-mingw"
* changes:
configure: Add an arm64-win64-gcc target
test: Check for ARCH_X86_64 in addition to _WIN64
configure: Add an armv7-win32-gcc target
ads2gas: Add a -noelf option
Reason for revert: x87 instruction usage might not be as
clear cut as I would like. At the very least, llvm mingw
builds appear to having issues with emms.
Original change's description:
> remove fldcw/fstcw from Win64 builds
>
> _MCW_PC (Precision control) is not supported on x64:
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/control87-controlfp-control87-2
>
> The x87 FPU is not used on Win64 or ARM so setting the x87 control word
> is not necessary. The SSE/SSE2 and ARM FPUs don't have a precision
> control - the precision is embedded in each instruction - so the need to
> set the control word is also gone.
Johann [Mon, 5 Mar 2018 21:48:35 +0000 (13:48 -0800)]
remove fldcw/fstcw from Win64 builds
_MCW_PC (Precision control) is not supported on x64:
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/control87-controlfp-control87-2
The x87 FPU is not used on Win64 or ARM so setting the x87 control word
is not necessary. The SSE/SSE2 and ARM FPUs don't have a precision
control - the precision is embedded in each instruction - so the need to
set the control word is also gone.
Martin Storsjo [Wed, 21 Mar 2018 12:15:13 +0000 (14:15 +0200)]
configure: Add an arm64-win64-gcc target
This configuration doesn't require any extra custom settings, since
it only uses neon intrinsics that are handled automatically by the
compiler (no external assembly).
Martin Storsjo [Wed, 21 Mar 2018 12:12:04 +0000 (14:12 +0200)]
configure: Add an armv7-win32-gcc target
This builds for windows on arm, with llvm-mingw. The target triplet
is named -gcc since that's how similar existing targets are named,
even though it technically runs clang (via frontends named
"$CROSS-gcc").
Assemble using $CC -c since there's no standalone assembler
available (except perhaps llvm-mc).
Jerome Jiang [Tue, 20 Mar 2018 22:39:02 +0000 (15:39 -0700)]
vp9 svc frame drop: enable adaptive rd for row mt.
adaptive_rd_threshold_mt is set to 1 when speed >= 7 for SVC.
QVGA in SVC uses speed 5 which set adaptive_rd_threshold_mt to 0.
If VGA or HD is dropped for the last super frame, the flag is still 0
when the encoder is destroyed. Thus memory won't be released.
Linfeng Zhang [Thu, 22 Mar 2018 01:00:40 +0000 (18:00 -0700)]
Fix a strict-overflow warning
Compiler -- gcc (Debian 7.3.0-5) 7.3.0
./libvpx/vp9/encoder/vp9_denoiser.c:374:9: assuming signed overflow
does not occur when assuming that (X + c) < X is always false
[-Wstrict-overflow]
for (j = 0; j < xmis; j++) {