Yue Chen [Thu, 18 Feb 2016 20:35:14 +0000 (12:35 -0800)]
Optimizing obmc rd decision by checking the real rd cost
Instead of using model_rd_for_sb() to estimate the cost and make the
decision on bmc/obmc, we use super_block_yrd/uvrd() to calculate and
compare the real rd costs of bmc and obmc.
Average bit-rate reduction(%) of obmc experiment:
derflr/derfhd/hevcmr/hevchd
2.353/TBD/TBD/TBD
Before the optimization, the coding gain was:
1.582/1.109/1.600/1.164
Note: there is still some mysterious bug because that compared to
the previous version, the performance at low bit rate drops a lot.
Yaowu Xu [Mon, 22 Feb 2016 20:58:38 +0000 (12:58 -0800)]
Add shift stage in FASTSSIM computation
This commits adds a shift stage for FASTSSIM computaton when source
bit depth is different from working bit depth, to make sure metric
results are calculated in bit_depth consistent with source.
Yaowu Xu [Mon, 22 Feb 2016 18:22:42 +0000 (10:22 -0800)]
Add shift stage for PSNRHVS computation
This commit adds the ability to shift down the working buffer when
source bit_depth is different than working bit_depth. It does so
by shift down to be consistent with source bit_depth.
Jingning Han [Fri, 19 Feb 2016 19:53:24 +0000 (11:53 -0800)]
Unify motion vector cost system
This commit unifies the motion vector cost buffers for full pixel
and sub-pixel motion search. The new motion vector coding system
provides 0.5% coding gains for 720p and above sequences and 0.2%
for lower resolution sets.
Angie Chiang [Sat, 20 Feb 2016 03:31:38 +0000 (19:31 -0800)]
Fix 12 TAP convolution bug
Priviously, we do 12-tap interpolation even there is no sub pixel,
This could cause a bug becuase decoder doesn't extend border when there
is no sub pixel. In this situation, if we still do interpolation, we
will access the border extension which doesn't exist and cause a
memory error
Yue Chen [Tue, 16 Feb 2016 20:22:11 +0000 (12:22 -0800)]
Fixing a bug in obmc prediction in the rd loop
This bug made the rd loop use one-side obmc (compound of the current
predictor and the predictors of the left mi's, while the above ones
are ignored by mistake) to determine whether to use obmc. This fix
improved the compression performance by ~0.6% on different test sets.
Coding gain (%) of obmc experiment on derflr/derfhd/hevcmr/hevchd:
1.568/TBD/1.628/TBD
Marco [Tue, 16 Feb 2016 16:40:23 +0000 (08:40 -0800)]
vp9-real-time mode: Fix condition for allowing reference masking.
Add frame-level condition for reference masking: under external or
internal dynamic resize, allow for reference masking if none of
the references have been scaled.
Peviously, reference masking was turned off for the stream if dynamic
resize feature was enabled or an external resize event occurred.
reference_masking gives speed up with little/no loss in compression.
For speed 7 on rtc set: encoding time decreases by about 5-7%,
avgPSNR/SSIM goes down ~0.2%.
Jingning Han [Thu, 11 Feb 2016 20:36:49 +0000 (12:36 -0800)]
Fix tsan error in VP9 sub8x8 intra mode search
This commit fixes issue 1141. The issue was triggered in multi-tile
encoding. The change properly saves and restores the block context
information in the real-time mode selection process. It removes
several redundant memcpy operations in sub8x8 intra block mode search.
Geza Lore [Fri, 12 Feb 2016 16:04:35 +0000 (16:04 +0000)]
Add optimized vpx_sum_squares_2d_i16 for vp10.
Using this we can eliminate large numbers of calls to predict intra,
and is also faster than most of the variance functions it replaces.
This is an equivalence transform so coding performance is unaffected.
Encoder speedup is approx 7% when var_tx, super_tx and ext_tx are all
enabled.
Yue Chen [Wed, 27 Jan 2016 22:18:53 +0000 (14:18 -0800)]
Overlapped block motion compensation experiment
In this experiment, an obmc inter prediction mode is enabled for
>= 8X8 inter blocks. When the obmc flag is on, the regular block-
based motion compensation will be refined by using predictors of
the above and left blocks.
Fixed some compatibility issues with vp9_highbitdepth, supertx,
ref_mv, and ext_interp.
Coding gain (%) on derflr/hevcmr/hevchd
OBMC:
1.047/1.022/0.708
OBMC + SUPERTX:
1.652/1.616/1.137
SUPERTX:
0.862/0.779/0.630
Adds a wiener filter based restoration scheme in loop which can
be optionally selected instead of the bilateral filter.
The LMMSE filter generated per frame is a separable symmetric 7
tap filter. Three parameters for each of horizontal and vertical
filters are transmitted in the bitstream. The fourth parameter
is obtained assuming the sum is normalized to 1.
Also integerizes the bilateral filters, along with other
refactoring necessary in order to support the new switchable
restoration type framework.
derflr: -0.75% BDRATE
[A lot of videos still prefer bilateral, however since many frames
now use the simpler separable filter, the decoding speed is
much better].
Further experiments to follow, related to replacing the bilateral.
James Zern [Fri, 12 Feb 2016 02:09:33 +0000 (18:09 -0800)]
vp10_receive_raw_frame: add missing setjmp
allocations done within this function are protected with
vpx_internal_error; adding the setjmp fixes a crash in
vp10_lookahead_push() under low memory conditions.
James Zern [Fri, 12 Feb 2016 02:05:31 +0000 (18:05 -0800)]
vp9_receive_raw_frame: add missing setjmp
allocations done within this function are protected with
vpx_internal_error; adding the setjmp fixes a crash in
vp9_lookahead_push() under low memory conditions.
Yaowu Xu [Fri, 5 Feb 2016 21:03:47 +0000 (13:03 -0800)]
Enable computing PSNRHVS for hbd build
This commit adds computation of PSNRHVS for highbitdepth build, it
also adds tests to make sure the calculation of psnrhvs metric for
10 and 12 bit correct.
Jingning Han [Thu, 11 Feb 2016 17:41:06 +0000 (09:41 -0800)]
Align rate-distortion cost metric for chroma compoments
This commit aligns the rate-distortion metrics for both luma and
chroma components in super transform rate-distortion optimization.
It improves the coding gains due to var-tx and supertx experiments
by 0.2% for high resolution test sets.
Alex Converse [Wed, 10 Feb 2016 22:12:26 +0000 (14:12 -0800)]
Port switch to 9-bit rate cost to vp10.
Brings the following commits to vp10: 269428e Tie the bit cost scale to a define. d13385c Switch to 9-bit rate cost constants built on a 256 probability denominator. ad43a73 Fix a signed overflow in vp9 motion cost. 1c9b091 Fix some interger overflow errors fac947d Restore previous motion search bit-error scale.
Marco [Thu, 11 Feb 2016 00:59:09 +0000 (16:59 -0800)]
vp9-resize: Force reference masking off for external dynamic-resizing.
An issue exists with reference_masking in non-rd pickmode for spatial
scaling. It was kept off for internal dynamic resizing and svc, this
change is to keep it off also for external dynamic resizing.
Update to external resize test, and update TODO to re-enable this
at frame level when references have same scale as source.
Yaowu Xu [Fri, 5 Feb 2016 19:30:12 +0000 (11:30 -0800)]
Enable computing of FastSSIM for HBD build
This commit adds the computation of fastSSIM for highbitdepth build,
it also modifies the hbdmetric test to be more generic and applicable
for fastSSIM.
The 255 used for calculating ssim constants c1 and c2 is not exactly
scaled by 4x and 16x to 1023 and 4095, therefore requries the metric
test to have a thresold more tolerant than 0, currently at 0.03dB.
Yue Chen [Wed, 10 Feb 2016 23:53:08 +0000 (15:53 -0800)]
Adding the config tag for the OBMC experiment
obmc: We add an obmc prediction mode at superblock level.
When it is enabled, predictors of the above and left blocks
are used to refine the regular block-based motion compensation.
Marco [Wed, 10 Feb 2016 23:18:26 +0000 (15:18 -0800)]
vp9 resize_test: Enable resize_allowed in real-time ExternalResize test.
For dynamic resizing (whether the new codec size is determined internally
or externally set by user), we should for now keep rc.resize_allowed enabled.
This prevent the use of referene_masking for real-time mode
(in set_rt_speed_feature()).