- Integrate 5 flip transform types for each 4x4, 8x8, and 16x16
block, for experiment, EXT_TX.
- Encoder speed improves about 12%-15%.
- Update the unit tests for bit-exact result against C.
This commit fixes issue 1141. The issue was triggered in multi-tile
encoding. The change properly saves and restores the block context
information in the real-time mode selection process. It removes
several redundant memcpy operations in sub8x8 intra block mode
search.
Increases number of wedges for smaller block and removes
wedge coding mode for blocks larger than 32x32.
Also adds various other enhancements for subsequent experimentation,
including adding provision for multiple smoothing functions
(though one is used currently), adds a speed feature that decides
the sign for interinter wedges using a fast mechanism, and refactors
wedge representations.
lowres: -2.651% BDRATE
Most of the gain is due to increase in codebook size for 8x8 - 16x16.
Angie Chiang [Mon, 16 May 2016 18:12:18 +0000 (18:12 +0000)]
Merge changes I6aa75c66,Id5f0fade,I368d365e,Ibaf7b00b into nextgenv2
* changes:
Refactor and add flip unit test to vp10_inv_txfm2d_test.cc
Add flip feature to vp10_inv_txfm2d.c
add unit test for highbd flip transform
Refactor vp10_fwd_txfm2d_test.cc
Yi Luo [Fri, 13 May 2016 17:08:13 +0000 (10:08 -0700)]
HBD inverse HT 4x4 SSE4.1 optimization
- Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
- Encoder overall instruction count drops 2.91%.
- Decoder overall instruction count drops 1.01%.
- Add unit test to test bit-exact result against C.
Jingning Han [Thu, 12 May 2016 21:30:23 +0000 (14:30 -0700)]
Refactor the inter predictor for supertx
This commit unifies the inter predictor used in supertx at both
encoder and decoder sides. It removes the redundant decoder
implementations related to border extension.
Jingning Han [Thu, 12 May 2016 19:50:12 +0000 (12:50 -0700)]
Unify the use of inter predictor in encoder and decoder
This commit unifies the inter predictor used in the encoder and
decoder side for super-tx experiment. This resolves an enc/dec
mismatch found in nextgenv2 nightly-run unit test.
Yunqing Wang [Thu, 5 May 2016 23:42:57 +0000 (16:42 -0700)]
Add decoder APIs and unit tests in tile-coding experiment
In the tile-coding experiment,
1. In tile decoder, added 2 set control APIs:
VP10_SET_DECODE_TILE_ROW and VP10_SET_DECODE_TILE_COL. It allowed
users to set the range of decoding at frame level.
2. Added a unit test while tile-coding experiment is on. It tested
both tile encoder and decoder to make sure the encoded frame
can be decoded as a whole frame or as independent tiles.
Jingning Han [Wed, 11 May 2016 22:50:58 +0000 (15:50 -0700)]
Fix highbd masked variance function declaration
Fix the variable type mismatch between highbd_calc_masked_var_t and
the actual function definiton. This clears the related compiler
warnings in highbd with ext-inter experiment.
Weighted single motion search is implemented for obmc predictor.
When NEWMV mode is used, to determine the MV for the current block,
we run weighted motion search to compare the weighted prediction
with (source - weighted prediction using neighbors' MVs), in which
the distortion is the actual prediction error of obmc prediction.
Coding gain: 0.404/0.425/0.366 for lowres/midres/hdres
Speed impact: +14% encoding time
(obmc w/o mv search 13%-> obmc w/ mv search 27%)
Sarah Parker [Thu, 5 May 2016 23:35:27 +0000 (16:35 -0700)]
Move new quant experiment in quant_common.c from nextgen
NEW_QUANT allows bin widths to be modified as a factor of the nominal
quantization step size. This adds functions to get dequantization
values based on the dequantization offset and 3 knots for a single
quantization profile.
Geza Lore [Tue, 10 May 2016 10:13:10 +0000 (11:13 +0100)]
Break tile row dependencies.
When not using ext-tile, there were still dependencies between tile
rows due to various tools (eg intra predictors) relying on the above
row or above mode info, which can be in the above tile. This is now
broken (the same way as it was when ext-tile is enabled) by fixing
the appropriate predicates.
Geza Lore [Mon, 9 May 2016 15:09:41 +0000 (16:09 +0100)]
Fix interintra predictor buffer overflow.
When constructing the intra predictor for rectangular interintra blocks,
the last row/column of the first square is copied back into the source
image (which is the current reconstructed image buffer) before
predicting the second square. The code used to use the height instead
of width for vertical rectangles, and vice versa for horizontal
rectangles, leading to overwriting the block on the right/below. This
leads to an encode/decode mismatch if the right/below block is in a
different tile and is encoded before the current block, which did happen
with multi-threaded encoding tests. This is now fixed.
Yunqing Wang [Mon, 9 May 2016 21:20:50 +0000 (14:20 -0700)]
Refine VP10 REFRESH_FRAME_CONTEXT_MODE
In VP10, REFRESH_FRAME_CONTEXT_OFF mode is only set when the error
resillient mode is on. Instead of being used to decide how to update
the frame contexts, it is used to decide if or not to reset the
frame contexts.
To verify, ran borg test on lowres set. The result is neutral.
Overall PSNR: -0.006%; SSIM: -0.006%.
Zoe Liu [Mon, 9 May 2016 18:16:28 +0000 (11:16 -0700)]
Turn on the use of upsampled refs for ext-refs
Without this patch, the experiment of ext-refs showed almost no coding
gains compared to the baseline. This is because when ext-refs is on, the
use of upsampled reference is off.
With this patch, the ext-refs experiment works with the upsampled
references and shows coding gains in Overall PSNR as follows, with ~5%
slow down for encoding time:
Note that the previous patch a912c6ec314d816767a4c3eb4e5e1bddcc4c1186
that "Make LAST_FRAME always point to the newly coded frame in ext-refs"
made ext-refs work with the upsampled refereces.
Yi Luo [Mon, 9 May 2016 17:36:40 +0000 (10:36 -0700)]
HBD hybrid transform 16x16 SSE4.1 optimization
- Tx_type: DCT_DCT, DCT_ADST, ADST_DCT, ADST_ADST.
- Update vp10_fht16x16_test.cc to do bit-exact test against
latest C version.
- HBD encoder speed improves ~1.8%.
Zoe Liu [Fri, 6 May 2016 00:32:51 +0000 (17:32 -0700)]
Make LAST_FRAME always point to the newly coded frame in ext-refs
This patch changes the encoder only for the ext-refs experiment. For
each newly coded frame to refresh the LAST_FRAME, the decoder is
notified that the LAST4_FRAME is to be refreshed, and read out the
updated reference frame buffer vitural indexes for the next coded
frame in a way that:
LAST4_FRAME => LAST_FRAME,
LAST_FRAME => LAST2_FRAME,
LAST2_FRAME => LAST3_FRAME, and
LAST3_FRAME => LAST4_FRAME.
Compared against the original ext-refs experiment in TOT, a small gain
is achieved in overall PSNR:
lowres Avg: -0.154
lowres BDRate: -0.044
Yi Luo [Thu, 5 May 2016 23:32:12 +0000 (16:32 -0700)]
Normalize naming/testing convention in vp10_fht8x8_test.cc
Use clear and correct type/function names.
Add ASM_REGISTER_STATE_CHECK wrapper for SSE4.1 function.
Conform macro EXPECT_EQ(expected, actual) convention.