Jingning Han [Fri, 29 May 2015 06:16:35 +0000 (23:16 -0700)]
Allow vpxdec to produce target tile reconstruction
This commit allows the vpxdec to produce the target tile
reconstruction as its output. When the provided tile index is -1,
the entire corresponding row/column tiles will be reconstructed.
If the tile index is over the upper limit, the decoder will decode
left-most/bottom tile.
Jingning Han [Wed, 27 May 2015 17:48:26 +0000 (10:48 -0700)]
Enable selective key frame tile decoding
This commit allows the decoder to decode selective tiles according
to decoder configuration settings. To decode a single tile out of
the provided key frame bit-stream (test_kf.webm), set compiler
configuration:
Julia Robson [Thu, 21 May 2015 16:53:14 +0000 (17:53 +0100)]
Intrabc high bit depth functionality
Fixing compile errors when intrabc and vp9_highbitdepth are both enabled.
Error arose because intrabc uses vp9_setup_scale_factors_for_frame()
which takes an additional argument (use_highbitdepth) when
vp9_highbitdepth is enabled.
With this patch, the ZEROMV mode is overloaded to represent
a single global dominant motion using one of three models:
1. True zero translation motion (as before)
2. A translation motion different from 0
3. A Rotation-zoom affine model where the predictor is warped
The actual model used is indicated at the frame level for
each reference frame.
A metric that computes the ratio of the error with a global
non-zero model to the error for zero motion, is used to
determine on the encoder side whether to use one of the two
non-zero models or not.
Jingning Han [Tue, 26 May 2015 19:29:49 +0000 (12:29 -0700)]
Make the tile coding syntax support large scale tile decoding
This commit makes the bit-stream syntax support fast selective tile
decoding in a large scale tile array. It reduces the computational
complexity of computing the target tile offset in the bit-stream
from quadratic to linear scale, while maintaining relatively small
stack space requirement (in the order of 1024 bytes instead of 1M
bytes). The overhead cost due to tile separation remains identical.
Zoe Liu [Thu, 7 May 2015 09:03:39 +0000 (02:03 -0700)]
Refined the mv ref candidate selection
Now this is an on-going work on the re-work on the motion vector
references for all Inter coding modes. Currently implementation on
sub8x8 and BLOCK_8X8 is done. More work will be added along the way.
Essetial ideas include:
(1) Added new nearestmv algorithm through adaptive median search, out of
the four nearest neighbors: TOP, LEFT, TOPLEFT, and TOPRIGHT;
(2) Added new sheme for sub8x8 to obtain the mv ref candidates,
specially, mv ref candidates are obtained depending on the sub8x8 mode
of the current block as well as the sub8x8 mode of the neighboring
block;
(3) Added top right corner mv ref candidate whenever it is available;
The adding of the top right mv ref has showed potential in helping such
video clips as bridge_far_cif.
Julia Robson [Thu, 21 May 2015 09:45:15 +0000 (10:45 +0100)]
Loop_postfilter high bit depth functionality
Adding a high bit depth version of the function loop_bilateral_filter()
and ensuring the appropriate version of this function is called when
vp9_highbitdepth is enabled.
If a non-skipped block has all transform blocks with only 0 data, then
decoder infers skip flag. This affects the loopfilter. No real encoder
would do this though, so it is pointless. Also, it causes headaches in
HW implmentations as the loop filter cannot proceed until all TX blocks
in the block have been checked. There could be up to 768 of them in
64x64 4:4:4 with 4x4 transform.
Jingning Han [Thu, 21 May 2015 19:44:29 +0000 (12:44 -0700)]
Enable arbitrary tile size support
This commit allows the encoder to process tile coding per 64x64
block. The supported upper limit of tile resolution is the minimum
of frame size and 4096 in each dimension. To turn on, set
--experiment --row-tile
and compile.
It overwrite the old --tile-columns and --tile-rows configurations.
These two parameters now tell the encoder the width and height of
tile in the unit of 64x64 block. For example,
--tile-columns=1 --tile-rows=1
will make a tile contains a single 64x64 block.
Jingning Han [Wed, 20 May 2015 17:15:25 +0000 (10:15 -0700)]
Refactor internal tile_info variables to support 1024 tiles
Move the 2D tile info arrays as global variables. This resolves
the local function stack overflow issue due to excessively large
tile info variables. This allows the internal operation to support
up to 1024 row and column tiles.
Jingning Han [Wed, 20 May 2015 00:17:28 +0000 (17:17 -0700)]
Support up to 64 row tile coding
This commit allows the codec to use up to row tiles (optionally
in combination with up to 64 column tiles per row tile). The
minimum tile size is set to be 256x256 pixel block.
Jingning Han [Tue, 19 May 2015 22:21:46 +0000 (15:21 -0700)]
Rename variables in tile info decoding
The max and min tile number reference should be used to support
both row and column tiles. This commit renames the previous col
prefix to avoid confusion.
Julia Robson [Mon, 18 May 2015 10:52:44 +0000 (11:52 +0100)]
Fix mismatch in handling of 8x4/4x8 blocks with supertx
Test VP9/EndToEndTestLarge.EndtoEndPSNRTest/1 (422 stream) failed when
supertx enabled. This was because 4x8 and 8x4 blocks were not being
split into 4x4s during tokenization in the encoder. This patch
uses vp9_foreach_transformed_block() to fix this.
Zoe Liu [Sat, 11 Apr 2015 00:08:31 +0000 (17:08 -0700)]
Cleaned mv search code and added a few fixes on the experiments
Besides code cleaning, this patch contains 3 fixes:
(1) Fixed the COMPOUND_MODES for the NEW_NEWMV mode;
(2) Fixed the joint search when the NEAR_FORNEWMV mode (in NEWMVREF)
is being evaluated;
(3) Fixed the WEDGE_PARTITION when the NEAR_FORNEWMV mode (in NEWMVREF)
is being evaluated.
(4) Adjusted the entropy probability value for NEAR_FORNEW mode.
On derflr turning on all 14 experiments (except for global-motion), the
average gain w.r.t. PSNR is +0.07%:
Maximum on bridge_far_cif: +1.02%
Minimum on hallmonitor_cif: -0.16%
hui su [Tue, 21 Apr 2015 21:10:19 +0000 (14:10 -0700)]
Optimize entropy coding of non-transform tokens
Use separate token probabilities and counters for non-transform
blocks (pixel domain) . Initial probabilities are trained with screen_content
clips. On screen_content, it improves coding performance by about
2% (from +16.4% to +18.45%).
The initial probabilities are not optimized for natural videos. So it should
not be used for natural videos. Set FOR_SCREEN_CONTENT as 0/1 to specify
whether or not to enable this patch.
Implements a first version of global motion where the
existing ZEROMV mode is converted to a translation only
global motion mode.
A lot of the code for supporting a rotation-zoom affine
model is also incorporated.
WIP.
Peter de Rivaz [Thu, 11 Dec 2014 15:54:23 +0000 (15:54 +0000)]
Corrected optimization of 8x8 DCT code
The 8x8 DCT uses a fast version whenever possible.
There was a mistake in the checking code which
meant sometimes the fast version was used when it
was not safe to do so.
Peter de Rivaz [Mon, 10 Nov 2014 16:17:49 +0000 (16:17 +0000)]
Fixed idct16x16_10 highbitdepth transform
In the case when there are only non-zero coefficients
in the first 4x4 block a special routine is called.
The highbitdepth optimized version of this routine
examined the wrong positions when deciding whether
to call an assembler or C inverse transform.
Peter de Rivaz [Fri, 24 Oct 2014 07:37:39 +0000 (08:37 +0100)]
Refactored idct routines and headers
This change is made in preparation for a
subsequent patch which adds acceleration
for the highbitdepth transform functions.
The highbitdepth transform functions attempt
to use 16/32bit sse instructions where possible,
but fallback to using the C implementations if
potential overflow is detected. For this reason
the dct routines are made global so they can be
called from the acceleration functions in the
subsequent patch.
Zoe Liu [Thu, 30 Apr 2015 00:47:45 +0000 (17:47 -0700)]
Changed nearmv for one of the sub8x8 partitions
It is a minor change, but the essential idea is to use the mv of the
top right block as the nearmv for the bottom left partition in the
sub8x8 block. The change is under the experiment of NEWMVREF.
When all 13 experiments are on (except for INTRABC), the gain is +0.05%:
Worse on bowing_cif: -0.17%
Best on foreman_cif: +0.42%; and bridge_far_cif: +0.40%
The total 13 experiments achieved a gain of +6.97% against base.
Alex Converse [Thu, 30 Apr 2015 19:52:36 +0000 (12:52 -0700)]
tx_skip: Avoid undefined shift behavior.
vp9_quantize_rect did illegal shifts but didn't use the results.
The shift |a << b| is unfortunately undefined if |a < 0|, but the
more verbose |a * (1 << b)| generates the same machine code.
James Zern [Sat, 14 Mar 2015 01:49:03 +0000 (18:49 -0700)]
usage.dox: fix doxygen warnings in 1.8.x
use \li to denote list items with \if.
fixes the following likely visible in <1.8.3:
usage.dox: warning: Invalid list item found
usage.dox: warning: End of list marker found without any preceding list items
* Uses double for RD cost computation to guard against overflow
for large resolution frames.
* Use previous frame's filter level to code the level better.
* Change precision of the filter parameters.
* Allow spatial variance for x and y to be different
Change-Id: I1669f65eb0ab1e8519962954c92d59e04f1277b7
derflr: +0.556% (a little up from before)