Ronald S. Bultje [Sat, 17 Nov 2012 06:26:12 +0000 (22:26 -0800)]
Remove special-case inline detokenization in b_pred reconstruction.
Just like for all other block modes, b_pred tokens can be read together
before starting macroblock reconstruction. This removes special cases
for b_pred in decode_macroblock() and allows to make decode_coefs_4x4()
static in detokenize.c.
While at it, remove the redundant handling and checking of plane_type
and block_index (i) in decode_coefs_4x4(). Since the function is static,
and is called only from decode_mb_tokens_4x4(), we don't need to worry
that the arguments ever go out of sync.
Paul Wilkins [Fri, 16 Nov 2012 16:31:32 +0000 (16:31 +0000)]
Further experimentation with the mode context
Experiments with a larger set of contexts and some
clean up to replace magic numbers regarding the
number of contexts.
The starting values and rate of backwards adaption
are still suspect and based on a small set of tests.
Added forwards adjustment of probabilities.
The net result of adding the new context and forward
update is small compared to the old context from the
legacy find_near function. (down a little on derf but
up by a similar amount for HD)
HOWEVER.... with the new context and forward update
the impact of disabling the reverse update (which may be
necessary in some use cases to facilitate parallel decoding)
is hugely reduced.
For the old context without forward update, the impact of
turning off reverse update (Experiment was with SB off) was
Derf - 0.9, Yt -1.89, ythd -2.75 and sthd -8.35. The impact was
mainly at low data rates.
With the new context and forward update enabled the impact
for all the test sets was no more than 0.5-1% (again most at
the low end).
Yaowu Xu [Fri, 16 Nov 2012 14:31:53 +0000 (06:31 -0800)]
changed mv candidate search for superblocks
added additional motion vectors at close neighborhood of a superblock
to the list of candiate motion vectors, and removed a couple that are
further away.
The change helped std-hd set about .8% (all metrics) and smaller gain
for derf set.
Deb Mukherjee [Wed, 7 Nov 2012 14:50:25 +0000 (06:50 -0800)]
Compound inter-intra experiment
A patch on compound inter-intra prediction.
In compound inter-intra prediction, a new predictor for
16x16 inter coded MBs are obtained by combining a single
inter predictor with a 16x16 intra predictor, in a manner
that the weight varies with distance from the top/left
boundary. The current search strategy is to combine the best
inter mode with the best intra mode obtained independently.
Results so far:
derf +0.31%
yt +0.32%
std-hd +0.35%
hd +0.42%
It is conceivable that the results would improve somewhat
with a more thorough search strategy where all intra modes
are searched given the best mv, or even a joint search for
the best mv and the best intra mode.
Yaowu Xu [Thu, 15 Nov 2012 17:55:36 +0000 (09:55 -0800)]
changed asm obj output filenames in MSVC build
this commit changed the asm file compiling in MSVC to use individually
customized build command line with object filename specified for each
input file. This allows object filenames prefixed with path name, and
avoid name collision in link time
John Koleszar [Wed, 31 Oct 2012 20:13:19 +0000 (13:13 -0700)]
make: flatten object file directories
Rather than building an object file directory heirarchy matching the
source tree's layout, rename the object files so that the object
file name contains the path in the source file tree. The intent here
is to allow two files in different parts of the source tree to have
the same name and still not collide when put into an ar archive.
John Koleszar [Wed, 14 Nov 2012 17:51:23 +0000 (09:51 -0800)]
detokenize: use SEG_LVL_EOB feature consistently
Update decode_coefs() to break when c >= eob, since it's possible that
c starts the loop from 1 and eob is 0. The loop won't terminate in that
case.
Add new get_eob() function to consistently clamp the eob based on the
segment level EOB and the block size. It's possible to code a segment
level EOB that's greater than the block size, and that leads to an
out of bounds access.
Ronald S. Bultje [Tue, 13 Nov 2012 20:09:02 +0000 (12:09 -0800)]
Don't use hybrid transform (ADST) for superblocks.
This is in line with other cases where we disable ADST if prediction
size and transform size don't match. Before this patch, the RD loop
will use ADST for superblocks, but frame encoding/decoding won't.
John Koleszar [Tue, 13 Nov 2012 23:20:40 +0000 (15:20 -0800)]
Don't write recon.yuv by default
CONFIG_DEBUG was turning on some code to dump the reconstructed frame
to a buffer from within the decoder. Move this code to a more specific
debugging define.
Paul Wilkins [Mon, 12 Nov 2012 15:09:25 +0000 (15:09 +0000)]
New inter mode context
This change is a fix / extension of the newbestrefmv
experiment. As such it is presented without IFDEF.
The change creates a new context for coding inter modes
in vp9_find_mv_refs(). This replaces the context that
was previously calculated in vp9_find_near_mvs().
The new context is unoptimized and not necessarily
any better at this stage (results pending), but eliminates
the need for a legacy call to vp9_find_near_mvs().
Based on numbers from Scott, this could help decode
speed by several %.
In a later patch I will add support for forward update of
context (assuming this helps) and refine the context as
necessary.
Deb Mukherjee [Tue, 9 Oct 2012 20:19:15 +0000 (13:19 -0700)]
New b-intra mode where direction is contextual
Preliminary patch on a new 4x4 intra mode B_CONTEXT_PRED where the
dominant direction from the context is used to encode. Various decoder
changes are needed to support decoding of B_CONTEXT_PRED in conjunction
with hybrid transforms since the scan order and tokenization depends on
the actual direction of prediction obtained from the context. Currently
the traditional directional modes are used in conjunction with the
B_CONTEXT_PRED, which also seems to provide the best results.
Packing Altref along with succeeding frame and length encoding frames
The altref frame is packed along with the next P frame. So that
outside of the codec there are now only two types of frames P and I.
Also, now it is one frame in and one frame out with respect to the
codec. Apart from that, all the frames are length encoded with the
length of each frame appended to the frame itself. There are
two categories of frames and each of them will look as follows:
- Packed frames (an altref along with the succeeding p frame)
- altref_frame_data | altref_lenngth | frame_data | length
- Unpacked frames (all frames other than the above)
- frame_data | length
Also split superblock handling code out of decode_macroblock() into
a new function decode_superblock(), for easier readability.
Derf +0.05%, HD +0.2%, STDHD +0.1%. We can likely get further gains
by allowing to select mb_skip_coeff for a subset of the complete SB
or something along those lines, because although this change allows
coding smaller transforms for bigger predictors, it increases the
overhead of coding EOBs to skip the parts where the residual is
near-zero, and thus the overall gain is not as high as we'd expect.
Yunqing Wang [Wed, 7 Nov 2012 00:06:22 +0000 (16:06 -0800)]
Optimize 16x16 dequant and idct
As suggested by Yaowu, simplified 16x16 dequant and idct. In decoder,
after detoken step, we know the number of non-zero dct coefficients
(eobs) in a macroblock. Idct calculation can be skipped or simplified
based on eobs, which improves the decoder performance.
John Koleszar [Wed, 7 Nov 2012 00:59:01 +0000 (16:59 -0800)]
Rough merge of master into experimental
Creates a merge between the master and experimental branches. Fixes a
number of conflicts in the build system to allow *either* VP8 or VP9
to be built. Specifically either:
VP9 still exports its symbols and files as VP8, so that will be
resolved in the next commit.
Unit tests are broken in VP9, but this isn't a new issue. They are
fixed upstream on origin/experimental as of this writing, but rebasing
this merge proved difficult, so will tackle that in a second merge
commit.
James Zern [Wed, 7 Nov 2012 00:58:11 +0000 (16:58 -0800)]
Fix variance (signed integer) overflow
In the variance calculations the difference is summed and later squared.
When the sum exceeds sqrt(2^31) the value is treated as a negative when
it is shifted which gives incorrect results.
To fix this we force the multiplication to be unsigned.
The alternative fix is to shift sum down by 4 before multiplying.
However that will reduce precision.
For 16x16 blocks the maximum sum is 65280 and sqrt(2^31) is 46340 (and
change).
This change is based on: 1698234 Missed some variance casts fea3556 Fix variance overflow
James Zern [Tue, 6 Nov 2012 02:13:04 +0000 (18:13 -0800)]
fix test builds
s/([vV][pP])8/$19/
additionally dct.h was removed; declare the _c functions that are used
in the tests. the TODO for conversion to parameterized tests still
remains.
Yaowu Xu [Mon, 5 Nov 2012 22:22:59 +0000 (14:22 -0800)]
silent a lot of MSVC compiler warnings
there are still a couple type of warning left, which are related to
double constants assigned to float type. As those would be addressed
by the conversion of transforms into integer version. This commit
has left those un-dealt with.
James Zern [Mon, 5 Nov 2012 20:50:16 +0000 (12:50 -0800)]
rdopt: fix use of uninitialized value in addition
rd_pick_intra4x4mby_modes / rd_pick_intra8x8mby_modes would both use the
input value of 'rate_y' in the return calculation. In many places this
value is uninitialized. Remove the unneeded sum.
Yunqing Wang [Fri, 2 Nov 2012 20:06:51 +0000 (13:06 -0700)]
Fix eobs data type
The block sizes for decoding tokens are up to 16x16, which means
eobs is within [0, 256]. Using (signed) char is not enough. Changed
eobs data type to unsigned short to fix the problem.