Yaowu Xu [Fri, 26 Oct 2012 16:14:15 +0000 (09:14 -0700)]
Improves subpixel reference mv evaluation
Previously, in evaluating reference motion vectors, MVs are always
rounded to integer pixel position and SADs are calculated. This
commit takes into account the subpixel portion of the mvs, and uses
bilinear interpolation to produce reference pixel values in subpixel
postions. In addition, SSE is used in place of SAD. Pixels used are
16x2 above and 2x16 to the left.
This commmit intends to test the potential of this line of work in
term of compression improvement, obviously, the change would increase
decoder complexity significantly.
Test results
std-hd: 1.738%(avg) 1.779%(glb), 1.663%(ssim)
derf: 0.472%(avg) 0.477%(glb), 0.418%(ssim)
Paul Wilkins [Thu, 25 Oct 2012 12:58:21 +0000 (13:58 +0100)]
Explicit MV reference experiment.
Coding and costing of mv reference signal.
Issues in updating MV ref with COMPANDED_MVREF_THRESH
to be resolved. Ideally the MV precision should be defined based
on absolute MV magnitude not as now the MV ref magnitude.
Update to mv counts moved into bitstream.c because otherwise
if the motion reference is changed at the last minute the encoder
and decoder get out of step in terms of the counts used to update
entropy probs.
Code working on a few test clips but no results yet re benefit vs
signaling cost and no tuning of red loop to test lower cost alternatives
based on the available reference values.
Patch 3. Added check to make sure we don't pick a reference
that would give rise to an uncodeable / out of range residual.
Patch 6-7: Attempt to rebase. OK to submit but best to leave flag off for now.
Scott LaVarnway [Fri, 26 Oct 2012 00:24:50 +0000 (17:24 -0700)]
Faster 8t filtering
Quickly modified the ssse3 sixtap filters to support eight taps. For the test
clip used, a 23+% boost in decoder performance was seen. We can
revisit later and improve further.
John Koleszar [Fri, 19 Oct 2012 22:35:36 +0000 (15:35 -0700)]
coef_probs: remove duplicate read/update code
Refactor per-transform copy & paste into a common function
update_coef_probs_common() and read_coef_probs_common(). The dry-run and
bit-writing loops in the encoder are still obvious candidates to be made
common, but they start to diverge a bit in the next commit, so are left
as-is for now.
Yunqing Wang [Wed, 24 Oct 2012 16:14:36 +0000 (09:14 -0700)]
Fix "_FORTIFY_SOURCE" redefined warning
On Ubuntu 12.04, we got the following warning message:
<command-line>:0:0: warning: "_FORTIFY_SOURCE" redefined
[enabled by default]
<built-in>:0:0: note: this is the location of the previous definition
This was already fixed in VP8 configure file. Did the same change in
experimental branch to stop this warning.
Deb Mukherjee [Mon, 22 Oct 2012 21:43:01 +0000 (14:43 -0700)]
Merging in the Switchable interp experiment
There is a macro DEFAULT_INTERP_FILTER defined in encoder/onyx_if.c that
is set as EIGHTTAP for now - so SWITCHABLE is not really used. Ideally,
this should be SWITCHABLE but that would make the encoder quite a bit slower.
We will change the default filter to SWITCHABLE once we find a faster way to
search for switchable filters.
Ronald S. Bultje [Mon, 22 Oct 2012 19:54:39 +0000 (12:54 -0700)]
Merge changes I02e7f64a,Ide954b00,Idc8b5977 into experimental
* changes:
Fix another typo in 4x4-transform-for-i8x8-intra-pred coeff contexts.
8x8 transform support in splitmv.
Use SPLITMV_PARTITIONING instead of a plain integer type.
Ronald S. Bultje [Mon, 22 Oct 2012 18:49:00 +0000 (11:49 -0700)]
8x8 transform support in splitmv.
For splitmv, where partitioning is 8x16, 16x8 or 8x8, this patch
uses the 8x8 transform (instead of the 4x4) if txfm_mode is
ALLOW_8X8 or ALLOW_16X16. For TX_MODE_SELECT, splitmv can indicate
which of the 2 transform sizes (4x4 or 8x8) it wants to use.
Gains (with hybridtx4x4/8x8/16x16 and tx_select experiments
enabled) on derf: +0.9%, HD: +0.4%, STD/HD: +0.8% (SSIM or overall
PSNR, both metrics show similar improvements).
Deb Mukherjee [Fri, 19 Oct 2012 22:12:12 +0000 (15:12 -0700)]
Allow B_VL_PRED & B_LD_PRED modes with Superblocks
Allows B_VL_PRED & B_LD_PRED modes to be used for all blocks
within a MB in B_PRED mode. These modes were temporarily
disabled with super-block coding.
Scott LaVarnway [Fri, 19 Oct 2012 22:52:12 +0000 (15:52 -0700)]
sse2 intrinsic version of vp8_mbloop_filter_vertical_edge()
First sse2 version of vp8_mbloop_filter_vertical_edge(). For now,
intrinsics are being used until the bitstream is finalized. This function
will be revisited later for further performance improvements.
For the test clip used, a 34+% decoder performance improvement
was seen. This will vary depending on material.
John Koleszar [Thu, 18 Oct 2012 23:27:30 +0000 (16:27 -0700)]
calculate probs consistently
There were several different methods for calculating bitstream
probabilities in use. Consolodate these into a pair of functions,
get_prob() and get_binary_prob().
John Koleszar [Thu, 18 Oct 2012 21:34:53 +0000 (14:34 -0700)]
lint-hunks: better support for working tree
When run with no arguments, report warnings in the diff between the
working tree and HEAD. With arguments, report warnings in the diff
between the named commit and its parents.
John Koleszar [Wed, 17 Oct 2012 23:47:38 +0000 (16:47 -0700)]
Remove bc, bc2 from pbi,cpi,xd
Pass the bool coder to be used explicitly. This avoids cases where two
different bool coders can be addressed from the same function. Also be
more consistent with bool coder variable naming, start to standardize
on 'bc'.
Deb Mukherjee [Mon, 15 Oct 2012 23:41:41 +0000 (16:41 -0700)]
Some cleanups and fixes.
Separates the logic on transform type selection previously spread out
over a number of files into a separate function. Currently the tx_type
field in b_mode_info is not used, but still left in there to eventually
use for signaling the transform type in the bitstream.
Also, now for tx_type = DCT_DCT, the regular integer DCT is used, as
opposed to the floating point DCT used in conjuction with hybrid
transform.
Results change somewhat due to the transform change, but are within
reasonable limits. The hd/std-hd sets are slightly up, while derf/yt
are slightly down.
Scott LaVarnway [Thu, 18 Oct 2012 21:29:26 +0000 (14:29 -0700)]
sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge()
First sse2 version of vp8_mbloop_filter_horizontal_edge(). For now,
intrinsics are being used until the bitstream is finalized. This function
will be revisited later for further performance improvements.
For the test clip used, a 31+% decoder performance improvement
was seen. This will vary depending on material.
John Koleszar [Thu, 18 Oct 2012 04:43:18 +0000 (21:43 -0700)]
lint-hunks: exit status for only affected lines
Prior to this patch, if there were any lint errors, this script would
exit with an error, even if those errors were not in the hunks being
tested by this script. This change makes it so that if any lint lines
are printed, an error is returned.
John Koleszar [Wed, 17 Oct 2012 16:38:13 +0000 (09:38 -0700)]
Move remaining per-frame data into partition 0
This commit moves a bit of data that ended up packed with the
modes/mv/residual partition during the change to interleaved encoding
into partition 0 where it belongs.
John Koleszar [Tue, 16 Oct 2012 20:52:39 +0000 (13:52 -0700)]
Interleave modes/residual per macroblock
Packs the bitstream with each mb's residual following its mode/mv
information.
TODO: There are still a few fields that should be packed into partition
0 but are included in partition 1, due to them being serialized from
write_kfmodes/pack_inter_mode_mvs, which execute after the first
partition is finalized. These need to be separated out into a separate
function, similar to mb_mode_mv_init() in decodemv.c.
John Koleszar [Tue, 16 Oct 2012 21:08:40 +0000 (14:08 -0700)]
Force interleaved decoding
Rather than decoding all modes/mvs separately, decode them per MB. This
forces the mode which was already used form the CONFIG_NEWBESTREFMV and
CONFIG_SUPERBLOCKS experiments, and is a precursor to changing to
interleaved encoding.