Paul Wilkins [Thu, 25 Oct 2012 12:58:21 +0000 (13:58 +0100)]
Explicit MV reference experiment.
Coding and costing of mv reference signal.
Issues in updating MV ref with COMPANDED_MVREF_THRESH
to be resolved. Ideally the MV precision should be defined based
on absolute MV magnitude not as now the MV ref magnitude.
Update to mv counts moved into bitstream.c because otherwise
if the motion reference is changed at the last minute the encoder
and decoder get out of step in terms of the counts used to update
entropy probs.
Code working on a few test clips but no results yet re benefit vs
signaling cost and no tuning of red loop to test lower cost alternatives
based on the available reference values.
Patch 3. Added check to make sure we don't pick a reference
that would give rise to an uncodeable / out of range residual.
Patch 6-7: Attempt to rebase. OK to submit but best to leave flag off for now.
Scott LaVarnway [Fri, 26 Oct 2012 00:24:50 +0000 (17:24 -0700)]
Faster 8t filtering
Quickly modified the ssse3 sixtap filters to support eight taps. For the test
clip used, a 23+% boost in decoder performance was seen. We can
revisit later and improve further.
John Koleszar [Fri, 19 Oct 2012 22:35:36 +0000 (15:35 -0700)]
coef_probs: remove duplicate read/update code
Refactor per-transform copy & paste into a common function
update_coef_probs_common() and read_coef_probs_common(). The dry-run and
bit-writing loops in the encoder are still obvious candidates to be made
common, but they start to diverge a bit in the next commit, so are left
as-is for now.
Yunqing Wang [Wed, 24 Oct 2012 16:14:36 +0000 (09:14 -0700)]
Fix "_FORTIFY_SOURCE" redefined warning
On Ubuntu 12.04, we got the following warning message:
<command-line>:0:0: warning: "_FORTIFY_SOURCE" redefined
[enabled by default]
<built-in>:0:0: note: this is the location of the previous definition
This was already fixed in VP8 configure file. Did the same change in
experimental branch to stop this warning.
Deb Mukherjee [Mon, 22 Oct 2012 21:43:01 +0000 (14:43 -0700)]
Merging in the Switchable interp experiment
There is a macro DEFAULT_INTERP_FILTER defined in encoder/onyx_if.c that
is set as EIGHTTAP for now - so SWITCHABLE is not really used. Ideally,
this should be SWITCHABLE but that would make the encoder quite a bit slower.
We will change the default filter to SWITCHABLE once we find a faster way to
search for switchable filters.
Ronald S. Bultje [Mon, 22 Oct 2012 19:54:39 +0000 (12:54 -0700)]
Merge changes I02e7f64a,Ide954b00,Idc8b5977 into experimental
* changes:
Fix another typo in 4x4-transform-for-i8x8-intra-pred coeff contexts.
8x8 transform support in splitmv.
Use SPLITMV_PARTITIONING instead of a plain integer type.
Ronald S. Bultje [Mon, 22 Oct 2012 18:49:00 +0000 (11:49 -0700)]
8x8 transform support in splitmv.
For splitmv, where partitioning is 8x16, 16x8 or 8x8, this patch
uses the 8x8 transform (instead of the 4x4) if txfm_mode is
ALLOW_8X8 or ALLOW_16X16. For TX_MODE_SELECT, splitmv can indicate
which of the 2 transform sizes (4x4 or 8x8) it wants to use.
Gains (with hybridtx4x4/8x8/16x16 and tx_select experiments
enabled) on derf: +0.9%, HD: +0.4%, STD/HD: +0.8% (SSIM or overall
PSNR, both metrics show similar improvements).
Deb Mukherjee [Fri, 19 Oct 2012 22:12:12 +0000 (15:12 -0700)]
Allow B_VL_PRED & B_LD_PRED modes with Superblocks
Allows B_VL_PRED & B_LD_PRED modes to be used for all blocks
within a MB in B_PRED mode. These modes were temporarily
disabled with super-block coding.
Scott LaVarnway [Fri, 19 Oct 2012 22:52:12 +0000 (15:52 -0700)]
sse2 intrinsic version of vp8_mbloop_filter_vertical_edge()
First sse2 version of vp8_mbloop_filter_vertical_edge(). For now,
intrinsics are being used until the bitstream is finalized. This function
will be revisited later for further performance improvements.
For the test clip used, a 34+% decoder performance improvement
was seen. This will vary depending on material.
John Koleszar [Thu, 18 Oct 2012 23:27:30 +0000 (16:27 -0700)]
calculate probs consistently
There were several different methods for calculating bitstream
probabilities in use. Consolodate these into a pair of functions,
get_prob() and get_binary_prob().
John Koleszar [Thu, 18 Oct 2012 21:34:53 +0000 (14:34 -0700)]
lint-hunks: better support for working tree
When run with no arguments, report warnings in the diff between the
working tree and HEAD. With arguments, report warnings in the diff
between the named commit and its parents.
John Koleszar [Wed, 17 Oct 2012 23:47:38 +0000 (16:47 -0700)]
Remove bc, bc2 from pbi,cpi,xd
Pass the bool coder to be used explicitly. This avoids cases where two
different bool coders can be addressed from the same function. Also be
more consistent with bool coder variable naming, start to standardize
on 'bc'.
Deb Mukherjee [Mon, 15 Oct 2012 23:41:41 +0000 (16:41 -0700)]
Some cleanups and fixes.
Separates the logic on transform type selection previously spread out
over a number of files into a separate function. Currently the tx_type
field in b_mode_info is not used, but still left in there to eventually
use for signaling the transform type in the bitstream.
Also, now for tx_type = DCT_DCT, the regular integer DCT is used, as
opposed to the floating point DCT used in conjuction with hybrid
transform.
Results change somewhat due to the transform change, but are within
reasonable limits. The hd/std-hd sets are slightly up, while derf/yt
are slightly down.
Scott LaVarnway [Thu, 18 Oct 2012 21:29:26 +0000 (14:29 -0700)]
sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge()
First sse2 version of vp8_mbloop_filter_horizontal_edge(). For now,
intrinsics are being used until the bitstream is finalized. This function
will be revisited later for further performance improvements.
For the test clip used, a 31+% decoder performance improvement
was seen. This will vary depending on material.
John Koleszar [Thu, 18 Oct 2012 04:43:18 +0000 (21:43 -0700)]
lint-hunks: exit status for only affected lines
Prior to this patch, if there were any lint errors, this script would
exit with an error, even if those errors were not in the hunks being
tested by this script. This change makes it so that if any lint lines
are printed, an error is returned.
John Koleszar [Wed, 17 Oct 2012 16:38:13 +0000 (09:38 -0700)]
Move remaining per-frame data into partition 0
This commit moves a bit of data that ended up packed with the
modes/mv/residual partition during the change to interleaved encoding
into partition 0 where it belongs.
John Koleszar [Tue, 16 Oct 2012 20:52:39 +0000 (13:52 -0700)]
Interleave modes/residual per macroblock
Packs the bitstream with each mb's residual following its mode/mv
information.
TODO: There are still a few fields that should be packed into partition
0 but are included in partition 1, due to them being serialized from
write_kfmodes/pack_inter_mode_mvs, which execute after the first
partition is finalized. These need to be separated out into a separate
function, similar to mb_mode_mv_init() in decodemv.c.
John Koleszar [Tue, 16 Oct 2012 21:08:40 +0000 (14:08 -0700)]
Force interleaved decoding
Rather than decoding all modes/mvs separately, decode them per MB. This
forces the mode which was already used form the CONFIG_NEWBESTREFMV and
CONFIG_SUPERBLOCKS experiments, and is a precursor to changing to
interleaved encoding.
Ronald S. Bultje [Mon, 15 Oct 2012 20:49:45 +0000 (13:49 -0700)]
Remove mode_rdopt from MB_MODE_INFO.
The variable is essentially a duplicate of mode for RD-only purposes.
Removing it gives identical results, and saves 4 bytes per macroblock
(i.e. 32.5kB for a 1080p HD video encode).
Ronald S. Bultje [Mon, 15 Oct 2012 17:52:13 +0000 (10:52 -0700)]
Add a new token stuffing function vp8_stuff_mb().
This way a caller doesn't need to implement the logic for which (and how
many) tokens to write out to stuff one macroblock worth of EOBs. Make
the actual function implementations static, since they are now only used
in tokenize.c; also do some minor stylistic changes so it follows the
style guide a little more closely; use PLANE_TYPE where appropriate,
remove old (stale) frame_type function arguments; hardcode plane type
where only a single one is possible (2nd order DC or U/V EOB stuffing);
support stuffing 8x8/4x4 transform EOBs with no 2nd order DC.