Ronald S. Bultje [Mon, 22 Oct 2012 19:54:39 +0000 (12:54 -0700)]
Merge changes I02e7f64a,Ide954b00,Idc8b5977 into experimental
* changes:
Fix another typo in 4x4-transform-for-i8x8-intra-pred coeff contexts.
8x8 transform support in splitmv.
Use SPLITMV_PARTITIONING instead of a plain integer type.
Ronald S. Bultje [Mon, 22 Oct 2012 18:49:00 +0000 (11:49 -0700)]
8x8 transform support in splitmv.
For splitmv, where partitioning is 8x16, 16x8 or 8x8, this patch
uses the 8x8 transform (instead of the 4x4) if txfm_mode is
ALLOW_8X8 or ALLOW_16X16. For TX_MODE_SELECT, splitmv can indicate
which of the 2 transform sizes (4x4 or 8x8) it wants to use.
Gains (with hybridtx4x4/8x8/16x16 and tx_select experiments
enabled) on derf: +0.9%, HD: +0.4%, STD/HD: +0.8% (SSIM or overall
PSNR, both metrics show similar improvements).
Deb Mukherjee [Fri, 19 Oct 2012 22:12:12 +0000 (15:12 -0700)]
Allow B_VL_PRED & B_LD_PRED modes with Superblocks
Allows B_VL_PRED & B_LD_PRED modes to be used for all blocks
within a MB in B_PRED mode. These modes were temporarily
disabled with super-block coding.
Scott LaVarnway [Fri, 19 Oct 2012 22:52:12 +0000 (15:52 -0700)]
sse2 intrinsic version of vp8_mbloop_filter_vertical_edge()
First sse2 version of vp8_mbloop_filter_vertical_edge(). For now,
intrinsics are being used until the bitstream is finalized. This function
will be revisited later for further performance improvements.
For the test clip used, a 34+% decoder performance improvement
was seen. This will vary depending on material.
John Koleszar [Thu, 18 Oct 2012 23:27:30 +0000 (16:27 -0700)]
calculate probs consistently
There were several different methods for calculating bitstream
probabilities in use. Consolodate these into a pair of functions,
get_prob() and get_binary_prob().
John Koleszar [Thu, 18 Oct 2012 21:34:53 +0000 (14:34 -0700)]
lint-hunks: better support for working tree
When run with no arguments, report warnings in the diff between the
working tree and HEAD. With arguments, report warnings in the diff
between the named commit and its parents.
John Koleszar [Wed, 17 Oct 2012 23:47:38 +0000 (16:47 -0700)]
Remove bc, bc2 from pbi,cpi,xd
Pass the bool coder to be used explicitly. This avoids cases where two
different bool coders can be addressed from the same function. Also be
more consistent with bool coder variable naming, start to standardize
on 'bc'.
Deb Mukherjee [Mon, 15 Oct 2012 23:41:41 +0000 (16:41 -0700)]
Some cleanups and fixes.
Separates the logic on transform type selection previously spread out
over a number of files into a separate function. Currently the tx_type
field in b_mode_info is not used, but still left in there to eventually
use for signaling the transform type in the bitstream.
Also, now for tx_type = DCT_DCT, the regular integer DCT is used, as
opposed to the floating point DCT used in conjuction with hybrid
transform.
Results change somewhat due to the transform change, but are within
reasonable limits. The hd/std-hd sets are slightly up, while derf/yt
are slightly down.
Scott LaVarnway [Thu, 18 Oct 2012 21:29:26 +0000 (14:29 -0700)]
sse2 intrinsic version of vp8_mbloop_filter_horizontal_edge()
First sse2 version of vp8_mbloop_filter_horizontal_edge(). For now,
intrinsics are being used until the bitstream is finalized. This function
will be revisited later for further performance improvements.
For the test clip used, a 31+% decoder performance improvement
was seen. This will vary depending on material.
John Koleszar [Thu, 18 Oct 2012 04:43:18 +0000 (21:43 -0700)]
lint-hunks: exit status for only affected lines
Prior to this patch, if there were any lint errors, this script would
exit with an error, even if those errors were not in the hunks being
tested by this script. This change makes it so that if any lint lines
are printed, an error is returned.
John Koleszar [Wed, 17 Oct 2012 16:38:13 +0000 (09:38 -0700)]
Move remaining per-frame data into partition 0
This commit moves a bit of data that ended up packed with the
modes/mv/residual partition during the change to interleaved encoding
into partition 0 where it belongs.
John Koleszar [Tue, 16 Oct 2012 20:52:39 +0000 (13:52 -0700)]
Interleave modes/residual per macroblock
Packs the bitstream with each mb's residual following its mode/mv
information.
TODO: There are still a few fields that should be packed into partition
0 but are included in partition 1, due to them being serialized from
write_kfmodes/pack_inter_mode_mvs, which execute after the first
partition is finalized. These need to be separated out into a separate
function, similar to mb_mode_mv_init() in decodemv.c.
John Koleszar [Tue, 16 Oct 2012 21:08:40 +0000 (14:08 -0700)]
Force interleaved decoding
Rather than decoding all modes/mvs separately, decode them per MB. This
forces the mode which was already used form the CONFIG_NEWBESTREFMV and
CONFIG_SUPERBLOCKS experiments, and is a precursor to changing to
interleaved encoding.
Ronald S. Bultje [Mon, 15 Oct 2012 20:49:45 +0000 (13:49 -0700)]
Remove mode_rdopt from MB_MODE_INFO.
The variable is essentially a duplicate of mode for RD-only purposes.
Removing it gives identical results, and saves 4 bytes per macroblock
(i.e. 32.5kB for a 1080p HD video encode).
Ronald S. Bultje [Mon, 15 Oct 2012 17:52:13 +0000 (10:52 -0700)]
Add a new token stuffing function vp8_stuff_mb().
This way a caller doesn't need to implement the logic for which (and how
many) tokens to write out to stuff one macroblock worth of EOBs. Make
the actual function implementations static, since they are now only used
in tokenize.c; also do some minor stylistic changes so it follows the
style guide a little more closely; use PLANE_TYPE where appropriate,
remove old (stale) frame_type function arguments; hardcode plane type
where only a single one is possible (2nd order DC or U/V EOB stuffing);
support stuffing 8x8/4x4 transform EOBs with no 2nd order DC.
Ronald S. Bultje [Sun, 14 Oct 2012 22:29:56 +0000 (15:29 -0700)]
Add and consistently use PLANE_TYPE.
Change the macros PLANE_TYPE_{Y_NO_DC,Y2,UV,Y_WITH_DC} to a typed enum,
and use this typed enum consistently across all places where relevant.
In places where the type is implied (e.g. in functions that only handle
second order planes or chroma planes), remove it as a function argument
and instead hardcode the proper enum in the code directly.
Ronald S. Bultje [Sat, 13 Oct 2012 18:46:21 +0000 (11:46 -0700)]
Merge duplicate loops in tokenization code.
Also merge the three occurrences of 4x4 chroma block writing into a
single function, and call that function instead of duplicating the
4x4 chroma tokenization code in 3 places.
Ronald S. Bultje [Sat, 13 Oct 2012 16:27:54 +0000 (09:27 -0700)]
Minor refactoring in encodeintra.c.
Merge code blocks for different transform sizes; use MACROBLOCKD as a
temp variable where that leads to smaller overall source code; remove
duplicate code under #if CONFIG_HYBRIDTRANSFORM/#else blocks. Some style
changes to make it follow the style guide a little better.
Ronald S. Bultje [Sat, 13 Oct 2012 15:15:51 +0000 (08:15 -0700)]
Remove duplicate or unused code in encoder/encodemb.c.
Also make some minor stylistic changes to bring the code closer to
the style guide. Remove distinction between inter and intra transform
functions, since both do exactly the same thing except for the check
against SPLITMV for the second-order transform. Remove some commented
out debug code. Remove 8x8/16x16 transform code in encode_inter16x16y(),
since the first-pass only uses 4x4 anyway.
Ronald S. Bultje [Sat, 13 Oct 2012 05:42:06 +0000 (22:42 -0700)]
Remove duplicate or unused code in encoder/quantize.c.
Also make some minor stylistic changes to bring the code closer to
the style guide. Remove checks against i8x8/bpred in the mb-codepath,
since these do individual block reconstruction and thus don't go through
this codepath.
Ronald S. Bultje [Fri, 12 Oct 2012 01:19:20 +0000 (18:19 -0700)]
Remove reverting of tx-select if only a single txfm-size is used.
Entropy coding takes care of this anyway, and this causes changes to
the txfm size assigned to skip blocks, which can affect the loopfilter
output, thus causing encoder/decoding mismatches.
John Koleszar [Fri, 12 Oct 2012 05:15:33 +0000 (22:15 -0700)]
consolidate update_mb_segmentation_map data
The update_mb_segmentation_map flag was being signalled earlier than
other data dependent on that flag. Consolidate this data so it's
parsed within the same if-scope as the flag is originally parsed in.
Results: derf (vanilla or +hybridtx) +0.2% and (+hybrid16x16
or +tx16x16) +0.7%-0.8%; HD (vanilla or +hybridtx) +0.1-0.2%
and (+hybrid16x16 or +tx16x16) +1.4%, STD/HD (vanilla or +hybridtx)
about even, and (+hybrid16x16 or +tx16x16) +0.8-1.0%.
Paul Wilkins [Fri, 5 Oct 2012 10:16:46 +0000 (11:16 +0100)]
Fix SIMD unsafe use of floating point.
This commit fixes unsafe simd / floating point interactions arising
from the current hybrid and 16x16 transform implementation.
These led to a raft of bugs and issues when the project was
built using VS2008 for Win32 though they did not show up with
the unix builds.
Gerrit makes a meal out of presenting the fix but all I have actually
done is indent the body of each function that uses floating point by
one level and bracket with emms instructions using the function
vp8_clear_system_state(). See below.
function () {
vp8_clear_system_state();
{
... function body
}
vp8_clear_system_state();
}
This is almost certainly over the top in terms of number of emms
instructions but is a temporary measure pending implementation of
integer variants of each function to replace the floating point.
Limited testing suggests that this fixes the problems that arose for
Win32 VS2008 when the hybrid or 16x16 transforms were enabled.
Deb Mukherjee [Mon, 10 Sep 2012 05:42:35 +0000 (22:42 -0700)]
Entropy coding for hybrid transform
Separates the entropy coding context models for 4x4, 8x8 and 16x16
ADST variants.
There is a small improvement for HD (hd/std-hd) by about 0.1-0.2%.
Results on derf/yt are about the same, probably because there is not
enough statistics.
Results may improve somewhat once the initial probability tables are
updated for the hybrid transforms which is coming soon.