Paul Wilkins [Fri, 7 Oct 2011 15:58:28 +0000 (16:58 +0100)]
Segmentation Features;
Only encode sign bit for feature data that can have a sign.
Tweaks to the test segmentation rules so that it now actually gives
a net benefit on the derf set of about 0.4% though much higher
on some clips at the low end.
Paul Wilkins [Wed, 5 Oct 2011 10:26:00 +0000 (11:26 +0100)]
Further segment feature extensions.
This quite large check in includes the following:
Merge in some code from Ronald (mbgraph.c) that scans a Gf/arf group.
This is used as a basis for a simple segmentation for the normal frames
in a gf/arf group. This code also uses satd functions from Yaowu.
Adds functionality for coding the latest possible position of an EOB for
blocks in the segment. (Currently 0-15 only, hence just for 4x4 dct).
Where the EOB position is 0 this acts like "skip" and the normal coding
of skip at the per mb level is disabled.
Added functions (seg_common.c) for setting and reading segment feature
elements. These may want to be optimized away at some point but while the
mecahnism is in a state of flux they provide a single location for making
changes and keep things a bit cleaner.
This is still proof of concept code. Currently the tested feature set:-
Yaowu Xu [Sat, 8 Oct 2011 22:48:53 +0000 (15:48 -0700)]
fixed a decoder bug
When 8x8 transform is enabled, the decoder does an extra reconstruct
on MBs that are coded using 8x8. This commit fixed the logic around
the decoding of mb encoded with 8x8 transform.
Paul Wilkins [Fri, 30 Sep 2011 15:45:16 +0000 (16:45 +0100)]
Segment coding of mode and reference frame.
Proof of concept test code that encodes mode and reference
frame data at the segment level.
Decode-able bit stream but some issues not yet resolved.
As it this helps a little on a couple of clips but hurts on most as
the basis for segmentation is unsound.
To build and test, configure with
--enable-experimental --enable-segfeatures
Rd and Rm registers should be different in 'mul'. This register
combination results in unpredictable behaviour. GCC will give
a warning and RVCT an error in this case.
Restriction applies only to armv5 targets and not for armv6 and above.
Stefan Holmer [Tue, 6 Sep 2011 12:34:36 +0000 (14:34 +0200)]
Fix necessary for input partitions iface to match the RTP profile
These changes fixes a glitch between the RTP profile and the input
partitions interface. Since there's no way for the user to know the
actual number of partitions, the decoder have to read the
multi_token_paritition bits also when input partitions mode is
enabled.
Included are also a couple of fixes for issues with independent
partitions and uninitialized memory reads.
Yaowu Xu [Wed, 31 Aug 2011 19:01:58 +0000 (12:01 -0700)]
enable selecting&transmitting to for intra mode entropy
This commit added a 3 bit index to the bitstream, the index is used to
look into the intra mode coding entropy context table. The commit uses
the mode stats to calculate the cost of transmitting modes using 8
possible entropy distributions, and selects the distribution that
provides the lowest cost to do the actual mode coding.
Initial test show this provides additional .2%~.3% gain over quantizer
adaptive intra mode coding. So the adaptive intra mode coding provides
a total of .5%(psnr) to .6% gain(ssim) combined for all-key-encoding
To build and test, configure with
--enable-experimental --enable-qimode
Yaowu Xu [Thu, 4 Aug 2011 23:30:27 +0000 (16:30 -0700)]
add quantizer adaptive intra mb mode encoding
make intra mode coding entropy distribution adaptive to baseQindex, an
encoding test on hd clips with all key frame shows universal gain on
all clips in both .2%(psnr) and (ssim).3%.
To build and test, configure with
--enable-experimental --enable-qimode
Yaowu Xu [Thu, 4 Aug 2011 23:30:27 +0000 (16:30 -0700)]
add 8x8 intra prediction modes
Patch 1 to Patch 3 is an initial implementation of 8x8 intra prediction
modes, here are with the following assumptions:
a. 8x8 has 4 prediction modes DC, H, V and TM
b. UV 4x4 block use the same mode as corresponding 8x8 area
c. i8x8 modes are enabled for key frame only for now
Patch 4:
d. removed debug code from previous patches
Patch 5:
e. added stats code to collect entropy stats and further cleaned up
Patch 6:
f. changed mode stats code to collect finer stats of modes
Patch 7:
g. normalized i8x8 modes distribution to total at 256 (8bits).
Patch 8:
h. fixed a bug in decoder and removed debug printf output.
Patch 9:
i. more cleanups to address paul's comment
Patch 10:
j. messy rebase/merges to bring the commit up to date.
Tests on HD clips encoded with all key frame showing consistent gain
on all clips and all metrics:~0.5%(psnr) and 0.6%(ssim):
http://www.corp.google.com/~yaowu/no_crawl/i8x8hd_allkey_fixedq.html
To build and test, configure with:
--enable-experimental --enable-i8x8
Scott LaVarnway [Wed, 24 Aug 2011 18:42:26 +0000 (14:42 -0400)]
Removed bmi copy to/from BLOCKD
for SPLITMV and B_PRED modes. Modified code to use the bmi
found in mode_info_context instead of BLOCKD. On the decode
side, the uvmvs are calculated only when required, instead of
every macroblock. This is WIP. (bmi should eventually be
removed from BLOCKD)
Small performance gains noticed for RT encodes and decodes.(VGA)
Fritz Koenig [Mon, 22 Aug 2011 22:29:41 +0000 (15:29 -0700)]
Use local labels for jumps/loops in x86 assembly.
Prepend . to local labels in assembly code. This
allows non unique labels within a file. Also
makes profiling information more informative
by keeping the function name with the loop name.
Fritz Koenig [Mon, 22 Aug 2011 19:36:28 +0000 (12:36 -0700)]
Reclassify optimized ssim calculations as SSE2.
Calculations were incorrectly classified as either
SSE3 or SSSE3. Only using SSE2 instructions.
Cleanup function names and make non-RTCD code work
as well.
Fritz Koenig [Fri, 19 Aug 2011 15:51:27 +0000 (08:51 -0700)]
Reclasify optimized ssim calculations as SSE2.
Calculations were incorrectly classified as either
SSE3 or SSSE3. Only using SSE2 instructions.
Cleanup function names and make non-RTCD code work
as well.
Alpha Lam [Tue, 9 Aug 2011 19:59:45 +0000 (20:59 +0100)]
Copy less when active map is in use
When active map is specified and the current frame is not a key frame,
golden frame nor a altref frame then copy only those active regions.
This significantly reduces encoding time by as much as 19% on the test
system where realtime encoding is used. This is particularly useful
when the frame size is large (e.g. 2560x1600) and there's only a few
action macroblocks.
Paul Wilkins [Wed, 17 Aug 2011 13:14:23 +0000 (14:14 +0100)]
Small boost to every other frame.
Instead of a single mid GF boost apply a few extra bits to
every other frame. This gives a very small average metrics
improvement on both derf and YT sets.
John Koleszar [Fri, 12 Aug 2011 18:51:36 +0000 (14:51 -0400)]
Revert "Improved 1-pass CBR rate control"
This reverts commit b5ea2fbc2c1554769848774c836aad262af95072. Further
testing showed noticable keyframe popping in some cases, reverting this
for now to give time for a proper fix.