Pengchong Jin [Tue, 5 Aug 2014 21:39:06 +0000 (14:39 -0700)]
Directly split the block in partition search
This patch allows the encoder to directly split the block
in partition search, therefore skip searching NONE. It
computes a score which measures whether 16x16 motion vectors
from the first pass in the current block are consistent with
each others. If they are inconsistent and we have enough Q
to encode, split the block directly, and skip searching NONE.
This feature is under flag CONFIG_FP_MB_STATS. In speed 2,
it further gives a speedup of 3-8% on sample yt clips as
compared to the previous version under the same flag. Overall,
the features under the flag will give 7-15% on typical yt
clips at up to 6000kbps data rate. The speedup at very high
data rate is not significant.
For hard stdhd clips:
park_joy_1080p @ 15000kbps: 504541ms -> 506293ms (-0.35%)
pedestrian_area_1080p @ 2000kbps: 326610ms -> 290090ms (+11.2%)
The compression performance using the features under the flag:
derf: -0.068%
yt: -0.189%
hd: -0.318%
stdhd:-0.183%
To use the feature, set CONFIG_FP_MB_STATS and turn on
cpi->use_fp_mb_stats.
Tom Finegan [Thu, 31 Jul 2014 05:05:48 +0000 (22:05 -0700)]
vp9_spatial_svc_encoder.sh: Disable existing tests, add a test that works.
- vp9_spatial_svc_encoder.c no longer supports the -m parameter that
has been used in the example test. Tests using -m have been disabled.
- Added a basic test that appears to work as of commit 3249f26ff85e2bfe148167ce80df53643a89a2d2.
- Minor style clean up.
Tom Finegan [Thu, 31 Jul 2014 03:47:55 +0000 (20:47 -0700)]
vpxdec.sh: Refactor vpxdec().
- Split vpxdec wrapper function into vpxdec() and vpxdec_pipe().
- Remove hard coded --noblit and --summary arguments from
the wrappers in favor of shifting off the first argument (the
input file) and passing all remaining parameters to vpxdec.
- Add --noblit and --summary args to existing tests, and update the
pipe input test to use vpxdec_pipe().
Pengchong Jin [Wed, 30 Jul 2014 02:49:47 +0000 (19:49 -0700)]
Early termination after partition NONE is done in RD.
This patch allows the encoder to skip the search for partition
SPLIT, HORZ, VERT after the search for partition NONE is done
in RD optimization. It uses the first pass block-wise statistics
to make the decision. If all 16x16 blocks in the current partition
have zero motions and small residues from the frist pass statistics,
and it has small difference variance, further partition search is
skipped.
For speed 2 setting, experiments on general youtube clips show that
the speedup varies from 1% - 10%, 5% on average. On the performance
side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.
For hard stdhd clips:
park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)
This feature is under the compilation flag CONFIG_FP_MB_STATS and
it is off in current setting.
Jingning Han [Tue, 29 Jul 2014 16:50:03 +0000 (09:50 -0700)]
Chessboard pattern partition search
This commit enables a chessboard pattern constrained partition
search for 720p and above resolutions. The scheme applies stricter
partition search to alternative blocks based on its above/left
neighboring blocks' partition range, as well as that of the
collocated blocks in the previous frame. It is currently turned
on at 16x16 block size level. The chessboard pattern is flipped
per coding frame.
The speed 3 runtime is reduced:
park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)
The compression performance is changed:
hd -0.223%
stdhd -0.295%
James Zern [Wed, 30 Jul 2014 02:21:24 +0000 (19:21 -0700)]
Merge "Makefile: add -mstackrealign to CFLAGS on OS/2\r\r This prevents SIGSEGV of test_libvpx.\r\r Change-Id: I788743841469f4141bc8d29b1d1a8683cb00655c\r"
James Zern [Tue, 22 Jul 2014 01:27:58 +0000 (18:27 -0700)]
vp9_cx_iface: defer compressed data buffer alloc
currently the only way to know if multiple alt-refs are enabled is to
inspect the encoder instance.
this reduces the size of the allocation by 75% when not using multiple
alt-refs
Johann [Tue, 29 Jul 2014 18:28:23 +0000 (11:28 -0700)]
Require armv6/media when building armv7
When building with runtime cpu detect assume that armv7 targets can be
relied upon to have at least armv6 support. This may allow dead code
detectors to remove some _c functions.
Jingning Han [Fri, 25 Jul 2014 23:43:27 +0000 (16:43 -0700)]
Use frame index directly in get_chessboard_index
The get_chessboard_index() used to call the entire VP9_COMMON
struct pointer to retrieve the chessboard pattern index. This cl
makes it call the frame index directly.
Pengchong Jin [Mon, 28 Jul 2014 23:04:36 +0000 (16:04 -0700)]
Remove the redundant index computation in the first pass
Remove the redundant index computation when store the first
pass block-wise statistics. Currently, a single byte is
allocated for a 16x16 blocks, and all the frame statistics
saved during the first pass will be kept in memory for use
in the second pass. For a 1920x1080 300-frame clip, it will
take about 2.3 MB memory. This feature is off in current
setting.
Jim Bankoski [Mon, 28 Jul 2014 15:37:25 +0000 (08:37 -0700)]
Fix reference frame size restrictions.
The issue was introduced by commit g9f37d14 with adding explicit
restrictions on reference-frame scale factors. The restriction
is checked against aligned-by-8 frame dimensions, not against
original ones. So, for example, frame of 35×35 actually can refer
to frame of 70×70, but the new check won't allow this. It will
compare 35 vs 72 (not 70), so 2x downscale limit will be exceeded.