Tom Finegan [Thu, 31 Jul 2014 05:05:48 +0000 (22:05 -0700)]
vp9_spatial_svc_encoder.sh: Disable existing tests, add a test that works.
- vp9_spatial_svc_encoder.c no longer supports the -m parameter that
has been used in the example test. Tests using -m have been disabled.
- Added a basic test that appears to work as of commit 3249f26ff85e2bfe148167ce80df53643a89a2d2.
- Minor style clean up.
Tom Finegan [Thu, 31 Jul 2014 03:47:55 +0000 (20:47 -0700)]
vpxdec.sh: Refactor vpxdec().
- Split vpxdec wrapper function into vpxdec() and vpxdec_pipe().
- Remove hard coded --noblit and --summary arguments from
the wrappers in favor of shifting off the first argument (the
input file) and passing all remaining parameters to vpxdec.
- Add --noblit and --summary args to existing tests, and update the
pipe input test to use vpxdec_pipe().
Pengchong Jin [Wed, 30 Jul 2014 02:49:47 +0000 (19:49 -0700)]
Early termination after partition NONE is done in RD.
This patch allows the encoder to skip the search for partition
SPLIT, HORZ, VERT after the search for partition NONE is done
in RD optimization. It uses the first pass block-wise statistics
to make the decision. If all 16x16 blocks in the current partition
have zero motions and small residues from the frist pass statistics,
and it has small difference variance, further partition search is
skipped.
For speed 2 setting, experiments on general youtube clips show that
the speedup varies from 1% - 10%, 5% on average. On the performance
side in PSNR, derf 0.004%, yt -0.059%, hd -0.106%, stdhd 0.032%.
For hard stdhd clips:
park_joy_1080p, 502952 ms -> 503307 ms (-0.07%)
pedestrian_area_1080p, 227049 ms -> 220531 ms (+3%)
This feature is under the compilation flag CONFIG_FP_MB_STATS and
it is off in current setting.
Jingning Han [Tue, 29 Jul 2014 16:50:03 +0000 (09:50 -0700)]
Chessboard pattern partition search
This commit enables a chessboard pattern constrained partition
search for 720p and above resolutions. The scheme applies stricter
partition search to alternative blocks based on its above/left
neighboring blocks' partition range, as well as that of the
collocated blocks in the previous frame. It is currently turned
on at 16x16 block size level. The chessboard pattern is flipped
per coding frame.
The speed 3 runtime is reduced:
park_joy_1080p, 652832 ms -> 607738 ms (7% speed-up)
pedestrian_area_1080p, 215998 ms -> 200589 ms (8% speed-up)
The compression performance is changed:
hd -0.223%
stdhd -0.295%
James Zern [Wed, 30 Jul 2014 02:21:24 +0000 (19:21 -0700)]
Merge "Makefile: add -mstackrealign to CFLAGS on OS/2\r\r This prevents SIGSEGV of test_libvpx.\r\r Change-Id: I788743841469f4141bc8d29b1d1a8683cb00655c\r"
James Zern [Tue, 22 Jul 2014 01:27:58 +0000 (18:27 -0700)]
vp9_cx_iface: defer compressed data buffer alloc
currently the only way to know if multiple alt-refs are enabled is to
inspect the encoder instance.
this reduces the size of the allocation by 75% when not using multiple
alt-refs
Jingning Han [Fri, 25 Jul 2014 23:43:27 +0000 (16:43 -0700)]
Use frame index directly in get_chessboard_index
The get_chessboard_index() used to call the entire VP9_COMMON
struct pointer to retrieve the chessboard pattern index. This cl
makes it call the frame index directly.
Pengchong Jin [Mon, 28 Jul 2014 23:04:36 +0000 (16:04 -0700)]
Remove the redundant index computation in the first pass
Remove the redundant index computation when store the first
pass block-wise statistics. Currently, a single byte is
allocated for a 16x16 blocks, and all the frame statistics
saved during the first pass will be kept in memory for use
in the second pass. For a 1920x1080 300-frame clip, it will
take about 2.3 MB memory. This feature is off in current
setting.
Jim Bankoski [Mon, 28 Jul 2014 15:37:25 +0000 (08:37 -0700)]
Fix reference frame size restrictions.
The issue was introduced by commit g9f37d14 with adding explicit
restrictions on reference-frame scale factors. The restriction
is checked against aligned-by-8 frame dimensions, not against
original ones. So, for example, frame of 35×35 actually can refer
to frame of 70×70, but the new check won't allow this. It will
compare 35 vs 72 (not 70), so 2x downscale limit will be exceeded.
Jingning Han [Fri, 25 Jul 2014 14:08:23 +0000 (07:08 -0700)]
Fix rd_pick_partition search loop for 4x4 blocks
The partition search for 4x4 blocks takes unnecessary steps to
reconstruct pixels and an extra partition type update. This commit
removes such operations. No visible compression/speed difference.
Thanks to Yue (yuec@) for finding this issue.
Johann [Thu, 24 Jul 2014 16:32:01 +0000 (09:32 -0700)]
Remove neon version of vp8 extend borders
The code fails the unit test. Speed comparisons to the C are invalid
because the code frequently didn't correctly extend the right and
bottom portions of the frame.
Reduce maximum frame size on ARM devices to avoid OOM
Jingning Han [Thu, 24 Jul 2014 22:31:32 +0000 (15:31 -0700)]
Fix potential ioc issue in vp9_get_prob for 4K above sizes
This commit turns on the existing vp9_get_prob function using
64 bit in the intermediate step. It fixes the ioc issue for 4K
above frame sizes (issue 828).
Johann [Wed, 16 Jul 2014 09:30:21 +0000 (02:30 -0700)]
Set and use uv_crop_[width|height]
Ensure consistent border extension by rounding uv_crop_* at image
creation time. Where it was rounded problems could arise with the right
and bottom extensions.
When padding = 32, y_width = 64, and y_crop_width = 63:
(padding + width - crop_width + 1) / 2
32 + 64 - 63 + 1 should equal 32 *but*
32 + 1 + 1 equals 34 giving a right buffer of 17 instead of 16.
By calculating uv_crop_* earlier we round up at the appropriate time and
for the same values:
(y_crop_width + 1) / 2
63 + 1 / 2
64
(padding / 2) + uv_width - uv_crop_width
16 + 16 - 16
16
A previous change, https://gerrit.chromium.org/gerrit/#/c/70632,
introduced a size validation for reference frames to insuare the
input stream is a valid VP9 stream. However, the logic requiring
all reference frames have valid size turned out to be too strict.
In this commit, we modify the validation to require one of the
reference frame has valid dimension. In addition, the decoder
reports error whenever it detects the use of reference frame
with invalid scalig ratio.
James Zern [Thu, 24 Jul 2014 21:55:19 +0000 (14:55 -0700)]
rtcd.pl: check for auto_help availability
'auto_help' was added to Getopt::Long in 2.33
this isn't strictly necessary as an unrecognized option (--help) will
issue a warning and then print the usage
Adrian Grange [Thu, 24 Jul 2014 20:37:47 +0000 (13:37 -0700)]
Fix allocation of context buffers on frame resize
The patch:
https://gerrit.chromium.org/gerrit/#/c/70814/
changed the test that determined whether the context
frame buffers needed to be reallocated or not.
The code checked for a change in total frame area
to signal the need to reallocate context buffers.
However, the above_context buffer needs to be
resized i:xf only the width of the frame has increased.
Jingning Han [Wed, 23 Jul 2014 19:02:52 +0000 (12:02 -0700)]
Remove redundant argument entry in handle_inter_mode
The value of mode_excluded has been properly set in
vp9_rd_pick_inter_mode_sb(). It is redundant to send it in
handle_inter_mode() and re-set the value again.
Jingning Han [Wed, 23 Jul 2014 18:47:56 +0000 (11:47 -0700)]
Use the chessboard pattern pred search in newmv mode
This commit extends the chessboard pattern prediction filter search.
If the above and left blocks have the same prediction filter type,
the encoder will skip the prediction filter type search and use the
reference one.
The overall chessboard pattern prediction filter type search reduces
speed 3 runtime for hard clips. Experiments on park joy at 1080p
and 15000 kbps show that the runtime goes from 723265 ms to 65832 ms,
i.e., about 10% speed-up. Compression performance wise, it affects
the coding quality by