James Zern [Wed, 30 Jul 2014 02:21:24 +0000 (19:21 -0700)]
Merge "Makefile: add -mstackrealign to CFLAGS on OS/2\r\r This prevents SIGSEGV of test_libvpx.\r\r Change-Id: I788743841469f4141bc8d29b1d1a8683cb00655c\r"
James Zern [Tue, 22 Jul 2014 01:27:58 +0000 (18:27 -0700)]
vp9_cx_iface: defer compressed data buffer alloc
currently the only way to know if multiple alt-refs are enabled is to
inspect the encoder instance.
this reduces the size of the allocation by 75% when not using multiple
alt-refs
Jingning Han [Fri, 25 Jul 2014 23:43:27 +0000 (16:43 -0700)]
Use frame index directly in get_chessboard_index
The get_chessboard_index() used to call the entire VP9_COMMON
struct pointer to retrieve the chessboard pattern index. This cl
makes it call the frame index directly.
Pengchong Jin [Mon, 28 Jul 2014 23:04:36 +0000 (16:04 -0700)]
Remove the redundant index computation in the first pass
Remove the redundant index computation when store the first
pass block-wise statistics. Currently, a single byte is
allocated for a 16x16 blocks, and all the frame statistics
saved during the first pass will be kept in memory for use
in the second pass. For a 1920x1080 300-frame clip, it will
take about 2.3 MB memory. This feature is off in current
setting.
Jim Bankoski [Mon, 28 Jul 2014 15:37:25 +0000 (08:37 -0700)]
Fix reference frame size restrictions.
The issue was introduced by commit g9f37d14 with adding explicit
restrictions on reference-frame scale factors. The restriction
is checked against aligned-by-8 frame dimensions, not against
original ones. So, for example, frame of 35×35 actually can refer
to frame of 70×70, but the new check won't allow this. It will
compare 35 vs 72 (not 70), so 2x downscale limit will be exceeded.
Jingning Han [Fri, 25 Jul 2014 14:08:23 +0000 (07:08 -0700)]
Fix rd_pick_partition search loop for 4x4 blocks
The partition search for 4x4 blocks takes unnecessary steps to
reconstruct pixels and an extra partition type update. This commit
removes such operations. No visible compression/speed difference.
Thanks to Yue (yuec@) for finding this issue.
Johann [Thu, 24 Jul 2014 16:32:01 +0000 (09:32 -0700)]
Remove neon version of vp8 extend borders
The code fails the unit test. Speed comparisons to the C are invalid
because the code frequently didn't correctly extend the right and
bottom portions of the frame.
Reduce maximum frame size on ARM devices to avoid OOM
Jingning Han [Thu, 24 Jul 2014 22:31:32 +0000 (15:31 -0700)]
Fix potential ioc issue in vp9_get_prob for 4K above sizes
This commit turns on the existing vp9_get_prob function using
64 bit in the intermediate step. It fixes the ioc issue for 4K
above frame sizes (issue 828).
Johann [Wed, 16 Jul 2014 09:30:21 +0000 (02:30 -0700)]
Set and use uv_crop_[width|height]
Ensure consistent border extension by rounding uv_crop_* at image
creation time. Where it was rounded problems could arise with the right
and bottom extensions.
When padding = 32, y_width = 64, and y_crop_width = 63:
(padding + width - crop_width + 1) / 2
32 + 64 - 63 + 1 should equal 32 *but*
32 + 1 + 1 equals 34 giving a right buffer of 17 instead of 16.
By calculating uv_crop_* earlier we round up at the appropriate time and
for the same values:
(y_crop_width + 1) / 2
63 + 1 / 2
64
(padding / 2) + uv_width - uv_crop_width
16 + 16 - 16
16
A previous change, https://gerrit.chromium.org/gerrit/#/c/70632,
introduced a size validation for reference frames to insuare the
input stream is a valid VP9 stream. However, the logic requiring
all reference frames have valid size turned out to be too strict.
In this commit, we modify the validation to require one of the
reference frame has valid dimension. In addition, the decoder
reports error whenever it detects the use of reference frame
with invalid scalig ratio.
James Zern [Thu, 24 Jul 2014 21:55:19 +0000 (14:55 -0700)]
rtcd.pl: check for auto_help availability
'auto_help' was added to Getopt::Long in 2.33
this isn't strictly necessary as an unrecognized option (--help) will
issue a warning and then print the usage
Adrian Grange [Thu, 24 Jul 2014 20:37:47 +0000 (13:37 -0700)]
Fix allocation of context buffers on frame resize
The patch:
https://gerrit.chromium.org/gerrit/#/c/70814/
changed the test that determined whether the context
frame buffers needed to be reallocated or not.
The code checked for a change in total frame area
to signal the need to reallocate context buffers.
However, the above_context buffer needs to be
resized i:xf only the width of the frame has increased.
Jingning Han [Wed, 23 Jul 2014 19:02:52 +0000 (12:02 -0700)]
Remove redundant argument entry in handle_inter_mode
The value of mode_excluded has been properly set in
vp9_rd_pick_inter_mode_sb(). It is redundant to send it in
handle_inter_mode() and re-set the value again.
Jingning Han [Wed, 23 Jul 2014 18:47:56 +0000 (11:47 -0700)]
Use the chessboard pattern pred search in newmv mode
This commit extends the chessboard pattern prediction filter search.
If the above and left blocks have the same prediction filter type,
the encoder will skip the prediction filter type search and use the
reference one.
The overall chessboard pattern prediction filter type search reduces
speed 3 runtime for hard clips. Experiments on park joy at 1080p
and 15000 kbps show that the runtime goes from 723265 ms to 65832 ms,
i.e., about 10% speed-up. Compression performance wise, it affects
the coding quality by
Jingning Han [Tue, 22 Jul 2014 23:32:20 +0000 (16:32 -0700)]
Enable chessboard inter prediction filter type search
This commit enables a chessboard pattern prediction filter type
search scheme for rate-distortion optimization speed-up. For the
inferred motion vector modes, the encoder can re-use its above/left
neighbor blocks' prediction filter type and skip a full test on
all possible filter types. Such operation is turned on/off
alternatively in a chessboard manner.
It is turned on in speed 3. For test clip pedestrian 1080p, the
runtime is reduced from 231500 ms -> 221700 ms. The compression
performance is changed:
derf: -0.147%
yt: -0.134%
hd: -0.079%
stdhd: -0.220%
VP9FrameSizeTestsLarge exposed an integer overflow in the VP9 encoder,
for now reduce the size to allow the tests to clear and prevent further
regressions.
4096x4096 -> 4096x2160
this should be restored after the bug is fixed:
https://code.google.com/p/webm/issues/detail?id=828
Tim Kopp [Tue, 22 Jul 2014 21:00:11 +0000 (14:00 -0700)]
VP9 denoiser bugfix in debugging code.
When OUTPUT_YUV_DENOISED is enabled the encoder outputs the uncompressed,
denoised video to a separate file. Moved the point at which the file is
written to in order to avoid an extra blank frame at the beginning of the video.
Marco Paniconi [Tue, 22 Jul 2014 18:06:00 +0000 (11:06 -0700)]
vp8: Set default denoiser_decision to copy for UV channel.
Since the UV decision to denoise is based on Y, we need to set
the default/initial denoiser decision_u/v to COPY_BLOCK,
to make sure if no uv_denoiser is applied we still update
(uv)running_avg with source.
Marco Paniconi [Mon, 21 Jul 2014 16:58:54 +0000 (09:58 -0700)]
vp8 denoiser fix: Update denoised altref on key frame.
On a key frame, the denoised-running_avg for all references
frames should be updated with the source.
The altref denoised-running_avg was not being updated on key frame,
this fixes that.
Jingning Han [Mon, 21 Jul 2014 23:22:56 +0000 (16:22 -0700)]
Turn on adaptive pred filter scheme for sub8x8 below 720p
For sequences of resolution below 720p, the encoder will check
intra prediction modes and inter prediction modes from LAST_FRAME.
This commit turns on adaptive prediction filter scheme for sub8x8
blocks, where inter prediction modes are enabled. For the test
sequence bus at CIF, the speed 2 runtime goes down from 17879 ms
to 16783 ms, i.e., 6% speed up. The compression performance of
derf set is down by -0.128%.
Moved call to vp9_clear_system_state() to a proper location
The commit moved a call to vp9_clear_system_state() to a correct
location, i.e. prior function calls using floating point numbers.
This was to fix a mismatch mmx code and sse2 version, where a
floating point number used in adjust_frame_rate(cpi) gets NAN due
to mmx registers being in wrong state.