Johann [Thu, 24 Jul 2014 16:32:01 +0000 (09:32 -0700)]
Remove neon version of vp8 extend borders
The code fails the unit test. Speed comparisons to the C are invalid
because the code frequently didn't correctly extend the right and
bottom portions of the frame.
Reduce maximum frame size on ARM devices to avoid OOM
Johann [Wed, 16 Jul 2014 09:30:21 +0000 (02:30 -0700)]
Set and use uv_crop_[width|height]
Ensure consistent border extension by rounding uv_crop_* at image
creation time. Where it was rounded problems could arise with the right
and bottom extensions.
When padding = 32, y_width = 64, and y_crop_width = 63:
(padding + width - crop_width + 1) / 2
32 + 64 - 63 + 1 should equal 32 *but*
32 + 1 + 1 equals 34 giving a right buffer of 17 instead of 16.
By calculating uv_crop_* earlier we round up at the appropriate time and
for the same values:
(y_crop_width + 1) / 2
63 + 1 / 2
64
(padding / 2) + uv_width - uv_crop_width
16 + 16 - 16
16
Jingning Han [Wed, 23 Jul 2014 19:02:52 +0000 (12:02 -0700)]
Remove redundant argument entry in handle_inter_mode
The value of mode_excluded has been properly set in
vp9_rd_pick_inter_mode_sb(). It is redundant to send it in
handle_inter_mode() and re-set the value again.
Jingning Han [Wed, 23 Jul 2014 18:47:56 +0000 (11:47 -0700)]
Use the chessboard pattern pred search in newmv mode
This commit extends the chessboard pattern prediction filter search.
If the above and left blocks have the same prediction filter type,
the encoder will skip the prediction filter type search and use the
reference one.
The overall chessboard pattern prediction filter type search reduces
speed 3 runtime for hard clips. Experiments on park joy at 1080p
and 15000 kbps show that the runtime goes from 723265 ms to 65832 ms,
i.e., about 10% speed-up. Compression performance wise, it affects
the coding quality by
Jingning Han [Tue, 22 Jul 2014 23:32:20 +0000 (16:32 -0700)]
Enable chessboard inter prediction filter type search
This commit enables a chessboard pattern prediction filter type
search scheme for rate-distortion optimization speed-up. For the
inferred motion vector modes, the encoder can re-use its above/left
neighbor blocks' prediction filter type and skip a full test on
all possible filter types. Such operation is turned on/off
alternatively in a chessboard manner.
It is turned on in speed 3. For test clip pedestrian 1080p, the
runtime is reduced from 231500 ms -> 221700 ms. The compression
performance is changed:
derf: -0.147%
yt: -0.134%
hd: -0.079%
stdhd: -0.220%
VP9FrameSizeTestsLarge exposed an integer overflow in the VP9 encoder,
for now reduce the size to allow the tests to clear and prevent further
regressions.
4096x4096 -> 4096x2160
this should be restored after the bug is fixed:
https://code.google.com/p/webm/issues/detail?id=828
Tim Kopp [Tue, 22 Jul 2014 21:00:11 +0000 (14:00 -0700)]
VP9 denoiser bugfix in debugging code.
When OUTPUT_YUV_DENOISED is enabled the encoder outputs the uncompressed,
denoised video to a separate file. Moved the point at which the file is
written to in order to avoid an extra blank frame at the beginning of the video.
Marco Paniconi [Tue, 22 Jul 2014 18:06:00 +0000 (11:06 -0700)]
vp8: Set default denoiser_decision to copy for UV channel.
Since the UV decision to denoise is based on Y, we need to set
the default/initial denoiser decision_u/v to COPY_BLOCK,
to make sure if no uv_denoiser is applied we still update
(uv)running_avg with source.
Marco Paniconi [Mon, 21 Jul 2014 16:58:54 +0000 (09:58 -0700)]
vp8 denoiser fix: Update denoised altref on key frame.
On a key frame, the denoised-running_avg for all references
frames should be updated with the source.
The altref denoised-running_avg was not being updated on key frame,
this fixes that.
Jingning Han [Mon, 21 Jul 2014 23:22:56 +0000 (16:22 -0700)]
Turn on adaptive pred filter scheme for sub8x8 below 720p
For sequences of resolution below 720p, the encoder will check
intra prediction modes and inter prediction modes from LAST_FRAME.
This commit turns on adaptive prediction filter scheme for sub8x8
blocks, where inter prediction modes are enabled. For the test
sequence bus at CIF, the speed 2 runtime goes down from 17879 ms
to 16783 ms, i.e., 6% speed up. The compression performance of
derf set is down by -0.128%.
Moved call to vp9_clear_system_state() to a proper location
The commit moved a call to vp9_clear_system_state() to a correct
location, i.e. prior function calls using floating point numbers.
This was to fix a mismatch mmx code and sse2 version, where a
floating point number used in adjust_frame_rate(cpi) gets NAN due
to mmx registers being in wrong state.
This patch adds back in code that checks that the frame
size lies within defined bounds was inadvertantly removed
by a previous patch:
https://gerrit.chromium.org/gerrit/#/c/70814/
Deb Mukherjee [Mon, 14 Jul 2014 20:31:29 +0000 (13:31 -0700)]
Use custom mkstemp() to fix Win issue in y4m_test
Uses mkstmp() with directory being the same as the test data
directory to create temporary output file. For Windows
GetTempFileNameA() function is used.
Deb Mukherjee [Fri, 18 Jul 2014 10:06:07 +0000 (03:06 -0700)]
Fix FrameSizeTestsLarge unit-test on 32-bit arch.
If the img allocation fails the test used to crash before on
32 bit architecture. This patch uses null check on img in
FillFrame. Also, if the first frame initialization has not been
conducted VPX_CODEC_ERROR is expected to return rather than
VPX_CODEC_OK.
Deb Mukherjee [Tue, 15 Jul 2014 08:54:29 +0000 (01:54 -0700)]
Separates profile 2 into 2 profiles 2 and 3
Separates HBD profile int two profiles (2 and 3) consistent with the
highbitdepth branch. This patch is ported from the original highbitdepth
branch patch: https://gerrit.chromium.org/gerrit/#/c/70460/
Two of the invalid file tests needed to be updated.
Adrian Grange [Thu, 10 Jul 2014 22:35:51 +0000 (15:35 -0700)]
Modified frame buffer handling
This patch is the first step toward simplifying the
frame buffer handling.
The final goal is to have a common frame buffer handling
framework for both encoder and decoder that incorporates
the existing ability to use externally allocated memory.
Deb Mukherjee [Wed, 16 Jul 2014 16:37:13 +0000 (09:37 -0700)]
Adds support for raw yuv files for 422/444
Adds support for raw yuv inputs in 422/444 sampling for use
in profiles 1 and 3.
New options added to vpxenc are:
--i422 and --i444, which are to be used in conjunction with
--width, --height, and --fps for proper raw yuv handling.
A new option is added to vpxdec:
--rawvideo, which enforces raw yuv video output for the
bit-stream decoded irrespective of 420, 422 or 444 sampling.
The existing options --i420 and --yv12
are specialized for use only for 420 content.
Paul Wilkins [Wed, 16 Jul 2014 10:21:27 +0000 (11:21 +0100)]
Changes to rd balance and multi-arf bug fix.
2 pass only change to calculation of rd mult based on Q.
Make a small adjustment based on frame type and also
replace adjustment based on iifactor with an one based
on the ambient GF/ARF boost level.
Also fix multi arf bug / issue.
Overall these change give an slight improvement in ssim
but hurt psnr a little.
Fix show_existing_frame not decreasing frame buffer ref counter.
The issue was introduced by commit g7c43fb6. If current frame
is repeated from existing-ref pool, frame buffer ref counter
is not decreased, so buffer isn't released. Decoder fails being
unable to allocate new frame buffer at some point.
Added a test vector to verify that the condition will not
recur later. Test vector was generated by the code in this patch:
https://gerrit.chromium.org/gerrit/#/c/70862/