Jingning Han [Tue, 24 Jun 2014 17:45:31 +0000 (10:45 -0700)]
Enable real-time version reference motion vector search
This commit enables a fast reference motion vector search scheme.
It checks the nearest top and left neighboring blocks to decide the
most probable predicted motion vector. If it finds the two have
the same motion vectors, it then skip finding exterior range for
the second most probable motion vector, and correspondingly skips
the check for NEARMV.
The runtime of speed -5 goes down
pedestrian at 1080p 29377 ms -> 27783 ms
vidyo at 720p 11830 ms -> 10990 ms
i.e., 6%-8% speed-up.
For rtc set, the compression performance
goes down by about -1.3% for both speed -5 and -6.
Paul Wilkins [Thu, 26 Jun 2014 08:48:31 +0000 (09:48 +0100)]
Fix quality regression for multi arf off case.
Bug introduced during multiple iterations on: I3831*
gf_group->arf_update_idx[] cannot currently be used
to select the arf buffer index if buffer flipping on overlays
is enabled (still currently the case when multi arf OFF).
Jingning Han [Wed, 25 Jun 2014 18:41:49 +0000 (11:41 -0700)]
Make non-RD intra mode search txfm size dependent
This commit fixes the potential issue in the non-RD mode decision
flow that only checks part of the block to estimate the cost. It
was due to the use of fixed transform size, in replacing the
largest transform block size. This commit enables per transform
block cost estimation of the intra prediction mode in the non-RD
mode decision.
hkuang [Tue, 24 Jun 2014 21:22:09 +0000 (14:22 -0700)]
Revert "Revert 3 patches from Hangyu to get Chrome to build:"
This patch reverts the previous revert from Jim and also add a
variable user_priv in the FrameWorker to save the user_priv
passed from the application. In the decoder_get_frame function,
the user_priv will be binded with the img. This change is needed
or it will fail the unit test added here:
https://gerrit.chromium.org/gerrit/#/c/70610/
Yunqing Wang [Fri, 20 Jun 2014 17:11:34 +0000 (10:11 -0700)]
Reuse inter prediction result in real-time speed 6
In real-time speed 6, no partition search is done. The inter
prediction results got from picking mode can be reused in the
following encoding process. A speed feature reuse_inter_pred_sby
is added to only enable the resue in speed 6.
This patch doesn't change encoding result. RTC set tests showed
that the encoding speed gain is 2% - 5%.
Adrian Grange [Wed, 18 Jun 2014 21:34:24 +0000 (14:34 -0700)]
Fix test on maximum downscaling limits
There is a normative scaling range of (x1/2, x16)
for VP9. This patch fixes the maximum downscaling
tests that are applied in the convolve function.
The code used a maximum downscaling limit of x1/5
for historic reasons related to the scalable
coding work. Since the downsampling in this
application is non-normative it will revert to
using a separate non-normative scaler.
Paul Wilkins [Tue, 17 Jun 2014 14:31:24 +0000 (15:31 +0100)]
Fix some bugs in multi-arf
Fix some bugs relating to the use of buffers
in the overlay frames.
Fix bug where a mid sequence overlay was
propagating large partition and transform sizes into
the subsequent frame because of :-
sf->last_partitioning_redo_frequency > 1 and
sf->tx_size_search_method == USE_LARGESTALL
Paul Wilkins [Mon, 9 Jun 2014 15:25:31 +0000 (16:25 +0100)]
Experiment for mid group second arf.
This patch implements a mechanism for inserting a second
arf at the mid position of arf groups.
It is currently disabled by default using the flag multi_arf_enabled.
Results are currently down somewhat in initial testing if
multi-arf is enabled. Most of the loss is attributable to the
fact that code to preserve the previous golden frame
(in the arf buffer) in cases where we are coding an overlay
frame, is currently disabled in the multi-arf case.
Adrian Grange [Mon, 16 Jun 2014 23:22:28 +0000 (16:22 -0700)]
Allocate buffers based on correct chroma format
The encoder currently allocates frame buffers before
it establishes what the chroma sub-sampling factor is,
always allocating based on the 4:4:4 format.
This patch detects the chroma format as early as
possible allowing the encoder to allocate buffers of
the correct size.
Future patches will change the encoder to allocate
frame buffers on demand to further reduce the memory
profile of the encoder and rationalize the buffer
management in the encoder and decoder.
Jim Bankoski [Mon, 23 Jun 2014 14:04:57 +0000 (07:04 -0700)]
error check vp9 superframe parsing
This patch insures that the last byte of a chunk that contains a
valid superframe marker byte, actually has a proper superframe index.
If not it returns an error.
As part of doing that the file : vp90-2-15-fuzz-flicker.webm now fails
to decode properly and moves to the invalid file test from the test
vector suite.
hkuang [Tue, 17 Jun 2014 01:23:06 +0000 (18:23 -0700)]
Introduce FrameWorker for decoding.
When decoding in serial mode, there will be only
one FrameWorker doing decoding. When decoding in
parallel mode, there will be several FrameWorkers
doing decoding in parallel.
The code properly catches an invalid stream but seg faults instead of
returning an error due to a buffer not having been initialized. This
code fixes that.
Jim Bankoski [Fri, 20 Jun 2014 20:52:06 +0000 (13:52 -0700)]
Validate error checking code in decoder.
This patch adds a mechanism for insuring error checking on invalid files
by creating a unit test that runs the decoder and tests that the error
code matches what's expected on each frame in the decoder.
Disabled for now as this unit test will segfault with existing code.
Jim Bankoski [Thu, 19 Jun 2014 22:10:30 +0000 (15:10 -0700)]
fix peek_si to enable 1 byte show existing frames.
The test for this is in test vector code ( show existing frames will
fail ). I can't check it in disabled as I'm changing the generic
test code to do this:
Jingning Han [Thu, 19 Jun 2014 16:24:09 +0000 (09:24 -0700)]
Allow key frame more flexibility in mode search
This commit allows the key frame to search through more prediction
modes and more flexible block sizes. No speed change observed. The
coding performance for rtc set is improved by 1.7% for speed -5 and
3.0% for speed -6.
hkuang [Fri, 13 Jun 2014 21:14:02 +0000 (14:14 -0700)]
Add superframe support for frame parallel decoding.
A superframe is a bunch of frames that bundled as one frame. It is mostly
used to combine one or more non-displayable frames and one displayable frame.
For frame parallel decoding, libvpx decoder will only support decoding one
normal frame or a super frame with superframe index.
If an application pass a superframe without superframe index or a chunk
of displayable frames without superframe index to libvpx decoder, libvpx
will not decode it in frame parallel mode. But libvpx decoder still could
decode it in serial mode.