Tom Finegan [Thu, 10 Jul 2014 22:17:05 +0000 (15:17 -0700)]
sh tests: Add support for running tested programs within another program.
Specifying the --prefix command line arg executes all test programs within the
context of the prefix string, which is assigned to VPX_TEST_PREFIX.
All test functions updated to include VPX_TEST_PREFIX in their eval command.
James Zern [Thu, 10 Jul 2014 04:02:02 +0000 (21:02 -0700)]
tests: add API_REGISTER_STATE_CHECK
used to wrap API functions to ensure full environment consistency as
opposed to the renamed ASM_REGISTER_STATE_CHECK which is used with
assembly functions.
currently checks the FPU tag word in x86/x86_64 gcc builds to ensure
emms has been called.
if a thread was still doing work when End() was called there'd be a race
on worker->status_. in these cases, however, the specific value is
meaningless as it would be >= OK and the thread would have been shut
down properly, but we'll check 'impl_' instead to avoid any potential
TSan/DRD reports.
Yunqing Wang [Thu, 10 Jul 2014 16:19:03 +0000 (09:19 -0700)]
Refactor vp9_diamond_search_sad function
Currently, vp9_diamond_search_sadx4() is only called when sse3 is
enabled, which is improper since sse2 optimization of sdx4df
functions are available. Changed to always use
vp9_diamond_search_sadx4().
Combined non-rd motion searchs into a single function
This commit combined the full pel and sub pel motion search into a
single function to avoid code duplication. The commit does not change
encoder outputs.
Jingning Han [Mon, 7 Jul 2014 19:08:40 +0000 (12:08 -0700)]
Re-design quantization process for 32x32 transform block
This commit enables a new quantization process for 32x32 2D-DCT
transform coefficient blocks. It improves the compression
performance of speed 5 by 1.4%. The overall compression gains of
speed 5 due to the new quantization scheme is 4.7%. It also includes
the SSSE3 implementation of the 32x32 quantization process.
Adrian Grange [Mon, 9 Jun 2014 22:22:17 +0000 (15:22 -0700)]
Fix decoder handling of intra-only frames
This patch fixes bug 633:
https://code.google.com/p/webm/issues/detail?id=633
The first decoded frame does not have to be a keyframe,
it could be an inter-frame that is coded intra-only.
This patch fixes the handling of intra-only frames.
A test vector has also been added that encodes 3
intra-only frames at the start of the clip. The
test vector was generated using the code in the
following patch:
https://gerrit.chromium.org/gerrit/#/c/70680/
Tim Kopp [Mon, 7 Jul 2014 18:35:27 +0000 (11:35 -0700)]
Vp9 denoiser MC bugfix
In the previous version, only certain buffers in the macroblockd were saved and
the restored. In this version, all of the buffers are saved and restored. The
code was then rolled into a loop for readability.
Also contains a tiny fix for when the -DOUTPUT_YUV_DENOISED flag is used.
Prepare for frame parallel decoding, the reference count buffers
need to be protected by mutex. Move vp9_thread.* to common
folder so that those buffers could use cross-platform mutex
from vp9_thread.*.
Alex Converse [Tue, 1 Jul 2014 20:02:05 +0000 (13:02 -0700)]
Cleanup motion search speed features.
* Replace max_step_search_steps with constant MAX_MVSEARCH_STEPS
* Fold (reduce_first_step_size + speed > 5) into reduce_first_step_size
replacing uses of reduce_first_step_size that don't add the speed
check with zero.
Alex Converse [Wed, 2 Jul 2014 19:36:48 +0000 (12:36 -0700)]
Split vp9_rdopt into vp9_rdopt and vp9_rd.
vp9_rdopt is for making rd optimal mode decisions. vp9_rd is for all
other rd related routines. Anything used outside of making an rd optimal
decision belongs in rd.
Yunqing Wang [Wed, 2 Jul 2014 19:16:27 +0000 (12:16 -0700)]
Fix rd threshold overflow issue
Moved the threshold adjustment before reference flag checking,
which could set the threshold to INT_MAX for disabled reference
frame, and cause overflow if the adjustment is done after that.
Tim Kopp [Tue, 17 Jun 2014 20:04:39 +0000 (13:04 -0700)]
VP9 denoiser implemented FILTER_BLOCK case
Renamed updating_running_avg() to filter(). Extended function with the rest of
the filter procedure. Made all of the empirically-determined constants used in
VP8 into functions so they can be tweaked more easily.
Tim Kopp [Tue, 1 Jul 2014 18:05:16 +0000 (11:05 -0700)]
VP9 denoising enabled by noise_sensitivity param
As in VP8.
Currently, this parameter is set with the VP8E_SET_NOISE_SENSITIVITY flag.
The flag was not renamed so that we don't break the interface for webrtc. This
should probably be changed at some point in the future.
Added a speed feature controlling a motion search parameter
This commit added a speed feature to control the step_param used in
full pixel motion search. The intention is to reduced the search
steps for high speed real time coding.
James Zern [Wed, 2 Jul 2014 02:02:15 +0000 (19:02 -0700)]
vpxdec: add --keep-going option
for debugging purposes.
continues decoding after receiving a decode error. will still exit with
an error after the current loop, ignoring remaining --loops
Jingning Han [Tue, 1 Jul 2014 23:10:44 +0000 (16:10 -0700)]
Re-design quantization process
This commit re-designs the quantization process for transform
coefficient blocks of size 4x4 to 16x16. It improves compression
performance for speed 7 by 3.85%. The SSSE3 version for the
new quantization process is included.
The average runtime of the 8x8 block quantization is reduced
from 285 cycles -> 255 cycles, i.e., over 10% faster.
Pengchong Jin [Mon, 30 Jun 2014 16:52:27 +0000 (09:52 -0700)]
Store/read 16x16 block statistics obtained from the first pass
Add a conditional compile flag for this feature. Also add a
switch to enable the encoder to use these statistics in the
second pass. Currently, the switch is turned off.
Yunqing Wang [Tue, 1 Jul 2014 19:18:27 +0000 (12:18 -0700)]
Elevate NEWMV mode checking threshold in real time
The current threshold is knid of low, and in many cases NEWMV
mode is checked but not picked as the best mode. This patch
added a speed feature to increase NEWMV threshold, so that
less partition mode checking goes to check NEWMV. This feature
is enabled for speed 6 and 7.
Rtc set borg tests showed:
1. Speed 6, overall psnr: -0.088%, ssim: -1.339%;
Average speedup on rtc set is 11.1%.
2. Speed 7, overall psnr: -0.505%, ssim: -2.320%
Average speedup on rtc set is 12.9%.