James Zern [Thu, 8 Dec 2016 21:02:30 +0000 (13:02 -0800)]
idct16x16_add_neon: fix arm visual studio builds
after: 2d3d95f enable vpx_idct16x16_256_add_neon in hbd builds
reorder INCLUDEs and fix indent of IF/ENDIFs
remove vpx_config.asm to avoid multiple symbol definitions in windows
builds and shift idct_neon.asm.S to the top to allow use of
CONFIG_VP9_HIGHBITDEPTH in the export list.
Yunqing Wang [Wed, 7 Dec 2016 18:00:36 +0000 (10:00 -0800)]
Remove an unused first pass statistic
One of the first pass stats "new_mv_count" is no longer used in VP9,
and is removed. This also makes it easy to implement a multi-threaded
first pass. This change doesn't affect the coding performance, which
has been verified by borg tests.
Linfeng Zhang [Wed, 7 Dec 2016 19:34:00 +0000 (11:34 -0800)]
Update TEST_P(PartialIDctTest, RunQuantCheck)
1. Use correct projections when copying real dct/quant outputs.
2. Remove local random number generator and combine loops.
3. Quantization with minimum allowed step sizes instead of maximum.
This may generate larger inputs.
James Zern [Wed, 30 Nov 2016 03:47:50 +0000 (19:47 -0800)]
idct16x16,NEON: rm output_stride from pass1 fns
vpx_idct16x16_256_add_neon_pass1, vpx_idct16x16_10_add_neon:
this was a constant 8 in all cases meaning the results are stored
contiguously, this allows the number of stores to be reduced.
Marco [Mon, 5 Dec 2016 20:05:35 +0000 (12:05 -0800)]
vp9: Adjust the weight factor for segment rate cost for aq-mode=3.
Use the segment weight factor based on the target (cr->percent_refresh)
if it less than the current estimate (avergae of past usage and target).
Small improvement at low bitrates.
James Zern [Fri, 25 Nov 2016 01:51:10 +0000 (17:51 -0800)]
build/make/Android.mk: correct rtcd template var refs
the expansion of findstring and rtcd_dep_template_CONFIG_ASM_ABIS needs
to be deferred until the block is parsed as makefile syntax rather than
eval time where rtcd_dep_template_CONFIG_ASM_ABIS will be unset. this
ensures vpx_config.asm is properly created.
* changes:
Update vpx_idct4x4_16_add_neon() to pass SingleExtremeCoeff test
Refine 8-bit 4x4 idct NEON intrinsics
Add idct speed test.
Update partial_idct_test.cc to support high bitdepth
James Zern [Thu, 24 Nov 2016 00:49:19 +0000 (16:49 -0800)]
Android.mk,armv7: fix idct_neon.asm.S creation
force this to be created before any other .S files. this change
additionally removes the file from the source list as it doesn't need to
be compiled on its own.
Marco [Tue, 22 Nov 2016 00:37:32 +0000 (16:37 -0800)]
vp9: Adjust cyclic refresh parameters for low bitrates.
Increase the motion threshold and qp-delta for segment#2 boost.
This can increase the frame-drop at low bitrates, but generally
better spatial quality.
Only affects real-time mode with aq-mode=3, at very low bitrates.
Jerome Jiang [Sat, 19 Nov 2016 01:07:20 +0000 (17:07 -0800)]
Cover more filter levels in unit tests for post proc.
For some filter level, the C/MSA doesn't match SSE2. Part of unit tests
are disabled. They will be re-enabled when C/MSA funcs are fixed.
BUG=webm:1321
James Zern [Wed, 23 Nov 2016 07:03:12 +0000 (23:03 -0800)]
use storage.googleapis for testdata download
replace downloads.webmproject.org with the canonical
storage.googleapis.com/... form. this appears less likely to fail when
dealing with multiple concurrent connections.
Marco [Tue, 22 Nov 2016 18:10:06 +0000 (10:10 -0800)]
vp9: Use more aggressive skip when short_circuit_low_temp_var = 1.
Use the same feature as https://chromium-review.googlesource.com/#/c/411327/,
but allow it to be used for speed = 6 and 7, where
short_circuit_low_temp_var = 1.
Speed up of ~2-3% for speed 7, with little/no loss in compression.
Jim Bankoski [Tue, 22 Nov 2016 15:31:04 +0000 (07:31 -0800)]
vp9-tests : split VpxEncoderThreadTest into two tests.
VpxEncoderThreadTest was taking a very long time for some runs and
timing out a lot. This is an attempt to split the test into runs
that can be run nightly ( speeds 2 through 9) and runs that can
be run weekly ( speeds 0-1 ).
Yaowu Xu [Mon, 21 Nov 2016 18:49:56 +0000 (10:49 -0800)]
Add validation of frame_parallel_decoding_mode
This is a boolean value that is written into bitstream, any value other
than 0 or 1 could have led to unexpected behavior. This commit fix the
issue by adding validation of the value to make sure it is boolean.
Jerome Jiang [Tue, 15 Nov 2016 18:37:12 +0000 (10:37 -0800)]
vp9: Speed 8: More aggresive golden skip for low res.
Add a new, more aggresive short circuit: short_circuit_low_temp_var = 3 to skip
golden of any mode when variance is lower than threshold for low res.
This change only affects speed = 8, low resolution.
Metrics for avgPSNR/SSIM on rtc_derf (low resolution) show loss of
0.27/0.31%.
On Nexus 6, the encoding time is reduced by ~2.3% on average across all
low-res clips.
James Zern [Sat, 12 Nov 2016 20:59:34 +0000 (12:59 -0800)]
partial_idct_test: use <limits> for int16_min/max
this removes the need for __STDC_LIMIT_MACROS which is defined in
vpx_integer.h, but may be preceded by earlier includes of stdint.h;
fixes build with the r13 ndk
Jerome Jiang [Mon, 14 Nov 2016 18:22:00 +0000 (10:22 -0800)]
vp9: Speed 8: Turn off 4x4avg for low-res non-key frames.
Changes only affects speed = 8 for low resolutions.
Metrics for avgPSNR/SSIM on rtc_derf (low resolutions) show loss of
0.5/0.6%.
On Nexus 6, the encoding time is reduced by ~5.9% on average across all
low-res clips.
Visually little/no change on rtc_derf clips.
Jingning Han [Sat, 12 Nov 2016 00:10:01 +0000 (16:10 -0800)]
Enable asymptotic closed-loop encoding decision
This commit enables asymptotic closed-loop encoding decision for
the key frame and alternate reference frame. It follows the regular
rate control scheme, but leaves out additional iteration on the
updated frame level probability model. It is enabled for speed 0.
James Zern [Tue, 8 Nov 2016 04:22:22 +0000 (20:22 -0800)]
enable vpx_idct32x32_34_add_neon in hbd builds
replace load_and_transpose_s16_8x8() in idct32_6_neon() with a separate
load_tran_low_to_s16() and transpose_s16_8x8(). the combined function is
used in idct32_8_neon() where the input is the correctly sized output
from the earlier stage.