Jingning Han [Wed, 6 Jul 2016 19:46:48 +0000 (12:46 -0700)]
Use precise context to estimate coeff rate cost
Use the precise context to estimate the zero token cost in trellis
optimization process. This improves the speed 0 coding performance
by 0.15% for lowres and 0.1% for midres. It improves the speed 1
coding performance by 0.2% for midres and hdres.
Jingning Han [Wed, 6 Jul 2016 17:05:51 +0000 (10:05 -0700)]
Enable uniform quantization with trellis optimization in speed 0
This commit allows the inter prediction residual to use uniform
quantization followed by trellis coefficient optimization in
speed 0. It improves the coding performance by
Jingning Han [Wed, 6 Jul 2016 16:58:22 +0000 (09:58 -0700)]
Refactor coeff_cost() function
Move the operations that update the context buffers outside this
function. The coeff_cost() takes all input as const value and returns
the coefficient cost.
This makes preparation for the next coefficient optimization CLs.
Jingning Han [Fri, 1 Jul 2016 19:20:45 +0000 (12:20 -0700)]
Remove txfrm_block_to_raster_xy() from vp9 encoder
The transform block row and column positions are always available
outside the callees. There is no need to re-compute these values
again. This approach has been used by the decoder. This commit
removes txfrm_block_to_raster_xy() function.
Yunqing Wang [Sat, 2 Jul 2016 00:58:02 +0000 (17:58 -0700)]
Make set_reference control API work in VP9
Moved the API patch from NextGenv2. An example was included.
To try it, for example, run the following command:
$ examples/vpx_cx_set_ref vp9 352 288 in.yuv out.ivf 4 30
paulwilkins [Thu, 30 Jun 2016 12:38:57 +0000 (13:38 +0100)]
Fix error in get_ul_intra_threshold() for 10/12 bit.
The scaling of the threshold for 10 and 12 bit here appears
to be in the wrong direction. For 10 and 12 bit we expect sse
values to be higher and hence the threshold used should be
scaled up not down.
James Zern [Thu, 30 Jun 2016 03:37:12 +0000 (20:37 -0700)]
convolve_test: fix byte offsets in hbd build
CONVERT_TO_BYTEPTR(x) was corrected in: 003a9d2 Port metric computation changes from nextgenv2
to use the more common (x) within the expansion. offsets should occur
after converting the pointer to the desired type.
JackyChen [Wed, 29 Jun 2016 23:22:12 +0000 (16:22 -0700)]
vp9: Change the scheme for modeling rd for 32x32 on newmv_last mode.
For real time CBR mode, use model_rd_for_sb_y for 32x32 if the mode is
newmv last, which is less aggressive in skipping transform and
quantization, to avoid quality regression in some conditions.
Linfeng Zhang [Thu, 9 Jun 2016 20:46:09 +0000 (13:46 -0700)]
Update vpx subpixel 1d filter ssse3 asm
Speed test shows the new vertical filters have degradation on Celeron
Chromebook. Added "X86_SUBPIX_VFILTER_PREFER_SLOW_CELERON" to control
the vertical filters activated code. Now just simply active the code
without degradation on Celeron. Later there should be 2 set of vertical
filters ssse3 functions, and let jump table to choose based on CPU type.
paulwilkins [Wed, 3 Feb 2016 15:37:32 +0000 (15:37 +0000)]
Add experimental spatial de-noise filter on key frames.
For forced key frames in particular this helps to make them
blend better with the surrounding frames where noise tends
to be suppressed by a combination of quantization and alt
ref filtering.
Currently disabled by default under and IFDEF flag pending
wider testing.
jackychen [Tue, 28 Jun 2016 22:38:18 +0000 (15:38 -0700)]
vp9 postproc: Bug fix and code clean.
Bug fix: The crash is caused by not allocating buffer for prev_mip in
postproc_state and prev_mip in postproc_state is only used for MFQE,
ohter postproc modules, deblocking and etc., should not use it.
James Zern [Wed, 22 Jun 2016 19:18:49 +0000 (12:18 -0700)]
*.asm: normalize label format
add a trailing ':', though it's optional with the tools we support, it's
more common to use it to mark a label. this also quiets the
orphan-labels warning with nasm/yasm.
jackychen [Thu, 23 Jun 2016 16:40:49 +0000 (09:40 -0700)]
vp9: Increase thr_var for 32x32 blocks in var-based partitioning.
For real-time mode, increase variance threshold for 32x32 blocks in
var-based partitioning for resolution >= 720p, so that it is more
likely to stay at 32x32 for high resolution which accelerates the
encoding speed with little/no PSNR drop.
PSNR effect on different speed settings:
speed 8 rtc: 0.02 overall PSNR drop, 0.285% SSIM drop
speed 7 rtc: 0.196% overall PSNR increase, 0.066% SSIM increase
speed 5 rtc_derf: no effect.
Speed up:
gips_motion_WHD, 1mbps: 2.5% faster on speed 7, 2.6% faster on speed8
gips_stat_WHD, 1mbps: 4.6% faster on speed 7, 5.6% faster on speed8
James Zern [Sat, 25 Jun 2016 18:13:02 +0000 (11:13 -0700)]
vp9_pickmode: revert rd modeling change for hbd
Avoids a segfault in high-bitdepth builds.
This restores the condition to its state prior to: 7991241 vp9: Change the scheme for modeling rd for bsize 32x32.
jackychen [Fri, 24 Jun 2016 16:39:27 +0000 (09:39 -0700)]
vp9: Change the scheme for modeling rd for bsize 32x32.
For real-time CBR mode, use model_rd_for_sb_y_large instead of
model_rd_for_sb_y for 32x32 block. In the former model, transform
might be skipped more aggressively in some condtions, which speeds
up encoding time with only a little PSNR/SSIM drop on rtc test set.
No obvious visual quality regression.
PSNR effect on different speed settings:
speed 8 rtc: 0.129% overall PSNR drop, 0.137% SSIM drop
speed 7 rtc: 0.135% overall PSNR drop, 0.062% SSIM drop
speed 5 rtc_derf: 0.105% overall PSNR drop, 0.095% SSIM drop
Speed up:
gips_motion_WHD, 1mbps: 3.29% faster on speed 7, 2.56% faster on speed8
gips_stat_WHD, 1mbps: 2.17% faster on speed 7, 1.62% faster on speed8