]>
granicus.if.org Git - libvpx/log
Paul Wilkins [Wed, 16 Aug 2017 18:25:29 +0000 (18:25 +0000)]
Merge "Fix corrupt arf groups due to low "lag_in_frames""
Linfeng Zhang [Wed, 16 Aug 2017 16:36:37 +0000 (16:36 +0000)]
Merge changes I08b562b6,Ia275940a,I51106e90
* changes:
Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
Update highbd idct x86 optimizations.
Update 32x32 idct sse2 and ssse3 optimizations.
paulwilkins [Wed, 16 Aug 2017 13:07:24 +0000 (14:07 +0100)]
Fix corrupt arf groups due to low "lag_in_frames"
Having a very small value for "lag_in_frames" can result in
corrupt arf groups including displayed frames that update
the arf buffer and fake overlay frames that are not in fact
overlays of real arfs but are nevertheless starved of bits.
Leaving lag_in_frames at the default of 25 for these 5 frame two
pass VBR tests should now give rise to a valid ARF coding pattern
as follows:- K(ey), A(rf), N(ormal), N, N, O(verlay).
This change is part of a response to BUG=webm:1454 where broken
arf groups interacted badly with a change that corrects for large rate
misses. However, it may still in some cases increase encode time by
virtue of the fact that the unit test now codes a correct coding pattern
with "hidden" ARF frames.
Change-Id: Ifd0246a4c1d0be247247c754024d7a4ed5f66a6b
Paul Wilkins [Wed, 16 Aug 2017 13:01:38 +0000 (13:01 +0000)]
Merge "Fix for encoder slowdown (for speeds >= 3)"
paulwilkins [Mon, 14 Aug 2017 15:11:34 +0000 (16:11 +0100)]
Fix for encoder slowdown (for speeds >= 3)
Some clips in nightly unit test exhibiting significant encoder slowdown which
appears to bisect to Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a.
The above change allowed for emergency iterations of the recode loop and
adjustment of the Q range if there is a large rate miss.
This patch disables the above adaptation for cases of cpu_speed >= 3 or more
specifically where cpi->sf.recode_loop >= ALLOW_RECODE_KFARFGF.
For speeds >= 3 the code does not currently run a dummy bit pack operation
inside the recode loop. Without this dummy pack operation there is no up to
date estimate of the current frame's size to use as a basis for assessing the
requirement for a recode. In practice it was using the previous frames size (or 0
for the first frame) which could cause odd behavior.
If we require the emergency rate correction added in Change-Id: I6923.. for
the higher speed settings it will be necessary to enable the dummy pack
which will in turn hurt encode speed.
BUG=webm:1454
Change-Id: I4fb3c6062ca9508325a6f31582f8e80f1a9b126f
Jerome Jiang [Tue, 15 Aug 2017 18:28:54 +0000 (18:28 +0000)]
Merge "Clean up writing YUV files for debug purpose."
Marco Paniconi [Tue, 15 Aug 2017 17:53:08 +0000 (17:53 +0000)]
Merge "vp9: Denoiser fix: use correct bsize for skin detection."
Jerome Jiang [Mon, 14 Aug 2017 20:57:51 +0000 (13:57 -0700)]
Clean up writing YUV files for debug purpose.
Change legacy vp8/9_write_yuv_frame to vpx_write_yuv_files.
Delete some flags that can be enabled during build.
To enable writing denoised YUV, use the following command line:
CFLAGS='-DOUTPUT_YUV_DENOISED' ./configure
--enable-vp9-temporal-denoising
For skinmap, use CFLAGS='-DOUTPUT_YUV_SKINMAP'
Change-Id: I236974ac8b3cf279d20c4dc7f6162d8b480b6528
Johann Koenig [Tue, 15 Aug 2017 17:37:59 +0000 (17:37 +0000)]
Merge changes I1f1edeaa,I89313cac
* changes:
quantize: silence unsigned overflow warning
quantize test: quiet overflow warning
Marco [Tue, 15 Aug 2017 17:01:09 +0000 (10:01 -0700)]
vp9: Denoiser fix: use correct bsize for skin detection.
Change-Id: I9d201fa3a4b00ebd147b57ed519fab8d59b0a802
Johann [Tue, 15 Aug 2017 16:48:24 +0000 (09:48 -0700)]
quantize: silence unsigned overflow warning
The result of the xor operation is unsigned. If coeff was negative,
this results in an unsigned value - INT_MIN.
Change-Id: I1f1edeaa6de1f4c68b848e8a82a666d390b749f0
Scott LaVarnway [Tue, 15 Aug 2017 15:35:33 +0000 (15:35 +0000)]
Merge "vp9: strip temporal filter code"
Johann [Tue, 15 Aug 2017 15:28:09 +0000 (08:28 -0700)]
quantize test: quiet overflow warning
Promote the result of RandRange to signed
Change-Id: I89313cace3bcbe9af96946bef00b6857fc48b128
Paul Wilkins [Tue, 15 Aug 2017 14:57:56 +0000 (14:57 +0000)]
Merge "Patch relating to Issue 1456."
Paul Wilkins [Tue, 15 Aug 2017 14:57:22 +0000 (14:57 +0000)]
Merge "Enable emergency fast Q adaptation for VBR test case."
Linfeng Zhang [Tue, 15 Aug 2017 00:05:22 +0000 (17:05 -0700)]
Add vpx_highbd_idct32x32_{34, 135, 1024}_add_{sse2, sse4_1}
BUG=webm:1412
Change-Id: I08b562b60fa85fbc2fec1c15c323a3444b44618f
Linfeng Zhang [Mon, 14 Aug 2017 23:47:24 +0000 (16:47 -0700)]
Update highbd idct x86 optimizations.
BUG=webm:1412
Change-Id: Ia275940af7d7d8637e9a851a9e39d655bfbe4069
Linfeng Zhang [Thu, 10 Aug 2017 22:17:48 +0000 (15:17 -0700)]
Update 32x32 idct sse2 and ssse3 optimizations.
Change-Id: I51106e90344035452621c49a6e1be7d5276b6c70
Scott LaVarnway [Thu, 10 Aug 2017 23:19:18 +0000 (16:19 -0700)]
vp9: strip temporal filter code
when CONFIG_REALTIME_ONLY is enabled.
BUG=webm:1446
Change-Id: Id547783ec75383966c40ab5cf6abb4a0f7984f52
Johann Koenig [Mon, 14 Aug 2017 20:52:52 +0000 (20:52 +0000)]
Merge changes I4b4beab1,I02f74dec
* changes:
quantize test: check skip_block
quantize test: use negative input
Johann Koenig [Mon, 14 Aug 2017 20:46:22 +0000 (20:46 +0000)]
Merge "temporal filter test: adjust inputs and runtime"
Jerome Jiang [Mon, 14 Aug 2017 18:55:42 +0000 (11:55 -0700)]
vp9 svc: Fix the stats output when sl = 1.
Actual frame size and bitrate is all 0 when using SVC sample encoder
with sl = 1 because the stats are set in parse_superframe_index which
will not caculate properly when sl = 1 since there is no superframe.
Use pkt->data.frame.sz instead when sl = 1.
Change-Id: I93f5e98a4c779e32b007e1564ba5396af9e34ad6
Scott LaVarnway [Mon, 14 Aug 2017 18:01:44 +0000 (18:01 +0000)]
Merge "vp9: strip mb graph code"
Johann [Tue, 28 Mar 2017 22:19:55 +0000 (15:19 -0700)]
temporal filter test: adjust inputs and runtime
Use input with a narrow range because the filter only applies when the
frames are similar.
Run CompareReferenceRandom more times. Especially before narrowing the
input range, the filter frequently did not apply.
Change-Id: Ie249bedf6d0d33dfa5884611cb1835788e418b38
James Zern [Mon, 14 Aug 2017 16:31:14 +0000 (09:31 -0700)]
disable SSSE3/VP9QuantizeTest* in hbd builds
this test fails with the configuration similar to the assembly prior to:
d52cb5972 quantize: copy ssse3 optimizations to intrinsics
BUG=webm:1458
Change-Id: Idc5c0b84c0598259fc49609a9f0756de531d3baf
Scott LaVarnway [Fri, 11 Aug 2017 19:24:33 +0000 (12:24 -0700)]
vp9: strip mb graph code
when CONFIG_REALTIME_ONLY is enabled.
BUG=webm:1446
Change-Id: I4b1b8e9a456830ba1b1bd3a8882e038d37ee7903
Johann [Fri, 11 Aug 2017 17:44:36 +0000 (10:44 -0700)]
Rename vp8 quantize file
BUG=webm:1457
Change-Id: Ie8fae018ad8417724fde087055b90228850d631d
Jerome Jiang [Fri, 11 Aug 2017 00:54:35 +0000 (00:54 +0000)]
Merge "vp9 SVC: Fix the denoiser frame buffer management."
Jerome Jiang [Mon, 7 Aug 2017 23:32:26 +0000 (16:32 -0700)]
vp9 SVC: Fix the denoiser frame buffer management.
Change the denoiser frame buffer management for SVC to more generally
handle the layer patterns in SVC (where last is not always refreshed).
This change is only for SVC with denoising and is bitexact.
Change-Id: Ic2b146a924cdf6e7114609158afa3d4880fe3fae
Linfeng Zhang [Thu, 10 Aug 2017 20:25:18 +0000 (20:25 +0000)]
Merge "Clean highbd idct x86 code with inline functions"
Johann Koenig [Thu, 10 Aug 2017 15:42:49 +0000 (15:42 +0000)]
Merge "neon: vpx_quantize_b_32x32"
Johann Koenig [Thu, 10 Aug 2017 15:42:20 +0000 (15:42 +0000)]
Merge "quantize: copy ssse3 optimizations to intrinsics"
paulwilkins [Tue, 8 Aug 2017 11:01:46 +0000 (12:01 +0100)]
Patch relating to Issue 1456.
Testing of 4k videos encoded with a fixed arbitrary chunking interval
uncovered a bug where by if a chunk ends 1 frame before a real scene cut,
the next chunk may be encoded with two consecutive key frames at the start
with the first being assigned 0 bits.
This fix insures that where there is a key frame group of length 1 it is
at least assigned 1 frames worth of bits not 0.
See also patch Change-Id: I692311a709ccdb6003e705103de9d05b59bf840a
which by virtue of allowing fast adaptation of Q made this bug more visible.
BUG=webm:1456
Change-Id: Ic9e016cb66d489b829412052273238975dc6f6ab
Linfeng Zhang [Wed, 9 Aug 2017 00:39:04 +0000 (17:39 -0700)]
Clean highbd idct x86 code with inline functions
Created inline functions highbd_butterfly_cospi16_sse2()
and highbd_butterfly_cospi16_sse4_1()
BUG=webm:1412
Change-Id: Icbc53a73712b6207379872a5e88d0a4d09e2322a
Marco Paniconi [Tue, 8 Aug 2017 23:08:10 +0000 (23:08 +0000)]
Merge "vp9: Partition logic adjustment for speed 6 feature."
Johann [Tue, 8 Aug 2017 21:21:58 +0000 (14:21 -0700)]
quantize test: check skip_block
Not all sizes were tested previously. Only 4x4 and 32x32
Change-Id: I4b4beab1b92a810a097a7306de04cc9e0e260315
Johann [Tue, 8 Aug 2017 21:19:56 +0000 (14:19 -0700)]
quantize test: use negative input
coeff contains signed values.
Change-Id: I02f74decf30379a28122169ab3e844d0f3bd7d23
Johann [Tue, 8 Aug 2017 21:05:16 +0000 (14:05 -0700)]
neon: vpx_quantize_b_32x32
With skip block the neon is about twice as fast as C.
The neon has no shortcut for coeff < zbin so it always takes the
same amount of time. Even if the C can take the shortcut, it is over
twice as fast in neon. If it can't, that gap increases to over 10x.
BUG=webm:1426
Change-Id: I400722146c1b5a5f6289f67d85fd642463d2bfc6
Johann [Thu, 3 Aug 2017 17:22:07 +0000 (10:22 -0700)]
quantize: copy ssse3 optimizations to intrinsics
Fairly minor differences from sse2. pabsw and psignw are the big gains.
Also re-uses some values in eob calculation to avoid an extra pcmp.
Fixes test failures in HBD and OS X builds.
Allows using it in 32bit builds, where it is about 40% faster than sse2.
Substantially faster than the assembly for skip_block. 10-20% faster the
rest of the time.
Change-Id: If783bb3567e561e47667e10133b9c84414a334e2
Marco [Tue, 8 Aug 2017 17:34:47 +0000 (10:34 -0700)]
vp9: Partition logic adjustment for speed 6 feature.
When adapt_partition_source_sad is enabled (currently only at
speed 6 for resoln <= 360p): use lower subsize (8x8 instead of 16x16)
for nonrd_select_partition on 32X32 blocks.
And force avoiding rectangular partition checks in
nonrd_pick_partition for speed >= 6.
Small increase ~0.5 in metrics for speed 6 on rtc_derf,
no change in speed.
Change-Id: Id751bc8f7573634571b2d6f5e29627cd5cebccae
Linfeng Zhang [Tue, 8 Aug 2017 00:37:02 +0000 (17:37 -0700)]
Update 32x32 idct sse2 funcs, add partial case 135
Change-Id: I2b9add83f6fd8f9138fed3bec04a59877a237a6a
Linfeng Zhang [Fri, 4 Aug 2017 00:50:03 +0000 (17:50 -0700)]
Rename highbd_multiplication_and_add_xx() to highbd_butterfly_xx()
in idct x86 code
Change-Id: I5159499a73a5c1b680516f6ca9c3d84f00c35083
Linfeng Zhang [Fri, 4 Aug 2017 00:46:21 +0000 (17:46 -0700)]
Replace multiplication_and_add() with butterfly() in idct x86 code
Change-Id: I266e45a3d75a5357c7d6e6f20ab5c6fdbfe4982e
Linfeng Zhang [Fri, 4 Aug 2017 00:42:54 +0000 (17:42 -0700)]
Update butterfly() in idct x86 optimizations.
Change-Id: Ic73e03bab9fdc085146f52094014db4af36ad701
Linfeng Zhang [Thu, 3 Aug 2017 00:48:40 +0000 (17:48 -0700)]
Add vpx_highbd_idct16x16_{10, 38, 256}_add_sse4_1
BUG=webm:1412
Change-Id: I8877c986b4042f7b8e33f5674c86700675a0e4ca
Linfeng Zhang [Fri, 4 Aug 2017 22:29:19 +0000 (15:29 -0700)]
Update for loop increment of idct x86 functions
Change-Id: Ided7895eaf41d5bc9d64fe536a17f5a078da68d4
Linfeng Zhang [Fri, 4 Aug 2017 22:10:12 +0000 (15:10 -0700)]
Update high bitdepth 16x16 idct x86 code
Prepare for high bitdepth 16x16 idct sse4.1 code.
Just functions moving and renaming.
BUG=webm:1412
Change-Id: Ie056fe4494b1f299491968beadcef990e2ab714a
Johann Koenig [Fri, 4 Aug 2017 20:34:50 +0000 (20:34 +0000)]
Merge "quantize test: consolidate sizes"
Johann [Wed, 2 Aug 2017 17:24:19 +0000 (10:24 -0700)]
quantize test: consolidate sizes
Pass a max txfm size parameter and combine the base quantize
test with the 32x32 test.
Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b
Scott LaVarnway [Fri, 4 Aug 2017 14:48:46 +0000 (07:48 -0700)]
vpx_dsp: merge avx2 variance files
BUG=webm:1404
Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4
Kaustubh Raste [Fri, 4 Aug 2017 05:26:56 +0000 (10:56 +0530)]
Fix mips dspr2 6 tap filter clobber list
Change-Id: Ib7c07e6ce00a5c7e59113b16e6661a8369f9e646
Linfeng Zhang [Fri, 4 Aug 2017 01:16:35 +0000 (01:16 +0000)]
Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function"
Scott LaVarnway [Thu, 3 Aug 2017 23:17:09 +0000 (23:17 +0000)]
Merge "vpx_dsp: Use correct check for halfpel in"
Linfeng Zhang [Wed, 2 Aug 2017 23:28:13 +0000 (16:28 -0700)]
Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function
BUG=webm:1412
Change-Id: I945f0fb6807b8948747243794dc7352b959221f7
Linfeng Zhang [Thu, 3 Aug 2017 20:51:02 +0000 (20:51 +0000)]
Merge changes I76727df0,I66297d78,I1d000c6b
* changes:
Extract inlined 16x16 idct sse2 code into header file
Add transpose_32bit_8x4() sse2 optimization
Update x86 idct optimization
Scott LaVarnway [Wed, 2 Aug 2017 19:19:19 +0000 (12:19 -0700)]
vpx_dsp: Use correct check for halfpel in
vpx_sub_pixel_variance32xh_avx2() and
vpx_sub_pixel_avg_variance32xh_avx2
see:
17fae3a Change to use correct check for halfpel
Change-Id: Ib0741c5c2fd011e9650ca62b76009f1b59fdbe4c
paulwilkins [Tue, 1 Aug 2017 16:06:29 +0000 (17:06 +0100)]
Enable emergency fast Q adaptation for VBR test case.
Enable fast adaptation of Q when there is a large overshoot
for the #ifdef AGGRESSIVE_VBR test case.
AGGRESSIVE_VBR is not currently enabled by default.
Change-Id: I7240bb6589795964b6b0b66df4468e4f21504e0f
Yunqing Wang [Thu, 3 Aug 2017 00:03:10 +0000 (00:03 +0000)]
Merge "Force the bit exactness in the first pass"
Linfeng Zhang [Wed, 2 Aug 2017 23:17:43 +0000 (16:17 -0700)]
Extract inlined 16x16 idct sse2 code into header file
Will be called by high bitdepth functions.
Change-Id: I76727df00941b5a27adceaba8347f275475fcd8c
Linfeng Zhang [Wed, 2 Aug 2017 23:15:58 +0000 (16:15 -0700)]
Add transpose_32bit_8x4() sse2 optimization
Change-Id: I66297d78b38db718cfe3ebb8ea972f5a72c17955
Yunqing Wang [Wed, 2 Aug 2017 22:47:09 +0000 (15:47 -0700)]
Force the bit exactness in the first pass
Originally, for the purpose of keeping a fast first pass, the first-pass
stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but
that difference is very small that doesn't cause a mismatch between the
final bitstreams. However, if the encoder changes, this minor difference
may cause a mismatch. Thus, this patch always forces the first pass to
be bit exact.
BUG=webm:1453
Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8
Johann Koenig [Wed, 2 Aug 2017 21:16:35 +0000 (21:16 +0000)]
Merge "quantize test: add speed comparison"
Marco [Fri, 30 Jun 2017 15:51:31 +0000 (08:51 -0700)]
vp8: Drop due to overshoot for non-screen content.
For 1 pass CBR mode:
Apply the logic for dropping (and re-adjusting rate control)
due to large overshoot to the case of non-screen content when
drop_frames_allowed is enabled.
For the non-screen content case: add additional condition that
rate correction factor is close to minimum state, and flag to
constrain the frequency of the dropping.
Also handle the case of temporal layers and multi-res encoding.
Add some flags/counters to the layer context for temporal layers.
For multi-res: drop due to overshoot is checked on lowest stream,
and if overshoot is detected we force drops on all upper streams
for that frame.
This feature is to avoid large frame sizes on big content
changes following low content period.
No change in behavior for screen_content_mode = 2.
Change-Id: I797ab236cbbf3b15cad439e9a227fbebced632e6
Scott LaVarnway [Wed, 2 Aug 2017 19:08:10 +0000 (19:08 +0000)]
Merge "vpxdsp: variance_impl_avx2.c cleanup"
Johann [Thu, 27 Jul 2017 21:14:20 +0000 (14:14 -0700)]
quantize test: add speed comparison
Test some possible scenarios.
Change-Id: I1a612e7153b31756be66390ceea55877856d5a33
Scott LaVarnway [Tue, 25 Jul 2017 20:26:46 +0000 (13:26 -0700)]
vpxdsp: variance_impl_avx2.c cleanup
BUG=webm:1404
Change-Id: I8d8498009e5ef7bf1137e4ff16ec81738a020b02
shiyou yin [Wed, 2 Aug 2017 01:08:43 +0000 (01:08 +0000)]
Merge "loongson mmi configuration patch."
Linfeng Zhang [Tue, 1 Aug 2017 00:46:20 +0000 (17:46 -0700)]
Update x86 idct optimization
Move constant coefficients preparation into inline function.
Change-Id: I1d000c6b161794c8828ff70768439b767e2afea1
Linfeng Zhang [Tue, 1 Aug 2017 21:39:39 +0000 (21:39 +0000)]
Merge "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"
Johann Koenig [Tue, 1 Aug 2017 16:44:31 +0000 (16:44 +0000)]
Merge "neon: vpx_quantize_b"
Paul Wilkins [Tue, 1 Aug 2017 08:58:36 +0000 (08:58 +0000)]
Merge "Respond more rapidly to excessive local overshoot."
Marco Paniconi [Tue, 1 Aug 2017 02:48:13 +0000 (02:48 +0000)]
Merge "vp9: Adjust noise estimation for 360p."
Marco [Tue, 1 Aug 2017 00:06:14 +0000 (17:06 -0700)]
vp9: Adjust noise estimation for 360p.
Change-Id: Ib76875232491b14f7114061e8e913e87004427a0
Linfeng Zhang [Mon, 31 Jul 2017 23:36:13 +0000 (16:36 -0700)]
Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
This replaces commit
aa1c4cd , which has a bug and was reverted in
commit
3c73e58 .
The bug is caused by rounding -step1[5] in highbd_idct8x8_12_half1d().
Change-Id: I37b3a5f0d91815f2dc570209091dc6626fd178a8
James Zern [Mon, 31 Jul 2017 22:43:41 +0000 (22:43 +0000)]
Merge "highbd_inv_txfm_sse4: make << of neg. val a multiply"
Johann [Thu, 27 Jul 2017 20:25:38 +0000 (13:25 -0700)]
neon: vpx_quantize_b
With skip block or coeff < zbin it is about twice as fast as C.
If most coeff values are > zbin it is about 10-15x as fast as C.
BUG=webm:1426
Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7
YinShiyou [Fri, 23 Jun 2017 08:26:30 +0000 (16:26 +0800)]
loongson mmi configuration patch.
enable loongson mmi optimization: ../configure --enable-mmi
Change-Id: I7792c3adeac1d5b573917d7857bba6c1cc05fea5
Marco Paniconi [Mon, 31 Jul 2017 14:58:15 +0000 (14:58 +0000)]
Merge "Revert "Revert "vp9: Speed feature to adapt partition based on source_sad."""
Marco [Sat, 29 Jul 2017 02:11:53 +0000 (19:11 -0700)]
vp9: Fix denoising condition when pickmode partition is used.
When the superblock partition is based on the nonrd-pickmode,
we need to avoid the denoising. Current condition was based on
the speed level. This change is to make the condition at the
superblock level, as the switch in partitioning may be done at
sb level based on source_sad (e.g., in speed 6).
Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04
Jerome Jiang [Mon, 31 Jul 2017 01:57:44 +0000 (18:57 -0700)]
Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""
This reverts commit
c9266b85476aadf078238b7bde3c36bf7953e11c .
Disable source_sad when resolution > 1080P. The test should
pass now.
BUG=webm:1452
Change-Id: I72dde88e66590ff9e41da5e5dd83f5550a83f082
James Zern [Sun, 30 Jul 2017 19:48:28 +0000 (12:48 -0700)]
highbd_inv_txfm_sse4: make << of neg. val a multiply
left shifting a negative value is undefined; quiets a ubsan warning.
this is applied to a constant, no change in the generated code.
Change-Id: I595f0ff7904ef025e07bb80234293d958dc9f254
James Zern [Sun, 30 Jul 2017 03:26:10 +0000 (03:26 +0000)]
Merge "Revert "vp9: Speed feature to adapt partition based on source_sad.""
James Zern [Sat, 29 Jul 2017 18:34:57 +0000 (11:34 -0700)]
Revert "vp9: Speed feature to adapt partition based on source_sad."
This reverts commit
064fc570ff8399536563e3846500fd99b273b034 .
This causes an assertion failure in vp9_mcomp.c when running
gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41:
`mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2))
- 1)'
Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6
James Zern [Sat, 29 Jul 2017 18:07:01 +0000 (11:07 -0700)]
Revert "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"
This reverts commit
aa1c4cd140007ea5b4be99732fbb23d1fd8cf2b5 .
This fails the following tests with extreme input coefficients:
SSE2/InvTrans8x8DCT.CompareReference/0
SSE2/InvTrans8x8DCT.CompareReference/2
previously the optimized path was skipped in this range
Change-Id: I9af015a46eba96208834a219fafd651d37556a80
Marco Paniconi [Sat, 29 Jul 2017 01:46:58 +0000 (01:46 +0000)]
Merge "vp9: Adjust logic in source sad for screen content."
Marco Paniconi [Sat, 29 Jul 2017 01:45:19 +0000 (01:45 +0000)]
Merge "vp9: Speed feature to adapt partition based on source_sad."
Jerome Jiang [Fri, 28 Jul 2017 23:34:04 +0000 (16:34 -0700)]
vp9: Adjust logic in source sad for screen content.
Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3
Marco [Fri, 28 Jul 2017 17:29:12 +0000 (10:29 -0700)]
vp9: Speed feature to adapt partition based on source_sad.
Move the source_sad feature to speed 6 (from speed 7), and
add speed feature to switch from the variance-based partition
to reference_partition (which uses nonrd-pickmode for bsize selection)
if source_sad is high.
Currently used only for speed 6 for resoln <= 360p.
About 4-5% improvement on 360p in RTC set.
Some speed slowdown, but still ~30% faster than speed 5.
Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c
Urvang Joshi [Fri, 28 Jul 2017 22:57:22 +0000 (15:57 -0700)]
Remove the DP version of vp9_optimize_b().
The greedy version was already enabled by default here:
https://chromium-review.googlesource.com/c/546848/
And the speed+compression gains from greedy version were already
mentioned here:
https://chromium-review.googlesource.com/c/531675/
Change-Id: Iad9f7d03490c845ad1e230af028c9d39edddca97
Linfeng Zhang [Fri, 28 Jul 2017 22:43:00 +0000 (22:43 +0000)]
Merge changes Ia0e20f5f,I28150789,I35df041b,I221dff34
* changes:
Update vpx_idct16x16_10_add_sse2()
Add vpx_idct16x16_38_add_sse2()
Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
Refactor highbd idct 4x4 and 8x8 x86 functions
James Zern [Fri, 28 Jul 2017 08:21:28 +0000 (01:21 -0700)]
Revert "quantize ssse3: declare all variables"
This reverts commit
03f5e300d69d368290305e19cc66bac8b0ea1ff8 .
This causes test failures under OSX:
SSSE3/VP9QuantizeTest.EOBCheck/0
SSSE3/VP9QuantizeTest.OperationCheck/0
Change-Id: I122732717ead1f7af5b04c529a6948e382e5e59b
Linfeng Zhang [Fri, 21 Jul 2017 21:56:42 +0000 (14:56 -0700)]
Update vpx_idct16x16_10_add_sse2()
Change-Id: Ia0e20f5fa47382af5785221eebb05212b40bd35c
Linfeng Zhang [Thu, 20 Jul 2017 23:53:19 +0000 (16:53 -0700)]
Add vpx_idct16x16_38_add_sse2()
Change-Id: I28150789feadc0b63d2fadc707e48971b41f9898
Linfeng Zhang [Fri, 30 Jun 2017 23:55:17 +0000 (16:55 -0700)]
Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
BUG=webm:1412
Change-Id: I35df041b757d42278ac7a5cdbd909e8ffcee1455
Linfeng Zhang [Fri, 30 Jun 2017 20:55:38 +0000 (13:55 -0700)]
Refactor highbd idct 4x4 and 8x8 x86 functions
BUG=webm:1412
Change-Id: I221dff34dd5f71b390b5e043d0a137ccb0a01dec
Johann Koenig [Thu, 27 Jul 2017 21:18:35 +0000 (21:18 +0000)]
Merge "quantize ssse3: declare all variables"
Jerome Jiang [Thu, 27 Jul 2017 20:24:08 +0000 (20:24 +0000)]
Merge "vp8: Remove isolated skin & non skin blocks."
Jerome Jiang [Wed, 19 Jul 2017 20:02:53 +0000 (13:02 -0700)]
vp8: Remove isolated skin & non skin blocks.
Neutral on RTC metrics and speed on Pixel.
Change-Id: I26b907483fe133e6e4c1009d147631f0d0e0f2fb
James Zern [Wed, 26 Jul 2017 03:13:49 +0000 (20:13 -0700)]
inv_txfm_{sse2,ssse3}: clear conversion warnings
visual studio reports tran_high_t (int64) -> short in calls to
_mm_set1_epi16
Change-Id: Icb8d1baee77ad3d45edb1477a443d3e648f0b745
James Zern [Wed, 26 Jul 2017 03:11:09 +0000 (20:11 -0700)]
highbd_idct*_sse*.c: clear conversion warnings
visual studio reports tran_high_t (int64) -> int in calls to
_mm_setr_epi32
Change-Id: Ic2247c8e3800991202151790d78bd94c4f4aed05