granicus.if.org Git - libvpx/log

]> granicus.if.org Git - libvpx/log

projects / libvpx / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Johann [Thu, 3 Aug 2017 17:22:07 +0000 (10:22 -0700)]

quantize: copy ssse3 optimizations to intrinsics

Fairly minor differences from sse2. pabsw and psignw are the big gains.
Also re-uses some values in eob calculation to avoid an extra pcmp.

Fixes test failures in HBD and OS X builds.

Allows using it in 32bit builds, where it is about 40% faster than sse2.

Substantially faster than the assembly for skip_block. 10-20% faster the
rest of the time.

Change-Id: If783bb3567e561e47667e10133b9c84414a334e2

commit | commitdiff | tree

Johann Koenig [Fri, 4 Aug 2017 20:34:50 +0000 (20:34 +0000)]

Merge "quantize test: consolidate sizes"

commit | commitdiff | tree

Johann [Wed, 2 Aug 2017 17:24:19 +0000 (10:24 -0700)]

quantize test: consolidate sizes

Pass a max txfm size parameter and combine the base quantize
test with the 32x32 test.

Change-Id: I72ddf020fe6888e864ea9f3642ee2d9a8e48a04b

commit | commitdiff | tree

Scott LaVarnway [Fri, 4 Aug 2017 14:48:46 +0000 (07:48 -0700)]

vpx_dsp: merge avx2 variance files

BUG=webm:1404

Change-Id: Ieb8f85c3811b05df78722cb41eeb1166966ceec4

commit | commitdiff | tree

Kaustubh Raste [Fri, 4 Aug 2017 05:26:56 +0000 (10:56 +0530)]

Fix mips dspr2 6 tap filter clobber list

Change-Id: Ib7c07e6ce00a5c7e59113b16e6661a8369f9e646

commit | commitdiff | tree

Linfeng Zhang [Fri, 4 Aug 2017 01:16:35 +0000 (01:16 +0000)]

Merge "Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function"

commit | commitdiff | tree

Scott LaVarnway [Thu, 3 Aug 2017 23:17:09 +0000 (23:17 +0000)]

Merge "vpx_dsp: Use correct check for halfpel in"

commit | commitdiff | tree

Linfeng Zhang [Wed, 2 Aug 2017 23:28:13 +0000 (16:28 -0700)]

Rewrite vpx_idct16x16_{10,256}_add_sse2() and add case 38 function

BUG=webm:1412

Change-Id: I945f0fb6807b8948747243794dc7352b959221f7

commit | commitdiff | tree

Linfeng Zhang [Thu, 3 Aug 2017 20:51:02 +0000 (20:51 +0000)]

Merge changes I76727df0,I66297d78,I1d000c6b

* changes:
  Extract inlined 16x16 idct sse2 code into header file
  Add transpose_32bit_8x4() sse2 optimization
  Update x86 idct optimization

commit | commitdiff | tree

Scott LaVarnway [Wed, 2 Aug 2017 19:19:19 +0000 (12:19 -0700)]

vpx_dsp: Use correct check for halfpel in

vpx_sub_pixel_variance32xh_avx2() and
vpx_sub_pixel_avg_variance32xh_avx2

see:
17fae3a Change to use correct check for halfpel

Change-Id: Ib0741c5c2fd011e9650ca62b76009f1b59fdbe4c

commit | commitdiff | tree

Yunqing Wang [Thu, 3 Aug 2017 00:03:10 +0000 (00:03 +0000)]

Merge "Force the bit exactness in the first pass"

commit | commitdiff | tree

Linfeng Zhang [Wed, 2 Aug 2017 23:17:43 +0000 (16:17 -0700)]

Extract inlined 16x16 idct sse2 code into header file

Will be called by high bitdepth functions.

Change-Id: I76727df00941b5a27adceaba8347f275475fcd8c

commit | commitdiff | tree

Linfeng Zhang [Wed, 2 Aug 2017 23:15:58 +0000 (16:15 -0700)]

Add transpose_32bit_8x4() sse2 optimization

Change-Id: I66297d78b38db718cfe3ebb8ea972f5a72c17955

commit | commitdiff | tree

Yunqing Wang [Wed, 2 Aug 2017 22:47:09 +0000 (15:47 -0700)]

Force the bit exactness in the first pass

Originally, for the purpose of keeping a fast first pass, the first-pass
stats between row_mt_mode = 0 and row_mt_mode = 1 are not bit exact, but
that difference is very small that doesn't cause a mismatch between the
final bitstreams. However, if the encoder changes, this minor difference
may cause a mismatch. Thus, this patch always forces the first pass to
be bit exact.

BUG=webm:1453

Change-Id: I2b67cf529dee81f660f9d9e7fe9a60ea3c7b12b8

commit | commitdiff | tree

Johann Koenig [Wed, 2 Aug 2017 21:16:35 +0000 (21:16 +0000)]

Merge "quantize test: add speed comparison"

commit | commitdiff | tree

Marco [Fri, 30 Jun 2017 15:51:31 +0000 (08:51 -0700)]

vp8: Drop due to overshoot for non-screen content.

For 1 pass CBR mode:
Apply the logic for dropping (and re-adjusting rate control)
due to large overshoot to the case of non-screen content when
drop_frames_allowed is enabled.

For the non-screen content case: add additional condition that
rate correction factor is close to minimum state, and flag to
constrain the frequency of the dropping.

Also handle the case of temporal layers and multi-res encoding.
Add some flags/counters to the layer context for temporal layers.
For multi-res: drop due to overshoot is checked on lowest stream,
and if overshoot is detected we force drops on all upper streams
for that frame.

This feature is to avoid large frame sizes on big content
changes following low content period.

No change in behavior for screen_content_mode = 2.

Change-Id: I797ab236cbbf3b15cad439e9a227fbebced632e6

commit | commitdiff | tree

Scott LaVarnway [Wed, 2 Aug 2017 19:08:10 +0000 (19:08 +0000)]

Merge "vpxdsp: variance_impl_avx2.c cleanup"

commit | commitdiff | tree

Johann [Thu, 27 Jul 2017 21:14:20 +0000 (14:14 -0700)]

quantize test: add speed comparison

Test some possible scenarios.

Change-Id: I1a612e7153b31756be66390ceea55877856d5a33

commit | commitdiff | tree

Scott LaVarnway [Tue, 25 Jul 2017 20:26:46 +0000 (13:26 -0700)]

vpxdsp: variance_impl_avx2.c cleanup

BUG=webm:1404

Change-Id: I8d8498009e5ef7bf1137e4ff16ec81738a020b02

commit | commitdiff | tree

shiyou yin [Wed, 2 Aug 2017 01:08:43 +0000 (01:08 +0000)]

Merge "loongson mmi configuration patch."

commit | commitdiff | tree

Linfeng Zhang [Tue, 1 Aug 2017 00:46:20 +0000 (17:46 -0700)]

Update x86 idct optimization

Move constant coefficients preparation into inline function.

Change-Id: I1d000c6b161794c8828ff70768439b767e2afea1

commit | commitdiff | tree

Linfeng Zhang [Tue, 1 Aug 2017 21:39:39 +0000 (21:39 +0000)]

Merge "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"

commit | commitdiff | tree

Johann Koenig [Tue, 1 Aug 2017 16:44:31 +0000 (16:44 +0000)]

Merge "neon: vpx_quantize_b"

commit | commitdiff | tree

Paul Wilkins [Tue, 1 Aug 2017 08:58:36 +0000 (08:58 +0000)]

Merge "Respond more rapidly to excessive local overshoot."

commit | commitdiff | tree

Marco Paniconi [Tue, 1 Aug 2017 02:48:13 +0000 (02:48 +0000)]

Merge "vp9: Adjust noise estimation for 360p."

commit | commitdiff | tree

Marco [Tue, 1 Aug 2017 00:06:14 +0000 (17:06 -0700)]

vp9: Adjust noise estimation for 360p.

Change-Id: Ib76875232491b14f7114061e8e913e87004427a0

commit | commitdiff | tree

Linfeng Zhang [Mon, 31 Jul 2017 23:36:13 +0000 (16:36 -0700)]

Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2

This replaces commit aa1c4cd, which has a bug and was reverted in
commit 3c73e58.

The bug is caused by rounding -step1[5] in highbd_idct8x8_12_half1d().

Change-Id: I37b3a5f0d91815f2dc570209091dc6626fd178a8

commit | commitdiff | tree

James Zern [Mon, 31 Jul 2017 22:43:41 +0000 (22:43 +0000)]

Merge "highbd_inv_txfm_sse4: make << of neg. val a multiply"

commit | commitdiff | tree

Johann [Thu, 27 Jul 2017 20:25:38 +0000 (13:25 -0700)]

neon: vpx_quantize_b

With skip block or coeff < zbin it is about twice as fast as C.

If most coeff values are > zbin it is about 10-15x as fast as C.

BUG=webm:1426

Change-Id: I5d3c007b014a372d5ef0882b39bb48983b4131c7

commit | commitdiff | tree

YinShiyou [Fri, 23 Jun 2017 08:26:30 +0000 (16:26 +0800)]

loongson mmi configuration patch.

enable loongson mmi optimization: ../configure --enable-mmi

Change-Id: I7792c3adeac1d5b573917d7857bba6c1cc05fea5

commit | commitdiff | tree

Marco Paniconi [Mon, 31 Jul 2017 14:58:15 +0000 (14:58 +0000)]

Merge "Revert "Revert "vp9: Speed feature to adapt partition based on source_sad."""

commit | commitdiff | tree

Marco [Sat, 29 Jul 2017 02:11:53 +0000 (19:11 -0700)]

vp9: Fix denoising condition when pickmode partition is used.

When the superblock partition is based on the nonrd-pickmode,
we need to avoid the denoising. Current condition was based on
the speed level. This change is to make the condition at the
superblock level, as the switch in partitioning may be done at
sb level based on source_sad (e.g., in speed 6).

Change-Id: I12ece4f60b93ed34ee65ff2d6cdce1213c36de04

commit | commitdiff | tree

Jerome Jiang [Mon, 31 Jul 2017 01:57:44 +0000 (18:57 -0700)]

Revert "Revert "vp9: Speed feature to adapt partition based on source_sad.""

This reverts commit c9266b85476aadf078238b7bde3c36bf7953e11c.

Disable source_sad when resolution > 1080P. The test should
pass now.

BUG=webm:1452

Change-Id: I72dde88e66590ff9e41da5e5dd83f5550a83f082

commit | commitdiff | tree

James Zern [Sun, 30 Jul 2017 19:48:28 +0000 (12:48 -0700)]

highbd_inv_txfm_sse4: make << of neg. val a multiply

left shifting a negative value is undefined; quiets a ubsan warning.
this is applied to a constant, no change in the generated code.

Change-Id: I595f0ff7904ef025e07bb80234293d958dc9f254

commit | commitdiff | tree

James Zern [Sun, 30 Jul 2017 03:26:10 +0000 (03:26 +0000)]

Merge "Revert "vp9: Speed feature to adapt partition based on source_sad.""

commit | commitdiff | tree

James Zern [Sat, 29 Jul 2017 18:34:57 +0000 (11:34 -0700)]

Revert "vp9: Speed feature to adapt partition based on source_sad."

This reverts commit 064fc570ff8399536563e3846500fd99b273b034.

This causes an assertion failure in vp9_mcomp.c when running
gtest_filter=VP9/MotionVectorTestLarge.OverallTest/41:
`mv->col >= -((1 << (11 + 1 + 2)) - 1) && mv->col < ((1 << (11 + 1 + 2))
- 1)'

Change-Id: I449e777bf18b661cb3f1d82253610c55c51687f6

commit | commitdiff | tree

James Zern [Sat, 29 Jul 2017 18:07:01 +0000 (11:07 -0700)]

Revert "Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2"

This reverts commit aa1c4cd140007ea5b4be99732fbb23d1fd8cf2b5.

This fails the following tests with extreme input coefficients:
SSE2/InvTrans8x8DCT.CompareReference/0
SSE2/InvTrans8x8DCT.CompareReference/2

previously the optimized path was skipped in this range

Change-Id: I9af015a46eba96208834a219fafd651d37556a80

commit | commitdiff | tree

Marco Paniconi [Sat, 29 Jul 2017 01:46:58 +0000 (01:46 +0000)]

Merge "vp9: Adjust logic in source sad for screen content."

commit | commitdiff | tree

Marco Paniconi [Sat, 29 Jul 2017 01:45:19 +0000 (01:45 +0000)]

Merge "vp9: Speed feature to adapt partition based on source_sad."

commit | commitdiff | tree

Jerome Jiang [Fri, 28 Jul 2017 23:34:04 +0000 (16:34 -0700)]

vp9: Adjust logic in source sad for screen content.

Change-Id: I917d106f4c95ea44e413e23881f6303982e1a6a3

commit | commitdiff | tree

Marco [Fri, 28 Jul 2017 17:29:12 +0000 (10:29 -0700)]

vp9: Speed feature to adapt partition based on source_sad.

Move the source_sad feature to speed 6 (from speed 7), and
add speed feature to switch from the variance-based partition
to reference_partition (which uses nonrd-pickmode for bsize selection)
if source_sad is high.

Currently used only for speed 6 for resoln <= 360p.
About 4-5% improvement on 360p in RTC set.
Some speed slowdown, but still ~30% faster than speed 5.

Change-Id: Ib0330ee5fe9fdd2608aed91359a2a339d967491c

commit | commitdiff | tree

Urvang Joshi [Fri, 28 Jul 2017 22:57:22 +0000 (15:57 -0700)]

Remove the DP version of vp9_optimize_b().

The greedy version was already enabled by default here:
https://chromium-review.googlesource.com/c/546848/

And the speed+compression gains from greedy version were already
mentioned here:
https://chromium-review.googlesource.com/c/531675/

Change-Id: Iad9f7d03490c845ad1e230af028c9d39edddca97

commit | commitdiff | tree

Linfeng Zhang [Fri, 28 Jul 2017 22:43:00 +0000 (22:43 +0000)]

Merge changes Ia0e20f5f,I28150789,I35df041b,I221dff34

* changes:
  Update vpx_idct16x16_10_add_sse2()
  Add vpx_idct16x16_38_add_sse2()
  Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2
  Refactor highbd idct 4x4 and 8x8 x86 functions

commit | commitdiff | tree

James Zern [Fri, 28 Jul 2017 08:21:28 +0000 (01:21 -0700)]

Revert "quantize ssse3: declare all variables"

This reverts commit 03f5e300d69d368290305e19cc66bac8b0ea1ff8.

This causes test failures under OSX:
SSSE3/VP9QuantizeTest.EOBCheck/0
SSSE3/VP9QuantizeTest.OperationCheck/0

Change-Id: I122732717ead1f7af5b04c529a6948e382e5e59b

commit | commitdiff | tree

Linfeng Zhang [Fri, 21 Jul 2017 21:56:42 +0000 (14:56 -0700)]

Update vpx_idct16x16_10_add_sse2()

Change-Id: Ia0e20f5fa47382af5785221eebb05212b40bd35c

commit | commitdiff | tree

Linfeng Zhang [Thu, 20 Jul 2017 23:53:19 +0000 (16:53 -0700)]

Add vpx_idct16x16_38_add_sse2()

Change-Id: I28150789feadc0b63d2fadc707e48971b41f9898

commit | commitdiff | tree

Linfeng Zhang [Fri, 30 Jun 2017 23:55:17 +0000 (16:55 -0700)]

Rewrite vpx_highbd_idct8x8_{12,64}_add_sse2

BUG=webm:1412

Change-Id: I35df041b757d42278ac7a5cdbd909e8ffcee1455

commit | commitdiff | tree

Linfeng Zhang [Fri, 30 Jun 2017 20:55:38 +0000 (13:55 -0700)]

Refactor highbd idct 4x4 and 8x8 x86 functions

BUG=webm:1412

Change-Id: I221dff34dd5f71b390b5e043d0a137ccb0a01dec

commit | commitdiff | tree

Johann Koenig [Thu, 27 Jul 2017 21:18:35 +0000 (21:18 +0000)]

Merge "quantize ssse3: declare all variables"

commit | commitdiff | tree

Jerome Jiang [Thu, 27 Jul 2017 20:24:08 +0000 (20:24 +0000)]

Merge "vp8: Remove isolated skin & non skin blocks."

commit | commitdiff | tree

Jerome Jiang [Wed, 19 Jul 2017 20:02:53 +0000 (13:02 -0700)]

vp8: Remove isolated skin & non skin blocks.

Neutral on RTC metrics and speed on Pixel.

Change-Id: I26b907483fe133e6e4c1009d147631f0d0e0f2fb

commit | commitdiff | tree

James Zern [Wed, 26 Jul 2017 03:13:49 +0000 (20:13 -0700)]

inv_txfm_{sse2,ssse3}: clear conversion warnings

visual studio reports tran_high_t (int64) -> short in calls to
_mm_set1_epi16

Change-Id: Icb8d1baee77ad3d45edb1477a443d3e648f0b745

commit | commitdiff | tree

James Zern [Wed, 26 Jul 2017 03:11:09 +0000 (20:11 -0700)]

highbd_idct*_sse*.c: clear conversion warnings

visual studio reports tran_high_t (int64) -> int in calls to
_mm_setr_epi32

Change-Id: Ic2247c8e3800991202151790d78bd94c4f4aed05

commit | commitdiff | tree

James Zern [Tue, 25 Jul 2017 23:40:21 +0000 (16:40 -0700)]

vpx_variance16x16_sse2: correct cast order

allow the right shift to operate on 64-bits, this matches the rest of
the implementations

previously:
b0f1ae147 vpx_get16x16var_avx2: correct cast order

Change-Id: I632ee5e418f3f9b30e79ecd05588eb172b0783aa

commit | commitdiff | tree

James Zern [Mon, 24 Jul 2017 23:29:44 +0000 (16:29 -0700)]

vpx_get16x16var_avx2: correct cast order

allow the right shift to operate on 64-bits, this matches the rest of
the implementations

missed in:
6acd061aa variance_avx2: sync variance functions with c-code

Change-Id: Icae436b881251ccb9f9ed64fcbf8d358c58a4617

commit | commitdiff | tree

James Zern [Sat, 22 Jul 2017 20:01:49 +0000 (13:01 -0700)]

set_var_thresh_from_histogram: prevent negative variance

For 8-bit the subtrahend is small enough to fit into uint32_t.

For 10/12-bit apply:
63a37d16f Prevent negative variance

previously:
47b9a0912 Resolve -Wshorten-64-to-32 in highbd variance.
c0241664a Resolve -Wshorten-64-to-32 in variance.

Change-Id: I181c85f0b9a03da37c2e8b89482d48aa3dbc0aee

commit | commitdiff | tree

Marco [Thu, 20 Jul 2017 20:43:55 +0000 (13:43 -0700)]

vp8: Fix compile warning in vp8_multi_resolution_encoder.c

Change-Id: I49c960179dfc1902aa5e5c99915789878c06bc3d

commit | commitdiff | tree

Johann Koenig [Thu, 20 Jul 2017 19:46:05 +0000 (19:46 +0000)]

Merge "quantize test: promote RandRange() result to signed"

commit | commitdiff | tree

Johann Koenig [Thu, 20 Jul 2017 19:45:59 +0000 (19:45 +0000)]

Merge "quantize test: lowbd functions do not pass in highbd"

commit | commitdiff | tree

Jerome Jiang [Thu, 20 Jul 2017 16:58:01 +0000 (16:58 +0000)]

Merge "vp9: Removed unused skin detection function."

commit | commitdiff | tree

Johann [Wed, 19 Jul 2017 21:33:00 +0000 (14:33 -0700)]

quantize test: promote RandRange() result to signed

Avoid unsigned overflow warning:
unsigned integer overflow: 19974 - 32703 cannot be represented in type
'unsigned int'

Change-Id: Ifebee014342e4c6f3b53306c0cad6ae0b465ac12

commit | commitdiff | tree

Johann [Wed, 19 Jul 2017 21:20:13 +0000 (14:20 -0700)]

quantize test: lowbd functions do not pass in highbd

qcoeff output looks OK but dqcoeff is no good.

BUG=webm:1448

Change-Id: I07211db8a8b74f1f45fdd059852e2de0e5ee18fd

commit | commitdiff | tree

Johann Koenig [Thu, 20 Jul 2017 15:17:26 +0000 (15:17 +0000)]

Merge "quantize test: eob is output"

commit | commitdiff | tree

Johann Koenig [Wed, 19 Jul 2017 21:35:57 +0000 (21:35 +0000)]

Merge "Earmark extra space for VSX."

commit | commitdiff | tree

Jerome Jiang [Wed, 19 Jul 2017 21:30:21 +0000 (21:30 +0000)]

Merge "Roll libwebm: Fix android build failure with NDK r15b."

commit | commitdiff | tree

Johann [Tue, 18 Jul 2017 21:20:14 +0000 (14:20 -0700)]

quantize test: eob is output

eob values are generated by the function.

Change-Id: I8ce92100e83022bff99888a5a7e6ef378c49fda3

commit | commitdiff | tree

Han Shen [Wed, 12 Jul 2017 19:56:19 +0000 (12:56 -0700)]

Earmark extra space for VSX.

Backend specific optimization for PPC VSX reads 16 bytes, whereas arm neon /
sse2 only reads <= 8 bytes. Although the extra bytes read are actually never
used, this is not a warrant for groping around.  Fixed by allocating more when
building for VSX. This is reported by asan.

Also note - PPC does have assembly that loads 64-bit content from memory - lxsdx
loads one 64-bit doubleword (whereas lxvd2x loads two 64-bit doubleword) from
memory. However, we only have "vec_vsx_ld" builtins that mapped to lxvd2x, no
builtins to lxsdx. The only way to access lxsdx is through inline assembly,
which does not fit well in the origin paradigm.

Refer:
  vsx:
    vpx_tm_predictor_4x4_vsx @ third_party/libvpx/git_root/vpx_dsp/ppc/intrapred_vsx.c
  neon:
    vpx_tm_predictor_4x4_neon @ third_party/libvpx/git_root/vpx_dsp/arm/intrapred_neon_asm.asm
  sse2:
    tm_predictor_4x4 @ third_party/libvpx/git_root/vpx_dsp/x86/intrapred_sse2.asm

BUG=b/63112600

Tested:
  asan tests passed.

Change-Id: I5f74b56e35c05b67851de8b5530aece213f2ce9d

commit | commitdiff | tree

Johann Koenig [Wed, 19 Jul 2017 20:34:13 +0000 (20:34 +0000)]

Merge "variance: call C comp_avg_pred"

commit | commitdiff | tree

Jerome Jiang [Mon, 17 Jul 2017 20:59:14 +0000 (13:59 -0700)]

Roll libwebm: Fix android build failure with NDK r15b.

BUG=webm:1447

Change-Id: I8defe45cb94eb9c209ba72ce446786f24c14c0b8

commit | commitdiff | tree

Jerome Jiang [Tue, 18 Jul 2017 21:52:04 +0000 (14:52 -0700)]

vp9: Removed unused skin detection function.

Change-Id: I6702b7b11aa4ac9aac5fd54deef4377cdcb29c64

commit | commitdiff | tree

Jerome Jiang [Tue, 18 Jul 2017 21:30:04 +0000 (21:30 +0000)]

Merge "vp9: Allocate alt-ref in denoiser for SVC."

commit | commitdiff | tree

Jerome Jiang [Tue, 18 Jul 2017 20:48:32 +0000 (20:48 +0000)]

Merge "vp9: Remove isolated skin & non-skin blocks."

commit | commitdiff | tree

Johann Koenig [Tue, 18 Jul 2017 20:42:39 +0000 (20:42 +0000)]

Merge changes I62c2e313,Ibd7a0337,I94e1d886

* changes:
  quantize test: test sse2 and avx optimizations
  quantize test: extend arrays
  quantize test: restrict and correct input

commit | commitdiff | tree

Johann [Fri, 14 Jul 2017 18:29:32 +0000 (11:29 -0700)]

variance: call C comp_avg_pred

Keep optimized code out of the reference implementation. This matches
the style of the other sub calls.

Change-Id: I3da6acd4f2c647b029c420e22ac9410a18259689

commit | commitdiff | tree

Jerome Jiang [Mon, 17 Jul 2017 23:29:16 +0000 (16:29 -0700)]

vp9: Allocate alt-ref in denoiser for SVC.

When SVC is used, allocate alt-ref in denoiser.

Change-Id: I1b17221b55b9444cd23b97d481b54ff8d296d857

commit | commitdiff | tree

Johann [Tue, 18 Jul 2017 19:32:57 +0000 (12:32 -0700)]

quantize ssse3: declare all variables

Copy missing line from avx implementation.

Change-Id: I9755c5b4d4034867de6fa9f741c24bf49dce3a27

commit | commitdiff | tree

Johann [Tue, 18 Jul 2017 17:06:23 +0000 (10:06 -0700)]

quantize test: test sse2 and avx optimizations

ssse3 does not pass either of the tests.

avx 32x32 does not pass.

Change-Id: I62c2e31336fd2327327afaa0da896ad79a3def44

commit | commitdiff | tree

Jerome Jiang [Tue, 11 Jul 2017 18:31:01 +0000 (11:31 -0700)]

vp9: Remove isolated skin & non-skin blocks.

0.007% regression on rtc and 0.004% gain on rtc_derf.
1 thread on QVGA,VGA and HD has ~0.2% speed regression while 2 threads has
~0.2% speed gain on Google Pixel.

Change-Id: Ia4a6ec904df670d7001e35e070b01e34149d23dc

commit | commitdiff | tree

Johann [Tue, 18 Jul 2017 16:55:45 +0000 (09:55 -0700)]

quantize test: extend arrays

Officially the quant structures are 8 elements, with one dc element and
7 repeated ac elements. The low bit depth optimizations take advantage
of this to fill the xmm registers. The high bit depth version manually
duplicates the values.

If all the optimizations were unified, the structure sizes could be
greatly reduced.

Change-Id: Ibd7a0337a7832ce2a1a05ee433c310077e1059ae

commit | commitdiff | tree

Johann [Tue, 18 Jul 2017 16:40:45 +0000 (09:40 -0700)]

quantize test: restrict and correct input

Use only valid values for quantize inputs. These were determined by
looping over vp9_init_quantizer and looking for max and min values.

This allows extending the test to the low bit depth functions which were
not designed to handle all possible inputs but only valid inputs.

Change-Id: I94e1d8863a49ac227845b65c6b50130e10e6319e

commit | commitdiff | tree

Marco [Tue, 18 Jul 2017 16:15:13 +0000 (09:15 -0700)]

vp9: Disable usage of sb_use_mv_part for SVC.

To fix valgrind issueis with SVC tests.
SVC encoding uses prune_evenmore which is causing uinit value.

Will re-enable later when issue is resolved.

Change-Id: I257ff878cf78197ddd813db056582a4d5fe94f44

commit | commitdiff | tree

Marco [Mon, 17 Jul 2017 23:04:04 +0000 (16:04 -0700)]

vp9: Fix to setting content_state for real-time mode.

When content_state_sb is set to LowVarHighSumdiff, don't reset
it to VeryHighSad. Visually better on clips with strong lighting changes.

Small/negligible change in RTC metrics and speed.

Change-Id: I20c383e3c4cf8d1149de5f9260449c0b7cf7c6aa

commit | commitdiff | tree

Marco [Thu, 13 Jul 2017 21:49:39 +0000 (14:49 -0700)]

vp9: Reuse motion from choose_partitioning in NEWMV search.

When int_pro_motion_estimation is done for superblock in
choose_partitioning, use it to avoid the full_pixel_search
for NEWMV mode, if bsize is >= 32X32.

For speed > 7.
Small/neutral change on RTC metrics.
~1-2% speedup on arm on high motion clip.

Change-Id: I3cfe6833ff4bf75d4afa83eaf058ad45729de85b

commit | commitdiff | tree

James Zern [Sat, 15 Jul 2017 18:37:10 +0000 (18:37 +0000)]

Merge "fix 'make exampletest' w/CONFIG_REALTIME_ONLY"

commit | commitdiff | tree

Jerome Jiang [Fri, 14 Jul 2017 20:45:33 +0000 (13:45 -0700)]

vp9: Compute skin only for blocks eligible for noise estimation.

Change-Id: Iddcb83a5968db57cfd312c5bc44b2a226a2a3264

commit | commitdiff | tree

Marco [Thu, 13 Jul 2017 23:09:11 +0000 (16:09 -0700)]

vp9: Adjust minmax threshold for variance partitioning.

Only affects speed 7. Improvement on high motion clips.

Change-Id: Ibddb68fed9c63207df29ffd790f9205b1cecf687

commit | commitdiff | tree

Johann [Thu, 13 Jul 2017 16:14:37 +0000 (09:14 -0700)]

quantize test: use Buffer

Although the low bitdepth functions are identical (excepting the need
for larger intermediate values) they do not pass these tests. This
improves the error output to aid debugging.

Simplify buffer usage with Buffer and removing unnecessarily aligned
variables.

eob is a single element and never written using aligned instructions.

BUG=webm:1426

Change-Id: Ic95789a135cf1e8a3846d85270f2b818f6ec7e35

commit | commitdiff | tree

James Zern [Thu, 13 Jul 2017 17:47:20 +0000 (10:47 -0700)]

fix 'make exampletest' w/CONFIG_REALTIME_ONLY

for tests that aren't explicitly testing 2-pass behavior use --passes=1
with this configuration

Change-Id: I6a1520ecc65d0f626486604310af29dacb9f197f

commit | commitdiff | tree

James Zern [Wed, 12 Jul 2017 23:30:04 +0000 (23:30 +0000)]

Merge "remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY"

commit | commitdiff | tree

Johann Koenig [Wed, 12 Jul 2017 20:15:00 +0000 (20:15 +0000)]

Merge "sad4d neon: 64x[32,64]"

commit | commitdiff | tree

Marco Paniconi [Wed, 12 Jul 2017 19:13:05 +0000 (19:13 +0000)]

Merge "vp9: Fix to SVC and denoising for fixed pattern case."

commit | commitdiff | tree

Johann Koenig [Wed, 12 Jul 2017 15:01:30 +0000 (15:01 +0000)]

Merge changes Ibf5e61dc,I44b48512,I7de2500c,I5081b5ce

* changes:
  sad4d neon: 32x[16,32,64]
  sad4d neon: 16x[8,16,32]
  sad4d neon: 8x[4,8,16]
  sad4d neon: 4x4, 4x8

commit | commitdiff | tree

Johann [Tue, 11 Jul 2017 16:15:09 +0000 (09:15 -0700)]

sad4d neon: 64x[32,64]

Rewrite 64x64.

BUG=webm:1425

Change-Id: I336bf5a3aa4b783389c10b16a50f0f559346ecbf

commit | commitdiff | tree

Johann [Tue, 11 Jul 2017 14:39:28 +0000 (07:39 -0700)]

sad4d neon: 32x[16,32,64]

Rewrite 32x32. Use half the accumulator registers.

BUG=webm:1425

Change-Id: Ibf5e61dc4ba15056102aef8495f4a02c668c5d13

commit | commitdiff | tree

Johann [Tue, 11 Jul 2017 14:22:26 +0000 (07:22 -0700)]

sad4d neon: 16x[8,16,32]

Rewrite 16x16. Use half the accumulator registers.

BUG=webm:1425

Change-Id: I44b48512b1e3629505d83c2645e800f53878ccc2

commit | commitdiff | tree

Johann [Tue, 11 Jul 2017 14:01:12 +0000 (07:01 -0700)]

sad4d neon: 8x[4,8,16]

BUG=webm:1425

Change-Id: I7de2500cca4b621f21478c4b0333c56d76dbc9a4

commit | commitdiff | tree

Johann [Tue, 11 Jul 2017 12:44:23 +0000 (05:44 -0700)]

sad4d neon: 4x4, 4x8

BUG=webm:1425

Change-Id: I5081b5ce131821d590c53ac1206a94f50cb8b468

commit | commitdiff | tree

Urvang Joshi [Wed, 12 Jul 2017 00:08:56 +0000 (00:08 +0000)]

Merge "Remove the token state array from greedy optimize_b."

commit | commitdiff | tree

James Zern [Sat, 8 Jul 2017 04:42:44 +0000 (21:42 -0700)]

remove vp9_firstpass.c w/CONFIG_REALTIME_ONLY

BUG=webm:1446

Change-Id: I6e0ea9342c715d354c641109737172afa649b85b

commit | commitdiff | tree

Urvang Joshi [Tue, 11 Jul 2017 20:05:29 +0000 (13:05 -0700)]

Remove the token state array from greedy optimize_b.

Reduces memory usage, and speeds up encoding for some difficult clips.
No impact on output or metrics.

Ported from aomedia patch:
https://aomedia-review.googlesource.com/c/14501

Change-Id: I26ec69af8336f9e80da486a1cfbfc89a3596954d

Unnamed repository; edit this file 'description' to name the repository.