]> granicus.if.org Git - libvpx/log
libvpx
7 years agovp8: [loongson] optimize copymen with mmi
Shiyou Yin [Wed, 6 Sep 2017 09:57:16 +0000 (17:57 +0800)]
vp8: [loongson] optimize copymen with mmi

1. vp8_copy_mem16x16_mmi
2. vp8_copy_mem8x8_mmi
3. vp8_copy_mem8x4_mmi

Change-Id: I3de29a11fa7402df0e48bbb944440b1e66498a65

7 years agoMerge "vp9: Modify pickmode early exit for ARF in 1pass."
Marco Paniconi [Thu, 21 Sep 2017 01:33:12 +0000 (01:33 +0000)]
Merge "vp9: Modify pickmode early exit for ARF in 1pass."

7 years agovp9: Modify pickmode early exit for ARF in 1pass.
Marco [Wed, 20 Sep 2017 21:55:31 +0000 (14:55 -0700)]
vp9: Modify pickmode early exit for ARF in 1pass.

Add the condition frames_since_golden > 0 to the
early exit check for ARF usage in nonrd_pickmode.
This improves quality of first frame following ARF, where
frame_since_golden = 0.

Small/neutral gain in metrics for speed 6, neutral change in speed.

Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Change-Id: I82e73e6ff6fc849e5ca5448563cb8a0515fe0cdc

7 years agoMerge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c"
James Zern [Wed, 20 Sep 2017 21:12:45 +0000 (21:12 +0000)]
Merge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c"

7 years agoBug fix: fadst4() in vp9/encoder/vp9_dct.c
Linfeng Zhang [Wed, 20 Sep 2017 16:18:04 +0000 (09:18 -0700)]
Bug fix: fadst4() in vp9/encoder/vp9_dct.c

A new bug was introduced in a80bdfd "Change sinpi_{1,2,3,4}_9 from
tran_high_t to int16_t". Reverted the change in this file.

BUG=webm:1450

Failed test C/TransHT.AccuracyCheck/26.

Change-Id: Id001f57aad811803ef7d367d2b2bc008d8499991

7 years agoMerge "vp9: Modify simple_block_yrd condition for SVC"
Marco Paniconi [Wed, 20 Sep 2017 16:42:31 +0000 (16:42 +0000)]
Merge "vp9: Modify simple_block_yrd condition for SVC"

7 years agoMerge "vpxdsp: [x86] add highbd_d63_predictor functions"
Scott LaVarnway [Wed, 20 Sep 2017 11:39:28 +0000 (11:39 +0000)]
Merge "vpxdsp: [x86] add highbd_d63_predictor functions"

7 years agotemporal_filter_apply_sse2.asm: add ':' to label
James Zern [Wed, 20 Sep 2017 01:59:11 +0000 (18:59 -0700)]
temporal_filter_apply_sse2.asm: add ':' to label

quiets nasm warning:
label alone on a line without a colon might be in error

BUG=webm:1462

Change-Id: I660407ca60e8c9a810dba9d76afb65852029a29c

7 years agovpxdsp: [x86] add highbd_d63_predictor functions
Scott LaVarnway [Wed, 13 Sep 2017 01:01:31 +0000 (18:01 -0700)]
vpxdsp: [x86] add highbd_d63_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.94x

C vs SSSE3 speed gains:
_8x8 : ~8.69x
_16x16 : ~6.32x
_32x32 : ~5.33x

BUG=webm:1411

Change-Id: I2c35b527eac2229f17aaa9d118fb601e7195efe4

7 years agovp9: Modify simple_block_yrd condition for SVC
Marco [Tue, 19 Sep 2017 22:19:41 +0000 (15:19 -0700)]
vp9: Modify simple_block_yrd condition for SVC

Modify simple_block_yrd condition in nonrd_pickmode for SVC:
allow it to be used also on base temporal_layer, only when
spatial_layer > 1 and block size < 32x32.

Speed up of about ~2% for 3 layer SVC, with little/negligible
loss in quality.

Change-Id: I7734bdae51cf51f22b96f6b2b27da20ea1d84344

7 years agoMerge "Add datarate test for frame_parallel_decoding mode off."
Marco Paniconi [Tue, 19 Sep 2017 22:31:08 +0000 (22:31 +0000)]
Merge "Add datarate test for frame_parallel_decoding mode off."

7 years agovp9: Fix condition for limiting ARF 1 pass vbr.
Marco [Tue, 19 Sep 2017 18:00:40 +0000 (11:00 -0700)]
vp9: Fix condition for limiting ARF 1 pass vbr.

Fix the setting to frames_till_gf_update_due, and
adjust the limit value.
Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Neutral change to metrics and speed for ytlive.

Change-Id: I266d9a00b36221bc8602fa2746d4e8a8f7d4dfae

7 years agoMerge "vp9: Adjustments for ARF usage in 1 pass vbr."
Marco Paniconi [Tue, 19 Sep 2017 16:29:19 +0000 (16:29 +0000)]
Merge "vp9: Adjustments for ARF usage in 1 pass vbr."

7 years agovp9: Adjustments for ARF usage in 1 pass vbr.
Marco [Tue, 19 Sep 2017 00:30:49 +0000 (17:30 -0700)]
vp9: Adjustments for ARF usage in 1 pass vbr.

Only when USE_ALT_REF_ONE_PASS is enabled (off by default).
Force fixed partition to 64x64 when is_src_alt_ref_frame is true,
and don't force early exit for some modes in nonrd_pickmode
for ARF noshow frames.

Small gain ~0.2% on ytlive metrics for speed 6.
Neutral speed difference.

Change-Id: I27eb6622d0453c09a06ccdc3b16368762474d11d

7 years agoChange sinpi_{1,2,3,4}_9 from tran_high_t to int16_t
Linfeng Zhang [Tue, 12 Sep 2017 22:24:54 +0000 (15:24 -0700)]
Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t

Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154

7 years agoMerge "cosmetics: vp9_rtcd_defs.pl"
Linfeng Zhang [Mon, 18 Sep 2017 16:23:33 +0000 (16:23 +0000)]
Merge "cosmetics: vp9_rtcd_defs.pl"

7 years agoMerge "vp8: [loongson] optimize dequantize with mmi"
Shiyou Yin [Fri, 15 Sep 2017 23:53:40 +0000 (23:53 +0000)]
Merge "vp8: [loongson] optimize dequantize with mmi"

7 years agoAdd datarate test for frame_parallel_decoding mode off.
Marco [Fri, 15 Sep 2017 18:35:53 +0000 (11:35 -0700)]
Add datarate test for frame_parallel_decoding mode off.

Add datarate test, for both VBR and CBR mode, with the
frame_parallel_decoding mode disabled (and error_resilience off).

Change-Id: I54feec3248a68ecff4bef8d9a31bb1616fab77df

7 years agoMerge "Fix bug in intra mode rd penalty."
Paul Wilkins [Fri, 15 Sep 2017 15:43:29 +0000 (15:43 +0000)]
Merge "Fix bug in intra mode rd penalty."

7 years agoMerge "mips msa clean-up msa macros"
Kaustubh Raste [Fri, 15 Sep 2017 01:27:02 +0000 (01:27 +0000)]
Merge "mips msa clean-up msa macros"

7 years agoMerge "vp9_scale_test: add C config"
James Zern [Fri, 15 Sep 2017 00:27:58 +0000 (00:27 +0000)]
Merge "vp9_scale_test: add C config"

7 years agoMerge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()""
James Zern [Fri, 15 Sep 2017 00:27:41 +0000 (00:27 +0000)]
Merge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()""

7 years agoMerge "VP9 level targeting: add a new AUTO mode"
Hui Su [Thu, 14 Sep 2017 21:02:38 +0000 (21:02 +0000)]
Merge "VP9 level targeting: add a new AUTO mode"

7 years agovp9_scale_test: add C config
James Zern [Thu, 14 Sep 2017 20:08:04 +0000 (13:08 -0700)]
vp9_scale_test: add C config

Change-Id: I9dfe8255d1c096d246bf9719729f57dbae779ffc

7 years agoRevert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"
James Zern [Thu, 14 Sep 2017 20:06:40 +0000 (13:06 -0700)]
Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"

This reverts commit afee58f2c4159172f5340f2c7d3e8041cfa0eb91.

This causes ~8x slowdown in 4:3 in the C-code

Change-Id: I60a7ead12dc4ec1548b1b12cfe4b0be42ef04e0e

7 years agoVP9 level targeting: add a new AUTO mode
Hui Su [Thu, 10 Aug 2017 22:05:20 +0000 (15:05 -0700)]
VP9 level targeting: add a new AUTO mode

In the new AUTO mode, restrict the minimum alt-ref interval and max column
tiles adaptively based on picture size, while not applying any rate control
constraints.

This mode aims to produce encodings that fit into levels corresponding to
the source picture size, with minimum compression quality lost. However, the
bitstream is not guaranteed to be level compatible, e.g., the average bitrate
may exceed level limit.

BUG=b/64451920

Change-Id: I02080b169cbbef4ab2e08c0df4697ce894aad83c

7 years agovp8: [loongson] optimize dequantize with mmi
Shiyou Yin [Wed, 6 Sep 2017 03:30:25 +0000 (11:30 +0800)]
vp8: [loongson] optimize dequantize with mmi

1. vp8_dequantize_b_mmi
2. vp8_dequant_idct_add_mmi

Change-Id: I505f8afb7a444173392b325906e6a4f420f00709

7 years agovp8: [loongson] optimize idctllm with mmi
Shiyou Yin [Wed, 6 Sep 2017 00:51:21 +0000 (08:51 +0800)]
vp8: [loongson] optimize idctllm with mmi

1. vp8_short_idct4x4llm_mmi
2. vp8_short_inv_walsh4x4_mmi
3. vp8_dc_only_idct_add_mmi

Change-Id: I616923681e79d78607a4988608fc39df77b093f4

7 years agomips msa clean-up msa macros
Kaustubh Raste [Thu, 14 Sep 2017 06:59:19 +0000 (12:29 +0530)]
mips msa clean-up msa macros

Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
Created one define LD_V for vector load and ST_V for vector store

Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448

7 years agocosmetics: vp9_rtcd_defs.pl
Linfeng Zhang [Tue, 12 Sep 2017 20:19:55 +0000 (13:19 -0700)]
cosmetics: vp9_rtcd_defs.pl

Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086

7 years agoMerge "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"
Linfeng Zhang [Wed, 13 Sep 2017 17:21:45 +0000 (17:21 +0000)]
Merge "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"

7 years agoMerge "Revert "Revert "quantize avx: copy 32x32 implementation"""
Johann Koenig [Wed, 13 Sep 2017 14:44:53 +0000 (14:44 +0000)]
Merge "Revert "Revert "quantize avx: copy 32x32 implementation"""

7 years agoMerge "Optimize mips msa vp9 average mc functions"
Kaustubh Raste [Wed, 13 Sep 2017 06:02:49 +0000 (06:02 +0000)]
Merge "Optimize mips msa vp9 average mc functions"

7 years agoMerge "vp8: [loongson] optimize loopfilter with mmi"
Shiyou Yin [Wed, 13 Sep 2017 01:05:46 +0000 (01:05 +0000)]
Merge "vp8: [loongson] optimize loopfilter with mmi"

7 years agoRevert "Revert "quantize avx: copy 32x32 implementation""
Johann [Tue, 12 Sep 2017 21:09:42 +0000 (14:09 -0700)]
Revert "Revert "quantize avx: copy 32x32 implementation""

This reverts commit 8c42237bb200253931c49e2c530838f3a877dd65.

Because ssse3 code is used for the reference, the qcoeff and dqcoeff
reference buffers must be aligned.

Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06

7 years agoSpecialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()
Linfeng Zhang [Tue, 12 Sep 2017 18:37:04 +0000 (11:37 -0700)]
Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()

Scale 3x3 block instead of 16x16 block in each loop.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3. Optimization code
   will be smaller and faster.
2. The maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
   (The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: Ibb9242a629ddb03e1ff93b859bece738255e698c

7 years agoOptimize mips msa vp9 average mc functions
Kaustubh Raste [Tue, 12 Sep 2017 10:05:07 +0000 (15:35 +0530)]
Optimize mips msa vp9 average mc functions

Load the specific destination loads instead of vector load

Change-Id: I65ca13ae8f608fad07121fef848e2a18f54171fe

7 years agoMerge "vpxdsp: [x86] add highbd_d207_predictor functions"
Scott LaVarnway [Mon, 11 Sep 2017 22:32:23 +0000 (22:32 +0000)]
Merge "vpxdsp: [x86] add highbd_d207_predictor functions"

7 years agoAdd 4 to 1 scaling NEON optimization
Linfeng Zhang [Thu, 7 Sep 2017 19:50:36 +0000 (12:50 -0700)]
Add 4 to 1 scaling NEON optimization

BUG=webm:1419

Change-Id: If82a93935d2453e61b7647aae70983db1740bec7

7 years agovpxdsp: [x86] add highbd_d207_predictor functions
Scott LaVarnway [Wed, 6 Sep 2017 17:08:03 +0000 (10:08 -0700)]
vpxdsp: [x86] add highbd_d207_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.31x

C vs SSSE3 speed gains:
_8x8 : ~4.73x
_16x16 : ~10.88x
_32x32 : ~4.80x

BUG=webm:1411

Change-Id: I0bac29db261079181ddabc6814bd62c463109caf

7 years agovp8: [loongson] optimize loopfilter with mmi
Shiyou Yin [Fri, 8 Sep 2017 01:42:51 +0000 (09:42 +0800)]
vp8: [loongson] optimize loopfilter with mmi

1. vp8_loop_filter_horizontal_edge_mmi
2. vp8_loop_filter_vertical_edge_mmi
3. vp8_mbloop_filter_horizontal_edge_mmi
4. vp8_mbloop_filter_vertical_edge_mmi
5. vp8_loop_filter_simple_horizontal_edge_mmi
6. vp8_loop_filter_simple_vertical_edge_mmi

Change-Id: Ie34bbff3a16cff64e39a50798afd2b7dac9bcdc3

7 years agointrapred: sync highbd_d63_predictor w/d63_
James Zern [Sat, 9 Sep 2017 02:20:07 +0000 (19:20 -0700)]
intrapred: sync highbd_d63_predictor w/d63_

8/16/32: ~6%/~18%/~33% faster

previously:
7012ba639 vp9_reconintra: simplify d63_predictor

BUG=webm:1411

Change-Id: Ie775f3a4f7fd74df44754e65686d826a51c2cdc2

7 years agovpx_mem: make vpx_memset16 inline
James Zern [Sat, 9 Sep 2017 01:57:08 +0000 (18:57 -0700)]
vpx_mem: make vpx_memset16 inline

Change-Id: Ibb2cab930c95836e6d6e66300c33e7d08e4474d4

7 years agointrapred: sync highbd_d45_predictor w/d45_
James Zern [Sat, 9 Sep 2017 01:52:01 +0000 (18:52 -0700)]
intrapred: sync highbd_d45_predictor w/d45_

8/16/32:: ~19%/~54%/~75.5% faster

previously:
acc481eaa vp9_reconintra: simplify d45_predictor

BUG=webm:1411

Change-Id: Ie8340b0c5070ae640f124733f025e4e749b660d8

7 years agoMerge changes I9ec438aa,I99c954ff
James Zern [Fri, 8 Sep 2017 19:23:40 +0000 (19:23 +0000)]
Merge changes I9ec438aa,I99c954ff

* changes:
  Update convolve functions' assertions
  Add 2 to 1 scaling NEON optimization

7 years agoFix bug in intra mode rd penalty.
paulwilkins [Tue, 29 Aug 2017 20:08:08 +0000 (13:08 -0700)]
Fix bug in intra mode rd penalty.

The intra mode rd penalty was implemented as a rate penalty.
Code was added to scale the penalty according to block size but
this was not done correctly for the SB level or sub 8x8.

The code did a weird double scaling in regard to bit depth that
has been removed. Given that it is a rate penalty the bit depth
should not matter.

This bug fix improves average metrics  on our standard test
sets by about 0.1%

Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e

7 years agovpx_scale_test.h: remove #if from inside macro
James Zern [Fri, 8 Sep 2017 07:06:25 +0000 (00:06 -0700)]
vpx_scale_test.h: remove #if from inside macro

fixes visual studio error

Change-Id: I86206f17ca951b15e247c1b92561847d8c21ec7a

7 years agoMerge "vp8: [loongson] optimize sixtap predict with mmi"
Shiyou Yin [Fri, 8 Sep 2017 00:59:31 +0000 (00:59 +0000)]
Merge "vp8: [loongson] optimize sixtap predict with mmi"

7 years agoMerge "vpxdsp: [loongson] optimize sad functions with mmi"
Shiyou Yin [Fri, 8 Sep 2017 00:55:14 +0000 (00:55 +0000)]
Merge "vpxdsp: [loongson] optimize sad functions with mmi"

7 years agoUpdate convolve functions' assertions
Linfeng Zhang [Wed, 6 Sep 2017 19:01:07 +0000 (12:01 -0700)]
Update convolve functions' assertions

So that 4 to 1 frame scaling can call them.

Change-Id: I9ec438aa63b923ba164ad3c59d7ecfa12789eab5

7 years agoAdd 2 to 1 scaling NEON optimization
Linfeng Zhang [Tue, 5 Sep 2017 22:07:00 +0000 (15:07 -0700)]
Add 2 to 1 scaling NEON optimization

BUG=webm:1419

Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b

7 years agoRefactor convolve8 NEON functions
Linfeng Zhang [Tue, 5 Sep 2017 21:48:17 +0000 (14:48 -0700)]
Refactor convolve8 NEON functions

Change-Id: I4ac576875c91fee7cb150d298fae4a2c156d374c

7 years agoAdd ScaleFrameTest
Linfeng Zhang [Tue, 5 Sep 2017 21:38:45 +0000 (14:38 -0700)]
Add ScaleFrameTest

Move class VpxScaleBase to new file test/vpx_scale_test.h.
Add new file test/vp9_scale_test.cc with ScaleFrameTest.

BUG=webm:1419

Change-Id: Iec2098eafcef99b94047de525e5da47bcab519c1

7 years agoMerge "Remove get_filter_base() and get_filter_offset() in convolve"
Linfeng Zhang [Wed, 6 Sep 2017 22:39:15 +0000 (22:39 +0000)]
Merge "Remove get_filter_base() and get_filter_offset() in convolve"

7 years agoMerge "vpxdsp: [x86] add highbd_dc_128_predictor functions"
Scott LaVarnway [Wed, 6 Sep 2017 21:53:32 +0000 (21:53 +0000)]
Merge "vpxdsp: [x86] add highbd_dc_128_predictor functions"

7 years agoRemove support for stdatomic.h.
Peter Boström [Wed, 6 Sep 2017 15:48:42 +0000 (11:48 -0400)]
Remove support for stdatomic.h.

This header doesn't build on g++ v6 as it's a C and not C++ header
(_Atomic is not a keyword in C++11). Since the C and C++ invocations
cannot be guaranteed to point to the same underlying atomic_int
implementation, remove support for them and use compiler intrinsics
instead.

BUG=webm:1461

Change-Id: Ie1cd6759c258042efc87f51f036b9aa53e4ea9d5

7 years agoRemove get_filter_base() and get_filter_offset() in convolve
Linfeng Zhang [Mon, 28 Aug 2017 17:35:43 +0000 (10:35 -0700)]
Remove get_filter_base() and get_filter_offset() in convolve

so that the convolve functions are independent of table alignment.

Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee

7 years agovpxdsp: [x86] add highbd_dc_128_predictor functions
Scott LaVarnway [Tue, 5 Sep 2017 14:52:36 +0000 (07:52 -0700)]
vpxdsp: [x86] add highbd_dc_128_predictor functions

C vs SSE2 speed gains:
_4x4 : ~7.64x
_8x8 : ~16.60x
_16x16 : ~8.15x
_32x32 : ~5.05x

BUG=webm:1411

Change-Id: If165d419711cfda901bd428a05ca1560a009e62e

7 years agovp8: [loongson] optimize sixtap predict with mmi
Shiyou Yin [Sat, 2 Sep 2017 16:40:37 +0000 (00:40 +0800)]
vp8: [loongson] optimize sixtap predict with mmi

1. vp8_sixtap_predict16x16_mmi
2. vp8_sixtap_predict8x8_mmi
3. vp8_sixtap_predict8x4_mmi
4. vp8_sixtap_predict4x4_mmi

Change-Id: I186669d1a1d998a0f3ba3a548e25eee8b52c251b

7 years agovpxdsp: [loongson] optimize sad functions with mmi
Shiyou Yin [Sat, 2 Sep 2017 07:46:38 +0000 (15:46 +0800)]
vpxdsp: [loongson] optimize sad functions with mmi

1. vpx_sadWxH_c
2. vpx_sadWxH_avg_c
3. vpx_sadWxHx3_c
4. vpx_sadWxHx8_c
5. vpx_sadWxHx4d_c

Change-Id: Ie13161e3d73a052ea6ea7bac9cfadf55598fea7a

7 years agotest,Android.mk: export gtest include path
James Zern [Fri, 1 Sep 2017 03:07:01 +0000 (20:07 -0700)]
test,Android.mk: export gtest include path

fixes test file builds

Change-Id: Iaa725ad95d56cf77d9fef8994981a80102e9a966

7 years agoapply clang-format
clang-format [Mon, 28 Aug 2017 01:26:24 +0000 (18:26 -0700)]
apply clang-format

Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c

7 years ago.clang-format: update to 4.0.1
James Zern [Mon, 28 Aug 2017 01:22:04 +0000 (18:22 -0700)]
.clang-format: update to 4.0.1

based on Google style with the following differences:

3a4
> # Generated with clang-format 4.0.1
13c14
< AllowShortCaseLabelsOnASingleLine: false
---
> AllowShortCaseLabelsOnASingleLine: true
23c24
< BraceWrapping:
---
> BraceWrapping:
43c44
< ConstructorInitializerAllOnOneLineOrOnePerLine: true
---
> ConstructorInitializerAllOnOneLineOrOnePerLine: false
46,47c47,48
< Cpp11BracedListStyle: true
< DerivePointerAlignment: true
---
> Cpp11BracedListStyle: false
> DerivePointerAlignment: false
51c52
< IncludeCategories:
---
> IncludeCategories:
78c79
< PointerAlignment: Left
---
> PointerAlignment: Right
80c81
< SortIncludes:    true
---
> SortIncludes:    false

Change-Id: Ibc0ef87a516b8eae88d426dfdd7624be57e7b87c

7 years agoMerge "Prevent data race from low-pass filter."
Peter Boström [Fri, 1 Sep 2017 05:37:51 +0000 (05:37 +0000)]
Merge "Prevent data race from low-pass filter."

7 years agoMerge "inv_txfm_vsx: fix loads in high-bitdepth"
James Zern [Fri, 1 Sep 2017 03:09:49 +0000 (03:09 +0000)]
Merge "inv_txfm_vsx: fix loads in high-bitdepth"

7 years agoPrevent data race from low-pass filter.
Peter Boström [Thu, 31 Aug 2017 21:33:59 +0000 (14:33 -0700)]
Prevent data race from low-pass filter.

Makes main thread wait for the filter level to be picked to avoid a race
between the LPF thread and update_reference_frames(). This also
re-enables the failing tests under thread_sanitizer where this data race
was detected.

BUG=webm:1460

Change-Id: I7f5797142ea0200394309842ce3e91a480be4fbc

7 years agoMerge "Add atomics to vp8 synchronization primitives."
Peter Boström [Fri, 1 Sep 2017 01:36:22 +0000 (01:36 +0000)]
Merge "Add atomics to vp8 synchronization primitives."

7 years agoAdd atomics to vp8 synchronization primitives.
Peter Boström [Fri, 25 Aug 2017 22:48:11 +0000 (15:48 -0700)]
Add atomics to vp8 synchronization primitives.

Fixes issue on iPad Pro 10.5 (and probably other places) where threads
are not properly synchronized. On x86 this data race was benign as load
and store instructions are atomic, they were being atomic in practice as
the program hasn't been observed to be miscompiled.

Such guarantees are not made outside x86, and real problems manifested
where libvpx reliably reproduced a broken bitstream for even just the
initial keyframe. This was detected in WebRTC where this device started
using multithreading (as its CPU count is higher than earlier devices,
where the problem did not manifest as single-threading was used in
practice).

This issue was not detected under thread-sanitizer bots as mutexes were
conditionally used under this platform to simulate the protected read
and write semantics that were in practice provided on x86 platforms.

This change also removes several mutexes, so encoder/decoder state is
lighter-weight after this change and we do not need to initialize so
many mutexes (this was done even on non-thread-sanitizer platforms where
they were unused).

Change-Id: If41fcb0d99944f7bbc8ec40877cdc34d672ae72a

7 years agoMerge "vpxdsp: [x86] add highbd_dc_left_predictor functions"
Scott LaVarnway [Thu, 31 Aug 2017 21:34:27 +0000 (21:34 +0000)]
Merge "vpxdsp: [x86] add highbd_dc_left_predictor functions"

7 years agoMerge "vp9: Skip testing duplicate zero mv in nonrd-pickmode."
Jerome Jiang [Thu, 31 Aug 2017 17:16:19 +0000 (17:16 +0000)]
Merge "vp9: Skip testing duplicate zero mv in nonrd-pickmode."

7 years agovp9: Skip testing duplicate zero mv in nonrd-pickmode.
Jerome Jiang [Tue, 29 Aug 2017 20:36:34 +0000 (13:36 -0700)]
vp9: Skip testing duplicate zero mv in nonrd-pickmode.

Neutral on rtc set for speed 8. Neutral on ytlive for speed 5.

Saves some computation cycles but no speed gain observed on Pixel.

Change-Id: I34c4642cd543aa89c5b9c4bff6b7113577c64c91

7 years agoinv_txfm_vsx: fix loads in high-bitdepth
James Zern [Thu, 31 Aug 2017 06:47:56 +0000 (23:47 -0700)]
inv_txfm_vsx: fix loads in high-bitdepth

vec_vsx_ld -> load_tran_low

Change-Id: Id3144cdd528d2d406a515e5812e2ea9e4db64bf1

7 years agoMerge "Revert "Re-enable disabled tests under TSan.""
Jerome Jiang [Thu, 31 Aug 2017 01:52:42 +0000 (01:52 +0000)]
Merge "Revert "Re-enable disabled tests under TSan.""

7 years agoRevert "Re-enable disabled tests under TSan."
Jerome Jiang [Wed, 30 Aug 2017 23:44:21 +0000 (23:44 +0000)]
Revert "Re-enable disabled tests under TSan."

This reverts commit df9ce12259a4e866feeb580d2e0cf9648f60d3b5.

Reason for revert:

Re-enabled tests still fail tsan in high bitdepth.

Original change's description:
> Re-enable disabled tests under TSan.
>
> These tests point to an already-fixed bug, this should no longer have a
> data race.
>
> BUG=webm:1049
>
> Change-Id: Iaedc5db8df99362bdc501b70ff7fdebf8756fdb8

TBR=jzern@google.com,pbos@chromium.org,builds@webmproject.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug: webm:1049
Change-Id: I232f1f7726bf795b301abfb2e07cad6756642e53

7 years agovpxdsp: [x86] add highbd_dc_left_predictor functions
Scott LaVarnway [Wed, 30 Aug 2017 16:13:14 +0000 (09:13 -0700)]
vpxdsp: [x86] add highbd_dc_left_predictor functions

C vs SSE2 speed gains:
_4x4 : ~6.49x
_8x8 : ~10.82x
_16x16 : ~7.61x
_32x32 : ~5.29x

BUG=webm:1411

Change-Id: Ibc30c50cb7139049bf05298010803499e6ef949b

7 years agoMerge "vpxdsp: [x86] add highbd_dc_top_predictor functions"
Scott LaVarnway [Wed, 30 Aug 2017 11:25:07 +0000 (11:25 +0000)]
Merge "vpxdsp: [x86] add highbd_dc_top_predictor functions"

7 years agovpxdsp: [x86] add highbd_dc_top_predictor functions
Scott LaVarnway [Tue, 29 Aug 2017 18:25:32 +0000 (11:25 -0700)]
vpxdsp: [x86] add highbd_dc_top_predictor functions

C vs SSE2 speed gains:
_4x4 : ~7.39x
_8x8 : ~11.36x
_16x16 : ~8.68x
_32x32 : ~4.33x

BUG=webm:1411

Change-Id: I7f1487cd1531d4e7f0fbb4596fed3bfb72a59d58

7 years agoMerge "vp9: Speed 8: Enable skip_encode_sb"
Jerome Jiang [Tue, 29 Aug 2017 16:45:09 +0000 (16:45 +0000)]
Merge "vp9: Speed 8: Enable skip_encode_sb"

7 years agoMerge "Re-enable disabled tests under TSan."
Peter Boström [Tue, 29 Aug 2017 15:42:39 +0000 (15:42 +0000)]
Merge "Re-enable disabled tests under TSan."

7 years agoMerge "vpxdsp: [x86] add highbd_h_predictor functions"
Scott LaVarnway [Tue, 29 Aug 2017 14:05:13 +0000 (14:05 +0000)]
Merge "vpxdsp: [x86] add highbd_h_predictor functions"

7 years agovpxdsp: [x86] add highbd_h_predictor functions
Scott LaVarnway [Mon, 28 Aug 2017 14:26:08 +0000 (07:26 -0700)]
vpxdsp: [x86] add highbd_h_predictor functions

C vs SSE2 speed gains:
_4x4 : ~8.12x
_8x8 : ~9.71x
_16x16 : ~8.21x
_32x32 : ~5.0x

BUG=webm:1422

Change-Id: I5e8a1ed4db7b8dc539b3e2a728b0b34d8b4b1993

7 years agovp9: Speed 8: Enable skip_encode_sb
Jerome Jiang [Tue, 29 Aug 2017 00:05:48 +0000 (17:05 -0700)]
vp9: Speed 8: Enable skip_encode_sb

Neutral in borg tests.

Some clips show 3-4% speed gain on 2 threads on Pixel.

Change-Id: Ic959f34e44892a854551de6e9a3d9ec819ffed00

7 years agoRe-enable disabled tests under TSan.
Peter Boström [Mon, 28 Aug 2017 23:23:16 +0000 (16:23 -0700)]
Re-enable disabled tests under TSan.

These tests point to an already-fixed bug, this should no longer have a
data race.

BUG=webm:1049

Change-Id: Iaedc5db8df99362bdc501b70ff7fdebf8756fdb8

7 years agovp9: Remove resolution condition for using source_sad in speed 6.
Jerome Jiang [Mon, 28 Aug 2017 19:48:19 +0000 (12:48 -0700)]
vp9: Remove resolution condition for using source_sad in speed 6.

Rev d147771 fixed the test failure. So remove the resolution condition
for using source_sad in speed 6.

BUG=webm:1452

Change-Id: I1efba97e1ef5bd4de5f886299f6fcb907187abcd

7 years agoMerge "vp9: Speed 6 adapt_partition for live/vbr usage."
Marco Paniconi [Fri, 25 Aug 2017 22:00:08 +0000 (22:00 +0000)]
Merge "vp9: Speed 6 adapt_partition for live/vbr usage."

7 years agoMerge "vp9: SVC: Modify mv search condition in speed features."
Marco Paniconi [Fri, 25 Aug 2017 21:46:35 +0000 (21:46 +0000)]
Merge "vp9: SVC: Modify mv search condition in speed features."

7 years agovp9: Speed 6 adapt_partition for live/vbr usage.
Marco [Mon, 21 Aug 2017 23:39:56 +0000 (16:39 -0700)]
vp9: Speed 6 adapt_partition for live/vbr usage.

Enable adapt_partition for vbr mode for speed 6.
This allows the usage of the pickmode-based partition
(used in speed 5), but only selectively for superblocks
with high source sad, otherwise the faster variance based
partition scheme is used.

For speed 6 on ytlive set: avgPSNR/SSIM metrics up by ~0.6%,
several clips up by ~1.5%. Small/negligible decrease in speed.

Change-Id: I12f3efef6b3e059391de330fdbe5a44c2587f1f8

7 years agoMerge "Revert "quantize avx: copy 32x32 implementation""
Marco Paniconi [Fri, 25 Aug 2017 18:20:31 +0000 (18:20 +0000)]
Merge "Revert "quantize avx: copy 32x32 implementation""

7 years agovp9: SVC: Modify mv search condition in speed features.
Marco [Fri, 25 Aug 2017 16:59:57 +0000 (09:59 -0700)]
vp9: SVC: Modify mv search condition in speed features.

For SVC at speed >= 7: only use the improved mv search
on base spatial layer, if top layer resolution is above 640x360.

~2.3% speedup
Small/negligible loss in avgPSNR metrics on rtc set.

Change-Id: Iaef75a57ebf1c248931bc1aa28d20b7fecac1851

7 years agoRevert "quantize avx: copy 32x32 implementation"
Marco Paniconi [Fri, 25 Aug 2017 16:56:08 +0000 (16:56 +0000)]
Revert "quantize avx: copy 32x32 implementation"

This reverts commit f60d1dcd3de46f72bafc5eeef481bd1a4e203301.

Reason for revert: <INSERT REASONING HERE>
Failures in AVX/VP9QuantizeTest in nightly tests.
Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

TBR=slavarnway@google.com,johannkoenig@google.com,builds@webmproject.org

Change-Id: Ibd38636212269328317dd0721be9d25452113d1c
No-Presubmit: true
No-Tree-Checks: true
No-Try: true

7 years agoMerge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and...
Shiyou Yin [Fri, 25 Aug 2017 06:44:02 +0000 (06:44 +0000)]
Merge "vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi."

7 years agoMerge "vp9: Adjust 16x16 splot threshold for variance partition"
Marco Paniconi [Thu, 24 Aug 2017 22:26:43 +0000 (22:26 +0000)]
Merge "vp9: Adjust 16x16 splot threshold for variance partition"

7 years agoMake sure diff is present at configure time.
Tom Finegan [Thu, 24 Aug 2017 19:11:48 +0000 (12:11 -0700)]
Make sure diff is present at configure time.

This avoids an endless build loop at vpx_version.h
creation time when diff is not present.

Change-Id: I16ae386dbdaf14f9a2b85e4c5d1aaa6c08f52a45

7 years agoMerge "quantize avx: copy 32x32 implementation"
Johann Koenig [Thu, 24 Aug 2017 18:55:03 +0000 (18:55 +0000)]
Merge "quantize avx: copy 32x32 implementation"

7 years agovpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_p...
Shiyou Yin [Thu, 24 Aug 2017 15:11:58 +0000 (23:11 +0800)]
vpx_dsp:loongson optimize vpx_varianceWxH_c,vpx_sub_pixel_varianceWxH_c and vpx_sub_pixel_avg_varianceWxH_c with mmi.

Change-Id: Ia576a721df6312329b599c31cfe1fb1267a9f174

7 years agovp9: Adjust 16x16 splot threshold for variance partition
Marco [Thu, 24 Aug 2017 17:36:27 +0000 (10:36 -0700)]
vp9: Adjust 16x16 splot threshold for variance partition

For speeds < 7, increase threshold that controls the split
of 16x16->8x8 blocks, for resolutions 720p and higher.

Minor change for speed 5 (since it uses reference partition scheme
which only uses variance partition as first step).
For speed 6: ~0.5% increase in avgPSNR/SSIM metrics on ytlvie set.
No change in speed.

Change-Id: I5126580973201538d8ca26a9256b93c4d11d685b

7 years agoMerge "quantize test: skip block was removed"
Johann Koenig [Thu, 24 Aug 2017 17:43:10 +0000 (17:43 +0000)]
Merge "quantize test: skip block was removed"

7 years agoquantize avx: copy 32x32 implementation
Johann [Wed, 23 Aug 2017 20:59:33 +0000 (13:59 -0700)]
quantize avx: copy 32x32 implementation

Ensure avx and ssse3 stay in sync by testing them against each other.

Change-Id: I699f3b48785c83260825402d7826231f475f697c

7 years agoquantize ssse3: copy implementation to intrinsics
Johann [Wed, 16 Aug 2017 20:10:59 +0000 (13:10 -0700)]
quantize ssse3: copy implementation to intrinsics

Still does not pass tests. Does match the previous assembly, although
saving the sign before multiplying is dubious.

Change-Id: Ia163f18c755aba542d6e93f7bf7343184660df5a

7 years agoquantize test: skip block was removed
Johann [Thu, 24 Aug 2017 14:21:42 +0000 (07:21 -0700)]
quantize test: skip block was removed

Change-Id: I1d93698bc27529b0544d79dd7b9fe37afa51ef87