granicus.if.org Git - libvpx/log

]> granicus.if.org Git - libvpx/log

Scott LaVarnway [Wed, 20 Sep 2017 12:21:23 +0000 (05:21 -0700)]

vpxdsp: [x86] add highbd_d117_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.04x

C vs SSSE3 speed gains:
_8x8 : ~2.82x
_16x16 : ~5.93x
_32x32 : ~2.79x

BUG=webm:1411

Change-Id: I31d949695991c067dac89d91e0bed3e666c94993

commit | commitdiff | tree

Marco [Thu, 28 Sep 2017 17:47:34 +0000 (10:47 -0700)]

Set rc->high_source_sad = 0 before scene detection.

Only has effect when sf->use_altref_onepass is enabled,
as in that case scene detection is skipped for non-show frame
and so high_source_sad does not get reset to 0.

No change in metrics or speed.

Change-Id: I421f066d239341449c18826089e1810b9fc5967f

commit | commitdiff | tree

Marco Paniconi [Thu, 28 Sep 2017 16:52:28 +0000 (16:52 +0000)]

Merge "vp9: Modification to adapt the ARF usage for 1 pass vbr"

commit | commitdiff | tree

Marco [Thu, 21 Sep 2017 17:59:33 +0000 (10:59 -0700)]

vp9: Modification to adapt the ARF usage for 1 pass vbr

Add stats for past ARF usage, and use it to disable
ARF usage based on some conditions.

Overall improvement on ytlive set, reduces the regression
on the problem clips for this feature.

Only affects when sf->use_altref_onepass is enabled
(currently off by default).

Change-Id: I66267f227ea132dc86acb730e9882f85bead2cdb

commit | commitdiff | tree

Marco [Wed, 27 Sep 2017 21:49:58 +0000 (14:49 -0700)]

Add use_svc condition to the scene detection in 1 pass.

Scene detection is not currently used in SVC 1 pass code.
Speedup of ~0.4%.

Change-Id: I0ab769300919de710cd2da1402014fa3f22a1f86

commit | commitdiff | tree

Marco Paniconi [Wed, 27 Sep 2017 20:42:48 +0000 (20:42 +0000)]

Merge "Revert "Remove the speed condition on scene detection in 1 pass code.""

commit | commitdiff | tree

Scott LaVarnway [Wed, 27 Sep 2017 20:40:21 +0000 (20:40 +0000)]

Merge "vpxdsp: [x86] add highbd_d153_predictor functions"

commit | commitdiff | tree

Marco Paniconi [Wed, 27 Sep 2017 19:42:48 +0000 (19:42 +0000)]

Revert "Remove the speed condition on scene detection in 1 pass code."

This reverts commit 535b7b915ae5574db2f95632243cc5bee865f02e.

This is actually used in CBR to reset the rate control if high source sad is detected.

Original change's description:
> Remove the speed condition on scene detection in 1 pass code.
>
> Scene detection is used for VBR mode and for screen_content mode.
>
> It was also enabled for CBR mode via the speed condition,
> but currently the analysis in the scene detection is not used
> in CRB mode (similar computations are done locally at superblock level
> when the source_sad feature is enabled).
>
> For 1 pass code.
> No change in behavior. Small speed gain, ~0.5%.
>
> Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

TBR=marpan@google.com,builds@webmproject.org,jianj@google.com

Change-Id: Ib4e6b02047f75632503e7b0fc870af97fa9291c3
No-Presubmit: true
No-Tree-Checks: true
No-Try: true

commit | commitdiff | tree

James Zern [Wed, 27 Sep 2017 19:39:11 +0000 (19:39 +0000)]

Merge "fix signed integer overflow of idct"

commit | commitdiff | tree

James Zern [Wed, 27 Sep 2017 18:37:20 +0000 (18:37 +0000)]

Merge "vp9_dx_iface: Stop using iter parameter incorrectly"

commit | commitdiff | tree

Linfeng Zhang [Tue, 26 Sep 2017 19:33:40 +0000 (12:33 -0700)]

fix signed integer overflow of idct

Exposed by fuzz test in high bitdepth.
The bug is introduced in commit 64653fa.

BUG=webm:1466

Change-Id: Idd77d5c6a60efb9241471611ce1aba0646cb6ff5

commit | commitdiff | tree

Scott LaVarnway [Wed, 27 Sep 2017 17:06:14 +0000 (10:06 -0700)]

vpxdsp: [x86] add highbd_d153_predictor functions

C vs SSE2 speed gains:
_4x4 : ~1.95x

C vs SSSE3 speed gains:
_8x8 : ~3.30x
_16x16 : ~5.67x
_32x32 : ~3.87x

BUG=webm:1411

Change-Id: Ib483989b25614aa89b635e8c087d0879a5d71904

commit | commitdiff | tree

Marco [Wed, 27 Sep 2017 17:11:24 +0000 (10:11 -0700)]

Remove the speed condition on scene detection in 1 pass code.

Scene detection is used for VBR mode and for screen_content mode.

It was also enabled for CBR mode via the speed condition,
but currently the analysis in the scene detection is not used
in CRB mode (similar computations are done locally at superblock level
when the source_sad feature is enabled).

For 1 pass code.
No change in behavior. Small speed gain, ~0.5%.

Change-Id: I59991d7ef2af320bea7af4b907596e057affa42f

commit | commitdiff | tree

Vignesh Venkatasubramanian [Mon, 25 Sep 2017 23:56:04 +0000 (16:56 -0700)]

vp9_dx_iface: Stop using iter parameter incorrectly

'iter' parameter is being checked for NULL in every call to
decoder_get_frame which is quite pointless because it is always
going to be NULL unless the application changed it. The code works
as described only because vp9_get_raw_frame returns -1 on all
subsequent calls after the first.

Change-Id: Ic736b9e8fe36fc1430fc11d6a9b292be02497248

commit | commitdiff | tree

Linfeng Zhang [Wed, 27 Sep 2017 16:12:48 +0000 (16:12 +0000)]

Merge "Add vpx_scaled_2d_neon()"

commit | commitdiff | tree

Jerome Jiang [Wed, 27 Sep 2017 01:26:59 +0000 (01:26 +0000)]

Merge "Add unit test to expose vp8 bug when width is set odd."

commit | commitdiff | tree

Shiyou Yin [Wed, 27 Sep 2017 00:49:28 +0000 (00:49 +0000)]

Merge "vp8: [loongson] optimize copymen with mmi"

commit | commitdiff | tree

Jerome Jiang [Thu, 21 Sep 2017 17:52:20 +0000 (10:52 -0700)]

Add unit test to expose vp8 bug when width is set odd.

BUG=b/64710201

Change-Id: Ia518af5494a42e80949cf1165244fbed59606cf7

commit | commitdiff | tree

Marco [Tue, 26 Sep 2017 22:47:14 +0000 (15:47 -0700)]

Remove the speed condition in setting compute_source_sad.

The speed condition is not needed, feature can used for any
speed in 1 pass code.

Change-Id: I878ef3f63a075302eda48c0343fa243c80aab9ba

commit | commitdiff | tree

Marco [Tue, 26 Sep 2017 17:18:43 +0000 (10:18 -0700)]

Replace flag USE_ALTREF_FOR_ONE_PASS with speed feature.

To be used for 1 pass VBR.
Off by default in speed features.

Change-Id: I5d6110d6d191990db526fe68ec9715379a4d1754

commit | commitdiff | tree

Marco Paniconi [Tue, 26 Sep 2017 16:28:30 +0000 (16:28 +0000)]

Merge "SVC: Add setting for max_intra_rate_pct in sample encoder."

commit | commitdiff | tree

Linfeng Zhang [Tue, 19 Sep 2017 23:55:35 +0000 (16:55 -0700)]

Add vpx_scaled_2d_neon()

BUG=webm:1419

Change-Id: I39c8033734562efc0ac0e28e7f06fa05130f9b96

commit | commitdiff | tree

Linfeng Zhang [Tue, 26 Sep 2017 16:10:46 +0000 (16:10 +0000)]

Merge changes Ib9105462,Idfac00ed,If8d8a0e2

* changes:
  cosmetics: NEON scaling code
  Refactor convolve NEON code
  Refactor convolve code

commit | commitdiff | tree

Shiyou Yin [Wed, 6 Sep 2017 09:57:16 +0000 (17:57 +0800)]

vp8: [loongson] optimize copymen with mmi

1. vp8_copy_mem16x16_mmi
2. vp8_copy_mem8x8_mmi
3. vp8_copy_mem8x4_mmi

Change-Id: I3de29a11fa7402df0e48bbb944440b1e66498a65

commit | commitdiff | tree

Marco [Mon, 25 Sep 2017 20:36:25 +0000 (13:36 -0700)]

SVC: Add setting for max_intra_rate_pct in sample encoder.

Set it as default to 900.

Change-Id: Id2d990925dccff1f6762411c66ea95973440c92f

commit | commitdiff | tree

Scott LaVarnway [Mon, 25 Sep 2017 11:34:14 +0000 (11:34 +0000)]

Merge "vpxdsp: [x86] add highbd_d45_predictor functions"

commit | commitdiff | tree

Scott LaVarnway [Wed, 30 Aug 2017 16:27:44 +0000 (09:27 -0700)]

vpxdsp: [x86] add highbd_d45_predictor functions

C vs SSSE3 speed gains:
_4x4 : ~2.45x
_8x8 : ~10.61x
_16x16 : ~11.34x
_32x32 : ~6.36x

BUG=webm:1411

Change-Id: Ic91389a4f1a8ad093f498afe53765b897fb9be09

commit | commitdiff | tree

James Zern [Fri, 22 Sep 2017 07:35:55 +0000 (07:35 +0000)]

Merge changes If59743aa,Ib046fe28,Ia2345752

* changes:
  Remove the unnecessary cast of (int16_t)cospi_{1...31}_64
  Remove the unnecessary upcasts of (int)cospi_{1...31}_64
  Change cospi_{1...31}_64 from tran_high_t to tran_coef_t

commit | commitdiff | tree

Andrew Lewis [Thu, 21 Sep 2017 08:50:53 +0000 (08:50 +0000)]

Merge "Comma-separate VP9 encoder tmp.stt output"

commit | commitdiff | tree

Marco Paniconi [Thu, 21 Sep 2017 01:33:12 +0000 (01:33 +0000)]

Merge "vp9: Modify pickmode early exit for ARF in 1pass."

commit | commitdiff | tree

Marco [Wed, 20 Sep 2017 21:55:31 +0000 (14:55 -0700)]

vp9: Modify pickmode early exit for ARF in 1pass.

Add the condition frames_since_golden > 0 to the
early exit check for ARF usage in nonrd_pickmode.
This improves quality of first frame following ARF, where
frame_since_golden = 0.

Small/neutral gain in metrics for speed 6, neutral change in speed.

Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Change-Id: I82e73e6ff6fc849e5ca5448563cb8a0515fe0cdc

commit | commitdiff | tree

Linfeng Zhang [Mon, 18 Sep 2017 16:33:31 +0000 (09:33 -0700)]

Remove the unnecessary cast of (int16_t)cospi_{1...31}_64

BUG=webm:1450

Change-Id: If59743aafe99226e0ec67ab5d20678ce25f53ab8

commit | commitdiff | tree

Linfeng Zhang [Wed, 13 Sep 2017 20:05:47 +0000 (13:05 -0700)]

Remove the unnecessary upcasts of (int)cospi_{1...31}_64

BUG=webm:1450

Change-Id: Ib046fe28caec5b9ebdc9d0152df7c54ff4266858

commit | commitdiff | tree

Linfeng Zhang [Wed, 13 Sep 2017 00:13:17 +0000 (17:13 -0700)]

Change cospi_{1...31}_64 from tran_high_t to tran_coef_t

The unnecessary upcast to (int) will be cleaned later.

BUG=webm:1450

Change-Id: Ia234575206d5a74540526924b06ed3939322d063

commit | commitdiff | tree

James Zern [Wed, 20 Sep 2017 21:12:45 +0000 (21:12 +0000)]

Merge "Bug fix: fadst4() in vp9/encoder/vp9_dct.c"

commit | commitdiff | tree

Linfeng Zhang [Wed, 20 Sep 2017 16:18:04 +0000 (09:18 -0700)]

Bug fix: fadst4() in vp9/encoder/vp9_dct.c

A new bug was introduced in a80bdfd "Change sinpi_{1,2,3,4}_9 from
tran_high_t to int16_t". Reverted the change in this file.

BUG=webm:1450

Failed test C/TransHT.AccuracyCheck/26.

Change-Id: Id001f57aad811803ef7d367d2b2bc008d8499991

commit | commitdiff | tree

Marco Paniconi [Wed, 20 Sep 2017 16:42:31 +0000 (16:42 +0000)]

Merge "vp9: Modify simple_block_yrd condition for SVC"

commit | commitdiff | tree

Scott LaVarnway [Wed, 20 Sep 2017 11:39:28 +0000 (11:39 +0000)]

Merge "vpxdsp: [x86] add highbd_d63_predictor functions"

commit | commitdiff | tree

James Zern [Wed, 20 Sep 2017 01:59:11 +0000 (18:59 -0700)]

temporal_filter_apply_sse2.asm: add ':' to label

quiets nasm warning:
label alone on a line without a colon might be in error

BUG=webm:1462

Change-Id: I660407ca60e8c9a810dba9d76afb65852029a29c

commit | commitdiff | tree

Linfeng Zhang [Tue, 19 Sep 2017 23:39:17 +0000 (16:39 -0700)]

cosmetics: NEON scaling code

Change-Id: Ib91054622c1f09c4ca523bc6837d7d8ab9f03618

commit | commitdiff | tree

Linfeng Zhang [Tue, 19 Sep 2017 23:14:56 +0000 (16:14 -0700)]

Refactor convolve NEON code

Rename a couple of hbd static functions.
Move the position of NEON function convolve8_4().

Change-Id: Idfac00edf2e99cdd8e0a73b9f895402f60be6349

commit | commitdiff | tree

Linfeng Zhang [Tue, 19 Sep 2017 23:23:14 +0000 (16:23 -0700)]

Refactor convolve code

Extract a couple of static functions into their caller functions.

Change-Id: If8d8a0e217fba6b402d2a79ede13b5b444ff08a0

commit | commitdiff | tree

Scott LaVarnway [Wed, 13 Sep 2017 01:01:31 +0000 (18:01 -0700)]

vpxdsp: [x86] add highbd_d63_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.94x

C vs SSSE3 speed gains:
_8x8 : ~8.69x
_16x16 : ~6.32x
_32x32 : ~5.33x

BUG=webm:1411

Change-Id: I2c35b527eac2229f17aaa9d118fb601e7195efe4

commit | commitdiff | tree

Marco [Tue, 19 Sep 2017 22:19:41 +0000 (15:19 -0700)]

vp9: Modify simple_block_yrd condition for SVC

Modify simple_block_yrd condition in nonrd_pickmode for SVC:
allow it to be used also on base temporal_layer, only when
spatial_layer > 1 and block size < 32x32.

Speed up of about ~2% for 3 layer SVC, with little/negligible
loss in quality.

Change-Id: I7734bdae51cf51f22b96f6b2b27da20ea1d84344

commit | commitdiff | tree

Marco Paniconi [Tue, 19 Sep 2017 22:31:08 +0000 (22:31 +0000)]

Merge "Add datarate test for frame_parallel_decoding mode off."

commit | commitdiff | tree

Marco [Tue, 19 Sep 2017 18:00:40 +0000 (11:00 -0700)]

vp9: Fix condition for limiting ARF 1 pass vbr.

Fix the setting to frames_till_gf_update_due, and
adjust the limit value.
Only affects when USE_ALTREF_FOR_ONE_PASS is enabled.

Neutral change to metrics and speed for ytlive.

Change-Id: I266d9a00b36221bc8602fa2746d4e8a8f7d4dfae

commit | commitdiff | tree

Marco Paniconi [Tue, 19 Sep 2017 16:29:19 +0000 (16:29 +0000)]

Merge "vp9: Adjustments for ARF usage in 1 pass vbr."

commit | commitdiff | tree

Marco [Tue, 19 Sep 2017 00:30:49 +0000 (17:30 -0700)]

vp9: Adjustments for ARF usage in 1 pass vbr.

Only when USE_ALT_REF_ONE_PASS is enabled (off by default).
Force fixed partition to 64x64 when is_src_alt_ref_frame is true,
and don't force early exit for some modes in nonrd_pickmode
for ARF noshow frames.

Small gain ~0.2% on ytlive metrics for speed 6.
Neutral speed difference.

Change-Id: I27eb6622d0453c09a06ccdc3b16368762474d11d

commit | commitdiff | tree

Linfeng Zhang [Tue, 12 Sep 2017 22:24:54 +0000 (15:24 -0700)]

Change sinpi_{1,2,3,4}_9 from tran_high_t to int16_t

Add "typedef int16_t tran_coef_t;"

BUG=webm:1450

Change-Id: I67866f104898d1dda8989e1abdaf6983fe324154

commit | commitdiff | tree

Linfeng Zhang [Mon, 18 Sep 2017 16:23:33 +0000 (16:23 +0000)]

Merge "cosmetics: vp9_rtcd_defs.pl"

commit | commitdiff | tree

Shiyou Yin [Fri, 15 Sep 2017 23:53:40 +0000 (23:53 +0000)]

Merge "vp8: [loongson] optimize dequantize with mmi"

commit | commitdiff | tree

Marco [Fri, 15 Sep 2017 18:35:53 +0000 (11:35 -0700)]

Add datarate test for frame_parallel_decoding mode off.

Add datarate test, for both VBR and CBR mode, with the
frame_parallel_decoding mode disabled (and error_resilience off).

Change-Id: I54feec3248a68ecff4bef8d9a31bb1616fab77df

commit | commitdiff | tree

Paul Wilkins [Fri, 15 Sep 2017 15:43:29 +0000 (15:43 +0000)]

Merge "Fix bug in intra mode rd penalty."

commit | commitdiff | tree

Kaustubh Raste [Fri, 15 Sep 2017 01:27:02 +0000 (01:27 +0000)]

Merge "mips msa clean-up msa macros"

commit | commitdiff | tree

James Zern [Fri, 15 Sep 2017 00:27:58 +0000 (00:27 +0000)]

Merge "vp9_scale_test: add C config"

commit | commitdiff | tree

James Zern [Fri, 15 Sep 2017 00:27:41 +0000 (00:27 +0000)]

Merge "Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()""

commit | commitdiff | tree

Hui Su [Thu, 14 Sep 2017 21:02:38 +0000 (21:02 +0000)]

Merge "VP9 level targeting: add a new AUTO mode"

commit | commitdiff | tree

James Zern [Thu, 14 Sep 2017 20:08:04 +0000 (13:08 -0700)]

vp9_scale_test: add C config

Change-Id: I9dfe8255d1c096d246bf9719729f57dbae779ffc

commit | commitdiff | tree

James Zern [Thu, 14 Sep 2017 20:06:40 +0000 (13:06 -0700)]

Revert "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"

This reverts commit afee58f2c4159172f5340f2c7d3e8041cfa0eb91.

This causes ~8x slowdown in 4:3 in the C-code

Change-Id: I60a7ead12dc4ec1548b1b12cfe4b0be42ef04e0e

commit | commitdiff | tree

Hui Su [Thu, 10 Aug 2017 22:05:20 +0000 (15:05 -0700)]

VP9 level targeting: add a new AUTO mode

In the new AUTO mode, restrict the minimum alt-ref interval and max column
tiles adaptively based on picture size, while not applying any rate control
constraints.

This mode aims to produce encodings that fit into levels corresponding to
the source picture size, with minimum compression quality lost. However, the
bitstream is not guaranteed to be level compatible, e.g., the average bitrate
may exceed level limit.

BUG=b/64451920

Change-Id: I02080b169cbbef4ab2e08c0df4697ce894aad83c

commit | commitdiff | tree

Shiyou Yin [Wed, 6 Sep 2017 03:30:25 +0000 (11:30 +0800)]

vp8: [loongson] optimize dequantize with mmi

1. vp8_dequantize_b_mmi
2. vp8_dequant_idct_add_mmi

Change-Id: I505f8afb7a444173392b325906e6a4f420f00709

commit | commitdiff | tree

Shiyou Yin [Wed, 6 Sep 2017 00:51:21 +0000 (08:51 +0800)]

vp8: [loongson] optimize idctllm with mmi

1. vp8_short_idct4x4llm_mmi
2. vp8_short_inv_walsh4x4_mmi
3. vp8_dc_only_idct_add_mmi

Change-Id: I616923681e79d78607a4988608fc39df77b093f4

commit | commitdiff | tree

Kaustubh Raste [Thu, 14 Sep 2017 06:59:19 +0000 (12:29 +0530)]

mips msa clean-up msa macros

Removed inline for GP load-store in case of (__mips_isa_rev >= 6)
Created one define LD_V for vector load and ST_V for vector store

Change-Id: Ifec3570fa18346e39791b0dd622892e5c18bd448

commit | commitdiff | tree

Linfeng Zhang [Tue, 12 Sep 2017 20:19:55 +0000 (13:19 -0700)]

cosmetics: vp9_rtcd_defs.pl

Change-Id: I1bf57824e07fa4f8b3b5574984117f2bd7a1c086

commit | commitdiff | tree

Linfeng Zhang [Wed, 13 Sep 2017 17:21:45 +0000 (17:21 +0000)]

Merge "Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()"

commit | commitdiff | tree

Andrew Lewis [Wed, 13 Sep 2017 15:26:40 +0000 (16:26 +0100)]

Comma-separate VP9 encoder tmp.stt output

Also add column headings so that the output can still be parsed if the
set of headers changes later.

Change-Id: I4beaf266521e093db4acf5f715b18fdfb7e3d1cd

commit | commitdiff | tree

Johann Koenig [Wed, 13 Sep 2017 14:44:53 +0000 (14:44 +0000)]

Merge "Revert "Revert "quantize avx: copy 32x32 implementation"""

commit | commitdiff | tree

Kaustubh Raste [Wed, 13 Sep 2017 06:02:49 +0000 (06:02 +0000)]

Merge "Optimize mips msa vp9 average mc functions"

commit | commitdiff | tree

Shiyou Yin [Wed, 13 Sep 2017 01:05:46 +0000 (01:05 +0000)]

Merge "vp8: [loongson] optimize loopfilter with mmi"

commit | commitdiff | tree

Johann [Tue, 12 Sep 2017 21:09:42 +0000 (14:09 -0700)]

Revert "Revert "quantize avx: copy 32x32 implementation""

This reverts commit 8c42237bb200253931c49e2c530838f3a877dd65.

Because ssse3 code is used for the reference, the qcoeff and dqcoeff
reference buffers must be aligned.

Original change's description:
> quantize avx: copy 32x32 implementation
>
> Ensure avx and ssse3 stay in sync by testing them against each other.
>
> Change-Id: I699f3b48785c83260825402d7826231f475f697c

Change-Id: Ieeef11b9406964194028b0d81d84bcb63296ae06

commit | commitdiff | tree

Linfeng Zhang [Tue, 12 Sep 2017 18:37:04 +0000 (11:37 -0700)]

Specialize 4 to 3 scaling in vp9_scale_and_extend_frame_c()

Scale 3x3 block instead of 16x16 block in each loop.

Benefits:
1. Reduced number of different phase_scaler from 16 to 3. Optimization code
will be smaller and faster.
2. The maximum phase_scaler drifting will be reduced from 5/16 to 1/24.
(The drifting is 1/(3*16) in each step.)

BUG=webm:1419

Change-Id: Ibb9242a629ddb03e1ff93b859bece738255e698c

commit | commitdiff | tree

Kaustubh Raste [Tue, 12 Sep 2017 10:05:07 +0000 (15:35 +0530)]

Optimize mips msa vp9 average mc functions

Load the specific destination loads instead of vector load

Change-Id: I65ca13ae8f608fad07121fef848e2a18f54171fe

commit | commitdiff | tree

Scott LaVarnway [Mon, 11 Sep 2017 22:32:23 +0000 (22:32 +0000)]

Merge "vpxdsp: [x86] add highbd_d207_predictor functions"

commit | commitdiff | tree

Linfeng Zhang [Thu, 7 Sep 2017 19:50:36 +0000 (12:50 -0700)]

Add 4 to 1 scaling NEON optimization

BUG=webm:1419

Change-Id: If82a93935d2453e61b7647aae70983db1740bec7

commit | commitdiff | tree

Scott LaVarnway [Wed, 6 Sep 2017 17:08:03 +0000 (10:08 -0700)]

vpxdsp: [x86] add highbd_d207_predictor functions

C vs SSE2 speed gains:
_4x4 : ~2.31x

C vs SSSE3 speed gains:
_8x8 : ~4.73x
_16x16 : ~10.88x
_32x32 : ~4.80x

BUG=webm:1411

Change-Id: I0bac29db261079181ddabc6814bd62c463109caf

commit | commitdiff | tree

Shiyou Yin [Fri, 8 Sep 2017 01:42:51 +0000 (09:42 +0800)]

vp8: [loongson] optimize loopfilter with mmi

1. vp8_loop_filter_horizontal_edge_mmi
2. vp8_loop_filter_vertical_edge_mmi
3. vp8_mbloop_filter_horizontal_edge_mmi
4. vp8_mbloop_filter_vertical_edge_mmi
5. vp8_loop_filter_simple_horizontal_edge_mmi
6. vp8_loop_filter_simple_vertical_edge_mmi

Change-Id: Ie34bbff3a16cff64e39a50798afd2b7dac9bcdc3

commit | commitdiff | tree

James Zern [Sat, 9 Sep 2017 02:20:07 +0000 (19:20 -0700)]

intrapred: sync highbd_d63_predictor w/d63_

8/16/32: ~6%/~18%/~33% faster

previously:
7012ba639 vp9_reconintra: simplify d63_predictor

BUG=webm:1411

Change-Id: Ie775f3a4f7fd74df44754e65686d826a51c2cdc2

commit | commitdiff | tree

James Zern [Sat, 9 Sep 2017 01:57:08 +0000 (18:57 -0700)]

vpx_mem: make vpx_memset16 inline

Change-Id: Ibb2cab930c95836e6d6e66300c33e7d08e4474d4

commit | commitdiff | tree

James Zern [Sat, 9 Sep 2017 01:52:01 +0000 (18:52 -0700)]

intrapred: sync highbd_d45_predictor w/d45_

8/16/32:: ~19%/~54%/~75.5% faster

previously:
acc481eaa vp9_reconintra: simplify d45_predictor

BUG=webm:1411

Change-Id: Ie8340b0c5070ae640f124733f025e4e749b660d8

commit | commitdiff | tree

James Zern [Fri, 8 Sep 2017 19:23:40 +0000 (19:23 +0000)]

Merge changes I9ec438aa,I99c954ff

* changes:
Update convolve functions' assertions
Add 2 to 1 scaling NEON optimization

commit | commitdiff | tree

paulwilkins [Tue, 29 Aug 2017 20:08:08 +0000 (13:08 -0700)]

Fix bug in intra mode rd penalty.

The intra mode rd penalty was implemented as a rate penalty.
Code was added to scale the penalty according to block size but
this was not done correctly for the SB level or sub 8x8.

The code did a weird double scaling in regard to bit depth that
has been removed. Given that it is a rate penalty the bit depth
should not matter.

This bug fix improves average metrics on our standard test
sets by about 0.1%

Change-Id: I7cf81b66aad0cda389fe234f47beba01c7493b1e

commit | commitdiff | tree

James Zern [Fri, 8 Sep 2017 07:06:25 +0000 (00:06 -0700)]

vpx_scale_test.h: remove #if from inside macro

fixes visual studio error

Change-Id: I86206f17ca951b15e247c1b92561847d8c21ec7a

commit | commitdiff | tree

Shiyou Yin [Fri, 8 Sep 2017 00:59:31 +0000 (00:59 +0000)]

Merge "vp8: [loongson] optimize sixtap predict with mmi"

commit | commitdiff | tree

Shiyou Yin [Fri, 8 Sep 2017 00:55:14 +0000 (00:55 +0000)]

Merge "vpxdsp: [loongson] optimize sad functions with mmi"

commit | commitdiff | tree

Linfeng Zhang [Wed, 6 Sep 2017 19:01:07 +0000 (12:01 -0700)]

Update convolve functions' assertions

So that 4 to 1 frame scaling can call them.

Change-Id: I9ec438aa63b923ba164ad3c59d7ecfa12789eab5

commit | commitdiff | tree

Linfeng Zhang [Tue, 5 Sep 2017 22:07:00 +0000 (15:07 -0700)]

Add 2 to 1 scaling NEON optimization

BUG=webm:1419

Change-Id: I99c954ffa50a62ccff2c4ab54162916141826d9b

commit | commitdiff | tree

Linfeng Zhang [Tue, 5 Sep 2017 21:48:17 +0000 (14:48 -0700)]

Refactor convolve8 NEON functions

Change-Id: I4ac576875c91fee7cb150d298fae4a2c156d374c

commit | commitdiff | tree

Linfeng Zhang [Tue, 5 Sep 2017 21:38:45 +0000 (14:38 -0700)]

Add ScaleFrameTest

Move class VpxScaleBase to new file test/vpx_scale_test.h.
Add new file test/vp9_scale_test.cc with ScaleFrameTest.

BUG=webm:1419

Change-Id: Iec2098eafcef99b94047de525e5da47bcab519c1

commit | commitdiff | tree

Linfeng Zhang [Wed, 6 Sep 2017 22:39:15 +0000 (22:39 +0000)]

Merge "Remove get_filter_base() and get_filter_offset() in convolve"

commit | commitdiff | tree

Scott LaVarnway [Wed, 6 Sep 2017 21:53:32 +0000 (21:53 +0000)]

Merge "vpxdsp: [x86] add highbd_dc_128_predictor functions"

commit | commitdiff | tree

Peter Boström [Wed, 6 Sep 2017 15:48:42 +0000 (11:48 -0400)]

Remove support for stdatomic.h.

This header doesn't build on g++ v6 as it's a C and not C++ header
(_Atomic is not a keyword in C++11). Since the C and C++ invocations
cannot be guaranteed to point to the same underlying atomic_int
implementation, remove support for them and use compiler intrinsics
instead.

BUG=webm:1461

Change-Id: Ie1cd6759c258042efc87f51f036b9aa53e4ea9d5

commit | commitdiff | tree

Linfeng Zhang [Mon, 28 Aug 2017 17:35:43 +0000 (10:35 -0700)]

Remove get_filter_base() and get_filter_offset() in convolve

so that the convolve functions are independent of table alignment.

Change-Id: Ieab132a30d72c6e75bbe9473544fbe2cf51541ee

commit | commitdiff | tree

Scott LaVarnway [Tue, 5 Sep 2017 14:52:36 +0000 (07:52 -0700)]

vpxdsp: [x86] add highbd_dc_128_predictor functions

C vs SSE2 speed gains:
_4x4 : ~7.64x
_8x8 : ~16.60x
_16x16 : ~8.15x
_32x32 : ~5.05x

BUG=webm:1411

Change-Id: If165d419711cfda901bd428a05ca1560a009e62e

commit | commitdiff | tree

Shiyou Yin [Sat, 2 Sep 2017 16:40:37 +0000 (00:40 +0800)]

vp8: [loongson] optimize sixtap predict with mmi

1. vp8_sixtap_predict16x16_mmi
2. vp8_sixtap_predict8x8_mmi
3. vp8_sixtap_predict8x4_mmi
4. vp8_sixtap_predict4x4_mmi

Change-Id: I186669d1a1d998a0f3ba3a548e25eee8b52c251b

commit | commitdiff | tree

Shiyou Yin [Sat, 2 Sep 2017 07:46:38 +0000 (15:46 +0800)]

vpxdsp: [loongson] optimize sad functions with mmi

1. vpx_sadWxH_c
2. vpx_sadWxH_avg_c
3. vpx_sadWxHx3_c
4. vpx_sadWxHx8_c
5. vpx_sadWxHx4d_c

Change-Id: Ie13161e3d73a052ea6ea7bac9cfadf55598fea7a

commit | commitdiff | tree

James Zern [Fri, 1 Sep 2017 03:07:01 +0000 (20:07 -0700)]

test,Android.mk: export gtest include path

fixes test file builds

Change-Id: Iaa725ad95d56cf77d9fef8994981a80102e9a966

commit | commitdiff | tree

clang-format [Mon, 28 Aug 2017 01:26:24 +0000 (18:26 -0700)]

apply clang-format

Change-Id: If4c3e8a396d0fcb304f407b44e28cac3219f038c

commit | commitdiff | tree

James Zern [Mon, 28 Aug 2017 01:22:04 +0000 (18:22 -0700)]

.clang-format: update to 4.0.1

based on Google style with the following differences:

3a4
> # Generated with clang-format 4.0.1
13c14
< AllowShortCaseLabelsOnASingleLine: false
---
> AllowShortCaseLabelsOnASingleLine: true
23c24
< BraceWrapping:
---
> BraceWrapping:
43c44
< ConstructorInitializerAllOnOneLineOrOnePerLine: true
---
> ConstructorInitializerAllOnOneLineOrOnePerLine: false
46,47c47,48
< Cpp11BracedListStyle: true
< DerivePointerAlignment: true
---
> Cpp11BracedListStyle: false
> DerivePointerAlignment: false
51c52
< IncludeCategories:
---
> IncludeCategories:
78c79
< PointerAlignment: Left
---
> PointerAlignment: Right
80c81
< SortIncludes: true
---
> SortIncludes: false

Change-Id: Ibc0ef87a516b8eae88d426dfdd7624be57e7b87c

commit | commitdiff | tree

Peter Boström [Fri, 1 Sep 2017 05:37:51 +0000 (05:37 +0000)]

Merge "Prevent data race from low-pass filter."

commit | commitdiff | tree

James Zern [Fri, 1 Sep 2017 03:09:49 +0000 (03:09 +0000)]

Merge "inv_txfm_vsx: fix loads in high-bitdepth"

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom