]> granicus.if.org Git - libvpx/log
libvpx
7 years agoVP9: Add greedy version of av1_optimize_b().
Urvang Joshi [Thu, 8 Jun 2017 21:51:01 +0000 (14:51 -0700)]
VP9: Add greedy version of av1_optimize_b().

This was ported from the greedy version in AV1, written by Dake He
(dkhe@google.com).
See:
https://aomedia.googlesource.com/aom/+/master/av1/encoder/encodemb.c#137

Greedy version is disabled by default, but can be picked by setting
USE_GREEDY_OPTIMIZE_B to 1.
To be enabled by default later.

This is both faster and better in terms of compression.

Compression Improvement:
------------------------
lowres: -0.119
midres: -0.064
hdres:  -0.405

Speed Improvement:
------------------
(Based on encode time of 3 videos of different difficulties at
3 different target bitrates)
With --cpu-used=0: 0.38% to 5.55% faster
With --cpu-used=1: 0.24% to 2.79% faster
With --cpu-used=2: 0.29% to 1.46% faster

Change-Id: Ia7a23b3b244ad8eb253ac9e43cd03c5e021d2635

7 years agoMerge changes Ibf9d120b,I341399ec,Iaa5dd63b,Id59865fd
Linfeng Zhang [Thu, 15 Jun 2017 17:57:50 +0000 (17:57 +0000)]
Merge changes Ibf9d120b,I341399ec,Iaa5dd63b,Id59865fd

* changes:
  Update high bitdepth load_input_data() in x86
  Clean array_transpose_{4X8,16x16,16x16_2) in x86
  Remove array_transpose_8x8() in x86
  Convert 8x8 idct x86 macros to inline functions

7 years agoMerge "vp8: Adjust the pred_err threhsold for drop on overshoot."
Marco Paniconi [Wed, 14 Jun 2017 15:59:54 +0000 (15:59 +0000)]
Merge "vp8: Adjust the pred_err threhsold for drop on overshoot."

7 years agoUpdate high bitdepth load_input_data() in x86
Linfeng Zhang [Tue, 13 Jun 2017 23:53:53 +0000 (16:53 -0700)]
Update high bitdepth load_input_data() in x86

BUG=webm:1412

Change-Id: Ibf9d120b80c7d3a7637e79e123cf2f0aae6dd78c

7 years agoClean array_transpose_{4X8,16x16,16x16_2) in x86
Linfeng Zhang [Mon, 12 Jun 2017 23:23:53 +0000 (16:23 -0700)]
Clean array_transpose_{4X8,16x16,16x16_2) in x86

Change-Id: I341399ecbde37065375ea7e63511a26bfc285ea0

7 years agoRemove array_transpose_8x8() in x86
Linfeng Zhang [Mon, 12 Jun 2017 22:45:50 +0000 (15:45 -0700)]
Remove array_transpose_8x8() in x86

Duplicate of transpose_16bit_8x8()

Change-Id: Iaa5dd63b5cccb044974a65af22c90e13418e311f

7 years agoConvert 8x8 idct x86 macros to inline functions
Linfeng Zhang [Mon, 12 Jun 2017 21:03:37 +0000 (14:03 -0700)]
Convert 8x8 idct x86 macros to inline functions

Change-Id: Id59865fd6c453a24121ce7160048d67875fc67ce

7 years agovp8_skin_detection: add 'vp8_' prefix to public fns
James Zern [Tue, 13 Jun 2017 01:08:20 +0000 (18:08 -0700)]
vp8_skin_detection: add 'vp8_' prefix to public fns

BUG=webm:1438

Change-Id: I5feb31c254d02e116e624cfe702e73ba5a1f7aca

7 years agorename vp8/common/skin_detection.[hc] -> vp8_*
James Zern [Tue, 13 Jun 2017 00:54:21 +0000 (17:54 -0700)]
rename vp8/common/skin_detection.[hc] -> vp8_*

some build systems have trouble with duplicate basenames.
vpx_dsp/skin_detection.[hc] were added in:
658e85425 Merge skin detection code in vp8/9.

BUG=webm:1438

Change-Id: Ieaa70b40bda409ec23e6d179b47a930ac6243b05

7 years agovp8: Adjust the pred_err threhsold for drop on overshoot.
Marco [Mon, 12 Jun 2017 16:51:47 +0000 (09:51 -0700)]
vp8: Adjust the pred_err threhsold for drop on overshoot.

Change-Id: Ica2a09ac87160936b6f7bd01f167f464ea3ac41c

7 years agoMerge "vp9 level targeting: more strict constraint on min_gf_interval"
Hui Su [Mon, 12 Jun 2017 16:38:02 +0000 (16:38 +0000)]
Merge "vp9 level targeting: more strict constraint on min_gf_interval"

7 years agoMerge "Remove duplication on vp8/9_write_yuv_frame."
Jerome Jiang [Sat, 10 Jun 2017 04:50:19 +0000 (04:50 +0000)]
Merge "Remove duplication on vp8/9_write_yuv_frame."

7 years agovp9: SVC: Use prune_evenemore only for non_reference.
Marco [Sat, 10 Jun 2017 00:48:03 +0000 (17:48 -0700)]
vp9: SVC: Use prune_evenemore only for non_reference.

Set subpel prune_evenmore only for non_reference frames,
instead of all TL > 0 frames. Gain some quality back at
cost of small speed loss (~1-2%).

Change only effects SVC encoding at speed >= 7.

Change-Id: I5b9f51e51dccfd7050521a66996176b0415ca3f9

7 years agoRemove duplication on vp8/9_write_yuv_frame.
Jerome Jiang [Thu, 8 Jun 2017 23:07:02 +0000 (16:07 -0700)]
Remove duplication on vp8/9_write_yuv_frame.

Change-Id: Ib3546032a27c715bf509c0e24d26a189bc829da8

7 years agoMerge "idct_test: don't use std::nothrow anymore"
Johann Koenig [Fri, 9 Jun 2017 20:42:39 +0000 (20:42 +0000)]
Merge "idct_test: don't use std::nothrow anymore"

7 years agoMerge "buffer.h: allow declaring an alignment"
Johann Koenig [Fri, 9 Jun 2017 20:42:21 +0000 (20:42 +0000)]
Merge "buffer.h: allow declaring an alignment"

7 years agoMerge "Remove some dead code. Coverity CID 1310058"
Johann Koenig [Fri, 9 Jun 2017 20:41:56 +0000 (20:41 +0000)]
Merge "Remove some dead code. Coverity CID 1310058"

7 years agoidct_test: don't use std::nothrow anymore
Johann [Thu, 8 Jun 2017 17:09:23 +0000 (10:09 -0700)]
idct_test: don't use std::nothrow anymore

But still check for NULL before calling Init()

Change-Id: I2bf2887e1064c9103d29c542d20365c0aea75d76

7 years agobuffer.h: allow declaring an alignment
Johann [Tue, 6 Jun 2017 22:15:47 +0000 (15:15 -0700)]
buffer.h: allow declaring an alignment

x86 simd register operations generally prefer and may require 16 byte
alignment.

Change-Id: I73ce577a90dc66af60743c5727c36f23200950ba

7 years agoRemove some dead code. Coverity CID 1310058
Sylvestre Ledru [Fri, 9 Jun 2017 12:08:18 +0000 (14:08 +0200)]
Remove some dead code. Coverity CID 1310058

Change-Id: I1186cf1dd8cde42f5970928f43edfc852298289d

7 years agoMerge "vp8_decode_frame: fix oob read on truncated key frame"
James Zern [Thu, 8 Jun 2017 23:17:49 +0000 (23:17 +0000)]
Merge "vp8_decode_frame: fix oob read on truncated key frame"

7 years agovp8_decode_frame: fix oob read on truncated key frame
James Zern [Thu, 8 Jun 2017 03:46:13 +0000 (20:46 -0700)]
vp8_decode_frame: fix oob read on truncated key frame

the check for error correction being disabled was overriding the data
length checks. this avoids returning incorrect information (width /
height) for the decoded frame which could result in inconsistent sizes
returned in to an application causing it to read beyond the bounds of
the frame allocation.

BUG=webm:1443
BUG=b/62458770

Change-Id: I063459674e01b57c0990cb29372e0eb9a1fbf342

7 years agoRevert "buffer.h: use size_t"
Johann [Thu, 8 Jun 2017 17:20:21 +0000 (10:20 -0700)]
Revert "buffer.h: use size_t"

This reverts commit f08581c1d010ea95b8cfae686b5c0a64b32519f9.

type conversion warnings abound.

Change-Id: I41d4c0e7a388e1008bdbc55fefda4bbca3f89f00

7 years agoMerge "Merge skin detection code in vp8/9."
Jerome Jiang [Thu, 8 Jun 2017 16:35:59 +0000 (16:35 +0000)]
Merge "Merge skin detection code in vp8/9."

7 years agoMerge "fdct16x16 neon optimization"
Johann Koenig [Thu, 8 Jun 2017 15:19:35 +0000 (15:19 +0000)]
Merge "fdct16x16 neon optimization"

7 years agoMerge skin detection code in vp8/9.
Jerome Jiang [Mon, 5 Jun 2017 18:09:05 +0000 (11:09 -0700)]
Merge skin detection code in vp8/9.

BUG=webm:1438

Change-Id: Ie3dc034c7dbb498a0b088a767b1936ddeed4df14

7 years agovp9 level targeting: more strict constraint on min_gf_interval
hui su [Wed, 7 Jun 2017 22:50:36 +0000 (15:50 -0700)]
vp9 level targeting: more strict constraint on min_gf_interval

min_gf_interval should be no less than min_altref_distance + 1,
as the encoder may produce bitstream with alt-ref distance being
min_gf_interval - 1.

BUG=b/38450599

Change-Id: Ifb733daa643ebc668d1b23e1ce92db94b66dabe8

7 years agofdct16x16 neon optimization
Johann [Thu, 25 May 2017 17:02:34 +0000 (10:02 -0700)]
fdct16x16 neon optimization

Roughly 2x speedup. Since the only change for HBD is to store(), the
improvement appears to hold there as well.

BUG=webm:1424

Change-Id: I15b813d50deb2e47b49a6b0705945de748e83c19

7 years agoMerge "vp9: SVC: Enable simple_block_yrd for temporal layers."
Marco Paniconi [Wed, 7 Jun 2017 21:12:13 +0000 (21:12 +0000)]
Merge "vp9: SVC: Enable simple_block_yrd for temporal layers."

7 years agoMerge changes Iade45f69,I18d90658,Ieca3f1ef
Johann Koenig [Wed, 7 Jun 2017 19:20:15 +0000 (19:20 +0000)]
Merge changes Iade45f69,I18d90658,Ieca3f1ef

* changes:
  buffer.h: add num_elements_
  buffer.h: zero-init all values
  buffer.h: use size_t

7 years agovp9: SVC: Enable simple_block_yrd for temporal layers.
Marco [Wed, 7 Jun 2017 18:33:40 +0000 (11:33 -0700)]
vp9: SVC: Enable simple_block_yrd for temporal layers.

Enable simple_block_yrd for temporal enhancement layers (TL > 0).
And remove block size condiiton for SVC mode.
Only affects speed >= 7 SVC.

Speedup ~3-4%.
avgPSNR regression on RTC for (3 spatial, 3 temporal) layers: ~1%.

Change-Id: Iff4fc191623b71c69cd373e7c0823385e7ac67ed

7 years agobuffer.h: add num_elements_
Johann [Tue, 6 Jun 2017 20:17:07 +0000 (13:17 -0700)]
buffer.h: add num_elements_

raw_size_ was being incorrectly computed and used

Change-Id: Iade45f69964c567ffb258880f26006a96ae5a30d

7 years agobuffer.h: zero-init all values
Johann [Wed, 7 Jun 2017 18:27:26 +0000 (11:27 -0700)]
buffer.h: zero-init all values

Change-Id: I18d90658bcd4365d49adcadd6954090b3b399aa8

7 years agobuffer.h: use size_t
Johann [Tue, 6 Jun 2017 19:58:15 +0000 (12:58 -0700)]
buffer.h: use size_t

Change-Id: Ieca3f1ef23cd1d7b844ea3ecb054007ed280b04f

7 years agovp9: SVC: Enable row-mt in sample encoder.
Marco [Wed, 7 Jun 2017 17:27:19 +0000 (10:27 -0700)]
vp9: SVC: Enable row-mt in sample encoder.

Change-Id: I4b51043cb3f5955efe947fe4685aed4a21adb8bd

7 years agoMerge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}"
James Zern [Tue, 6 Jun 2017 23:52:39 +0000 (23:52 +0000)]
Merge "ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}"

7 years agoMerge "vp9: SVC: Adjust some speed settings for SVC speed >= 7."
Marco Paniconi [Tue, 6 Jun 2017 23:07:44 +0000 (23:07 +0000)]
Merge "vp9: SVC: Adjust some speed settings for SVC speed >= 7."

7 years agovp9: SVC: Adjust some speed settings for SVC speed >= 7.
Marco [Tue, 6 Jun 2017 22:28:06 +0000 (15:28 -0700)]
vp9: SVC: Adjust some speed settings for SVC speed >= 7.

Keep the 1/4subpel for all frames, use SUBPEL_TREE_PRUNED_EVENMORE
for all temporal enhancement layer frames.

Change-Id: Ibc681acbb6fc75b7b3c57fc483fcb11d591dfc9a

7 years agobuffer.h: split out init
Johann [Tue, 6 Jun 2017 19:48:01 +0000 (12:48 -0700)]
buffer.h: split out init

Change-Id: Idfbd2e01714ca9d00525c5aeba78678b43fb0287

7 years agobuffer.h: Use T for values
Johann [Tue, 6 Jun 2017 19:05:14 +0000 (12:05 -0700)]
buffer.h: Use T for values

Change-Id: I2da4110e843b6e361028b921c24b6ca2ea9077d9

7 years agoInitialize cost_list all to INT_MAX.
Jerome Jiang [Tue, 6 Jun 2017 17:13:34 +0000 (10:13 -0700)]
Initialize cost_list all to INT_MAX.

It is initialized to be { INT_MAX, 0, ... } in ffe0f9b.
No effect on encoders.
Make it consistent with other initializations.

BUG=webm:1440

Change-Id: Ie2a180d93626b55914c8c4255e466a1986d2b922

7 years agovp9_mcomp,get_cost_surf_min: quiet conversion warning
James Zern [Tue, 6 Jun 2017 05:52:58 +0000 (22:52 -0700)]
vp9_mcomp,get_cost_surf_min: quiet conversion warning

visual studio will warn if a 32-bit shift is implicitly converted to 64.
in this case integer storage is enough for the result.
since:
f3a9ae5ba Fix ubsan failure in vp9_mcomp.c.

Change-Id: I7e0e199ef8d3c64e07b780c8905da8c53c1d09fc

7 years agoMerge "Fix valgrind failure on uninitialized variables."
Jerome Jiang [Tue, 6 Jun 2017 03:47:30 +0000 (03:47 +0000)]
Merge "Fix valgrind failure on uninitialized variables."

7 years agoMerge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx"
James Zern [Tue, 6 Jun 2017 02:19:41 +0000 (02:19 +0000)]
Merge "ppc: Add vpx_sad64/32/16x64/32/16_avg_vsx"

7 years agoFix valgrind failure on uninitialized variables.
Jerome Jiang [Mon, 5 Jun 2017 18:41:02 +0000 (11:41 -0700)]
Fix valgrind failure on uninitialized variables.

BUG=webm:1440

Change-Id: I7074e42bdfa8dd25f11bbb3f2ab1b41d6f4c12e4

7 years agoFix ubsan failure in vp9_mcomp.c.
Jerome Jiang [Fri, 2 Jun 2017 20:50:08 +0000 (13:50 -0700)]
Fix ubsan failure in vp9_mcomp.c.

Change-Id: Iff1dea1fe9d4ea1d3fc95ea736ddf12f30e6f48d

7 years agovp9: SVC: Force subpel search off under certain conditions.
Marco [Fri, 26 May 2017 18:36:45 +0000 (11:36 -0700)]
vp9: SVC: Force subpel search off under certain conditions.

For SVC 1 pass non-rd mode:
Force subpel seach off for SVC for non-reference frames
under motion threshold.

Add flag to svc context to indicate if the frame is not used
as a reference.

Little/no quaity loss, ~2% speedup.

Change-Id: Ic433c44b514d19d08b28f80ff05231dc943b28e9

7 years agoMerge "vp9: Speed >8: Set subpel_search_method for low motion."
Marco Paniconi [Thu, 1 Jun 2017 23:56:53 +0000 (23:56 +0000)]
Merge "vp9: Speed >8: Set subpel_search_method for low motion."

7 years agovp9: Speed >8: Set subpel_search_method for low motion.
Marco [Thu, 1 Jun 2017 23:07:15 +0000 (16:07 -0700)]
vp9: Speed >8: Set subpel_search_method for low motion.

Speed >=8: for resolutions above CIF, and for low motion content,
set subpel_search_method to SUBPEL_TREE_PRUNED_EVENMORE.

Small speed gain (~2%) on vga clips,
RTC metrics up by ~2-3% on average.

Change-Id: Ie26ba0264589652f92dfe74308740debf94cf0cc

7 years agovp8 skin detection: Fix visual studio build failure.
Jerome Jiang [Thu, 1 Jun 2017 20:46:46 +0000 (13:46 -0700)]
vp8 skin detection: Fix visual studio build failure.

Change-Id: I510b755550ebbfa2aaf9b974920d7f1c6454a845

7 years agoFix corruption in skin map debugging output yuv.
Jerome Jiang [Wed, 31 May 2017 22:56:29 +0000 (15:56 -0700)]
Fix corruption in skin map debugging output yuv.

For both vp8 and vp9.

BUG=webm:1437

Change-Id: Ifd06f68a876ade91cc2cc27c574c4641b77cce28

7 years agovp8: Clean up skin detection.
Jerome Jiang [Wed, 31 May 2017 21:57:10 +0000 (14:57 -0700)]
vp8: Clean up skin detection.

Use only the average of center 2x2 pixels in vp8.

Change-Id: I2b23ff19a90827226273e0fca49e90c734eda59b

7 years agoMerge "comp_avg_pred neon: used by sub pixel avg variance"
Johann Koenig [Wed, 31 May 2017 18:17:27 +0000 (18:17 +0000)]
Merge "comp_avg_pred neon: used by sub pixel avg variance"

7 years agoMerge "Write skin map of vp8 skin detection for debug."
Jerome Jiang [Wed, 31 May 2017 16:37:07 +0000 (16:37 +0000)]
Merge "Write skin map of vp8 skin detection for debug."

7 years agoMerge "Update vpx_highbd_idct4x4_16_add_sse2()"
Linfeng Zhang [Wed, 31 May 2017 15:56:19 +0000 (15:56 +0000)]
Merge "Update vpx_highbd_idct4x4_16_add_sse2()"

7 years agocomp_avg_pred neon: used by sub pixel avg variance
Johann [Wed, 3 May 2017 21:58:52 +0000 (14:58 -0700)]
comp_avg_pred neon: used by sub pixel avg variance

BUG=webm:1423

Change-Id: I33de537f238f58f89b7a6c1c2d6e8110de4b8804

7 years agoWrite skin map of vp8 skin detection for debug.
Jerome Jiang [Fri, 26 May 2017 23:20:49 +0000 (16:20 -0700)]
Write skin map of vp8 skin detection for debug.

Change-Id: Ica1b4e918aa759cd0ce65920f9d88452bbf9e3b4

7 years agoUpdate vpx_highbd_idct4x4_16_add_sse2()
Linfeng Zhang [Mon, 22 May 2017 23:04:05 +0000 (16:04 -0700)]
Update vpx_highbd_idct4x4_16_add_sse2()

BUG=webm:1412

Change-Id: I26e4b34ae9bc1ae80c24f56d740d737a95f1ab84

7 years agoMerge "comp_avg_pred: alignment"
Johann Koenig [Tue, 30 May 2017 16:21:05 +0000 (16:21 +0000)]
Merge "comp_avg_pred: alignment"

7 years agoMerge "remove DECLARE_ALIGNED from neon code"
Johann Koenig [Tue, 30 May 2017 15:58:16 +0000 (15:58 +0000)]
Merge "remove DECLARE_ALIGNED from neon code"

7 years agocomp_avg_pred: alignment
Johann [Tue, 30 May 2017 14:46:43 +0000 (07:46 -0700)]
comp_avg_pred: alignment

x86 requires 16 byte alignment for some vector loads/stores.

arm does not have the same requirement.

The asserts are still in avg_pred_sse2.c. This just removes them from
the common code.

Change-Id: Ic5175c607a94d2abf0b80d431c4e30c8a6f731b6

7 years agoMerge "Fix vp8 race when build --enable-vp9-highbitdepth."
Jerome Jiang [Tue, 30 May 2017 05:47:44 +0000 (05:47 +0000)]
Merge "Fix vp8 race when build --enable-vp9-highbitdepth."

7 years agoremove DECLARE_ALIGNED from neon code
Johann [Fri, 26 May 2017 17:41:57 +0000 (10:41 -0700)]
remove DECLARE_ALIGNED from neon code

Unlike x86 neon only requires type alignment when loading into vectors.

Change-Id: I7bbbe4d51f78776e499ce137578d8c0effdbc02f

7 years agoMerge "subpel variance neon: reduce stack usage"
Johann Koenig [Fri, 26 May 2017 17:25:46 +0000 (17:25 +0000)]
Merge "subpel variance neon: reduce stack usage"

7 years agoMerge "Use vdup instead of vmov"
Johann Koenig [Fri, 26 May 2017 17:25:23 +0000 (17:25 +0000)]
Merge "Use vdup instead of vmov"

7 years agoFix vp8 race when build --enable-vp9-highbitdepth.
Jerome Jiang [Sat, 20 May 2017 00:07:09 +0000 (17:07 -0700)]
Fix vp8 race when build --enable-vp9-highbitdepth.

Split vp8/vp9 implementations on yv12_copy_frame_c.
Remove high-bitdepth codes from vp8_yv12_extend_frame_borders_c.
Clean up vp8 codes usage in vp9.

BUG=webm:1435

Change-Id: Ic68e79e9d71e1b20ddfc451fb8dcf2447861236d

7 years agovp9: SVC: Fix to condiiton on using source_sad.
Marco [Fri, 26 May 2017 15:43:32 +0000 (08:43 -0700)]
vp9: SVC: Fix to condiiton on using source_sad.

Fix the condition on usage of source_sad for temporal layers.
FIx allows it to be used for the case of 1 temporal layer.

Change-Id: I02b1b0ade67a7889d1b93cee66d27c0951131fc3

7 years agoMerge "vp9: Use source_sad only on top temporal enhancement layer."
Marco Paniconi [Fri, 26 May 2017 05:24:05 +0000 (05:24 +0000)]
Merge "vp9: Use source_sad only on top temporal enhancement layer."

7 years agoMerge "vp9: SVC: Enable copy partition for SVC speed >= 7."
Marco Paniconi [Fri, 26 May 2017 05:23:47 +0000 (05:23 +0000)]
Merge "vp9: SVC: Enable copy partition for SVC speed >= 7."

7 years agovp9: Use source_sad only on top temporal enhancement layer.
Marco [Thu, 25 May 2017 23:29:48 +0000 (16:29 -0700)]
vp9: Use source_sad only on top temporal enhancement layer.

For 1 pass CBR SVC mode.

Change-Id: Ic026740f9d0ec5eee7c5845be9c5b15884fec48d

7 years agoRefactor: Move vp8 skin detection to new files.
Jerome Jiang [Thu, 25 May 2017 20:58:20 +0000 (13:58 -0700)]
Refactor: Move vp8 skin detection to new files.

Change-Id: If760f28cbbf22beac1cc9bd1546f13831e9dd3f0

7 years agovp9: SVC: Enable copy partition for SVC speed >= 7.
Marco [Thu, 25 May 2017 18:01:25 +0000 (11:01 -0700)]
vp9: SVC: Enable copy partition for SVC speed >= 7.

Adjust the max_copied_frame setting for temporal layers.
Keep the same setting for non-SVC at speed 8.
This change also enables copy_partiton for non-SVC at speed 7,
but with smaller value of max_copied_frame (=2).

~2% speedup for SVC speed 7, 3 layers, with little/no quality loss.

Change-Id: Ic65ac9aad764ec65a35770d263424b2393ec6780

7 years agosubpel variance neon: reduce stack usage
Johann [Wed, 24 May 2017 18:52:42 +0000 (11:52 -0700)]
subpel variance neon: reduce stack usage

Unlike x86, arm does not impose additional alignment restrictions on
vector loads. For incoming values to the first pass, it uses vld1_u32()
which typically does impose a 4 byte alignment. However, as the first
pass operates on user-supplied values we must prepare for unaligned
values anyway (and have, see mem_neon.h).

But for the local temporary values there is no stride and the load will
use vld1_u8 which does not require 4 byte alignment.

There are 3 temporary structures. In the C, one is uint16_t. The arm
saturates between passes but still passes tests. If this becomes an
issue new functions will be needed.

Change-Id: I3c9d4701bfeb14b77c783d0164608e621bfecfb1

7 years agoUse vdup instead of vmov
Johann [Wed, 24 May 2017 18:38:15 +0000 (11:38 -0700)]
Use vdup instead of vmov

Change-Id: Idb6248c1429b55176bb3e9f4e8365ea0ed2be62a

7 years agoMerge changes Iaab2b9a1,Idfb458d3
Johann Koenig [Wed, 24 May 2017 18:33:53 +0000 (18:33 +0000)]
Merge changes Iaab2b9a1,Idfb458d3

* changes:
  sub pel avg variance neon: 4x block sizes
  sub pel variance neon: 4x block sizes

7 years agoMerge changes I31fa6ef8,I228c6f29
Johann Koenig [Wed, 24 May 2017 18:32:01 +0000 (18:32 +0000)]
Merge changes I31fa6ef8,I228c6f29

* changes:
  sub pel avg variance neon: add neon optimizations
  sub pel variance neon: normalize variable names

7 years agoMerge "partial_idct_test,InitInput: fix rollover in mult"
James Zern [Wed, 24 May 2017 16:27:21 +0000 (16:27 +0000)]
Merge "partial_idct_test,InitInput: fix rollover in mult"

7 years agopartial_idct_test,InitInput: fix rollover in mult
James Zern [Wed, 24 May 2017 13:25:44 +0000 (15:25 +0200)]
partial_idct_test,InitInput: fix rollover in mult

promote coeff to signed 64-bit to avoid exceeding integer bounds when
squaring the value

Change-Id: If77bef6bc0a6a4c39ca3013e5e2ddb426a1c6e1f

7 years agoppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}
Alexandra Hájková [Wed, 24 May 2017 13:27:09 +0000 (13:27 +0000)]
ppc: Add vpx_sadnxmx4d_vsx for n,m = {8, 16, 32 ,64}

Change-Id: I547d0099e15591655eae954e3ce65fdf3b003123

7 years agoUpdate inv_txfm_sse2.h and inv_txfm_sse2.c
Linfeng Zhang [Mon, 22 May 2017 22:37:15 +0000 (15:37 -0700)]
Update inv_txfm_sse2.h and inv_txfm_sse2.c

Extract shared code into inline functions.

Change-Id: Iee1e5a4bc6396aeed0d301163095c9b21aa66b2f

7 years agoUpdate InitInput() in test/partial_idct_test.cc
Linfeng Zhang [Mon, 22 May 2017 22:46:28 +0000 (15:46 -0700)]
Update InitInput() in test/partial_idct_test.cc

Make it work in high bit depth.

BUG=webm:1412

Change-Id: Ic5cfd410a69709f01e2924774356a108a349d273

7 years agoAdd support for Visual Studio 2017
Gregor Jasny [Tue, 23 May 2017 07:30:44 +0000 (09:30 +0200)]
Add support for Visual Studio 2017

BUG=webm:1428

Change-Id: Iba98aef1159724d106cf39b94d7b69843d76cd48

7 years agosub pel avg variance neon: 4x block sizes
Johann [Thu, 4 May 2017 21:48:43 +0000 (14:48 -0700)]
sub pel avg variance neon: 4x block sizes

BUG=webm:1423

Change-Id: Iaab2b9a183fdb54aae5f717aba95d90dc36a9e3b

7 years agosub pel variance neon: 4x block sizes
Johann [Thu, 4 May 2017 15:39:12 +0000 (08:39 -0700)]
sub pel variance neon: 4x block sizes

Add optimizations for blocks of width 4

BUG=webm:1423

Change-Id: Idfb458d36db3014d48fbfbe7f5462aa6eb249938

7 years agosub pel avg variance neon: add neon optimizations
Johann [Wed, 3 May 2017 19:28:32 +0000 (12:28 -0700)]
sub pel avg variance neon: add neon optimizations

These are missing an optimized version of vpx_comp_avg_pred

BUG=webm:1423

Change-Id: I31fa6ef842e98f7ff3ea079ffed51ae33178e2ed

7 years agosub pel variance neon: normalize variable names
Johann [Wed, 3 May 2017 19:12:44 +0000 (12:12 -0700)]
sub pel variance neon: normalize variable names

match vpx_dsp/variance.c variable names

Change-Id: I228c6f296c183af147b079b7c8bcdf97bd09cf3a

7 years agoMerge "Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2"
Linfeng Zhang [Mon, 22 May 2017 20:58:17 +0000 (20:58 +0000)]
Merge "Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2"

7 years agovariance neon: assert overflow conditions
Johann [Thu, 4 May 2017 16:07:28 +0000 (09:07 -0700)]
variance neon: assert overflow conditions

Change-Id: I12faca82d062eb33dc48dfeb39739b25112316cd

7 years agoAdd vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2
Linfeng Zhang [Wed, 17 May 2017 19:37:23 +0000 (12:37 -0700)]
Add vpx_highbd_idct{4x4,8x8,16x16}_1_add_sse2

BUG=webm:1412

Change-Id: Ia338a6057d36f9ed7eaa9cbd4dfbf0c3cbdc6468

7 years agoneon variance: special case 4x
Johann [Mon, 15 May 2017 23:30:00 +0000 (16:30 -0700)]
neon variance: special case 4x

The sub pixel variance uses a temp buffer which guarantees width ==
stride. Take advantage of this with the 4x and avoid the very costly
lane loads.

Change-Id: Ia0c97eb8c29dc8dfa6e51a29dff9b75b3c6726f1

7 years agoMerge changes Ib8dd96f7,Ie9854b77
Johann Koenig [Mon, 22 May 2017 17:48:32 +0000 (17:48 +0000)]
Merge changes Ib8dd96f7,Ie9854b77

* changes:
  neon variance: process 4x blocks
  use memcpy for unaligned neon stores

7 years agoMerge "vp9: Adjustments to cyclic refresh for high motion."
Marco Paniconi [Mon, 22 May 2017 06:27:30 +0000 (06:27 +0000)]
Merge "vp9: Adjustments to cyclic refresh for high motion."

7 years agovp9: Adjustments to cyclic refresh for high motion.
Marco [Mon, 22 May 2017 05:15:28 +0000 (22:15 -0700)]
vp9: Adjustments to cyclic refresh for high motion.

For aq-mode=3: refactor the condition for turning off
the refresh. Add some adjustments for high motion content.

No/little change in RTC metrics, only affects high motion case.

Change-Id: I7da8eabfb0e61db014be4562806f72ee5ef4a43b

7 years agovp9: Speed >= 8: Modify condition for low-resoln.
Marco [Mon, 22 May 2017 05:12:38 +0000 (22:12 -0700)]
vp9: Speed >= 8: Modify condition for low-resoln.

No change on RTC metrics.

Change-Id: I5abc573cb56572188d900645d13ba479f55a1ea0

7 years agoMerge "neon 4 byte helper functions"
Johann Koenig [Fri, 19 May 2017 17:11:30 +0000 (17:11 +0000)]
Merge "neon 4 byte helper functions"

7 years agoMerge "neon fdct: 4x4 implementation"
Johann Koenig [Fri, 19 May 2017 17:08:57 +0000 (17:08 +0000)]
Merge "neon fdct: 4x4 implementation"

7 years agoMerge "Changes to modified error."
Paul Wilkins [Fri, 19 May 2017 12:24:32 +0000 (12:24 +0000)]
Merge "Changes to modified error."

7 years agovp9: SVC: Modify condition to allow for copy partition.
Marco [Thu, 18 May 2017 21:12:24 +0000 (14:12 -0700)]
vp9: SVC: Modify condition to allow for copy partition.

When temporal layers are used, only allow for copy partition
on the top temporal enhancement layer frames.

Change-Id: I5472abdc0f9f6c8dafa75a7a84c615e08ae22af8

7 years agoMerge "vp9: Make copy partition work for SVC and dynamic resize."
Jerome Jiang [Thu, 18 May 2017 19:37:29 +0000 (19:37 +0000)]
Merge "vp9: Make copy partition work for SVC and dynamic resize."

7 years agovp9: Make copy partition work for SVC and dynamic resize.
Marco [Tue, 16 May 2017 00:14:11 +0000 (17:14 -0700)]
vp9: Make copy partition work for SVC and dynamic resize.

Only affects speed 8.

Make changes to copy partition to fix a bug in setting microblock
offset. Avg PSNR shows 0.02% gain on rtc_derf and 0.08% loss on rtc.

Change-Id: I61c3e5914dde645331344388e7437e5638acd4f3