granicus.if.org Git - libvpx/commit

]> granicus.if.org Git - libvpx/commit

author	Kyle Siefring <kylesiefring@gmail.com>
	Sun, 22 Oct 2017 23:34:19 +0000 (19:34 -0400)
committer	Kyle Siefring <kylesiefring@gmail.com>
	Tue, 24 Oct 2017 14:39:48 +0000 (10:39 -0400)
commit	ae35425ae64a3d9573f85a4a92c5638a58044057
tree	d92525876e018c38873e281dd5045f72af1f11be	tree \| snapshot
parent	b3a36f7946f930caa0e96448648db60d7330c98d	commit \| diff

Optimize convolve8 SSSE3 and AVX2 intrinsics

Changed the intrinsics to perform summation similiar to the way the assembly does.

The new code diverges from the assembly by preferring unsaturated additions.

Results for haswell

SSSE3
Horiz/Vert  Size  Speedup
Horiz       x4    ~32%
Horiz       x8    ~6%
Vert        x8    ~4%

AVX2
Horiz/Vert  Size  Speedup
Horiz       x16   ~16%
Vert        x16   ~14%

BUG=webm:1471

Change-Id: I7ad98ea688c904b1ba324adf8eb977873c8b8668

test/convolve_test.cc		diff \| blob \| history
vpx_dsp/x86/convolve_avx2.h		diff \| blob \| history
vpx_dsp/x86/convolve_ssse3.h		diff \| blob \| history
vpx_dsp/x86/vpx_subpixel_8t_intrin_ssse3.c		diff \| blob \| history

Unnamed repository; edit this file 'description' to name the repository.

RSS Atom