granicus.if.org Git - libvpx/commit

author	levytamar82 <levytamar82@gmail.com>
	Fri, 17 Jan 2014 19:39:16 +0000 (12:39 -0700)
committer	levytamar82 <levytamar82@gmail.com>
	Thu, 13 Feb 2014 03:45:31 +0000 (20:45 -0700)
commit	876c72a093f9d209e98522d1ef17ceef08689a2b
tree	83c03390de75cba4e27016a391f0da6766a33866	tree \| snapshot
parent	20d0f2b92f8328cf1fc7128be26e86ab81092776	commit \| diff

AVX2 Convolve Optimization

Two convolve functions were optimized for AVX2:
1. vp9_filter_block1d16_h8
2. vp9_filter_block1d16_v8
vp9_filter_block1d16_v8 was optimized for AVX2 by reducing the number of
loop strides by half, two strides were processed in parallel.
vp9_filter_block1d16_v8 was also optimized in the same way also some of the
loads were being done outside of the loop and by that preventing redundant
loads.
This Optimization gives 43% function level gain and 1.3% user level gain.
Now can be compiled in Windows

Change-Id: I2714124cfb0c14a77d7a0ce126a20db92ffbf92c

vp9/common/vp9_rtcd_defs.sh		diff \| blob \| history
vp9/common/x86/vp9_asm_stubs.c		diff \| blob \| history
vp9/common/x86/vp9_subpixel_8t_intrin_avx2.c	[new file with mode: 0644]	blob
vp9/vp9_common.mk		diff \| blob \| history