]> granicus.if.org Git - libvpx/commit
convolve8 sse2 test
authorAngie Chiang <angiebird@google.com>
Mon, 22 Feb 2016 22:11:05 +0000 (14:11 -0800)
committerAngie Chiang <angiebird@google.com>
Thu, 25 Feb 2016 01:01:20 +0000 (17:01 -0800)
commit8878fa4f9a5915a3dcf56d35b0127811d2645bf4
treef5ff65c26fff8cfe47fbc5350c265a193db7ab23
parent961668c91c642a69cdefc9765d0c34b377d4e3f8
convolve8 sse2 test

This experiment shows that when frame size is 64x64
vpx_highbd_convolve8_sse2 and vpx_convolve8_sse2's speed are similar.
However when frame size becomes 1024x1024
vpx_highbd_convolve8_sse2 is around 50% slower than vpx_convolve8_sse2
we think the bottleneck is from memory IO

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_64 (17 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_64 (42 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_64 (139 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_64 (499 ms)

VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_64 (16 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_64 (40 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_64 (130 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_64 (485 ms)

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_8_1024 (32 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_16_1024 (61 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_32_1024 (196 ms)
VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024

VP10ConvolveTest.vpx_highbd_convolve8_sse2_speed_64_1024 (694 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_8_1024 (21 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_16_1024 (44 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_32_1024 (138 ms)
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024
VP10ConvolveTest.vpx_convolve8_sse2_speed_l_64_1024 (491 ms)

Change-Id: I3131a031e0380e8eae748cfcccc6cbb961d05943
test/vp10_convolve_test.cc