]> granicus.if.org Git - libvpx/commit
Rewrite HORIZx4 and HORIZx8 in subpixel filter functions
authorYunqing Wang <yunqingwang@google.com>
Thu, 3 Oct 2013 00:26:01 +0000 (17:26 -0700)
committerYunqing Wang <yunqingwang@google.com>
Thu, 3 Oct 2013 16:04:02 +0000 (09:04 -0700)
commited22179a82700f4f0ba58030e2f3af2c17a02e52
tree6d5076a72ddcb2c68b34f91862e384e898e8aa38
parent03698aa6d8dc10e526955f4b516799e023663b4d
Rewrite HORIZx4 and HORIZx8 in subpixel filter functions

In subpixel filters, prefetched source data, unrolled loops,
and interleaved instructions.

In HORIZx4, integrated the idea in Scott's CL (commit:
d22a504d11a15dc3eab666859db0046b5a7d75c5), which was suggested by
Erik/Tamar from Intel. Further tweaking was done to combine row 0,
2, and row 1, 3 in registers to do more 2-row-in-1 operations until
the last add.

Test showed a ~2% decoder speedup.

Change-Id: Ib53d04ede8166c38c3dc744da8c6f737ce26a0e3
vp9/common/x86/vp9_subpixel_8t_ssse3.asm