]> granicus.if.org Git - libvpx/commit
Optimize transpose_neon.h helper functions
authorJonathan Wright <jonathan.wright@arm.com>
Sat, 25 Feb 2023 00:43:46 +0000 (00:43 +0000)
committerJonathan Wright <jonathan.wright@arm.com>
Mon, 27 Feb 2023 09:49:02 +0000 (09:49 +0000)
commitb25cca8c2edba5fbc18448007da2624a25113f4d
tree6b4a8cb4adbcdabb409c13954db1c6e18b869e06
parent45dc0d34d2fa1a848c282d8fc992206fa69f01b8
Optimize transpose_neon.h helper functions

1) Use vtrn[12]q_[su]64 in vpx_vtrnq_[su]64* helpers on AArch64
   targets. This produces half as many TRN1/2 instructions compared to
   the number of MOVs that result from vcombine.

2) Use vpx_vtrnq_[su]64* helpers wherever applicable.

3) Refactor transpose_4x8_s16 to operate on 128-bit vectors.

Change-Id: I9a8b1c1fe2a98a429e0c5f39def5eb2f65759127
vpx_dsp/arm/transpose_neon.h