]> granicus.if.org Git - libvpx/commit
Implement Neon variance functions using UDOT instruction
authorJonathan Wright <jonathan.wright@arm.com>
Tue, 11 May 2021 12:17:44 +0000 (13:17 +0100)
committerJames Zern <jzern@google.com>
Wed, 12 May 2021 21:03:05 +0000 (14:03 -0700)
commitc8b0432505d32820af0c42a94b219aa83eed5db9
tree2b41224e7339d3c083dc101ddfb027af0997662c
parent2db85c269bc5479e48ea7cd4fde85236ee0bc347
Implement Neon variance functions using UDOT instruction

Accelerate Neon variance functions by implementing the sum of squares
calculation using the Armv8.4-A UDOT instruction instead of 4 MLAs.

The previous implementation is retained for use on CPUs that do not
implement the Armv8.4-A dot product instructions.

Bug: b/181236880
Change-Id: I9ab3d52634278b9b6f0011f39390a1195210bc75
vpx_dsp/arm/sum_neon.h
vpx_dsp/arm/variance_neon.c