]> granicus.if.org Git - libvpx/commit
Make sure only NEON FDCT functions are called.
authorKonstantinos Margaritis <konma@vectorcamp.gr>
Fri, 11 Mar 2022 18:19:25 +0000 (20:19 +0200)
committerKonstantinos Margaritis <konma@vectorcamp.gr>
Thu, 17 Mar 2022 11:07:12 +0000 (13:07 +0200)
commitf79d256cb28a4228df66a7a6d1cebbd9071e0639
treeaf6e7e70ddb165208e3a7ad22132ac48f69e1b21
parent8a50f70ffc5eea6c2392a5c176bfe43e450ecebc
Make sure only NEON FDCT functions are called.

[NEON]
Added vpx_fdct4x4_pass1_neon(),
Added vpx_fdct8x8_pass1_notranspose_neon(),
Added vpx_fdct8x8_pass1_neon() to avoid code duplication
Refactored vpx_fdct4x4_neon() and vpx_dct8x8_neon() to use the above
Rename dct_body to vpx_fdct16x16_body to reuse later
Add transpose_s16_16x16()

I have run make test and all tests/configurations seem to pass.

Profiled using this command on an Ampere Altra VM:
sudo perf record -g ./vpxenc --codec=vp9 --height=1080 --width=1920 \
   --fps=25/1 --limit=20 -o output.mkv \
   ../original_videos_Sports_1080P_Sports_1080P-0063.mkv --debug –rt

Before this optimization:
1.32%     1.32%  vpxenc   vpxenc              [.] vpx_fdct4x4_neon
0.16%     0.16%  vpxenc   vpxenc              [.] vpx_fdct4x4_c
0.79%     0.79%  vpxenc   vpxenc              [.] vpx_fdct8x8_c
0.52%     0.52%  vpxenc   vpxenc              [.] vpx_fdct8x8_neon
1.23%     1.23%  vpxenc   vpxenc              [.] vpx_fdct16x16_c
0.54%     0.54%  vpxenc   vpxenc              [.] vpx_fdct16x16_neon

So, even though a _neon() version exists, the C version was called \
as well. After this patch:

1.42%     1.36%  vpxenc   vpxenc              [.] vpx_fdct4x4_neon
0.87%     0.82%  vpxenc   vpxenc              [.] vpx_fdct8x8_neon
0.74%     0.74%  vpxenc   vpxenc              [.] vpx_fdct16x16_neon

Change-Id: Id4e1dd315c67b4355fe4e5a1b59e181a349f16d0
vpx_dsp/arm/fdct16x16_neon.c
vpx_dsp/arm/fdct16x16_neon.h [new file with mode: 0644]
vpx_dsp/arm/fdct_neon.c
vpx_dsp/arm/fdct_neon.h [new file with mode: 0644]
vpx_dsp/arm/fwd_txfm_neon.c
vpx_dsp/arm/transpose_neon.h