]> granicus.if.org Git - llvm/commit
[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions.
authorAndrea Di Biagio <Andrea_DiBiagio@sn.scee.net>
Thu, 22 Aug 2019 15:20:16 +0000 (15:20 +0000)
committerAndrea Di Biagio <Andrea_DiBiagio@sn.scee.net>
Thu, 22 Aug 2019 15:20:16 +0000 (15:20 +0000)
commitde2dc037c80cc6c1466137b048682feff76703c6
tree23d5e403392f97ade6fced9115c8fff6c89624c8
parentbb63e152b1662df44d6d9de7f01b5045910db809
[X86][BtVer2] Fix latency/throughput of scalar integer MUL instructions.

Single operand MUL instructions that implicitly set EAX have the following
latency/throughput profile (see below):

imul %cl              # latency: 3cy - uOPs: 1 - 1 JMul
imul %cx              # latency: 3cy - uOPs: 3 - 3 JMul
imul %ecx             # latency: 3cy - uOPs: 2 - 2 JMul
imul %rcx             # latency: 6cy - uOPs: 2 - 4 JMul

mul %cl               # latency: 3cy - uOPs: 1 - 1 JMul
mul %cx               # latency: 3cy - uOPs: 3 - 3 JMul
mul %ecx              # latency: 3cy - uOPs: 2 - 2 JMul
mul %rcx              # latency: 6cy - uOPs: 2 - 4 JMul

Excluding the 64bit variant, which has a latency of 6cy, every other instruction
has a latency of 3cy. However, the number of decoded macro-opcodes (as well as
the resource cyles) depend on the MUL size.

The two operand MULs have a more predictable profile (see below):

imul %dx, %dx         # latency: 3cy - uOPs: 1 - 1 JMul
imul %edx, %edx       # latency: 3cy - uOPs: 1 - 1 JMul
imul %rdx, %rdx       # latency: 6cy - uOPs: 1 - 4 JMul

imul $3, %dx, %dx     # latency: 4cy - uOPs: 2 - 2 JMul
imul $3, %ecx, %ecx   # latency: 3cy - uOPs: 1 - 1 JMul
imul $3, %rdx, %rdx   # latency: 6cy - uOPs: 1 - 4 JMul

This patch updates the values in the Jaguar scheduling model and regenerates
llvm-mca tests.

Differential Revision: https://reviews.llvm.org/D66547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369661 91177308-0d34-0410-b5e6-96231b3b80d8
16 files changed:
lib/Target/X86/X86ScheduleBtVer2.td
test/tools/llvm-mca/X86/BtVer2/clear-super-register-1.s
test/tools/llvm-mca/X86/BtVer2/cmpxchg-read-advance.s
test/tools/llvm-mca/X86/BtVer2/dependency-breaking-sbb-2.s
test/tools/llvm-mca/X86/BtVer2/partial-reg-update-2.s
test/tools/llvm-mca/X86/BtVer2/partial-reg-update-4.s
test/tools/llvm-mca/X86/BtVer2/partial-reg-update-6.s
test/tools/llvm-mca/X86/BtVer2/partial-reg-update-7.s
test/tools/llvm-mca/X86/BtVer2/partial-reg-update.s
test/tools/llvm-mca/X86/BtVer2/read-advance-2.s
test/tools/llvm-mca/X86/BtVer2/resources-x86_64.s
test/tools/llvm-mca/X86/BtVer2/xadd.s
test/tools/llvm-mca/X86/BtVer2/xchg.s
test/tools/llvm-mca/X86/intel-syntax.s
test/tools/llvm-mca/X86/llvm-mca-markers-10.s
test/tools/llvm-mca/X86/llvm-mca-markers-9.s