]> granicus.if.org Git - llvm/commit
[X86] Add separate intrinsics for scalar FMA4 instructions.
authorCraig Topper <craig.topper@intel.com>
Sat, 25 Nov 2017 18:32:43 +0000 (18:32 +0000)
committerCraig Topper <craig.topper@intel.com>
Sat, 25 Nov 2017 18:32:43 +0000 (18:32 +0000)
commitfd41de87fcd87c49a1de30958bd51a88a3c29b01
tree3e206df2cb79c442a15627660c76eaf2057eef73
parent47dab13b2714c35707268ea08b5e59599fd393ab
[X86] Add separate intrinsics for scalar FMA4 instructions.

Summary:
These instructions zero the non-scalar part of the lower 128-bits which makes them different than the FMA3 instructions which pass through the non-scalar part of the lower 128-bits.

I've only added fmadd because we should be able to derive all other variants using operand negation in the intrinsic header like we do for AVX512.

I think there are still some missed negate folding opportunities with the FMA4 instructions in light of this behavior difference that I hadn't noticed before.

I've split the tests so that we can use different intrinsics for scalar testing between the two. I just copied the tests split the RUN lines and changed out the scalar intrinsics.

fma4-fneg-combine.ll is a new test to make sure we negate the fma4 intrinsics correctly though there are a couple TODOs in it.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318984 91177308-0d34-0410-b5e6-96231b3b80d8
18 files changed:
include/llvm/CodeGen/ISDOpcodes.h
include/llvm/IR/IntrinsicsX86.td
lib/Target/X86/X86ISelLowering.cpp
lib/Target/X86/X86ISelLowering.h
lib/Target/X86/X86InstrFMA.td
lib/Target/X86/X86InstrFormats.td
lib/Target/X86/X86InstrFragmentsSIMD.td
lib/Target/X86/X86InstrInfo.td
lib/Target/X86/X86IntrinsicsInfo.h
lib/Target/X86/X86Subtarget.h
test/CodeGen/X86/fma-commute-x86.ll
test/CodeGen/X86/fma-intrinsics-x86.ll
test/CodeGen/X86/fma-scalar-memfold.ll
test/CodeGen/X86/fma4-commute-x86.ll [new file with mode: 0644]
test/CodeGen/X86/fma4-fneg-combine.ll [new file with mode: 0644]
test/CodeGen/X86/fma4-intrinsics-x86.ll [new file with mode: 0644]
test/CodeGen/X86/fma4-intrinsics-x86_64-folded-load.ll
test/CodeGen/X86/fma4-scalar-memfold.ll [new file with mode: 0644]