Ryzen has something like an 18 cycle latency on these based on Agner's data. AMD's own xls is blank. So it seems like there might be something tricky here.
Agner's data for Intel CPUs indicates these are a single uop there.
Probably safest to remove them. We never generate them without an intrinsic so this should be ok.
Differential Revision: https://reviews.llvm.org/D49315
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337067
91177308-0d34-0410-b5e6-
96231b3b80d8
case X86::MULX64rm:
// Arithmetic instructions that are both constant time and don't set flags.
- case X86::PDEP32rm:
- case X86::PDEP64rm:
- case X86::PEXT32rm:
- case X86::PEXT64rm:
case X86::RORX32mi:
case X86::RORX64mi:
case X86::SARX32rm: