]> granicus.if.org Git - llvm/commit
[x86] scalarize extract element 0 of FP math
authorSanjay Patel <spatel@rotateright.com>
Thu, 28 Feb 2019 19:47:04 +0000 (19:47 +0000)
committerSanjay Patel <spatel@rotateright.com>
Thu, 28 Feb 2019 19:47:04 +0000 (19:47 +0000)
commit014b2ad7192422fc845d5256e314586ccca2f0a3
tree5e0dd681406a963776b8c60457c672ffe5f3f296
parent4c30f56d4332c937021f8c80df5deba64f881b10
[x86] scalarize extract element 0 of FP math

This is another step towards ensuring that we produce the optimal code for reductions,
but there are other potential benefits as seen in the tests diffs:

  1. Memory loads may get scalarized resulting in more efficient code.
  2. Memory stores may get scalarized resulting in more efficient code.
  3. Complex ops like fdiv/sqrt get scalarized which may be faster instructions depending on uarch.
  4. Even simple ops like addss/subss/mulss/roundss may result in faster operation/less frequency throttling when scalarized depending on uarch.

The TODO comment suggests 1 or more follow-ups for opcodes that can currently result in regressions.

Differential Revision: https://reviews.llvm.org/D58282

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355130 91177308-0d34-0410-b5e6-96231b3b80d8
12 files changed:
lib/Target/X86/X86ISelLowering.cpp
test/CodeGen/X86/avx1-logical-load-folding.ll
test/CodeGen/X86/avx512-hadd-hsub.ll
test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
test/CodeGen/X86/exedeps-movq.ll
test/CodeGen/X86/extractelement-fp.ll
test/CodeGen/X86/ftrunc.ll
test/CodeGen/X86/haddsub.ll
test/CodeGen/X86/scalar-int-to-fp.ll
test/CodeGen/X86/vec_extract.ll
test/CodeGen/X86/vector-reduce-fadd-fast.ll
test/CodeGen/X86/vector-reduce-fmul-fast.ll