granicus.if.org Git - llvm/commit

author	Sanjay Patel <spatel@rotateright.com>
	Thu, 28 Feb 2019 19:47:04 +0000 (19:47 +0000)
committer	Sanjay Patel <spatel@rotateright.com>
	Thu, 28 Feb 2019 19:47:04 +0000 (19:47 +0000)
commit	014b2ad7192422fc845d5256e314586ccca2f0a3
tree	5e0dd681406a963776b8c60457c672ffe5f3f296	tree \| snapshot
parent	4c30f56d4332c937021f8c80df5deba64f881b10	commit \| diff

[x86] scalarize extract element 0 of FP math

This is another step towards ensuring that we produce the optimal code for reductions,
but there are other potential benefits as seen in the tests diffs:

  1. Memory loads may get scalarized resulting in more efficient code.
  2. Memory stores may get scalarized resulting in more efficient code.
  3. Complex ops like fdiv/sqrt get scalarized which may be faster instructions depending on uarch.
  4. Even simple ops like addss/subss/mulss/roundss may result in faster operation/less frequency throttling when scalarized depending on uarch.

The TODO comment suggests 1 or more follow-ups for opcodes that can currently result in regressions.

Differential Revision: https://reviews.llvm.org/D58282

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355130 91177308-0d34-0410-b5e6-96231b3b80d8

lib/Target/X86/X86ISelLowering.cpp		diff \| blob \| history
test/CodeGen/X86/avx1-logical-load-folding.ll		diff \| blob \| history
test/CodeGen/X86/avx512-hadd-hsub.ll		diff \| blob \| history
test/CodeGen/X86/avx512-intrinsics-fast-isel.ll		diff \| blob \| history
test/CodeGen/X86/exedeps-movq.ll		diff \| blob \| history
test/CodeGen/X86/extractelement-fp.ll		diff \| blob \| history
test/CodeGen/X86/ftrunc.ll		diff \| blob \| history
test/CodeGen/X86/haddsub.ll		diff \| blob \| history
test/CodeGen/X86/scalar-int-to-fp.ll		diff \| blob \| history
test/CodeGen/X86/vec_extract.ll		diff \| blob \| history
test/CodeGen/X86/vector-reduce-fadd-fast.ll		diff \| blob \| history
test/CodeGen/X86/vector-reduce-fmul-fast.ll		diff \| blob \| history