Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway.
For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'.
With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok.
Differential Revision: https://reviews.llvm.org/D36213
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311508
91177308-0d34-0410-b5e6-
96231b3b80d8
if (isEliminableCastPair(PrecedingCI, CI))
return false;
- // If this is a vector sext from a compare, then we don't want to break the
- // idiom where each element of the extended vector is either zero or all ones.
- if (CI->getOpcode() == Instruction::SExt &&
- isa<CmpInst>(CastSrc) && CI->getDestTy()->isVectorTy())
- return false;
-
return true;
}
define <2 x i64> @test7(<4 x float> %a, <4 x float> %b) {
; CHECK-LABEL: @test7(
; CHECK-NEXT: [[CMP:%.*]] = fcmp ult <4 x float> [[A:%.*]], zeroinitializer
-; CHECK-NEXT: [[SEXT:%.*]] = sext <4 x i1> [[CMP]] to <4 x i32>
; CHECK-NEXT: [[CMP4:%.*]] = fcmp ult <4 x float> [[B:%.*]], zeroinitializer
-; CHECK-NEXT: [[SEXT5:%.*]] = sext <4 x i1> [[CMP4]] to <4 x i32>
-; CHECK-NEXT: [[AND:%.*]] = xor <4 x i32> [[SEXT]], [[SEXT5]]
+; CHECK-NEXT: [[AND1:%.*]] = xor <4 x i1> [[CMP]], [[CMP4]]
+; CHECK-NEXT: [[AND:%.*]] = sext <4 x i1> [[AND1]] to <4 x i32>
; CHECK-NEXT: [[CONV:%.*]] = bitcast <4 x i32> [[AND]] to <2 x i64>
; CHECK-NEXT: ret <2 x i64> [[CONV]]
;