granicus.if.org Git

[X86] isHorizontalBinOp - add extract_subvector(shuffle(x)) handling (PR39921)

Let's us match horizontal op patterns on fast-variable-shuffle targets (Haswell etc.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362327 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add AVX2 'fast-variable-shuffle' PHADD tests (PR39921)

Haswell etc. will combine shuffles to a extract_subvector(permd(x)) before isHorizontalBinOp can match it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362326 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][X86] extract-lowbits.ll: add one more pattern a with truncation

We are also free to interpret this as 'BZHI'/'BEXTR'.
https://rise4fun.com/Alive/dD6

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362325 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Fold insert_subvector(bitcast(x),bitcast(y),c1) -> bitcast(insert_subvector(x,y),c2)

Move this combine from x86 into generic DAGCombine, which currently only manages cases where the bitcast is between types of the same scalarsize.

Differential Revision: https://reviews.llvm.org/D59188

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362324 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] isBitwiseNot / isConstOrConstSplat - add support for build vector undefs + truncation (PR41020)

Add (opt-in) support for implicit truncation to isConstOrConstSplat, which allows us to match truncated 'all ones' cases in isBitwiseNot.

PR41020 compares against using ISD::isBuildVectorAllOnes() instead, but that predicate silently accepts any UNDEF elements in the build vector which might not be what we want in isBitwiseNot - so I've added an opt-in 'AllowUndefs' flag that is set to false by default but will allow us to enable it on individual cases where its safe.

Differential Revision: https://reviews.llvm.org/D62783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362323 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] SimplifyDemandedBits - don't use OriginalDemanded variables in analysis.

These might have been replaced in multiple use cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362322 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] SimplifyDemandedVectorElts - use same arg names as SimplifyDemandedBits. NFCI.

Helps with debugging as we recurse between them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362321 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVarSimplify] Add tests for saturating math on IV; NFC

These saturating math ops can be replaced with simple math.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362320 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][X86] extract-lowbits.ll: add patterns with truncation too

If we look past truncations of X too eagerly (D62786), we may
end up with 64-bit 'BEXTR', even though 32-bit-one would suffice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362319 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Replace two unchecked dyn_casts with casts.

The results of the dyn_casts were immediately dereferenced on the next line
so they had better not be null.

I don't think there's any way for these dyn_casts to fail, so use a cast
of adding null check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362315 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Use libtool for runtimes when building for Apple platform

LLVM CMake build already uses libtool instead of ar when building
for Apple platform and we should be using the same when building
runtimes. To do so, this change extracts the logic for finding
libtool into a separate file and then uses it from both the LLVM
build as well as the LLVM runtimes build.

Differential Revision: https://reviews.llvm.org/D62769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362313 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix several places that weren't passing what they though they were to MachineInstr::print

Over a year ago, MachineInstr gained a fourth boolean parameter that occurs
before the TII pointer. When this happened, several places started accidentally
passing TII into this boolean parameter instead of the TII parameter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362312 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Simplify the CHECK lines in vector-reduce-and/or/xor-widen.ll in similar way to r362308.

Forgot to do the widen forms when I was doing the others.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362310 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add the SSE versions of PMULLW and PMULLD to isAssociativeAndCommutative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362309 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Simplify the CHECK lines in vector-reduce-and/or/xor.

The AVX512BW and AVX512VL checks were never used. And AVX512 is the same
as AVX on all tests that weren't already split for AVX1 and AVX2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362308 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add avx512 command lines and test cases to machine-combiner.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362307 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyIndVar] Refactor overflow check elimination code; NFC

Extract a willNotOverflow() helper function that is shared between
eliminateOverflowIntrinsic() and strengthenOverflowingOperation().
Use WithOverflowInst for the former.

We'll be able to reuse the same code for saturating intrinsics as
well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362305 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineCost] Don't add the soft float function call cost for the fneg idiom, fsub -0.0, %x

Summary: Fneg can be implemented with an xor rather than a function call so we don't need to add the function call overhead. This was pointed out in D62699

Reviewers: efriedma, cameron.mcinally

Reviewed By: efriedma

Subscribers: javed.absar, eraman, hiraditya, haicheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362304 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Regenerate SDIV tests for an upcoming patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362303 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA][Scheduler] Change how memory instructions are dispatched to the pending set. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362302 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Add PR40897 test case

In reality APInt::getBitsNeeded(INT_MIN, base) cases require one less bit than is returned

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362301 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Add tests for CONCAT(MOVDDUP(x),MOVDDUP(y))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362300 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Extend range of register indexes accepted by cfcmsa/ctcmsa

The `cfcmsa` and `ctcmsa` instructions accept index of MSA control
register. The MIPS64 SIMD Architecture define eight MSA control
registers. But register index for `cfcmsa` and `ctcmsa` instructions
might be any number in 0..31 range. If the index is greater then 7,
`cfcmsa` writes zero to the destination registers and `ctcmsa` does
nothing [1].

[1] MIPS Architecture for Programmers Volume IV-j:
The MIPS64 SIMD Architecture Module
https://www.mips.com/?do-download=the-mips64-simd-architecture-module

Differential Revision: https://reviews.llvm.org/D62597

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362299 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Disable register coalescing to the PTRDISPREGS class

If we would allow register coalescing on PTRDISPREGS class then register
allocator can lock Z register to some virtual register. Larger instructions
requiring a memory acces then fail during the register allocation phase since
there is no available register to hold a pointer if Y register was already
taken for a stack frame. This patch prevents it by keeping Z register
spillable. It does it by not allowing coalescer to lock it.

Original discussion on https://github.com/avr-rust/rust/issues/128.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362298 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer][X86] Add other tests described in PR28474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362297 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer][X86] This test was from PR28474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362296 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][Codegen] shift-amount-mod.ll: drop innermost operation

I have initially added it in for test to display both
whether the binop w/ constant is sinked or hoisted.
But as it can be seen from the 'sub (sub C, %x), %y'
test, that actually conceals the issues it is supposed to test.

At least two more patterns are unhandled:
* 'add (sub C, %x), %y' - D62266
* 'sub (sub C, %x), %y'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362295 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVarSimplify] Fixup nowrap flags during LFTR (PR31181)

Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix
for LFTR poison handling issues in general.

When LFTR moves a condition from pre-inc to post-inc, it may now
depend on value that is poison due to nowrap flags. To avoid this,
we clear any nowrap flag that SCEV cannot prove for the post-inc
addrec.

Additionally, LFTR may switch to a different IV that is dynamically
dead and as such may be arbitrarily poison. This patch will correct
nowrap flags in some but not all cases where this happens. This is
related to the adoption of IR nowrap flags for the pre-inc addrec.
(See some of the switch_to_different_iv tests, where flags are not
dropped or insufficiently dropped.)

Finally, there are likely similar issues with the handling of GEP
inbounds, but we don't have a test case for this yet.

Differential Revision: https://reviews.llvm.org/D60935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362292 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVarSimplify] Add additional PR33181 tests; NFC

Two more tests with a switch to a dynamically dead IV, with poison
occuring on the first or second iteration.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362291 91177308-0d34-0410-b5e6-96231b3b80d8

Extend the DWARFExpression address handling to support 16-bit addresses

This allows the DWARFExpression class to handle addresses without
crashing on targets with 16-bit pointers like AVR.

This is required in order to generate assembly from clang via the '-S'
flag.

This fixes an error with the following message:

clang: llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h:132: llvm::DWARFExpression::DWARFExpression(llvm::DataExtractor, uint16_t, uint8_t):
Assertion `AddressSize == 8 || AddressSize == 4' failed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362290 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362289 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add AVX512BF16 and AVX512VP2INTERSECT instructions to the loading folding tables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362288 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make the X86FoldTablesEmitter functional again. Fix the spacing in the output to make it easier to diff.

Fix a few other formatting issues in the manual table. And remove some
old FIXMEs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362287 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld] fix too-small-bitmask error

Summary:
This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No.
33".

It seems that this statement is doing the standard bitwise trick for
adjusting a value to have a specific alignment.

The issue is that getStubAlignment() returns an unsigned, while DataSize
is declared a uint64_t. The right hand side of the expression is not
extended to 64b before bitwise negation, resulting in the top half of
the mask being 0s, which is not correct for realignment.

Reviewers: lhames, MaskRay

Reviewed By: MaskRay

Subscribers: RKSimon, MaskRay, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362286 91177308-0d34-0410-b5e6-96231b3b80d8

Inline variable into assert to fix unused variable warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362285 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPred] Eliminate a redundant/confusing cover function [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362284 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF, ARM64] Fix location of ARM64 CodeView test

ARM64 CodeView test was incorrectly put under test/DebugInfo/COFF folder which
runs for all all architectures. This fix moves it to a subfolder AArch64 with
lit.local.cfg which specify it supports AArch64 only.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362283 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPred] Handle a subset of NE comparison based latches

At the moment, LoopPredication completely bails out if it sees a latch of the form:
%cmp = icmp ne %iv, %N
br i1 %cmp, label %loop, label %exit
OR
%cmp = icmp ne %iv.next, %NPlus1
br i1 %cmp, label %loop, label %exit

This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can.

For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially.

For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem.

This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later.

Differential Revision: https://reviews.llvm.org/D62748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362282 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Fix hashing for MO_ExternalSymbol MachineOperands.

We were hashing the string pointer, not the string, so two instructions
could be identical (isIdenticalTo), but have different hash codes.

This showed up as a very rare, non-deterministic assertion failure
rehashing a DenseMap constructed by MachineOutliner. So there's no
"real" testcase, just a unittest which checks that the hash function
behaves correctly.

I'm a little scared fixing this is going to cause a regression in
outlining or MachineCSE, but hopefully we won't run into any issues.

Differential Revision: https://reviews.llvm.org/D61975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362281 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF, ARM64] Add CodeView register mapping

CodeView has its own register map which is defined in cvconst.h. Missing this
mapping before saving register to CodeView causes debugger to show incorrect
value for all register based variables, like variables in register and local
variables addressed by register (stack pointer + offset).

This change added mapping between LLVM register and CodeView register so the
correct register number will be stored to CodeView/PDB, it aso fixed the
mapping from CodeView register number to register name based on current
CPUType but print PDB to yaml still assumes X86 CPU and needs to be fixed.

Differential Revision: https://reviews.llvm.org/D62608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362280 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] check for INLINEASM_BR along w/ INLINEASM

Summary:
It looks like since INLINEASM_BR was created off of INLINEASM (r353563),
a few checks for INLINEASM needed to be updated to check for either
case.

pr/41999

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: nemanjai, hiraditya, kbarton, jsji, llvm-commits, craig.topper, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62403

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362278 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Revert inline line table change of r362264

Testing with debuggers shows that our previous behavior was correct.
The reason I thought MSVC did things differently is that MSVC prefers to
use the 0xB combined code offset and code length update opcode when
inline sites are discontiguous.

Keep the test changes, and update the llvm-pdbutil inline line table
dumper to account for this new interpretation of the opcodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362277 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix not adding ImplicitBufferPtr as a live-in

Fixes missing test from r293000.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362275 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Fold more fortified functions into non-fortified variants

When the object size argument is -1, no checking can be done, so calling the
_chk variant is unnecessary. We already did this for a bunch of these
functions.

rdar://50797197

Differential revision: https://reviews.llvm.org/D62358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362272 91177308-0d34-0410-b5e6-96231b3b80d8

NFC: Pull out a function to reduce some duplication

Part of https://reviews.llvm.org/D62358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362271 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Better represent the postinc form produced by LFTR in LoopPred tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362270 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Make the code in mutateStrictFPToFP less aware of how many operands each node has. NFCI

Just copy all of the operands except the chain and call MorphNode on that.
This removes the IsUnary and IsTernary flags.

Also always get the result type from the result type of the original
nodes. Previously we got it from the operand except for two nodes
where that didn't work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362269 91177308-0d34-0410-b5e6-96231b3b80d8

[Bugpoint] fix another use-after-move. NFC

Summary:
This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No.
7".

These statements are order independent, short of the use-after-move.

Reviewers: echristo, srhines, RKSimon

Reviewed By: RKSimon

Subscribers: dblaikie, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362267 91177308-0d34-0410-b5e6-96231b3b80d8

[RegisterCoalescer] fix potential use of undef value. NFC

Summary:
Fixes a warning produced from scan-build (llvm.org/reports/scan-build/),
further warnings found by annotation isMoveInstr [[nodiscard]].

isMoveInstr potentially does not assign to its parameters, so if they
were uninitialized, they will potentially stay uninitialized. It seems
most call sites pass references to uninitialized values, then use them
without checking the return value.

Reviewers: wmi

Reviewed By: wmi

Subscribers: MatzeB, qcolombet, hiraditya, tpr, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62109

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362265 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Fix inline line table accuracy for discontiguous segments

After improving the inline line table dumper in llvm-pdbutil and looking
at MSVC's inline line tables, it is clear that setting the length of the
inlined code region does not update the code offset. This means that the
delta to the beginning of a new discontiguous inlined code region should
be calculated relative to the last code offset, excluding the length.
Implementing this is a one line fix for MC: simply don't update
LastLabel.

While I'm updating these test cases, switch them to use llvm-objdump -d
and llvm-pdbutil. This allows us to show offsets of each instruction and
correlate the line table offsets to the actual code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362264 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply [CVP] Simplify non-overflowing saturating add/sub

If we can determine that a saturating add/sub will not overflow based
on range analysis, convert it into a simple binary operation. This is
a sibling transform to the existing with.overflow handling.

Reapplying this with an additional check that the saturating intrinsic
has integer type, as LVI currently does not support vector types.

Differential Revision: https://reviews.llvm.org/D62703

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362263 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Add vector saturating add test; NFC

Extra test for the assertion failure from D62703.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362262 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Fix assertion failure on vector with.overflow

Noticed on D62703. LVI only handles plain integers, not vectors of
integers. This was previously not an issue, because vector support
for with.overflow is only a relatively recent addition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362261 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Add ne icmp tests w/preinc forms for LoopPredication

Turns out this is substaintially easier to match then the post increment form, so let's start there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362260 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Resync Host.cpp with compiler-rt's cpu_model.c to enable 0x55 to be identified as cascadelake when avx512vnni is detected.

Some other formatting changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362256 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add unary FNeg tests to AMDGPU/amdgcn-intrinsics.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362255 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[CVP] Simplify non-overflowing saturating add/sub"

This reverts commit 1e692d1777ae34dcb93524b5798651a29defae09.

Causes assertion failure in builtins-wasm.c clang test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362254 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add unary FNeg to cos-1.ll cos-2.ll cos-sin-intrinsic.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362253 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Remove unused fields from BottleneckAnalysis. NFC

This should appease the buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362251 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Feed BUNDLE_PATH through llvm target wrappers

This feeds the new llvm_codsign BUNDLE_PATH option through from the llvm target wrapper functions, so that you can specify the BUNDLE_PATH on the target's codesign.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362248 91177308-0d34-0410-b5e6-96231b3b80d8

[MIR-Canon] Don't do vreg skip for independent instructions if there are none.

We don't want to create vregs if there is nothing to use them for. That causes
verifier errors.

Differential Revision: https://reviews.llvm.org/D62740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362247 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Refactor class BottleneckAnalysis. NFCI

The resource pressure distribution computation is now delegated by class
BottleneckAnalysis to an instance of class PressureTracker.
Class PressureTracker is also responsible for:
- tracking users of processor resource units.
- tracking the number of delay cycles caused by increases in backpressure.

BottleneckAnalysis internally initializes a dependency graph. Each nodes
represents an instruction in the input code sequence. Edges of the dependency
graph are critical register/memory/resource dependencies. Dependencies are only
added to the graph if they are seen as critical by backend pressure events.

The DependencyGraph is currently unused. It is possible to print the dependency
graph (see method DependencyGraph::dump()) for debugging purposes.
The long term goal is to use the information stored by the dependency graph in
order to do critical path computation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362246 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Add tests for loop predication of loops w/ne latch conditions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362244 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Simplify non-overflowing saturating add/sub

If we can determine that a saturating add/sub will not overflow
based on range analysis, convert it into a simple binary operation.
This is a sibling transform to the existing with.overflow handling.

Differential Revision: https://reviews.llvm.org/D62703

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362242 91177308-0d34-0410-b5e6-96231b3b80d8

Revert revert of r362112 with minor SystemZ test file corrections.

[FPEnv] Added a special UnrollVectorOp method to deal with the chain on StrictFP opcodes

This change creates UnrollVectorOp_StrictFP. The purpose of this is to address a failure that consistently occurs when calling StrictFP functions on vectors whose number of elements is 3 + 2n on most platforms, such as PowerPC or SystemZ. The old UnrollVectorOp method does not expect that the vector that it will unroll will have a chain, so it has an assert that prevents it from running if this is the case. This new StrictFP version of the method deals with the chain while unrolling the vector. With this new function in place during vector widending, llc can run vector-constrained-fp-intrinsics.ll for SystemZ successfully.

Submitted by: Drew Wock <drew.wock@sas.com>
Reviewed by: Cameron McInally, Kevin P. Neal
Approved by: Cameron McInally
Differential Revision: https://reviews.llvm.org/D62546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362241 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Use InliningThresholdMultiplier for inline hint

AMDGPU uses multiplier 9 for the inline cost. It is taken into account
everywhere except for inline hint threshold. As a result we are penalizing
functions with the inline hint making them less probable to be inlined
than those without the hint. Defaults are 225 for a normal function and
325 for a function with an inline hint. Currently we have effective
threshold 225 * 9 = 2025 for normal functions and just 325 for those with
the hint. That is fixed by this patch.

Differential Revision: https://reviews.llvm.org/D62707

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362239 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add unary FNeg tests to fabs.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362238 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC] Correctly adjust branch probability in PPCReduceCRLogicals

In PPCReduceCRLogicals after splitting the original MBB into 2, the 2 impacted branches still use original branch probability. This is unreasonable. Suppose we have following code, and the probability of each successor is 50%.

    condc = conda || condb
    br condc, label %target, label %fallthrough

It can be transformed to following,

    br conda, label %target, label %newbb
  newbb:
    br condb, label %target, label %fallthrough

Since each branch has a probability of 50% to each successor, the total probability to %fallthrough is 25% now, and the total probability to %target is 75%. This actually changed the original profiling data. A more reasonable probability can be set to 70% to the false side for each branch instruction, so the total probability to %fallthrough is close to 50%.

This patch assumes the branch target with two incoming edges have same edge frequency and computes new probability fore each target, and keep the total probability to original targets unchanged.

Differential Revision: https://reviews.llvm.org/D62430

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362237 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add unary FNeg tests to fcmp.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362234 91177308-0d34-0410-b5e6-96231b3b80d8

[MachinePipeliner][NFC] Add some debug log and statistics

This is to add some log and statistics for debugging

Differential Revision: https://reviews.llvm.org/D62165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362233 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add unary FNeg tests to fdiv.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362231 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Regenerate add/sub shrink constant tests for an upcoming patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362230 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Regenerate CTLZ tests for an upcoming patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362229 91177308-0d34-0410-b5e6-96231b3b80d8

[UpdateTestChecks] Add support for -march=r600 to match existing -march=amdgcn support

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362228 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add unary FNeg tests to fma.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362227 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] - Remove excessive `dynamic.test`

dynamic.test is a test that checks dumping of
dynamic tags. It uses precompiled objects as inputs
and it is completely excessive nowadays:

Now we have elf-dynamic-tags-machine-specific.test
and elf-dynamic-tags.test.
(https://github.com/llvm-mirror/llvm/blob/master/test/tools/llvm-readobj/elf-dynamic-tags-machine-specific.test)
(https://github.com/llvm-mirror/llvm/blob/master/test/tools/llvm-readobj/elf-dynamic-tags.test)

First is used to check target specific tags and second tests the common flags.
These tests use YAML, which is much better than using precompiled binaries.

Note that new reviews tend to update the YAML based
tests to add new tags, e.g. see D62596.

With this patch it became possible to remove
dynamic-table-so.aarch64 binary from the inputs folder.
(other binaries are still used in other tests).

Differential revision: https://reviews.llvm.org/D62728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362224 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r362160

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362223 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r362196

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362222 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r362190

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362221 91177308-0d34-0410-b5e6-96231b3b80d8

ftime-trace: Trace loop passes

These can take a significant amount of time in some builds.

Suggested by Andrea Di Biagio.

Differential Revision: https://reviews.llvm.org/D62666

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362219 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] 'C-(C2-X) --> X+(C-C2)' constant-fold

It looks this fold was already partially happening, indirectly
via some other folds, but with one-use limitation.
No other fold here has that restriction.

https://rise4fun.com/Alive/ftR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362217 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] 'add (sub C1, X), C2 --> sub (add C1, C2), X' constant-fold

https://rise4fun.com/Alive/qJQ

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362216 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support WHILE instructions

Summary:
Patch adds support for the following instructions:
* WHILEGE, WHILEGT, WHILEHS, WHILEHI, WHILEWR, WHILERW

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62601

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362215 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support TBL/TBX instructions

Summary:
A three sources variant of the TBL instruction is added to the existing
SVE instruction in SVE2. This is implemented with minor changes to the
existing TableGen class. TBX is a new instruction with its own
definition.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362214 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 store instructions

Summary:
Patch adds support for the following instructions:
* STNT1B, STNT1H, STNT1S, STNT1D

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362213 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Add detailed tests for lower call

Test different operand types of callee and their behavior whether
relocation model is pic or not.
Possible operand types are:
Register (function pointer),
External symbol (used for libcalls e.g. __udivdi3 or memcpy),
Global address.

Global address has different handling depending on relocation model
and linkage type. Register and external symbol do not.

Differential Revision: https://reviews.llvm.org/D62590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362212 91177308-0d34-0410-b5e6-96231b3b80d8

Follow up and fix for rL362064

Fix the misleadingly indentation introduced in rL362064. This will get rid of
the compiler warning, and it was actually a bug. This change will be used and
tested in D62669.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362211 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Handle position independent code

Handle position independent code for MIPS32.
When callee is global address, lower call will emit callee
as G_GLOBAL_VALUE and add target flag if needed.
Support $gp in getRegBankFromRegClass().
Select G_GLOBAL_VALUE, specially handle case when
there are target flags attached by lowerCall.

Differential Revision: https://reviews.llvm.org/D62589

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362210 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Copy add/sub constant-folding tests from codegen

Last three patterns are missed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362209 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][Codegen] Add/sub constant-folding: add scalar tests too

Just for completeness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362208 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Move initGlobalBaseReg to MipsFunctionInfo. NFC

Move initGlobalBaseReg from MipsSEDAGToDAGISel to MipsFunctionInfo.
This way functions used for handling position independent code during
instruction selection, getGlobalBaseReg and initGlobalBaseReg,
end up in same class.

Differential Revision: https://reviews.llvm.org/D62586

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362206 91177308-0d34-0410-b5e6-96231b3b80d8

[InstructionSimplify] Add missing implementation of llvm::SimplifyUnOp. NFC

There are no callers currently, but the function is declared so we should at
least implement it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362205 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Lower call for callee that is register

Lower call for callee that is register for MIPS32.
Register should contain callee function address.

Differential Revision: https://reviews.llvm.org/D62585

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362204 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove patterns for X86VSintToFP/X86VUintToFP+loadv4f32 to v2f64.

These patterns can incorrectly narrow a volatile load from 128-bits to 64-bits.
Similar to PR42079.

Switch to using (v4i32 (bitcast (v2i64 (scalar_to_vector (loadi64))))) as the
load pattern used in the instructions.

This probably still has issues in 32-bit mode where loadi64 isn't legal. Maybe
we should use VZMOVL for widened loads even when we don't need the upper bits
as zeroes?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362203 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for failure to use 128-bit masked vcvtdq2pd when load starts as v2i32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362202 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for a volatile load shrinking bug involving cvtdq2pd. NFC

Similar to PR42079

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362201 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Copy a test case from avx512-cvt.ll to avx512-cvt-widen.ll. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362200 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove avx512 isel patterns for fpextend+load. Prefer to only match fp extloads instead.

DAG combine will usually fold fpextend+load to an fp extload anyway. So the
256 and 512 patterns were probably unnecessary. The 128 bit pattern was special
in that it looked for a v4f32 load, but then used it in an instruction that
only loads 64-bits. This is bad if the load happens to be volatile. We could
probably make the patterns volatile aware, but that's more work for something
that's probably rare. The peephole pass might kick in and save us anyway. We
might also be able to fix this with some additional DAG combines.

This also adds patterns for vselect+extload to enabled masked vcvtps2pd to be
used. Previously we looked for the unlikely vselect+fpextend+load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362199 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test to show missed opportunity to use masked vcvtps2pd for vselect+extload.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362198 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for PR42079. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362197 91177308-0d34-0410-b5e6-96231b3b80d8