granicus.if.org Git

AMDGPU/GlobalISel: Handle 16-bit SALU min/max

This needs to be extended to s32, and expanded into cmp+select. This
is relying on the fact that widenScalar happens to leave the
instruction in place, but this isn't a guaranteed property of
LegalizerHelper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364831 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Lower SALU min/max to cmp+select

Use a change observer to apply a register bank to the newly created
intermediate result register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364830 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Avoid SFB - Fix inconsistent codegen with/without debug info(2)

The function findPotentialBlockers may consider debug info instructions as
potential blockers and may stop searching for a store-load pair prematurely.

This patch corrects this and tests the cases where the store is separated
from the load by more than InspectionLimit debug instructions.

Patch by Chris Dawson.

Differential Revision: https://reviews.llvm.org/D62408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364829 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Add tests for add legalization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364828 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize s16 add/sub/mul

If this is scalar, promote to s32. Use a new observer class to assign
the register bank of newly created registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364827 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix allowing non-boolean conditions for G_SELECT

The condition register bank must be scc or vcc so that a copy will be
inserted, which will be lowered to a compare.

Currently greedy unnecessarily forces using a VCC select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364825 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add tests for "shift direction in bittest" (PR42466)

https://rise4fun.com/Alive/8O1
https://bugs.llvm.org/show_bug.cgi?id=42466

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364824 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Verify G_MERGE_VALUES operand sizes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364822 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel]: Allow backends to custom legalize Intrinsics

https://reviews.llvm.org/D31359

Add a hook "legalizeInstrinsic" to allow backends to override this
and custom lower/legalize intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364821 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for sendmsg/sendmsghalt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364819 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize s16 fcmp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364817 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement lower for min/max

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364816 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GFX10: implement ds_ordered_count changes

Summary:
ds_ordered_count can now simultaneously operate on up to 4 dwords
in a single instruction, which are taken from (and returned to)
lanes 0..3 of a single VGPR.

Change-Id: I19b6e7b0732b617c10a779a7f9c0303eec7dd276

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364815 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Support GDS atomics

Summary:
Original patch by Marek Olšák

Change-Id: Ia97d5d685a63a377d86e82942436d1fe6e429bab

Reviewers: mareko, arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364814 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for DS ordered add/swap

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364811 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64/GlobalISel: Fix trying to select invalid MIR

Physical registers are not allowed to be a phi operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364810 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for amdgcn.writelane

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364808 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fail instead of assert when selecting loads

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364807 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Complete implementation of G_GEP

Also works around tablegen defect in selecting add with unused carry,
but if we have to manually select GEP, might as well handle add
manually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364806 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_PHI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364805 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Try to select VOP3 form of add

There are several things broken, but at least emit the right thing for
gfx9.

The import of the pattern with the unused carry out seems to not
work. Needs a special class for clamp, because OperandWithDefaultOps
doesn't really work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364804 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add widenSubVector to size in bits helper. NFCI.

We can already widenSubVector to a specific type (of the same scalar type) - this variant just specifies the target vector size.

This will be useful when CombineShuffleWithExtract relaxes the need to have the same scalar type for all shuffle operand subvector sources.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364803 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for readlane/readfirstlane

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364801 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-readelf] Expand llvm-readelf documentation

Previously, the llvm-readelf documentation was essentially just a list
of differences to llvm-readobj. Since llvm-readelf is the more likely
goto tool for many people migrating to the LLVM toolchain, it seems like
it would be helpful to document all the switches in the llvm-readelf
document too. This change expands the options listed accordingly.
Additionally, they are unlikely to care what the differences are to
llvm-readobj, since they won't be familiar with the latter as there is
no GNU equivalent, so this change moves the "differences" section to
llvm-readobj's documentation.

Reviewed by: peter.smith

Differential Revision: https://reviews.llvm.org/D63826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364800 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Implement select for 32-bit G_ADD

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364797 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix MVE_VQxDMLxDH instruction class

Summary:
According to the ARMARM, the VQDMLADH, VQRDMLADH, VQDMLSDH and
VQRDMLSDH instructions handle their results as follows: "The base
variant writes the results into the lower element of each pair of
elements in the destination register, whereas the exchange variant
writes to the upper element in each pair". I.e., the initial content
of the output register affects the result, as usual, we model this
with an additional input.

Also, for 32-bit variants Qd is not allowed to be the same register as
Qm and Qn, we use @earlyclobber to indicate this.

This patch also changes vpred_r to vpred_n because the instructions
don't have an explicit 'inactive' operand.

Reviewers: dmgreen, ostannard, simon_tatham

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364796 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_BRCOND for vcc

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364795 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE: support QQPRRegClass and QQQQPRRegClass

Summary:
QQPRRegClass and QQQQPRRegClass are used by the
interleaving/deinterleaving loads/stores to represent sequences of
consecutive SIMD registers.

Reviewers: ostannard, simon_tatham, dmgreen

Reviewed By: simon_tatham

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364794 91177308-0d34-0410-b5e6-96231b3b80d8

Update email address in CODE_OWNERS

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364793 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] (Y + ~X) + 1 --> Y - X fold (PR42459)

Summary:
To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

This does address the regression being exposed in D63992.

https://godbolt.org/z/7DGltU
https://rise4fun.com/Alive/EPO0

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42459 | PR42459 ]]

Reviewers: spatel, nikic, huihuiz

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364792 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Shift amount reassociation in bittest (PR42399)

Summary:
Given pattern:
`icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0`
we should move shifts to the same hand of 'and', i.e. rewrite as
`icmp eq/ne (and (x shift (Q+K)), y), 0` iff `(Q+K) u< bitwidth(x)`

It might be tempting to not restrict this to situations where we know
we'd fold two shifts together, but i'm not sure what rules should there be
to avoid endless combine loops.

We pick the same shift that was originally used to shift the variable we picked to shift:
https://rise4fun.com/Alive/6x1v

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42399 | PR42399]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364791 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Custom-lower UADDO(x, 1) and USUBO(x, 1)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364790 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_FRAME_INDEX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364789 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GFX10: fix scratch resource descriptor

Summary:
The stride should depend on the wave size, not the hardware generation.

Also, the 32_FLOAT format is 0x16, not 16; though that shouldn't be
relevant.

Change-Id: I088f93bf6708974d085d1c50967f119061da6dc6

Reviewers: arsenm, rampitec, mareko

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364788 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Make s16 select legal

This is easy to handle and avoids legalization artifacts which are
likely to obscure combines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364787 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_BRCOND for scc conditions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364786 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Tolerate copies with no type set

isVCC has the same bug, but isn't used in a context where it can cause
a problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364784 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix tests using the default alloca address space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364783 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select src modifiers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364782 91177308-0d34-0410-b5e6-96231b3b80d8

Fixup r364512

Fix stack-use-after-scope errors from r364512. One instance was already
fixed in r364611 - this patch simplifies that fix and addresses one more
instance of similar code.

Discussed in: https://reviews.llvm.org/D63905

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364778 91177308-0d34-0410-b5e6-96231b3b80d8

[UpdateTestChecks][PowerPC] Avoid empty string when scrubbing loop comments

Summary:
SCRUB_LOOP_COMMENT_RE was introduced in https://reviews.llvm.org/D31285
This works for some loops.

However, we may generate lines with loop comments only.
And since we don't scrub leading white spaces, this will leave an empty
line there, and FileCheck will complain it.

eg: llvm/test/CodeGen/PowerPC/PR35812-neg-cmpxchg.ll:27:15:
error: found empty check string with prefix 'CHECK:'
; CHECK-NEXT:

This prevented us from using the `update_llc_test_checks.py` for quite some cases.

We should still keep the comment token there, so that we can safely
scrub the loop comment without breaking FileCheck.

Reviewers: timshen, hfinkel, lebedev.ri, RKSimon

Subscribers: nemanjai, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364775 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Better commutative tests for "shift amount reassociation in bittest" pattern.

As discussed in https://reviews.llvm.org/D63829
*if* *both* shifts are one-use, we'd most likely want to produce `lshr`,
and not rely on ordering.

Also, there should likely be a *separate* fold to do this reordering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364772 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Rework VLCR algorithm

Add code to catch pattern for commutative instructions for VLCR.

Patch by Suyog Sarda.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364770 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Convert some places to Register

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364769 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix RegBankSelect for G_FCANONICALIZE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364768 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix RegBankSelect for G_BUILD_VECTOR

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364767 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fail on store to 32-bit address space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364766 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Improve icmp selection coverage.

Select s64 eq/ne scalar icmp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364765 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Improve test coverage for ((~x) + y) + 1 -> y - x fold fold (PR42459)

So we indeed to have this fold, but only if +1 is not the last operation..

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364764 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for WWM/WQM

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364763 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Use vcc reg bank for amdgcn.wqm.vote

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364762 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix scc->vcc copy handling

This was checking the size of the register with the value of the size,
which happens to be exec. Also fix assuming VCC is 64-bit to fix
wave32.

Also remove some untested handling for physical registers which is
skipped. This doesn't insert the V_CNDMASK_B32 if SCC is the physical
copy source. I'm not sure if this should be trying to handle this
special case instead of dealing with this in copyPhysReg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364761 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Use and instead of BFE with inline immediate

Zext from s1 is the only case where this should do anything with the
current legal extensions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364760 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add GINodeEquiv for min/max

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364759 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add DAG compat for G_FCANONICALIZE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364758 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for MSA and ASE instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364757 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for atomic instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364756 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add missing schedinfo for ADJCALLSTACKDOWN, ADJCALLSTACKUP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364755 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Call isLoopExiting for blocks in the loop.

isLoopExiting should only be called for blocks in the loop. A follow
up patch makes this requirement an assertion.

I've updated the usage here, to only match for actual exit blocks. Previously,
it would also match blocks not in the loop.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D63980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364750 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Tests for ((~x) + y) + 1 -> y - x fold fold (PR42459)

To be noted, this pattern is not unhandled by instcombine per-se,
it is somehow does end up being folded when one runs opt -O3,
but not if it's just -instcombine. Regardless, that fold is
indirect, depends on some other folds, and is thus blind
when there are extra uses.

https://bugs.llvm.org/show_bug.cgi?id=42459
https://rise4fun.com/Alive/EPO0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364749 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Add break; to the last switch case

As suggested by jrtc27 in the post-commit review of D60528.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364746 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] CombineShuffleWithExtract - updated description comments. NFCI.

CombineShuffleWithExtract no longer requires that both shuffle ops are extract_subvectors, from the same type or from the same size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364745 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Do minnum->minimum at legalization time instead of building time

The SDAGBuilder behavior stems from the days when we didn't have fast
math flags available in SDAG. We do now and doing the transformation in
the legalizer has the advantage that it also works for vector types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364743 91177308-0d34-0410-b5e6-96231b3b80d8

[benchmark] Disable CMake get_git_version

Disabled CMake get_git_version as it is meaningless for this in-tree
build, and hardcoded a null version.

Not using get_git_version avoids a refresh of the git index that is
executed by get_git_version. Refreshing the index can take a
considerable amount of time if the index needs to be refreshed
(particularly with the mono repo). This situation can arise when
building shared source on a host in VMs.

Differential Revision: https://reviews.llvm.org/D63925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364742 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Tests for x - ~(y) -> x + y + 1 fold (PR42457)

https://bugs.llvm.org/show_bug.cgi?id=42457
https://rise4fun.com/Alive/iFhE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364739 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Omit 'urem' where possible

This was added in D63390 / rL364286 to backend,
but it makes sense to also handle it in middle-end.
https://rise4fun.com/Alive/Zsln

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364738 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Copy test for omit urem when possible from TargetLowering

Was added in D63390 / rL364286 to backend, but it makes sense to also handle it here.
https://rise4fun.com/Alive/Zsln

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364737 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Avoid adding too much indirection to pointer-valued variables

This patch addresses PR41675, where a stack-pointer variable is dereferenced
too many times by its location expression, presenting a value on the stack as
the pointer to the stack.

The difference between a stack *pointer* DBG_VALUE and one that refers to a
value on the stack, is currently the indirect flag. However the DWARF backend
will also try to guess whether something is a memory location or not, based
on whether there is any computation in the location expression. By simply
prepending the stack offset to existing expressions, we can accidentally
convert a register location into a memory location, which introduces a
suprise (and unintended) dereference.

The solution is to add DW_OP_stack_value whenever we add a DIExpression
computation to a stack *pointer*. It's an implicit location computed on the
expression stack, thus needs to be flagged as a stack_value.

For the edge case where the offset is zero and the location could be a register
location, DIExpression::prepend will still generate opcodes, and thus
DW_OP_stack_value must still be added.

Differential Revision: https://reviews.llvm.org/D63429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364736 91177308-0d34-0410-b5e6-96231b3b80d8

[SimpleLoopUnswitch] Implement handling of prof branch_weights metadata for SwitchInst

Differential Revision: https://reviews.llvm.org/D60606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364734 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] WLS/LE Code Generation

Backend changes to enable WLS/LE low-overhead loops for armv8.1-m:
1) Use TTI to communicate to the HardwareLoop pass that we should try
   to generate intrinsics that guard the loop entry, as well as setting
   the loop trip count.
2) Lower the BRCOND that uses said intrinsic to an Arm specific node:
   ARMWLS.
3) ISelDAGToDAG the node to a new pseudo instruction:
   t2WhileLoopStart.
4) Add support in ArmLowOverheadLoops to handle the new pseudo
   instruction.

Differential Revision: https://reviews.llvm.org/D63816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364733 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add more load folding tests for vcvt(t)ps2(u)qq showing missed foldings. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364730 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improve the type checking fast-isel handling of vector bitcasts.

We had a bunch of vector size legality checks for the source type
based on feature flags, but we didn't check the destination type at
all beyond ensuring that it was a "simple" type. But this allowed
the destination to be i128 which isn't legal.

This commit changes the code to use TLI's isTypeLegal logic in
place of the all the subtarget checks. Then additionally checks
that the source and dest are vectors.

Fixes 42452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364729 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a DAG combine to replace vector loads feeding a v4i32->v2f64 CVTSI2FP/CVTUI2FP node with a vzload.

But only when the load isn't volatile.

This improves load folding during isel where we only have vzload
and scalar_to_vector+load patterns. We can't have full vector load
isel patterns for the same volatile load issue.

Also add some missing masked cvtsi2fp/cvtui2fp with vzload patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364728 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add some additional load folding tests to vec_int_to_fp.ll/vec_int_to_fp-widen.ll and disable the peephole pass.

Also copy some missing test cases from vec_int_to_fp.ll to vec_int_to_fp-widen.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364727 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add MOVHPDrm/MOVLPDrm patterns that use VZEXT_LOAD.

We already had patterns that used scalar_to_vector+load. But we can
also have a vzload.

Found while investigating combining scalar_to_vector+load to vzload.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364726 91177308-0d34-0410-b5e6-96231b3b80d8

Clean up MSVC visualization of LLVM pointer types

Create separate natvis ptr and int views for PointerIntPair.
These are convenient in watch Windows and will be used by
Clang visualizers to be checked in shortly

Also, removed deref views as the MSVC na format has
done the same thing natively since MSVC2013.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364723 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] canonicalize fcmp+select to minnum/maxnum intrinsics

This is the opposite direction of D62158 (we have to choose 1 form or the other).
Now that we have FMF on the select, this becomes more palatable. And the benefits
of having a single IR instruction for this operation (less chances of missing folds
based on extra uses, etc) overcome my previous comments about the potential advantage
of larger pattern matching/analysis.

Differential Revision: https://reviews.llvm.org/D62414

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364721 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanup: llvm::bsearch -> llvm::partition_point after r364719

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364720 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Implement llvm::bsearch() with std::partition_point()

Summary:
Delete the begin-end form because the standard std::partition_point
can be easily used as a replacement.

The ranges-style llvm::bsearch will be renamed to llvm::partition_point
in the next clean-up patch.

The name "bsearch" doesn't meet people's expectation because in C:

> If two or more members compare equal, which member is returned is unspecified.

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D63718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364719 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Custom lower AVX masked loads to masked load and vselect instead of selecting a maskmov+vblend during isel.

AVX masked loads only support 0 as the value for masked off elements.
So we need an extra blend to support other values. Previously we
expanded the masked load to two instructions with isel patterns.
With this patch we now insert the vselect during lowering and it
will be separately selected as a blend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364718 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Use the memory VT instead of result VT for FoldingSet profiling in getMaskedLoad/getMaskedStore.

This matches what is done by the Profile function. Otherwise CSE
won't work properly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364717 91177308-0d34-0410-b5e6-96231b3b80d8

[LFTR] Rephrase getLoopTest into "based-on" check; NFCI

What we want to know here is whether we're already using this value
for the loop condition, so make the query about that. We can extend
this to a more general "based-on" relationship, rather than a direct
icmp use later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364715 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] canonicalize fmin/fmax to LLVM intrinsics minnum/maxnum

This transform came up in D62414, but we should deal with it first.
We have LLVM intrinsics that correspond exactly to libm calls (unlike
most libm calls, these libm calls never set errno).
This holds without any fast-math-flags, so we should always canonicalize
to those intrinsics directly for better optimization.
Currently, we convert to fcmp+select only when we have FMF (nnan) because
fcmp+select does not preserve the semantics of the call in the general case.

Differential Revision: https://reviews.llvm.org/D63214

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364714 91177308-0d34-0410-b5e6-96231b3b80d8

[LFTR] Remove unnecessary latch check; NFCI

The whole indvars pass works on loops in simplified form, so there
is always a unique latch. Convert the condition into an assertion
in needsLFTR (though we also assert this in later LFTR functions).

Additionally update the comment on getLoopTest() now that we are
dealing with multiple exits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364713 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Shift amount reassociation (PR42391)

Summary:
Given pattern:
`(x shiftopcode Q) shiftopcode K`
we should rewrite it as
`x shiftopcode (Q+K)` iff `(Q+K) u< bitwidth(x)`
This is valid for any shift, but they must be identical.

* https://rise4fun.com/Alive/9E2
* exact on both lshr => exact https://rise4fun.com/Alive/plHk
* exact on both ashr => exact https://rise4fun.com/Alive/QDAA
* nuw on both shl => nuw https://rise4fun.com/Alive/5Uk
* nsw on both shl => nsw https://rise4fun.com/Alive/0plg

Should fix [[ https://bugs.llvm.org/show_bug.cgi?id=42391 | PR42391]].

Reviewers: spatel, nikic, RKSimon

Reviewed By: nikic

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63812

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364712 91177308-0d34-0410-b5e6-96231b3b80d8

[IR][Patternmatch] Add m_SpecificInt_ULT() predicate

Summary:
Match an integer or vector with every element unsigned less than the
Threshold. For vectors, this includes constants with undefined elements.

FIXME: is it worth generalizing this to simply take ICmpInst::Predicate?

Reviewers: craig.topper, spatel, nikic

Reviewed By: spatel

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364711 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Fix getBitsNeeded for INT_MIN values

Summary: This patch fixes behaviour of APInt::getBitsNeeded for INT_MIN 10 bits values.

Reviewers: regehr, RKSimon

Reviewed By: RKSimon

Subscribers: grandinj, dexonsmith, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364710 91177308-0d34-0410-b5e6-96231b3b80d8

[LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)

Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364709 91177308-0d34-0410-b5e6-96231b3b80d8

Partial revert of "[llvm-ar] Document response file support in --help"

This is partial revert of 70a8027c60fe1f95e8a8a1ff6575ebf8778d3544.

The test apparently failed on win32 bots due to the way slashes in
pathnames are handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364705 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Add some more tests for icmp select

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364703 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for update.dpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364701 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for atomic.inc/atomic.dec

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364699 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for some DS intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364698 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for some easy intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364697 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for icmp/fcmp intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364696 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for amdgcn.div.fmas

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364695 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect for some simple leaf intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364694 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars] Remove a bit of manual constant folding [NFC]

SCEV is more than capable of folding (add x, trunc(0)) to x.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364693 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add baseline test for packed shufflevector

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364691 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Assembler: support .int16/32/64 directives.

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364689 91177308-0d34-0410-b5e6-96231b3b80d8