granicus.if.org Git

[PowerPC][NFC] Use -mtriple in RUN line, remove target triple in tls.ll

To avoid confusion, especially when -mtriple are also added for PPC32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370427 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC32] Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO

Unlike ppc64, which has ADDISgotTprelHA+LDgotTprelL pairs,
ppc32 just uses LDgotTprelL32, so it does not make lots of sense to use
_LO without a paired _HA.

Emit R_PPC_GOT_TPREL16 instead R_PPC_GOT_TPREL16_LO to match GCC, and
get better linker relocation check. Note, R_PPC_GOT_TPREL16_{HA,LO}
don't have good linker support:

(a) lld does not support R_PPC_GOT_TPREL16_{HA,LO}.
(b) Top of tree ld.bfd does not support R_PPC_GOT_REL16_HA Initial-Exec -> Local-Exec relaxation:

  // a.o
  addis 3, 3, tsd_tls@got@tprel@ha
  lwz 3, tsd_tls@got@tprel@l(3)
  add 3, 3, tsd_tls@tls
  // b.o
  .section .tdata,"awT"; .globl tsd_tls; tsd_tls:

  // ld/ld-new a.o b.o
  internal error, aborting at ../../bfd/elf32-ppc.c:7952 in ppc_elf_relocate_section

Reviewed By: adalava

Differential Revision: https://reviews.llvm.org/D66925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370426 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Explicitly list all the always trivially rematerializable instructions.

Add a default with an llvm_unreachable for anything we don't expect.

This seems safer that just blindly returning true for anything
missing from the switch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370424 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: add CodeView register mapping for ARM NT

Add the core registers and NEON registers mapping to the CodeView
register ID. This is sufficient to compile a basic C program with debug
info using CodeView debug info.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370423 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Make __attribute__((used)) not imply export.

Add an WASM_SYMBOL_NO_STRIP flag, so that __attribute__((used)) doesn't
need to imply exporting. When targeting Emscripten, have
WASM_SYMBOL_NO_STRIP imply exporting.

Differential Revision: https://reviews.llvm.org/D62542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370415 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Precommit a few cases where we're missing oppurtunities for block local simplications off assumes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370414 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Support extended mnemonics mffprwz etc.

Summary:
Reported in https://github.com/opencv/opencv/issues/15413.

We have serveral extended mnemonics for Move To/From Vector-Scalar Register Instructions
eg: mffprd,mtfprd etc.

We only support one of them, this patch add the others.

Reviewers: nemanjai, steven.zhang, hfinkel, #powerpc

Reviewed By: hfinkel

Subscribers: wuzish, qcolombet, hiraditya, kbarton, MaskRay, shchenz, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66963

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370411 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Select arithmetic extended register patterns

This teaches GISel to select patterns which fold an extend plus optional shift
into the addressing mode. In particular, adds and subs.

Factor out the arith extended register ComplexPatterns in AArch64InstrFormats.td
and create GISel equivalents.

Add some equivalent functions to the ones in AArch64ISelDAGToDAG:

- `selectArithExtendedRegister`
- `narrowExtendRegIfNeeded`
- `getExtendTypeForInst`

`getExtendTypeForInst` includes the checks for loads and stores. This will be
used for WRO addressing modes in loads + stores.

Teach selectCopy to properly handle subregister copies on the same bank in
order to support `narrowExtendRegIfNeeded`. The extended register must be a
GPR32, so we need to support same-bank subregister copies.

Fix a bug in getSubRegForClass which would cause registers on things like
GPR32common to end up getting ssub. Just change the check to look for FPR32
rather than GPR32.

For tests:

- Add select-arith-extended-reg.mir
- Update addsub_ext.ll to include GlobalISel checks

Differential Revision: https://reviews.llvm.org/D66835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370410 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't emit unreachable stack adjustments

Summary:
This is a minor improvement on our past attempts to do this. Fixes
PR43155.

Reviewers: hans

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66905

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370409 91177308-0d34-0410-b5e6-96231b3b80d8

Allow '@' to appear in x86 mingw symbols

Summary:
There is no reason to differ in assembler behavior here between -msvc
and -gnu targets. Without this setting, the text after the '@' is
interpreted as a symbol variable, like foo@IMGREL.

Reviewers: mstorsjo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66974

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370408 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add possible bswap as widening shuffle test; NFC

Goes with the proposal in D66965.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370407 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the build for MSVC builds using M_PI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370405 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] combinePMULDQ - pmuldq(x, 0) -> zero vector (PR43159)

ISD::isBuildVectorAllZeros permits undef elements to be present, which means we can't return it as a zero vector. PMULDQ/PMULUDQ is an extending multiply so a multiply by zero of the lower 32-bits should result in a zero 64-bit element.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370404 91177308-0d34-0410-b5e6-96231b3b80d8

[ASan] Version mismatch check follow-up

Follow-up for:
[ASan] Make insertion of version mismatch guard configurable
3ae9b9d5e40d1d9bdea1fd8e6fca322df920754a

This tiny change makes sure that this test passes on our internal bots
as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370403 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize sin/cos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370402 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce duplicated code; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370399 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [MBP] Disable aggressive loop rotate in plain mode

This reverts r369664 (git commit 51f48295cbe8fa3a44db263b528dd9f7bae7bf9a)

It causes many benchmark regressions, internally and in llvm's benchmark suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370398 91177308-0d34-0410-b5e6-96231b3b80d8

Revert enabling MemorySSA.

Breaks sanitizers bots.

Differential Revision: https://reviews.llvm.org/D58311

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370397 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove what little support we had for MPX

-Deprecate -mmpx and -mno-mpx command line options
-Remove CPUID detection of mpx for -march=native
-Remove MPX from all CPUs
-Remove MPX preprocessor define

I've left the "mpx" string in the backend so we don't fail on old IR, but its not connected to anything.

gcc has also deprecated these command line options. https://www.phoronix.com/scan.php?page=news_item&px=GCC-Patch-To-Drop-MPX

Differential Revision: https://reviews.llvm.org/D66669

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370393 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Don't compute known bits for non-integral GEP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370392 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnrollAndJam] Use Lazy strategy for DTU.

We can also apply the earlier updates to the lazy DTU, instead of
applying them directly.

Reviewers: kuhar, brzycki, asbirlea, SjoerdMeijer

Reviewed By: brzycki, asbirlea, SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D66918

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370391 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add maskedValueIsZero and signBitIsZero to known bits

I dropped the DemandedElts since it seems to be missing from some of
the new interfaces, but not others.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370389 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add known bits to InstructionSelector

AMDGPU uses this for some addressing mode selection patterns. The
analysis run itself doesn't do anything so it seems easier to just
always require this than adding a way to opt in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370388 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA & LoopPassManager] Enable MemorySSA as loop dependency. Update tests.

Summary:
I'm not planning to check this in at the moment, but feedback is very welcome, in particular how this affects performance.
The feedback obtains here will guide the next steps towards enabling this.

This patch enables the use of MemorySSA in the loop pass manager.

Passes that currently use MemorySSA:
- EarlyCSE
Passes that use MemorySSA after this patch:
- EarlyCSE
- LICM
- SimpleLoopUnswitch
Loop passes that update MemorySSA (and do not use it yet, but could use it after this patch):
- LoopInstSimplify
- LoopSimplifyCFG
- LoopUnswitch
- LoopRotate
- LoopSimplify
- LCSSA
Loop passes that do *not* update MemorySSA:
- IndVarSimplify
- LoopDelete
- LoopIdiom
- LoopSink
- LoopUnroll
- LoopInterchange
- LoopUnrollAndJam
- LoopVectorize
- LoopReroll
- IRCE

Reviewers: chandlerc, george.burgess.iv, davide, sanjoy, gberry

Subscribers: jlebar, Prazek, dmgreen, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58311

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370384 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Select llvm.aarch64.stxr* intrinsics.

Add a GISelPredicateCode to the stxr_* PatFrags in AArch64InstrAtomics.td.

This allows us to select these intrinsics.

Differential Revision: https://reviews.llvm.org/D65779

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370382 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for bswap disguised as shuffle; NFC

Somewhat motivating case In PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146

But that's a lot more complicated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370381 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Use a GISelPredicateCode to select llvm.aarch64.stlxr.*

Remove manual selection code for this intrinsic and use a GISelPredicateCode
instead.

This allows us to fully select this intrinsic without any tricky custom C++
matching.

Differential Revision: https://reviews.llvm.org/D65780

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370380 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Select @llvm.aarch64.ldxr.* intrinsics

Same thing as D66897, but for ldxr.* instead. Add a GISelPredicateCode to the
ldxr_* definitions, which allows us to import them.

Add select-ldxr-intrin.mir, and update arm64-ldxr-stxr.ll.

Differential Revision: https://reviews.llvm.org/D66898

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370378 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Select @llvm.aarch64.ldaxr.* intrinsics

Add a GISelPredicateCode to ldaxr_*. This allows us to import the patterns for
@llvm.aarch64.ldaxr.*, and thus select them.

Add `isLoadStoreOfNumBytes` for the GISelPredicateCode, since each of these
intrinsics involves the same check.

Add select-ldaxr-intrin.mir, and update arm64-ldxr-stxr.ll.

Differential Revision: https://reviews.llvm.org/D66897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370377 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Skip sinking common lifetime markers of `alloca`.

Summary:
- Similar to the workaround in fix of PR30188, skip sinking common
lifetime markers of `alloca`. They are mostly left there after
inlining functions in branches.

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370376 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] Update fp-int-conversions-direct-moves.ll using script

Also add -ppc-asm-full-reg-names,-ppc-vsr-nums-as-vr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370375 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][SimplifyCFG] 'Safely extract low bits' pattern will also benefit from -phi-node-folding-threshold=3

This is the naive implementation of x86 BZHI/BEXTR instruction:
it takes input and bit count, and extracts low nbits up to bit width.
I.e. unlike shift it does not have any UB when nbits >= bitwidth.
Which means we don't need a while PHI here, simple select will do.
And if it's a select, it should then be trivial to fix codegen
to select it to BEXTR/BZHI.

See https://bugs.llvm.org/show_bug.cgi?id=34704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370369 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Fix shadow variable warnings. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370365 91177308-0d34-0410-b5e6-96231b3b80d8

DWARFDebugLoc: Make parsing and error reporting more robust

Summary:
While examining this class for possible use in lldb, I noticed two
things:
- it spits out parsing errors directly to stderr
- the loclists parser can incorrectly return valid location lists when
parsing malformed (truncated) data

I improve the stderr situation by making the parseOneLocationList
functions return Expected<T>s. The errors are still dumped to stderr by
their callers, so this is only a partial fix, but it is enough for my
use case, as I intend to parse the locations lists one by one.

I fix the behavior in the truncated scenario by using the newly
introduced DataExtractor Cursor API.

I also add tests for handling the error cases, as they currently have no
coverage.

Reviewers: dblaikie, JDevlieghere, probinson

Subscribers: lldb-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63591

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370363 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Fix callee-saved-gprs.ll test ABIs

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370359 91177308-0d34-0410-b5e6-96231b3b80d8

Allow replaceAndRecursivelySimplify to list unsimplified visitees.

This is part of D65280 and split it to avoid ABI changes on the 9.0
release branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370355 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Inline emitStoreWithSymOffset and emitLoadWithSymOffset methods. NFC

Both methods `MipsTargetStreamer::emitStoreWithSymOffset` and
`MipsTargetStreamer::emitLoadWithSymOffset` are almost the same and
differ argument names only. These methods are used in the single place
so it's better to inline their code and remove original methods.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370354 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix expanding `lw/sw $reg1, symbol($reg2)` instruction

When a "base" in the `lw/sw $reg1, symbol($reg2)` instruction is
a register and generated code is position independent, backend
does not add the "base" value to the symbol address.
```
lw     $reg1, %got(symbol)($gp)
lw/sw  $reg1, 0($reg1)
```

This patch fixes the bug and adds the missed `addu` instruction by
passing `BaseReg` into the `loadAndAddSymbolAddress` routine and handles
the case when the `BaseReg` is the zero register to escape redundant
`move reg, reg` instruction:
```
lw     $reg1, %got(symbol)($gp)
addu   $reg1, $reg1, $reg2
lw/sw  $reg1, 0($reg1)
```

Differential Revision: https://reviews.llvm.org/D66894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370353 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` inverted overflow bit

Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow or zero
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %retval.0
=>
  %iszero = icmp eq i4 %y, 0
  %umul = smul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %umul.ov.not = xor %umul.ov, -1
  %retval.0 = or i1 %iszero, %umul.ov.not
  ret i1 %umul.ov.not

Done: 1
Optimization is correct!
```
Note that this is inverted from what we have in a previous patch,
here we are looking for the inverted overflow bit.
And that inversion is kinda problematic - given this particular
pattern we neither hoist that `not` closer to `ret` (then the pattern
would have been identical to the one without inversion,
and would have been handled by the previous patch), neither
do the opposite transform. But regardless, we should handle this too.
I've filled [[ https://bugs.llvm.org/show_bug.cgi?id=42720 | PR42720 ]].

Reviewers: nikic, spatel, xbolva00, RKSimon

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65151

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370351 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Drop leftover "division-by-zero guard" around `@llvm.umul.with.overflow` overflow bit

Summary:
Now that with D65143/D65144 we've produce `@llvm.umul.with.overflow`,
and with D65147 we've flattened the CFG, we now can see that
the guard may have been there to prevent division by zero is redundant.
We can simply drop it:
```
----------------------------------------
Name: no overflow and not zero
  %iszero = icmp ne i4 %y, 0
  %umul = umul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %retval.0 = and i1 %iszero, %umul.ov
  ret i1 %retval.0
=>
  %iszero = icmp ne i4 %y, 0
  %umul = umul_overflow i4 %x, %y
  %umul.ov = extractvalue {i4, i1} %umul, 1
  %retval.0 = and i1 %iszero, %umul.ov
  ret %umul.ov

Done: 1
Optimization is correct!
```

Reviewers: nikic, spatel, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65150

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370350 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] FoldTwoEntryPHINode(): don't bailout on i1 PHI's if we can hoist a 'not' from incoming values

Summary:
As it can be seen in the tests in D65143/D65144, even though we have formed an '@llvm.umul.with.overflow'
and got rid of potential for division-by-zero, the control flow remains, we still have that branch.

We have this condition:
```
  // Don't fold i1 branches on PHIs which contain binary operators
  // These can often be turned into switches and other things.
  if (PN->getType()->isIntegerTy(1) &&
      (isa<BinaryOperator>(PN->getIncomingValue(0)) ||
       isa<BinaryOperator>(PN->getIncomingValue(1)) ||
       isa<BinaryOperator>(IfCond)))
    return false;
```
which was added back in rL121764 to help with `select` formation i think?

That check prevents us to flatten the CFG here, even though we know
we no longer need that guard and will be able to drop everything
but the '@llvm.umul.with.overflow' + `not`.

As it can be seen from tests, we end here because the `not` is being
sinked into the PHI's incoming values by InstCombine,
so we can't workaround this by hoisting it to after PHI.

Thus i suggest that we relax that check to not bailout if we'd get to hoist the `not`.

Reviewers: craig.topper, spatel, fhahn, nikic

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65147

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370349 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fold '((%x * %y) u/ %x) != %y' to '@llvm.umul.with.overflow' + overflow bit extraction

Summary:
`((%x * %y) u/ %x) != %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
$ /repositories/alive2/build-Clang-unknown/alive -root-only ~/llvm-patch1.ll
Processing /home/lebedevri/llvm-patch1.ll..

----------------------------------------
Name: no overflow
  %o0 = mul i4 %y, %x
  %o1 = udiv i4 %o0, %x
  %r = icmp ne i4 %o1, %y
  ret i1 %r
=>
  %n0 = umul_overflow i4 %x, %y
  %o0 = extractvalue {i4, i1} %n0, 0
  %o1 = udiv %o0, %x
  %r = extractvalue {i4, i1} %n0, 1
  ret %r

Done: 1
Optimization is correct!

----------------------------------------
Name: no overflow
  %o0 = mul i4 %y, %x
  %o1 = udiv i4 %o0, %x
  %r = icmp eq i4 %o1, %y
  ret i1 %r
=>
  %n0 = umul_overflow i4 %x, %y
  %o0 = extractvalue {i4, i1} %n0, 0
  %o1 = udiv %o0, %x
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1
  ret i1 %r

Done: 1
Optimization is correct!

```

Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370348 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fold '(-1 u/ %x) u< %y' to '@llvm.umul.with.overflow' + overflow bit extraction

Summary:
`(-1 u/ %x) u< %y` is one of (3?) common ways to check that
some unsigned multiplication (will not) overflow.
Currently, we don't catch it. We could:
```
----------------------------------------
Name: no overflow
  %o0 = udiv i4 -1, %x
  %r = icmp ult i4 %o0, %y
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %n0, 1

Done: 1
Optimization is correct!

----------------------------------------
Name: no overflow, swapped
  %o0 = udiv i4 -1, %x
  %r = icmp ugt i4 %y, %o0
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %r = extractvalue {i4, i1} %n0, 1

Done: 1
Optimization is correct!

----------------------------------------
Name: overflow
  %o0 = udiv i4 -1, %x
  %r = icmp uge i4 %o0, %y
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1

Done: 1
Optimization is correct!

----------------------------------------
Name: overflow
  %o0 = udiv i4 -1, %x
  %r = icmp ule i4 %y, %o0
=>
  %o0 = udiv i4 -1, %x
  %n0 = umul_overflow i4 %x, %y
  %n1 = extractvalue {i4, i1} %n0, 1
  %r = xor %n1, -1

Done: 1
Optimization is correct!
```

As it can be observed from tests, while simply forming the `@llvm.umul.with.overflow`
is easy, if we were looking for the inverted answer, then more work needs to be done
to cleanup the now-pointless control-flow that was guarding against division-by-zero.
This is being addressed in follow-up patches.

Reviewers: nikic, spatel, efriedma, xbolva00, RKSimon

Reviewed By: nikic, xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370347 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel] Model all `extractvalue`s as free.

Summary:
As disscussed in https://reviews.llvm.org/D65148#1606412,
`extractvalue` don't actually generate any code,
so we should treat them as free.

Reviewers: craig.topper, RKSimon, jnspaulsson, greened, asb, t.p.northover, jmolloy, dmgreen

Reviewed By: jmolloy

Subscribers: javed.absar, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66098

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370339 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations

The missing line added by this patch ensures that only spilt variable
locations are candidates for being restored from the stack. Otherwise,
register or constant-value information can be interpreted as a spill
location, through a union.

The added regression test replicates a scenario where this occurs: the
stack load from [rsp] causes the register-location DBG_VALUE to be
"restored" to rsi, when it should be left alone. See PR43058 for details.

Un x-fail a test that was suffering from this from a previous patch.

Differential Revision: https://reviews.llvm.org/D66895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370334 91177308-0d34-0410-b5e6-96231b3b80d8

Fix signed/unsigned comparison warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370333 91177308-0d34-0410-b5e6-96231b3b80d8

Fix shadow variable warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370332 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] - Allow placing local symbols after globals.

This allows us to produce broken binaries with local
symbols placed after global in '.dynsym'/'.symtab'

Also, simplifies the code.

Differential revision: https://reviews.llvm.org/D66799

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370331 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj/llvm-readelf] - Report a proper warning when dumping a broken dynamic relocation.

When we have a dynamic relocation with a broken symbol's st_name,
tools report a useless error: "Invalid data was encountered while parsing the file".

After this change we report a warning + "<corrupt>" as a symbol name.

Differential revision: https://reviews.llvm.org/D66734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370330 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE Masked loads and stores

Masked loads and store fit naturally with MVE, the instructions being easily
predicated. This adds lowering for the simple cases of masked loads and stores.
It does not yet deal with widening/narrowing or pre/post inc.

The llvm masked load intrinsic will accept a "passthru" value, dictating the
values used for the zero masked lanes. In MVE the instructions write 0 to the
zero predicated lanes, so we need to match a passthru that isn't 0 (or undef)
with a select instruction to pull in the correct data after the load.

We also need to do something with unaligned loads/stores. Currently this uses a
similar method used in big endian, using an VLDRB.8 (and potentially a VREV in
BE). This does mean that the predicate mask is converted from, for example, a
v4i1 to a v16i1. The VLDR instructions are defined as using the first bit of
the relevant mask lane, so this could potentially load different results if the
predicate is little odd. As the input is a v4i1 however, I believe this is OK
and all the bits required should be set in the predicate, making the VLDRB.8
load the same data.

Differential Revision: https://reviews.llvm.org/D66534

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370329 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] LiveDebugValues should always revisit backedges if it skips them

The "join" method in LiveDebugValues does not attempt to join unseen
predecessor blocks if their out-locations aren't yet initialized, instead
the block should be re-visited later to see if any locations have changed
validity. However, because the set of blocks were all being "process"'d
once before "join" saw them, that logic in "join" was actually ignoring
legitimate out-locations on the first pass through. This meant that some
invalidated locations were not removed from the head of loops, allowing
illegal locations to persist.

Fix this by removing the run of "process" before the main join/process loop
in ExtendRanges. Now the unseen predecessors that "join" skips truly are
uninitialized, and we come back to the block at a later time to re-run
"join", see the @baz function added.

This also fixes another fault where stack/register transfers in the entry
block (or any other before-any-loop-block) had their tranfers initially
ignored, and were then never revisited. The MIR test added tests for this
behaviour.

XFail a test that exposes another bug; a fix for this is coming in D66895.

Differential Revision: https://reviews.llvm.org/D66663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370328 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][CodeGen][NFC] Delay `combineIncDecVector()` from DAGCombine to X86DAGToDAGISel

Summary:
We were previously doing it in DAGCombine.
But we also want to do `sub %x, C` -> `add %x, (sub 0, C)` for vectors in DAGCombine.
So if we had `sub %x, -1`, we'll transform it to `add %x, 1`,
which `combineIncDecVector()` will immediately transform back into `sub %x, -1`,
and here we go again...

I've marked this as NFC since not a single test changes,
but since that 'changes' DAGCombine, probably this isn't fully NFC.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62327

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370327 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] (insert_vector_elt (vector_shuffle X, Y), (extract_vector_elt X, N), IdxC) -> (vector_shuffle X, Y)

Summary: This is beneficial when the shuffle is only used once and end up being generated in a few places when some node is combined into a shuffle.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370326 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Masked load and store and predicate tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370325 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Shift amount reassociation in bittest: trunc-of-lshr (PR42399)

Summary:
Finally, the fold i was looking forward to :)

The legality check is muddy, i doubt i've groked the full generalization,
but it handles all the cases i care about, and can come up with:
https://rise4fun.com/Alive/26j

I.e. we can perform the fold if **any** of the following is true:
* The shift amount is either zero or one less than widest bitwidth
* Either of the values being shifted has at most lowest bit set
* The value that is being shifted by `shl` (which is not truncated) should have no less leading zeros than the total shift amount;
* The value that is being shifted by `lshr` (which **is** truncated) should have no less leading zeros than the widest bit width minus total shift amount minus one

I strongly suspect there is some better generalization, but i'm not aware of it as of right now.
For now i also avoided using actual `computeKnownBits()`, but restricted it to constants.

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370324 91177308-0d34-0410-b5e6-96231b3b80d8

LegalizeSetCCCondCode - Reduce scope of NeedSwap to fix cppcheck warning. NFCI.

No need for this to be defined outside the only switch case its used in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370320 91177308-0d34-0410-b5e6-96231b3b80d8

Fix variable set but no used warnings on NDEBUG builds. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370319 91177308-0d34-0410-b5e6-96231b3b80d8

Fix variable set but no used warning on NDEBUG builds. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370317 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Add a ResourceSectionRef method for getting the data entry, print it in llvm-readobj

Differential Revision: https://reviews.llvm.org/D66819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370311 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Add a bounds checking helper for iterating a coff_resource_dir_table

Instead of blindly incrementing pointers in llvm-readobj, use this
helper, which does bounds checking against the available section
data.

Differential Revision: https://reviews.llvm.org/D66818

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370310 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Fix error handling in ResourceSectionRef

Previously, the expression (Reader.readFoo()) was expanded twice,
triggering asserts as one of the Error types ends up not checked
(and as it was expanded twice, the method would end up called twice
if it failed first).

Differential Revision: https://reviews.llvm.org/D66817

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370309 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Print the resource type textually for .res files

This already is done when dumping resources from coff objects.

Differential Revision: https://reviews.llvm.org/D66816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370308 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Remove a leftover string trim operation. NFC.

This became unnecessary in SVN r359153.

Differential Revision: https://reviews.llvm.org/D66815

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370307 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove isel patterns with X86VBroadcast+scalar_to_vector+load.

The DAG should have these as X86VBroadcast+load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370299 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some unneeded X86VBroadcast isel patterns that have larger than 128 bit input types.

We should always be shrinking the input to 128 bits or smaller
when the node is created.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370296 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Deduce "noalias" attribute

Summary:
This patch adds very basic deduction for noalias.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Tags: LLVM

Differential Revision: https://reviews.llvm.org/D66207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370295 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a DAG combine to combine INSERTPS and VBROADCAST of a scalar load. Remove corresponding isel patterns.

We had an isel pattern to perform this, but its better to
do it in DAG combine as a simplification. This also fixes the lack
of patterns for AVX512 targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370294 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make inline assembly 'x' and 'v' constraints work for f128.

Including a type legalizer fix to make bitcast operand promotion
work correctly when getSoftenedFloat returns f128 instead of i128.

Fixes PR43157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370293 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Use Lazy strategy for DTU used for MergeBlockIntoPredecessor.

We do not access the DT in the loop, so we do not have to apply updates
eagerly. We can apply them lazyly and flush them after we are done
merging blocks.

As follow-up work, we might be able to use the DTU above as well,
instead of manually updating the DT.

This brings the example from PR43134 from ~100s to ~4s for a relase +
assertions build on my machine.

Reviewers: efriedma, kuhar, asbirlea, brzycki

Reviewed By: kuhar, brzycki

Differential Revision: https://reviews.llvm.org/D66911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370292 91177308-0d34-0410-b5e6-96231b3b80d8

[ObjectYAML] Fix lifetime issue in dumpDebugLines

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370289 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Improve messages in iteration verify mode

When we now verify the iteration count we will see the actual count
and the expected count before the assertion is triggered.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370285 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][NFC] Add const to map key

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370284 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][Fix] Indicate change correctly

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370283 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Fix typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370282 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't use frame virtual registers

SGPR spills aren't really handled after SILowerSGPRSpills. In order to
directly control what happens if the scavenger needs to spill, the
scavenger needs to be used directly. There is an alternative to
spilling in these contexts anyway since the frame register can be
increment and restored.

This does present another possible issue if spilling is needed for the
unused carry out if an add is needed. I think this can be avoided by
using a scalar add (although that clobbers SCC, which happens anyway).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370281 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel/TableGen: Handle setcc patterns

This is a special case because one node maps to two different G_
instructions, and the operand order is changed.

This mostly enables G_FCMP for AMDPGPU. G_ICMP is still manually
selected for now since it has the SALU and VALU complication to deal
with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370280 91177308-0d34-0410-b5e6-96231b3b80d8

Add requirement to test.

-debug-only option for llc is only available in debug builds so
"REQUIRES: asserts" is needed in the tes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370279 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix a couple isel patterns to not shrink a volatile load.

Also add a FIXME because I'm not sure why these patterns exist. Looks like a missing combine.

And another FIXME because the AVX512 equivalent one of the patterns is missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370276 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall

The patch fixed the issue that RV64 didn't clear the upper bits
when return complex floating value with lp64 ABI.

float _Complex
complex_add(float _Complex a, float _Complex b)
{
return a + b;
}

RealResult = zero_extend(RealA + RealB)
ImageResult = ImageA + ImageB
Return (RealResult | (ImageResult << 32))

The patch introduces shouldExtendTypeInLibCall target hook to suppress
the AssertZext generation when lowering floating LibCall.

Thanks to Eli's comments from the Bugzilla
https://bugs.llvm.org/show_bug.cgi?id=42820

Differential Revision: https://reviews.llvm.org/D65497

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370275 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add atomic.fence instruction

Summary:
This adds `atomic.fence` instruction:
https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator

And we now emit the new `atomic.fence` instruction for multithread
fences, rather than the prevous `atomic.rmw` hack.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, tlively, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66794

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370272 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Fix omission of INSTALL_WITH_TOOLCHAIN to llvm_add_library()

Due to a misstake with r365902 that tried to simplify the install with
toolchain logic LLVM-C.dll was no longer being installed.

Patch By: Jakob Bornecrantz

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370271 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add an empty line to separate different patterns. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370269 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix 64-bit address loading in case of applying 32-bit mask to the result

If result of 64-bit address loading combines with 32-bit mask, LLVM
tries to optimize the code and remove "redundant" loading of upper
32-bits of the address. It leads to incorrect code on MIPS64 targets.

MIPS backend creates the following chain of commands to load 64-bit
address in the `MipsTargetLowering::getAddrNonPICSym64` method:
```
(add (shl (add (shl (add %highest(sym), %higher(sym)),
                    16),
               %hi(sym)),
          16),
     %lo(%sym))
```

If the mask presents, LLVM decides to optimize the chain of commands. It
really does not make sense to load upper 32-bits because the 0x0fffffff
mask anyway clears them. After removing redundant commands we get this
chain:
```
(add (shl (%hi(sym), 16), %lo(%sym))
```

There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in `SYM_32`
predicate definition, backend incorrectly selects a pattern for a 32-bit
symbols and uses the `lui` instruction for loading `%hi(sym)`.

As a result we get incorrect set of instructions with unnecessary 16-bit
left shifting:
```
lui     at,0x0
    R_MIPS_HI16     foo
dsll    at,at,0x10
daddiu  at,at,0
    R_MIPS_LO16     foo
```

This patch resolves two problems:
- Fix `SYM_32/SYM_64` predicates to prevent selection of patterns dedicated
  to 32-bit symbols in case of using N64 ABI.
- Add missed patterns for 64-bit symbols for `%hi/%lo`.

Fix PR42736.

Differential Revision: https://reviews.llvm.org/D66228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370268 91177308-0d34-0410-b5e6-96231b3b80d8

Add tie-breaker for register class sorting in getSuperRegForSubReg

llvm::stable_sort is apparently not sufficient.

Use the same tie-breaker/sorting style as TopoOrderRC fix bot failures.

E.g.

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19401/steps/test-check-all/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370267 91177308-0d34-0410-b5e6-96231b3b80d8

Fix for "DICompileUnit not listed in llvm.dbg.cu" verification error after ...

...cloning a function from a different module

Currently when a function with debug info is cloned from a different module, the
cloned function may have hanging DICompileUnits, so that the module with the
cloned function fails debug info verification.

The proposed fix inserts all DICompileUnits reachable from the cloned function
to "llvm.dbg.cu" metadata operands of the cloned function module.

Reviewed By: aprantl, efriedma

Differential Revision: https://reviews.llvm.org/D66510

Patch by Oleg Pliss (Oleg.Pliss@azul.com)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370265 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj][XCOFF][NFC] Add return statement to avoid -Wimplicit-fallthrough warning

This is to fix the commit in r370097.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370260 91177308-0d34-0410-b5e6-96231b3b80d8

[ASan] Make insertion of version mismatch guard configurable

By default ASan calls a versioned function
`__asan_version_mismatch_check_vXXX` from the ASan module constructor to
check that the compiler ABI version and runtime ABI version are
compatible. This ensures that we get a predictable linker error instead
of hard-to-debug runtime errors.

Sometimes, however, we want to skip this safety guard. This new command
line option allows us to do just that.

rdar://47891956

Reviewed By: kubamracek

Differential Revision: https://reviews.llvm.org/D66826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370258 91177308-0d34-0410-b5e6-96231b3b80d8

Ignore object files that lack coverage information.

Before this change, if multiple binary files were presented, all of them must have been instrumented or the load would fail with coverage_map_error::no_data_found.

Patch by Dean Sturtevant.

Differential Revision: https://reviews.llvm.org/D66763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370257 91177308-0d34-0410-b5e6-96231b3b80d8

Use the handle --check-prefixes mechanism to de-verbosify a couple atomics tests [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370256 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Import patterns containing SUBREG_TO_REG

Reuse the logic for INSERT_SUBREG to also import SUBREG_TO_REG patterns.

- Split `inferSuperRegisterClass` into two functions, one which tries to use
  an existing TreePatternNode (`inferSuperRegisterClassForNode`), and one that
  doesn't. SUBREG_TO_REG doesn't have a node to leverage, which is the cause
  for the split.

- Rename GlobalISelEmitterInsertSubreg.td to GlobalISelEmitterSubreg.td and
  update it.

- Update impacted tests in the AArch64 and X86 backends.

This is kind of a hit/miss for code size improvements/regressions. E.g. in
add-ext.ll, we now get some identity copies. This isn't really anything the
importer can handle, since it's caused by a later pass introducing the copy for
the sake of correctness.

Differential Revision: https://reviews.llvm.org/D66769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370254 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r370249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370251 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix bug when calculating user_spgr_count for Code Object V3 assembler

Stop counting explicitly disabled user_spgr's in the user_sgpr_count field of the kernel descriptor.

Differential Revision: https://reviews.llvm.org/D66900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370250 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] clean up wrap propagation for reassociated ops; NFCI

Always true/false checks were flagged by static analysis;
https://bugs.llvm.org/show_bug.cgi?id=43143

I have not confirmed the logic difference in propagating nsw vs. nuw,
but presumably we would have noticed a bug by now if that was wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370248 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueMapper] NFC: Remove dead code to pause metadata mapping

Summary:
This functionality was added when Mapper::mapMetadata was recursive. It
is no longer needed after r265456, which switched it to be iterative.

Reviewers: dexonsmith, srhines

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66860

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370236 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][ReleaseNotes] Add a note about the switch to widening legalization for narrow vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370233 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Regularly clear dependences to remove spurious ones

As dependences between abstract attributes can become stale, e.g., if
one was sufficient to imply another one at some point but it has since
been wakened to the point it is not usable for the formerly implied one.
To weed out spurious dependences, and thereby eliminate unneeded
updates, we introduce an option to determine how often the dependence
cache is cleared and recomputed during the fixpoint iteration.

Note that the initial value was determined such that we see a positive
result on our tests.

Differential Revision: https://reviews.llvm.org/D63315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370230 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv] Add fptosi and fptoui constrained intrinsics.

This implements constrained floating point intrinsics for FP to signed and
unsigned integers.

Quoting from D32319:
The purpose of the constrained intrinsics is to force the optimizer to
respect the restrictions that will be necessary to support things like the
STDC FENV_ACCESS ON pragma without interfering with optimizations when
these restrictions are not needed.

Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton
Approved by: Craig Topper
Differential Revision: http://reviews.llvm.org/D63782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370228 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Fall back when translating musttail calls

These are currently translated as normal functions calls in AArch64.

Until we have proper tail call lowering, we shouldn't translate these.

Differential Revision: https://reviews.llvm.org/D66842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370225 91177308-0d34-0410-b5e6-96231b3b80d8

Reduce scope of variable only used in a local pattern match. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370224 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added more tests for D66651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370222 91177308-0d34-0410-b5e6-96231b3b80d8