granicus.if.org Git

AMDGPU: Use tablegen pattern for sendmsg intrinsics

Since this now emits a direct copy to m0, SIFixSGPRCopies has to
handle a physical register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367593 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Tail-Loop Folding

This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.

Differential Revision: https://reviews.llvm.org/D65197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367592 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointer

AMDGPU testcase isn't broken now, but will be in a future patch
without this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367591 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Assembler/InstPrinter: support call_indirect type index.

A TYPE_INDEX operand (as used by call_indirect) used to be represented
by the InstPrinter as a symbol (e.g. .Ltype_index0@TYPE_INDEX) which
was a bit of a mismatch with the WasmObjectWriter which expects an
unnamed symbol, to receive the signature from and then turn into a
reloc.

There was really no good way to round-trip this information. An earlier
version of this patch tried to attach the signature information using
a .functype, but that ran into trouble when the symbol was re-emitted
without a name. Removing the name was a giant hack also.

The current version changes the assembly syntax to have an inline
signature spec for TYPEINDEX operands that is always unnamed, which
is much more elegant both in syntax and in implementation (as now the
assembler is able to follow the same path as the regular backend)

Reviewers: sbc100, dschuff, aheejin, jgravelle-google, sunfish, tlively

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367590 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] SimplifyMultipleUseDemandedBits - Add ISD::INSERT_VECTOR_ELT handling

Allow us to peek through vector insertions to avoid dependencies on entire insertion chains.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367588 91177308-0d34-0410-b5e6-96231b3b80d8

Fix spacing of LLVM_USE_PERF in CMake.rst that caused it to be tabbed in funny

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367585 91177308-0d34-0410-b5e6-96231b3b80d8

Document LLVM_ENABLE_LIBCXX in CMake.rst

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367584 91177308-0d34-0410-b5e6-96231b3b80d8

Move macho-data-in-code.ll to X86/macho-data-in-code.ll (to only run when x86 is a valid target).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367583 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add PEXTR*(PINSR*(v, s, c), c) -> s combine.

We should probably extend this to cover bitcasts as well to help other cases in promote-vec3.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367582 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][FIX] Indicate a missing update change

User of AAReturnedValues need to know if HasOverdefinedReturnedCalls
changed from false to true as it will impact the result of the return
value traversal (calls are not ignored anymore).

This will be tested with the tests in D59978.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367581 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix lowering load/store instruction in PIC case

If an operand of the `lw/sw` instructions is a symbol, these instructions
incorrectly lowered using not-position-independent chain of commands.
For PIC code we should use `lw/addiu` instructions with the `R_MIPS_GOT16`
and `R_MIPS_LO16` relocations respectively. Instead of that LLVM generates
position dependent code with the `R_MIPS_HI16` and `R_MIPS_LO16`
relocations.

This patch provides a fix for the bug by handling PIC case separately in
the `MipsAsmParser::expandMemInst`. The main idea is to generate a chain
of PIC instructions to load a symbol address into a register and then
load the address content.

The fix is not optimal and does not fix all PIC-related problems. This
is a task for subsequent patches.

Differential Revision: https://reviews.llvm.org/D65524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367580 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests with 'ne' predicates; NFC

More coverage for the proposal in D65576.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367579 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Fix jumptable detection when disassembling Mach-O binaries

- Add LC_SEGMENT_64 handling in getSectionsAndSymbols to be able to find the base segment address from 64-bit Mach-O binaries.
- Add "data in code" detection into the !symbolTableWorked case, extract it into a separate function.
- Fix uninitialized variable usage on BaseSegmentAddress (initialize to 0).
- Add test.

Differential Revision: https://reviews.llvm.org/D65491

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367578 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add test with swapped select operands; NFC

More coverage for the proposal in D65576.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367577 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add getArg() method to Function class

Adds a method which, when called with function.getArg(i), returns an
Argument* to the i'th argument.

Patch by Henry Wildermuth

Differential Revision: https://reviews.llvm.org/D64925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367576 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] SimplifyMultipleUseDemandedBits - Add PEXTR/PINSR B+W handling

This adds SimplifyMultipleUseDemandedBitsForTargetNode X86 support and uses it to allow us to peek through vector insertions to avoid dependencies on entire insertion chains.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367570 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for openSUSE RISC-V triple

Reviewers: asb

Reviewed By: asb

Subscribers: lenary, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lebedev.ri, kito-cheng, shiva0217, rogfer01, dexonsmith, rkruppe, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D63497

Patch by Andreas Schwab (schwab)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367565 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] EltsFromConsecutiveLoads - don't attempt to merge volatile loads (PR42846)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367556 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Add Custom Parser for Atomic Memory Operands

Summary:
GCC Accepts both (reg) and 0(reg) for atomic instruction memory
operands. These instructions do not allow for an offset in their
encoding, so in the latter case, the 0 is silently dropped.

Due to how we have structured the RISCVAsmParser, the easiest way to add
support for parsing this offset is to add a custom AsmOperand and
parser. This parser drops all the parens, and just keeps the register.

This commit also adds a custom printer for these operands, which matches
the GCC canonical printer, printing both `(a0)` and `0(a0)` as `(a0)`.

Reviewers: asb, lewis-revill

Reviewed By: asb

Subscribers: s.egerton, hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, jfb, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367553 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Value: add replaceUsesWithIf() utility

Summary:
While there is always a `Value::replaceAllUsesWith()`,
sometimes the replacement needs to be conditional.

I have only cleaned a few cases where `replaceUsesWithIf()`
could be used, to both add test coverage,
and show that it is actually useful.

Reviewers: jdoerfert, spatel, RKSimon, craig.topper

Reviewed By: jdoerfert

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, aheejin, george.burgess.iv, asbirlea, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65528

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367548 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] SelectInst: add swapValues() utility

Summary:
Sometimes we need to swap true-val and false-val of a `SelectInst`.
Having a function for that is nicer than hand-writing it each time.

Reviewers: spatel, RKSimon, craig.topper, jdoerfert

Reviewed By: jdoerfert

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65520

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367547 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix for MVE VREV64

The VREV64 instruction is apparently unpredictable if Qd == Qm, due to the
cross-beat nature of the instruction. This adds an earlyclobber to Qd, which
seems to be the same way we deal with this on other instructions like the
write-back on loads and stores.

Differential Revision: https://reviews.llvm.org/D65502

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367544 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Regenerate BSWAP16 tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367543 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Do not allocate unnecessary emergency slot.

Fix an issue where the compiler still allocates an emergency spill slot even
though it already decided to spill an extra callee-save register to use
as a scratch register.

Reviewers: gberry, thegameg, mstorsjo, t.p.northover

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D65504

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367540 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Fold load/store + G_GEP + G_CONSTANT

Fold load/store + G_GEP + G_CONSTANT when
immediate in G_CONSTANT fits into 16 bit signed integer.

Differential Revision: https://reviews.llvm.org/D65507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367535 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM][Alignment] Fix AlignmentTest on platform where size_t != uint64_t

Reviewers: yroux

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367532 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][ARM][ParallelDSP] Getters and renaming

Add a couple of getters for Reduction and do some renaming of
variables around CreateSMLAD for clarity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367522 91177308-0d34-0410-b5e6-96231b3b80d8

[Testing] Fix tests that break with read-only checkouts

Found with `mount --bind -o ro ...` on Linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367519 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Use APInt::isSubsetOf/intersects to simplify some code.

Also use KnownBits::isNegative/isNonNegative to further simplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367518 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SILoadStoreOptimizer: Make some functions const

Reviewers: arsenm, pendingchaos, rampitec

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367517 91177308-0d34-0410-b5e6-96231b3b80d8

recommit:[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store

In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target.
So we can combine vector load + reverse into big endian load to eliminate the swap instruction.
Also combine vector reverse + store into big endian store.

Differential Revision: https://reviews.llvm.org/D65063

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367516 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: fix inst-select-load-local.mir in -DLLVM_ENABLE_ASSERTIONS=off builds after r367498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367514 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix flat load/store of pointer types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367513 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Remove manual store select code

This regresses the weird types that are newly treated as legal load
types, but fixes incorrectly using flat instrucions on SI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367512 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select local atomic cmpxchg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367511 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[NFC] Remove obsolete LLVM_GNUC_PREREQ"

The bots are sad, looks like GCC doesn't always have __has_builtin. I'll need to
modify the logic a bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367510 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367509 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Remove obsolete LLVM_GNUC_PREREQ

The current minimum GCC version is 4.8 (soon to be 5.1), we there don't need to check for older versions. While I'm around Compiler.h, also update some of the doxygen comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367508 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Allow selection of DS atomicrmw

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367507 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Start redefining atomic PatFrags

Start migrating to a form that will be compatible with the global isel
emitter. Also should fix some overly lax checks on the memory type,
which allowed mis-selecting some illegal atomics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367506 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Correct FP atomic patterns

These need to use an fadd, not an add. Also make the noret part clear
in the name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367505 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select simple local stores

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367504 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: moreElementsVector for G_LOAD/G_STORE

AMDGPU change and test is a placeholder until a future patch with
complete handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367503 91177308-0d34-0410-b5e6-96231b3b80d8

Create unique, but identically-named ELF sections for explicitly-sectioned functions and globals when using -function-sections and -data-sections.

This allows functions and globals to to be reordered later in the linking phase
(using the -symbol-ordering-file) even though reordering will be limited to
the scope of the explicit section.

Patch by Rahman Lavaee!

Differential Revision: https://reviews.llvm.org/D65478

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367501 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "AMDGPU: Split block for si_end_cf"

This reverts commit r359363, reapplying r357634

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367500 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a release-only build warning triggered by rL367485

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367499 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select local loads

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367498 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" and
and partial fix.
Causes windows buildbot errors.

This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and
53da7ca94343166ac68aef81db0398932fc258bb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367496 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build when both gtest death tests and LLVM_NODISCARD are available.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367495 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1.

This is extremely specific, but saves three instructions when it's
legal. I don't think the code can be usefully generalized.

Differential Revision: https://reviews.llvm.org/D65351

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367492 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Transform compare of masked value to shift on Thumb1.

Thumb1 has very limited immediate modes, so turning an "and" into a
shift can save multiple instructions.

It's possible to simplify the generated code for test2 and test3 in
cmp-and-fold.ll a little more, but I'll implement that as a followup.

Differential Revision: https://reviews.llvm.org/D65175

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367491 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstExprPreter] Overflow-detecting methods use GCC or clang builtins

Differential Revision: https://reviews.llvm.org/D65536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367490 91177308-0d34-0410-b5e6-96231b3b80d8

[ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches.

X86 at least is able to use movmsk or kmov to move the mask to the scalar
domain. Then we can just use test instructions to test individual bits.

This is more efficient than extracting each mask element
individually.

I special cased v1i1 to use the previous behavior. This avoids
poor type legalization of bitcast of v1i1 to i1.

I've skipped expandload/compressstore as I think we need to
handle constant masks for those better first.

Many tests end up with duplicate test instructions due to tail
duplication in the branch folding pass. But the same thing
happens when constructing similar code in C. So its not unique
to the scalarization.

Not sure if this lowering code will also be good for other targets,
but we're only testing X86 today.

Differential Revision: https://reviews.llvm.org/D65319

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367489 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add DAG combine to fold any_extend_vector_inreg+truncstore to an extractelement+store

We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it.

This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk.

Differential Revision: https://reviews.llvm.org/D65538

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367488 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to unbreak sphinx build bot by inserting a link.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367487 91177308-0d34-0410-b5e6-96231b3b80d8

Migrate some more fadd and fsub cases away from UnsafeFPMath control to utilize NoSignedZerosFPMath options control

Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context.

Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper

Reviewed By: spatel

Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji

Differential Revision: https://reviews.llvm.org/D65170

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367486 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work)

This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts. The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes.

The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367485 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] allow creating error strings from a Twine

It's useful when no format needs to happen, only the Twine needs to be put together.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367484 91177308-0d34-0410-b5e6-96231b3b80d8

Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"
after windows buildbot failure.

Added a check that the MachineInstr exists and is a call before trying
to add symbols around it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367483 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused variable warning for non-assert builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367482 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel] Address review feedback on passing MD_callees to lowerCall.

Preserve the nullptr default for KnownCallees that appears in
the base class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367477 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel] Pass MD_callees metadata down in call lowering.

Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.

NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.

Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette

Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367476 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: Add a tagged-globals backend feature.

This feature instructs the backend to allow locally defined global variable
addresses to contain a pointer tag in bits 56-63 that will be ignored by
the hardware (i.e. TBI), but may be used by an instrumentation pass such
as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD
sequence that sets bits 48-63 to the corresponding bits of the global, with
the linker bounds check disabled on the ADRP instruction to prevent the tag
from causing a link failure.

This implementation of the feature omits the MOVK when loading from or storing
to a global, which is sufficient for TBI. If the same approach is extended
to MTE, assuming that 0 is not configured as a catch-all tag, we will most
likely also need the MOVK in this case in order to avoid a tag mismatch.

Differential Revision: https://reviews.llvm.org/D65364

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367475 91177308-0d34-0410-b5e6-96231b3b80d8

SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned char to unsigned.

This makes the field wider than MachineOperand::SubReg_TargetFlags so that
we don't end up silently truncating any higher bits. We should still catch
any bits truncated from the MachineOperand field as a consequence of the
assertion in MachineOperand::setTargetFlags().

Differential Revision: https://reviews.llvm.org/D65465

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367474 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Limit the number of times for the same store and root nodes
to bail out in store merging dependence check.

We run into a case where dependence check in store merging bail out many times
for the same store and root nodes in a huge basicblock. That increases compile
time by almost 100x. The patch add a map to track how many times the bailing
out happen for the same store and root, and if it is over a limit, stop
considering the store with the same root as a merging candidate.

Differential Revision: https://reviews.llvm.org/D65174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367472 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Added overflow checking add, sub and mul.

Added AddOverflow, SubOverflow and MulOverflow to compute truncated results and return a flag indicating whether overflow occured.

Differential Revision: https://reviews.llvm.org/D65494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367470 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases to show premature decomposition of vector multiplies into shift+add/sub for types that aren't legal and need to be split. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367466 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add AVX512DQ command lines to vector-mul.ll to show that we use vpmullq instead of shift+add/sub for some cases. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367465 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r367463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367464 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Update condition to avoid overflow.

Summary:
Update condition to remove addition that may cause an overflow.
Resolves PR42814.

Reviewers: sanjoy, RKSimon

Subscribers: jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367461 91177308-0d34-0410-b5e6-96231b3b80d8

compiler-rt: Rename .cc file in lib/profile to .cpp

See https://reviews.llvm.org/D58620 for discussion.

Note how the comment in the file already said ".cpp" :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367460 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Add references to unreferenced footnotes.

Thanks to Stefan Granitz for catching the issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367458 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r367456

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367457 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r367452 and add standalone sources

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367454 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Add additional verification for phis.

Summary:
Verify that the incoming defs into phis are the last defs from the
respective incoming blocks.
When moving blocks, insertDef must RenameUses. Adding this verification
makes GVNHoist tests fail that uncovered this issue.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63147

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367451 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Add build files for compiler-rt/lib/profile

Differential Revision: https://reviews.llvm.org/D65518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367450 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Make builtin library build on macOS

For now, it only builds the x86_64 slice.

Differential Revision: https://reviews.llvm.org/D65513

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367449 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Fix redundant object files in builtin lib.

compiler-rt's builtin library has generic implementations of many
functions, and then per-arch optimized implementations of some.

In the CMake build, both filter_builtin_sources() and an explicit loop
at the end of the build file (see D37166) filter out the generic
versions if a per-arch file is present.

The GN build wasn't doing this filtering. Just do the filtering manually
and explicitly, instead of being clever.

While here, also remove files from the mingw/arm build that are
redundantly listed after D39938 / r318139 (both from the CMake and the
GN build).

While here, also fix a target_os -> target_cpu typo.

Differential Revision: https://reviews.llvm.org/D65512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367448 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] canonicalize fneg before fmul/fdiv

Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).

There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).

1. The instcombine test changes show the expected neutral IR diffs from
   reversing the order.

2. The reassociation tests show that we were missing an optimization
   opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
   that all of these transforms are allowed (regardless of binop/unop
   fneg version) because:

   "For all other operations [besides copy/abs/negate/copysign], this
   standard does not specify the sign bit of a NaN result."
   In all of these transforms, we always have some other binop
   (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
   potential intermediate NaN operand.
   (If that interpretation is wrong, then we must already have a bug in
   the existing transforms?)

3. The clang tests shouldn't exist as-is, but that's effectively a
   revert of rL367149 (the test broke with an extension of the
   pre-existing fneg canonicalization in rL367146).

Differential Revision: https://reviews.llvm.org/D65399

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367447 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[DwarfDebug] Dump call site debug info"

The build failure found after the rL365467 has been
resolved.

Differential Revision: https://reviews.llvm.org/D60716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367446 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][FIX] Add missing word to documentation in terms of SCCs

In the approval of D65299, commited as rL367440, I mentioned that my
proposed wording was lacking the word "maximal". It is added now for
correctness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367445 91177308-0d34-0410-b5e6-96231b3b80d8

[build] Add the ability to create a symlink for lipo

Add user enabled option to create lipo with symlink to llvm-lipo
Used rL326381 for reference.

Differential Revision: https://reviews.llvm.org/D65477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367444 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix for vectorizer crash with pointers of different size

When vectorizer strips pointers it can eventually end up with
pointers of two different sizes, then SCEV will crash.

Differential Revision: https://reviews.llvm.org/D65480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367443 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Reword documentation in terms of SCCs not cycles

Given the example:
header:
  br i1 %c, label %next, label %header
next:
  br i1 %c2, label %exit, label %header

We end up with a loop containing both header and next.  Given that, the describing the loop in terms of cycles is confusing since we have multiple distinct cycles within a single Loop.  Standardize on the SCC to clarify.

Differential Revision: https://reviews.llvm.org/D65299

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367440 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add xor-or-icmp tests with icmp having extra uses

Currently InstCombiner::foldXorOfICmps() bailouts if the
ICMP it wants to invert has extra uses. As it can be seen
in the tests in previous commit, this is super unfortunate,
this is the single pattern that is left non-canonicalized.

We could analyze if we can also invert all the uses if said ICMP
at the same time, thus not bailing out there.
I'm not seeing any nicer alternative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367439 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add baseline tests with non-canonical CLAMP pattern

As disscussed in https://reviews.llvm.org/D65148#1603922
these would all need to be canonicalized to traditional clamp pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367438 91177308-0d34-0410-b5e6-96231b3b80d8

[AARCH64] Switch relocations R_AARCH64_TLS_TPREL64 and R_AARCH64_DTPMOD64

The ELF for the Arm 64-bit Architecture document originally specified
R_AARCH64_TLS_DTPREL64 = 0x404
R_AARCH64_TLS_DTPMOD64 = 0x405

LLVM correctly followed the document. Unfortunately in binutils these
two codes were reversed:
R_AARCH64_TLS_DTPMOD64 = 0x404
R_AARCH64_TLS_DTPREL64 = 0x405
Given that binaries had shipped this change has become the defacto standard
interpretation of these relocation codes for any toolchain that wanted to
remain compatible with GNU.

To recognize this the latest version of the ABI document has renamed
the relocations to R_AARCH64_TLS_IMPDEF1 and R_AARCH64_TLS_IMPDEF2
permitting a toolchain to choose between the two relocation types, and
recommending that toolchains follow the GNU interpretation for maximum
compatibility.

Given that upstream llvm has never implemented the standard TLS model for
AArch64 so we have no binary legacy, synchronize with GCC so that we don't
create incompatible objects in the future. So far the only visible change
is in llvm-readobj as it can decode these relocations. Tthis change will
mean that llvm-readobj decodes the same way as GNU readelf.

fixes PR40507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367437 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Moved IsNOT helper earlier. NFCI.

Makes it available for more combines to use without adding declarations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367436 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add test cases for PR42825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367435 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Reject CSEL instructions with invalid operands

Summary:
According to the Armv8.1-M manual CSEL, CSINC, CSINV and CSNEG are
"constrained unpredictable" when SP is used as the source register Rn.

The assembler should diagnose this case.

Reviewers: momchil.velikov, dmgreen, ostannard, simon_tatham, t.p.northover

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367433 91177308-0d34-0410-b5e6-96231b3b80d8

[IPSCCP] Move callsite check to the beginning of the loop.

We have some code marks instructions with struct operands as overdefined,
but if the instruction is a call to a function with tracked arguments,
this breaks the assumption that the lattice values of all call sites
are not overdefined and will be replaced by a constant.

This also re-adds the assertion from D65222, with additionally skipping
non-callsite uses. This patch should address the cases reported in which
the assertion fired.

Fixes PR42738.

Reviewers: efriedma, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D65439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367430 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Ensure chained subvector insertions are the same size (PR42833)

Before combining insert_subvector(insert_subvector(vec, sub0, c0), sub1, c1) patterns, ensure that the subvectors are all the same type. On AVX512 targets especially we might have a mixture of 128/256 subvector insertions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367429 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add support for Transactional Memory Extension (TME)

Re-commit r366322 after some fixes

TME is a future architecture technology, documented in

  https://developer.arm.com/architectures/cpu-architecture/a-profile/exploration-tools
  https://developer.arm.com/docs/ddi0601/a

More about the future architectures:

  https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/new-technologies-for-the-arm-a-profile-architecture

This patch adds support for the TME instructions TSTART, TTEST, TCOMMIT, and
TCANCEL and the target feature/arch extension "tme".

It also implements TME builtin functions, defined in ACLE Q2 2019
(https://developer.arm.com/docs/101028/latest)

Differential Revision: https://reviews.llvm.org/D64416

Patch by Javed Absar and Momchil Velikov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367428 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM] Fix Alignment death tests in Release Mode

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367427 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate lrshrink test checks to make D65354 diff easier

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367426 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate callee-saved test checks to make D65354 diff easier

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367425 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate alias-static-alloca test checks to make D65354 diff easier

I've manually added the stack offsets back as these are worth keeping - we really need a way for update_llc_test_checks.py not to mask out useful address math

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367424 91177308-0d34-0410-b5e6-96231b3b80d8

[DivRemPairs] Fixup DNDEBUG build - variable is only used in assertion

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367423 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r367393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367422 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate vp2intersect tests

Enable nounwind to remove unnecessary stack manipulation code

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367421 91177308-0d34-0410-b5e6-96231b3b80d8