granicus.if.org Git

[X86][BtVer2] Update the WriteLoad latency.

r327630 introduced new write definitions for float/vector loads.
Before that revision, WriteLoad was used by both integer/float (scalar/vector)
load. So, WriteLoad had to conservatively declare a latency to 5cy. That is
because the load-to-use latency for float/vector load is 5cy.

Now that we have dedicated writes for float/vector loads, there is no reason why
we should keep the latency of WriteLoad to 5cy. At the moment, WriteLoad is only
used by scalar integer loads only; we can assume an optimstic 3cy latency for
them.
This patch changes that latency from 5cy to 3cy, and regenerates the affected
scheduling/mca tests.

Differential Revision: https://reviews.llvm.org/D56922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351742 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add XOP icmp cost tests (PR40376)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351741 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-symbolizer] Add -no-demangle as alias for -demangle=false

Summary: Provides -no-demangle as alias for -demangle=false. Motivation: https://bugs.llvm.org/show_bug.cgi?id=40075

Reviewers: jhenderson, ruiu

Reviewed By: jhenderson

Subscribers: erik.pilkington, rupprecht, llvm-commits

Differential Revision: https://reviews.llvm.org/D56773

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351735 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typos throughout the license files that somehow I and my reviewers
all missed!

Thanks to Alex Bradbury for pointing this out, and the fact that I never
added the intended `legacy` anchor to the developer policy. Add that
anchor too. With hope, this will cause the links to all resolve
successfully.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351731 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove and autoupgrade vpmovqd/vpmovwb intrinsics using trunc+select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351729 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Make getExpressionSize unsigned short

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351727 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix warnings in unit test of r351725

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351726 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV][NFC] Introduces expression sizes estimation

This patch introduces the field `ExpressionSize` in SCEV. This field is
calculated only once on SCEV creation, and it represents the complexity of
this SCEV from arithmetical point of view (not from the point of the number
of actual different SCEV nodes that are used in the expression). Roughly
saying, it is the number of operands and operations symbols when we print this
SCEV.

A formal definition is following: if SCEV `X` has operands
`Op1`, `Op2`, ..., `OpN`,
then
Size(X) = 1 + Size(Op1) + Size(Op2) + ... + Size(OpN).
Size of SCEVConstant and SCEVUnknown is one.

Expression size may be used as a universal way to limit SCEV transformations
for huge SCEVs. Currently, we have a bunch of options that represents various
limits (such as recursion depth limit) that may not make any sense from the
point of view of a LLVM users who is not familiar with SCEV internals, and all
these different options pursue one goal. A more general rule that may
potentially allow us to get rid of this redundancy in options is "do not make
transformations with SCEVs of huge size". It can apply to all SCEV traversals
and transformations that may need to visit a SCEV node more than once, hence
they are prone to combinatorial explosions.

This patch only introduces SCEV sizes calculation as NFC, its utilization will
be introduced in follow-up patches.

Differential Revision: https://reviews.llvm.org/D35989
Reviewed By: reames

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351725 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Add R_RISCV_RELAX relocation to all possible relax candidates.

Summary:
Add R_RISCV_RELAX relocation to all possible relax candidates and
update corresponding testcase.

Reviewers: asb, apazos

Differential Revision: https://reviews.llvm.org/D46677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351723 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough

This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.

Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.

The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.

This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.

Thanks to Carl Peto for reporting this bug.

This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351721 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Enable emission of debug information

Prior to this, the code was missing AVR-specific relocation logic in
RelocVisitor.h.

This patch teaches RelocVisitor about R_AVR_16 and R_AVR_32.

Debug information is emitted in the final object file, and understood by
'avr-readelf --debug-dump' from AVR-GCC.

llvm-dwarfdump is yet to understand how to dump AVR DWARF symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351720 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough"

This reverts commit r351718.

Carl pointed out that the unit test could be improved.

This patch will be recommitted once the test is made more resilient.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351719 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Insert unconditional branch when inserting MBBs between blocks with fallthrough

This updates the AVR Select8/Select16 expansion code so that, when
inserting the two basic blocks for true and false conditions, any
existing fallthrough on the previous block is preserved.

Prior to this patch, if the block before the Select pseudo fell through
to the subsequent block, two new basic blocks would be inserted at the
prior fallthrough point, changing the fallthrough destination.

The predecessor or successor lists were not updated, causing the
BranchFolding pass at -O1 and above the rearrange basic blocks, causing
an infinite loop. Not to mention the unconditional fallthrough to the
true block is incorrect in of itself.

This patch modifies the Select8/16 expansion so that, if inserting true
and false basic blocks at a fallthrough point, the implicit branch is
preserved by means of an explicit, unconditional branch to the previous
fallthrough destination.

Thanks to Carl Peto for reporting this bug.

This fixes avr-rust bug https://github.com/avr-rust/rust/issues/123.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351718 91177308-0d34-0410-b5e6-96231b3b80d8

Tentative fix for r351701 and gcc 6.2 build on ubuntu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351705 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing test file

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351702 91177308-0d34-0410-b5e6-96231b3b80d8

Replace llvm::isPodLike<...>  by llvm::is_trivially_copyable<...>

As noted in https://bugs.llvm.org/show_bug.cgi?id=36651, the specialization for
isPodLike<std::pair<...>> did not match the expectation of
std::is_trivially_copyable which makes the memcpy optimization invalid.

This patch renames the llvm::isPodLike trait into llvm::is_trivially_copyable.
Unfortunately std::is_trivially_copyable is not portable across compiler / STL
versions. So a portable version is provided too.

Note that the following specialization were invalid:

    std::pair<T0, T1>
    llvm::Optional<T>

Tests have been added to assert that former specialization are respected by the
standard usage of llvm::is_trivially_copyable, and that when a decent version
of std::is_trivially_copyable is available, llvm::is_trivially_copyable is
compared to std::is_trivially_copyable.

As of this patch, llvm::Optional is no longer considered trivially copyable,
even if T is. This is to be fixed in a later patch, as it has impact on a
long-running bug (see r347004)

Note that GCC warns about this UB, but this got silented by https://reviews.llvm.org/D50296.

Differential Revision: https://reviews.llvm.org/D54472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351701 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Legalize more bitcasts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351700 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add isPointer legality predicates

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351699 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Really legalize exts from i1

There is a combine that was hiding these tests
not actually testing what they should be, although
they were producing the expected end result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351698 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Auto upgrade VPCOM/VPCOMU intrinsics to generic integer comparisons

This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.

Noticed while cleaning up vector integer comparison costs for PR40376.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351697 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement widenScalar for basic FP ops

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351696 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize f32->f16 fptrunc

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351695 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix some crashs in g_unmerge_values/g_merge_values

This was crashing in the predicate function assuming the value
is a vector.

Copy more of what AArch64 uses. This probably needs more refinement
later, but I don't exactly understand what it means in some cases,
particularly since any legalization for these seems to be missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351693 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Regbank select for fpext

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351692 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Cleanup legality for extensions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351691 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Auto upgrade old style VPCOM/VPCOMU intrinsics to generic integer comparisons

We were upgrading these to the new style VPCOM/VPCOMU intrinsics (which includes the condition code immediate), but we'll be getting rid of those shortly, so convert these to generics first.

This causes a couple of changes in the upgrade tests as signed/unsigned eq/ne are equivalent and we constant fold true/false codes, these changes are the same as what we already do for avx512 cmp/ucmp.

Noticed while cleaning up vector integer comparison costs for PR40376.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351690 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Replace VPCOM/VPCOMU with generic integer comparisons (llvm)

These intrinsics can always be replaced with generic integer comparisons without any regression in codegen, even for -O0/-fast-isel cases.

Noticed while cleaning up vector integer comparison costs for PR40376.

A future commit will remove/autoupgrade the existing VPCOM/VPCOMU llvm intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351688 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add explicit vector select costs

Prior to SSE41 (and sometimes on AVX1), vector select has to be performed as a ((X & C)|(Y & ~C)) bit select.

Exposes a couple of issues with the min/max reduction costs (which only go down to SSE42 for some reason).

The increase pre-SSE41 selection costs also prevent a couple of tests from firing any longer, so I've either tweaked the target or added AVX tests as well to the existing SSE2 tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351685 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add explicit fcmp costs for pre-SSE42 targets

Typical throughputs: cmpss/cmpps = 1cy and cmpsd/cmppd = 2cy before the Core2 era

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351684 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI][X86] Reordered getCmpSelInstrCost cost tables in descending ISA order. NFCI.

Minor tidyup to make it clearer whats going on before adding additional costs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351683 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Split icmp/fcmp costs tests and test all comparison codes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351682 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add masked load/store/gather/scatter tests for SSE2/SSE42/AVX1 targets

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351681 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add non-constant vselect cost tests

Also add AVX512 costs at the same time

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351680 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Remove unneeded XFAILs from the Generic CodeGen tests

These have been in place for quite a while now.

Several bugs have since been fixed, and these tests now pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351679 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Allow AVR to be explicitly set as the default target triple

This extends the CMake cross compilation logic so that AVR can be set as
the default target triple, and thus the generic codegen tests can be
run.

This used to be possible on AVR; the CMake configuration files have
since been changed.

With this patch, 'cmake -DLLVM_DEFAULT_TARGET_TRIPLE=avr-unknown-unknown' can
be passed on the command line, making the `-mcpu` argument redundant to
'llc' and friends.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351678 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Replace two references to ARM's 't2_so_imm' type comments

These were originally introduced in a copy-paste committed in r351526.

The reference to 't2_so_imm' have been updated to 'imm_com8' so the
comment is now accurate.

Thanks to Eli Friedman for noticing this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351674 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Fix codegen bug in 16-bit loads

Prior to this patch, the AVR::LDWRdPtr instruction was always lowered to
instructions of this pattern:

ld $GPR8, [PTR:XYZ]+
ld $GPR8, [PTR]+1

This has a problem; the [PTR] is incremented in-place once, but never
decremented.

Future uses of the same pointer will use the now clobbered value,
leading to the pointer being incorrect by an offset of one.

This patch modifies the expansion code of the LDWRdPtr pseudo
instruction so that the pointer variable is not silently clobbered in
future uses in the same live range.

Bug first reported by Keshav Kini.

Patch by Kaushik Phatak.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351673 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[AVR] Fix codegen bug in 16-bit loads"

This reverts commit r351544.

In that commit, I had mistakenly misattributed the issue submitter as
the patch author, Kaushik Phatak.

The patch will be recommitted immediately with the correct attribution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351672 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantMerge] Factor out check for un-mergeable globals, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351671 91177308-0d34-0410-b5e6-96231b3b80d8

make XFAIL, REQUIRES, and UNSUPPORTED support multi-line expressions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351668 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add masked MCVTSI2P/MCVTUI2P ISD opcodes to model the cvtqq2ps cvtuqq2ps nodes that produce less than 128-bits of results.

These nodes zero the upper half of the result and can't be represented with vselect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351666 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Implement --only-section

Differential Revision: https://reviews.llvm.org/D56873

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351663 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Implement --only-keep-debug

Differential Revision: https://reviews.llvm.org/D56840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351662 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Implement --strip-debug

Also remove sections similarly for --strip-all, --discard-all,
--strip-unneeded.

Differential Revision: https://reviews.llvm.org/D56839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351661 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Add support for removing sections

Differential Revision: https://reviews.llvm.org/D56683

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351660 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Add a testcase for patching the debug directory. NFC.

The debug directory contains the rwa file address of itself,
which is updated on write. Add a testcase for this existing
functionality.

Differential Revision: https://reviews.llvm.org/D56876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351659 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Remove a superfluous namespace qualification. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351658 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Rename a test from .yaml to .test. NFC.

Tests named .yaml aren't executed by default in this directory
(while they are within e.g. LLD).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351657 91177308-0d34-0410-b5e6-96231b3b80d8

Update the coding standards with the new file header.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351652 91177308-0d34-0410-b5e6-96231b3b80d8

Update structured references to the license to the new license.

Since these are intended to be short and succinct, I've used the SPDX
full name. It's human readable, but formally agreed upon and will be
part of the SPDX spec for licenses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351649 91177308-0d34-0410-b5e6-96231b3b80d8

Update more file headers across all of the LLVM projects in the monorepo
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351648 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify cttz/ctlz + icmp ugt/ult

Followup to D55745, this time handling comparisons with ugt and ult
predicates (which are the canonical forms for non-equality predicates).

For ctlz we can convert into a simple icmp, for cttz we can convert
into a mask check.

Differential Revision: https://reviews.llvm.org/D56355

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351645 91177308-0d34-0410-b5e6-96231b3b80d8

[NFX] Fix language reference title declaration

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351644 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix unused variable warnings in Release builds

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351641 91177308-0d34-0410-b5e6-96231b3b80d8

Remove a period from CREDITS.TXT (testing email change). NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351639 91177308-0d34-0410-b5e6-96231b3b80d8

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8

Convert two more files that were using Windows line endings and remove
a stray single '\r' from one file. These are the last line ending issues
I can find in the files containing parts of LLVM's file headers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351634 91177308-0d34-0410-b5e6-96231b3b80d8

Install new LLVM license structure and new developer policy.

This installs the new developer policy and moves all of the license
files across all LLVM projects in the monorepo to the new license
structure. The remaining projects will be moved independently.

Note that I've left odd formatting and other idiosyncracies of the
legacy license structure text alone to make the diff easier to read.
Critically, note that we do not in any case *remove* the old license
notice or terms, as that remains necessary until we finish the
relicensing process.

I've updated a few license files that refer to the LLVM license to
instead simply refer generically to whatever license the LLVM project is
under, basically trying to minimize confusion.

This is really the culmination of so many people. Chris led the
community discussions, drafted the policy update and organized the
multi-year string of meeting between lawyers across the community to
figure out the strategy. Numerous lawyers at companies in the community
spent their time figuring out initial answers, and then the Foundation's
lawyer Heather Meeker has done *so* much to help refine and get us ready
here. I could keep going on, but I just want to make sure everyone
realizes what a huge community effort this has been from the begining.

Differential Revision: https://reviews.llvm.org/D56897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351631 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanup non-UTF8 characters and some types I found in these files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351630 91177308-0d34-0410-b5e6-96231b3b80d8

Enable IPConstantPropagation to work with abstract call sites

This modification of the currently unused inter-procedural constant
propagation pass (IPConstantPropagation) shows how abstract call sites
enable optimization of callback calls alongside direct and indirect
calls. Through minimal changes, mostly dealing with the partial mapping
of callbacks, inter-procedural constant propagation was enabled for
callbacks, e.g., OpenMP runtime calls or pthreads_create.

Differential Revision: https://reviews.llvm.org/D56447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351628 91177308-0d34-0410-b5e6-96231b3b80d8

AbstractCallSite -- A unified interface for (in)direct and callback calls

  An abstract call site is a wrapper that allows to treat direct,
  indirect, and callback calls the same. If an abstract call site
  represents a direct or indirect call site it behaves like a stripped
  down version of a normal call site object. The abstract call site can
  also represent a callback call, thus the fact that the initially
  called function (=broker) may invoke a third one (=callback callee).
  In this case, the abstract call side hides the middle man, hence the
  broker function. The result is a representation of the callback call,
  inside the broker, but in the context of the original instruction that
  invoked the broker.

  Again, there are up to three functions involved when we talk about
  callback call sites. The caller (1), which invokes the broker
  function. The broker function (2), that may or may not invoke the
  callback callee. And finally the callback callee (3), which is the
  target of the callback call.

  The abstract call site will handle the mapping from parameters to
  arguments depending on the semantic of the broker function. However,
  it is important to note that the mapping is often partial. Thus, some
  arguments of the call/invoke instruction are mapped to parameters of
  the callee while others are not. At the same time, arguments of the
  callback callee might be unknown, thus "null" if queried.

  This patch introduces also !callback metadata which describe how a
  callback broker maps from parameters to arguments. This metadata is
  directly created by clang for known broker functions, provided through
  source code attributes by the user, or later deduced by analyses.

For motivation and additional information please see the corresponding
talk (slides/video)
  https://llvm.org/devmtg/2018-10/talk-abstracts.html#talk20
as well as the LCPC paper
  http://compilers.cs.uni-saarland.de/people/doerfert/par_opt_lcpc18.pdf

Differential Revision: https://reviews.llvm.org/D54498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351627 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "[CGP] Check for existing inttotpr before creating new one"

Original commit: r351582

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351626 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeFunc] Allow merging identical vararg functions using aliases

Thanks to Nikita Popov for pointing out this missed case.

This is a follow-up to r351411, which disabled function merging for
vararg functions outright due to a miscompile (see llvm.org/PR40345).

Differential Revision: https://reviews.llvm.org/D56865

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351624 91177308-0d34-0410-b5e6-96231b3b80d8

[HotColdSplit] Mark inherently cold functions as such

If an inherently cold function is found, mark it as cold. For now this
means applying the `cold` and `minsize` attributes.

As a drive-by, revisit and clean up the criteria for considering a
function for splitting. Add tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351623 91177308-0d34-0410-b5e6-96231b3b80d8

[HotColdSplit] Remove a set which tracked split functions (NFC)

Use the begin/end iterator idiom to avoid visiting split functions,
instead of doing a set lookup.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351622 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeExtractor] Emit lifetime markers around reloads of outputs

CodeExtractor permits extracting a region of blocks from a function even
when values defined within the region are used outside of it.

This is typically done by creating an alloca in the original function
and reloading the alloca after a call to the extracted function.

Wrap the reload in lifetime start/end markers to promote stack coloring.

Suggested by Sergei Kachkov!

Differential Revision: https://reviews.llvm.org/D56045

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351621 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Reapply "[CGP] Check for existing inttotpr before creating new one""

This reverts commit r351618.

Compiler RT + ASAN tests are failing for PowerPC. Not sure
how would I reproduce these on macOS, so reverting (again)
until I do.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351619 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "[CGP] Check for existing inttotpr before creating new one"

Original commit: r351582

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351618 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r351584: "GlobalISel: Verify g_zextload and g_sextload"

This new assertion triggered on the AArch64 GlobalISel bots. Reverting while it's being investigated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351617 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Deduplicate static calling convention helpers for code size, NFC

Summary:
Right now we include ${TGT}GenCallingConv.inc once per each instruction
selection method implemented by ${TGT}:
- ${TGT}ISelLowering.cpp
- ${TGT}CallLowering.cpp
- ${TGT}FastISel.cpp

Instead, add a mechanism to tablegen for marking a particular convention
as "External", which causes tablegen to emit into the ::llvm namespace,
instead of as a static helper. This allows us to provide a header to
forward declare it, so we can simply call the function from all the
places it is referenced. Typically the calling convention analyzer is
called indirectly, so it doesn't benefit from inlining.

This saves a bit of final binary size, but mostly just saves object file
size:

before  after   diff   artifact
12852K  12492K  -360K  X86ISelLowering.cpp.obj
4640K   4280K   -360K  X86FastISel.cpp.obj
1704K   2092K   +388K  X86CallingConv.cpp.obj
52448K  52336K  -112K  llc.exe

I didn't collect before numbers for X86CallLowering.cpp.obj, which is
for GlobalISel, but we should save 360K there as well.

This patch applies the strategy to the X86 backend, but there is no
reason it couldn't be applied to the other backends that implement
multiple ISel strategies, like AArch64.

Reviewers: craig.topper, hfinkel, efriedma

Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D56883

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351616 91177308-0d34-0410-b5e6-96231b3b80d8

Use llvm_canonicalize_cmake_booleans for LLVM_LIBXML2_ENABLED [llvm]

r291284 added a nice mechanism to consistently pass CMake on/off toggles to
lit. This change uses it for LLVM_LIBXML2_ENABLED too (which was added around
the same time and doesn't use the new system yet).

Also alphabetically sort the list passed to llvm_canonicalize_cmake_booleans()
in llvm/test/CMakeLists.txt.

No intended behavior change.

Differential Revision: https://reviews.llvm.org/D56912

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351615 91177308-0d34-0410-b5e6-96231b3b80d8

Remove F_modify flag from FileOutputBuffer.

This code is dead. There is no use of the feature in the entire LLVM codebase.

Differential Revision: https://reviews.llvm.org/D56939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351613 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize more types for select

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351599 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[CGP] Check for existing inttotpr before creating new one"

This reverts commit r351582.

Bots are failing. Reverting this to fix and re-commit later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351598 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize illegal g_constant

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351596 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Verify G_BITCAST

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351594 91177308-0d34-0410-b5e6-96231b3b80d8

[elfabi] Add support for reading DT_NEEDED from binaries

This patch gives elfabi the ability to read DT_NEEDED entries from ELF binaries
to populate NeededLibs in TextAPI's ELFStub.

Differential Revision: https://reviews.llvm.org/D55852

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351592 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Verify G_ICMP/G_FCMP vector types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351591 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add more movmsk tests; NFC

The existing tests already show a sub-optimal transform,
but this should make it clear that we can't just match
an 'and' op when creating movmsk instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351590 91177308-0d34-0410-b5e6-96231b3b80d8

Make ThinLTO test run single threaded to try to avoid flakiness

To see if this helps flaky bot failures in PR40351.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351589 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove llvm.SI.load.const

It's taken 3 years, but now all of the old AMDGPU and SI intrinsics
are finally gone

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351586 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Verify g_zextload and g_sextload

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351584 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Lower avx512f scatter intrinsics to X86MaskedScatterSDNode instead of going directly to MachineSDNode.

This sends these intrinsics through isel in a much more normal way. This should allow addressing mode matching in isel to make better use of the displacement field.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351583 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] Check for existing inttotpr before creating new one

Make sure CodeGenPrepare doesn't emit multiple inttoptr instructions of
the same integer value while sinking address computations, but rather
CSEs them on the fly: excessive inttoptr's confuse SCEV into thinking
that related pointers have nothing to do with each other.

This problem blocks LoadStoreVectorizer from vectorizing some of the
loads / stores in a downstream target.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D56838

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351582 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Updates for -dag-dump-verbose

Summary:
This patch makes some changes related to -dag-dump-verbose.
Main use case has been when debugging how SelectionDAG is
dealing with debug info (SDDbgValue nodes).

1) We now print the number of DbgValues that are mapped to each
   SDNode.
2) Removed duplicated printing of DebugLoc (nowadays DebugLoc is
   printed also when not using -dag-dump-verbose).
3) Renamed SDDbgValue::dump to SDDbgValue::print, and added a
   new SDDbgValue::dump that will start a new line after calling
   print.
4) SDDbgValue::print now prints "Order", and it also prints
   some additional information when kind is CONST/FRAMEIX/VREG.
5) SelectionDAG::dump() now dumps all SDDbgValue nodes after
   the list of SDNodes (both "regular" and "ByVal" SDDbgValue:s).
   Invalidated nodes are not printed.
6) Prohibit inline printing of SDNode operands that has SDDbgValue
   nodes associated to them.

Reviewers: jmorse, aprantl

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56793

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351581 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the buildbot issue introduced by r351421

The EXPENSIVE_CHECK x86_64 Windows buildbot is failing due to this change. Fix
the map access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351577 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Change to range-based invocation of llvm::sort

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351574 91177308-0d34-0410-b5e6-96231b3b80d8

[adt] Twine(nullptr) derefs the nullptr. Add a deleted Twine(std::nullptr_t)

Summary:
nullptr can implicitly convert to Twine as Twine(nullptr) in which case it
resolves to Twine(const char *). This constructor derefs the pointer and
therefore doesn't work. Add a Twine(std::nullptr_t) = delete to make it a
compile time error.

It turns out that in-tree usage of Twine(nullptr) is confined to a single
private method in IRBuilder where foldConstant(... const Twine &Name = nullptr)
and this method is only ever called with an explicit Name argument as making it
a mandatory argument doesn't cause compile-time or run-time errors.

Reviewers: jyknight

Reviewed By: jyknight

Subscribers: dexonsmith, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D56870

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351572 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Split very large token factors for chained stores to 64k chunks.

Similar to D55073. Without this change, the DAG combiner crashes on code
with more than 64k of stores in a single basic block that form parallelizable
chains.

No test case, as it would be very IR file.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D56740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351571 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Lower avx2/avx512f gather intrinsics to X86MaskedGatherSDNode instead of going directly to MachineSDNode.:

This sends these intrinsics through isel in a much more normal way. This should allow addressing mode matching in isel to make better use of the displacement field.

Differential Revision: https://reviews.llvm.org/D56827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351570 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Skip blocks in sub-loops when scanning for uses.

Summary:
Scanning blocks in sub-loops for uses is unnecessary, as they were
already handled while dealing with the containing sub-loop.

This speeds up LCSSA for highly nested loops. For the test case in PR37202, it
halves the time spent in LCSSA. In cases were we won't be able to skip
any blocks, the additional lookup should be negligible.

Time-passes without this patch for test case from PR37202:

  Total Execution Time: 48.5505 seconds (48.5511 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  10.0822 ( 21.0%)   0.1406 ( 27.0%)  10.2228 ( 21.1%)  10.2228 ( 21.1%)  Loop-Closed SSA Form Pass
  10.0417 ( 20.9%)   0.1467 ( 28.2%)  10.1884 ( 21.0%)  10.1890 ( 21.0%)  Loop-Closed SSA Form Pass #2
   4.2703 (  8.9%)   0.0040 (  0.8%)   4.2742 (  8.8%)   4.2742 (  8.8%)  Unswitch loops
   2.7376 (  5.7%)   0.0229 (  4.4%)   2.7605 (  5.7%)   2.7611 (  5.7%)  Loop-Closed SSA Form Pass #5
   2.7332 (  5.7%)   0.0214 (  4.1%)   2.7546 (  5.7%)   2.7546 (  5.7%)  Loop-Closed SSA Form Pass #3
   2.7088 (  5.6%)   0.0230 (  4.4%)   2.7319 (  5.6%)   2.7324 (  5.6%)  Loop-Closed SSA Form Pass #4
   2.6855 (  5.6%)   0.0236 (  4.5%)   2.7091 (  5.6%)   2.7090 (  5.6%)  Loop-Closed SSA Form Pass #6
   2.1648 (  4.5%)   0.0018 (  0.4%)   2.1666 (  4.5%)   2.1664 (  4.5%)  Unroll loops
   1.8371 (  3.8%)   0.0009 (  0.2%)   1.8379 (  3.8%)   1.8380 (  3.8%)  Value Propagation
   1.8149 (  3.8%)   0.0021 (  0.4%)   1.8170 (  3.7%)   1.8169 (  3.7%)  Loop Invariant Code Motion
   1.6755 (  3.5%)   0.0226 (  4.3%)   1.6981 (  3.5%)   1.6980 (  3.5%)  Loop-Closed SSA Form Pass #7

Time-passes with this patch

  Total Execution Time: 29.9285 seconds (29.9276 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   5.2786 ( 17.7%)   0.0021 (  1.2%)   5.2806 ( 17.6%)   5.2808 ( 17.6%)  Unswitch loops
   4.3739 ( 14.7%)   0.0303 ( 18.1%)   4.4042 ( 14.7%)   4.4042 ( 14.7%)  Loop-Closed SSA Form Pass
   4.2658 ( 14.3%)   0.0192 ( 11.5%)   4.2850 ( 14.3%)   4.2851 ( 14.3%)  Loop-Closed SSA Form Pass #2
   2.2307 (  7.5%)   0.0013 (  0.8%)   2.2320 (  7.5%)   2.2318 (  7.5%)  Loop Invariant Code Motion
   2.0888 (  7.0%)   0.0012 (  0.7%)   2.0900 (  7.0%)   2.0897 (  7.0%)  Unroll loops
   1.6761 (  5.6%)   0.0013 (  0.8%)   1.6774 (  5.6%)   1.6774 (  5.6%)  Value Propagation
   1.3686 (  4.6%)   0.0029 (  1.8%)   1.3716 (  4.6%)   1.3714 (  4.6%)  Induction Variable Simplification
   1.1457 (  3.8%)   0.0010 (  0.6%)   1.1468 (  3.8%)   1.1468 (  3.8%)  Loop-Closed SSA Form Pass #4
   1.1384 (  3.8%)   0.0005 (  0.3%)   1.1389 (  3.8%)   1.1389 (  3.8%)  Loop-Closed SSA Form Pass #6
   1.1360 (  3.8%)   0.0027 (  1.6%)   1.1387 (  3.8%)   1.1387 (  3.8%)  Loop-Closed SSA Form Pass #5
   1.1331 (  3.8%)   0.0010 (  0.6%)   1.1341 (  3.8%)   1.1340 (  3.8%)  Loop-Closed SSA Form Pass #3

Reviewers: davide, efriedma, mzolotukhin

Reviewed By: davide, efriedma

Subscribers: hiraditya, dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D56848

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351567 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Implement llvm::Registry::iterator via llvm_iterator_facade

Summary:
Among other things, this allows using STL algorithms like 'find_if' over
llvm::Registry.

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D56854

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351566 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Add some missing always-uniform values.

This commit adds some missing intrinsics into the isAlwaysUniform list
for the AMDGPU backend.

Differential Revision: https://reviews.llvm.org/D56845

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351562 91177308-0d34-0410-b5e6-96231b3b80d8

[LTO] Change test/tools/lto/no-bitcode.s requirement from arm to aarch64

Set the test to properly require aarch64 instead of arm. Otherwise, this test fails with LLVM_TARGETS_TO_BUILD='ARM;X86'

bin/llvm-mc: : error: unable to get target for 'arm64-apple-ios7.0.0'

Committed on behalf of @easyaspi314 (Devin)

Differential Revision: https://reviews.llvm.org/D56472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351560 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAGBuilder] Cleanup InlineAsm Output generation. NFCI.

Defer inline asm's output fixup work until after we've generated the
inline asm node itself. Remove StoresToEmit, IndirectStoresToEmit, and
RetValRegs in favor of using ConstraintOperands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351558 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] simplify code for SDValue.getOperand(); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351557 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r351529 "[llvm-objdump][NFC] Improve readability."

msan errors in ELF/strip-all.s.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351556 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC][GFX8+][DISASSEMBLER] Corrected 1/2pi value for 64-bit operands

See bug 39332: https://bugs.llvm.org/show_bug.cgi?id=39332

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D56794

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351555 91177308-0d34-0410-b5e6-96231b3b80d8

[TTI] Use ConcreteTTI cast in getIntrinsicInstrCost Type variant. NFCI.

Same as we do in the Value variant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351554 91177308-0d34-0410-b5e6-96231b3b80d8

Reland r351529 "[llvm-objdump][NFC] Improve readability."

`SectionSymbol*` is cast from `void*` to
`std::tuple<uint64_t, StringRef, uint8_t>` in AMDGPUSymbolizer, so it has to
*be* one, not *act like* one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351553 91177308-0d34-0410-b5e6-96231b3b80d8