granicus.if.org Git

Fix build bot after r338521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338522 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ, TableGen] Fix shift count handling

The DAG combiner logic to simplify AND masks in shift counts is invalid.
While it is true that the SystemZ shift instructions ignore all but the
low 6 bits of the shift count, it is still invalid to simplify the AND
masks while the DAG still uses the standard shift operators (which are
*not* defined to match the SystemZ instruction behavior).

Instead, this patch performs equivalent operations during instruction
selection. For completely removing the AND, this now happens via
additional DAG match patterns implemented by a multi-alternative
PatFrags. For simplifying a 32-bit AND to a 16-bit AND, the existing DAG
patterns were already mostly OK, they just needed an output XForm to
actually truncate the immediate value.

Unfortunately, the latter change also exposed a bug in TableGen: it
seems XForms are currently only handled correctly for direct operands of
the outermost operation node. This patch also fixes that bug by simply
recurring through the whole pattern. This should be NFC for all other
targets.

Differential Revision: https://reviews.llvm.org/D50096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338521 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use isNullConstant helper. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338516 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][x86] Add STC + STD instruction resource tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338514 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Improve code comments. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338513 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Remove ambiguity to fix Windows bots

Should fix the MSVC bots by explicitly invoking
llvm::make_reverse_iterator to remove ambiguity with
std::make_reverse_iterator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338511 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Improve consistency in DWARFDie.h (NFC)

Follow-up for r338506 with some unrelated changes in formatting and
consistency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338509 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improved sched models for X86 BT*rr instructions.
Differential Revision: https://reviews.llvm.org/D49243

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338507 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Have custom std::reverse_iterator<DWARFDie>

The DWARFDie is a lightweight utility wrapper that stores a pointer to a
compile unit and a debug info entry. Currently, its iterator (used for
walking over its children) stores a DWARFDie and returns a const
reference when dereferencing it.

When the iterator is modified (by incrementing or decrementing it), this
reference becomes invalid. This was happening when calling reverse on
it, because the std::reverse_iterator is keeping a temporary copy of the
iterator (see
https://en.cppreference.com/w/cpp/iterator/reverse_iterator for a good
illustration).

The relevant code in libcxx:

reference operator*() const {_Iter __tmp = current; return *--__tmp;}

When dereferencing the reverse iterator, we decrement and return a
reference to a DWARFDie stored in the stack frame of this function,
resulting in UB at runtime.

This patch specifies the std::reverse_iterator for DWARFDie to do the
right thing.

Differential revision: https://reviews.llvm.org/D49679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338506 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Select global address

Select G_GLOBAL_VALUE for position dependent code.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D49803

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338499 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Enrich inline messages", tests fail

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338496 91177308-0d34-0410-b5e6-96231b3b80d8

Add llvm-rc to LLVM_TOOLCHAIN_TOOLS (PR38386)

This means it will be installed also in builds configured with
LLVM_INSTALL_TOOLCHAIN_ONLY, such as the Windows packages.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338495 91177308-0d34-0410-b5e6-96231b3b80d8

Enrich inline messages

Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)

Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338494 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Disallow the MachO specific .loh directive for windows

Also add a test for it being unsupported for linux.

Differential Revision: https://reviews.llvm.org/D49929

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338493 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When looking for (CMOV C-1, (ADD (CTTZ X), C), (X != 0)) -> (ADD (CMOV (CTTZ X), -1, (X != 0)), C), make sure we really have a compare with 0.

It's not strictly required by the transform of the cmov and the add, but it makes sure we restrict it to the cases we know we want to match.

While there canonicalize the operand order of the cmov to simplify the matching and emitting code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338492 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Basic support for producing DWARFv5 .debug_addr section

This revision implements support for generating DWARFv5 .debug_addr section.
The implementation is pretty straight-forward: we just check the dwarf version
and emit section header if needed.

Reviewers: aprantl, dblaikie, probinson

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D50005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338487 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] fold extracting from std::pair (1/2)

This patch intends to enable jump threading when a method whose return type is std::pair<int, bool> or std::pair<bool, int> is inlined.
For example, jump threading does not happen for the if statement in func.

std::pair<int, bool> callee(int v) {
  int a = dummy(v);
  if (a) return std::make_pair(dummy(v), true);
  else return std::make_pair(v, v < 0);
}

int func(int v) {
  std::pair<int, bool> rc = callee(v);
  if (rc.second) {
    // do something
  }

SROA executed before the method inlining replaces std::pair by i64 without splitting in both callee and func since at this point no access to the individual fields is seen to SROA.
After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (like or, and, trunc).

This series of patch add patterns in InstructionSimplify to fold extraction of members of std::pair. To help jump threading, actually we need to optimize the code sequence spanning multiple BBs.
These patches does not handle phi by itself, but these additional patterns help NewGVN pass, which calls instsimplify to check opportunities for simplifying instructions over phi, apply phi-of-ops optimization to result in successful jump threading.
SimplifyDemandedBits in InstCombine, can do more general optimization but this patch aims to provide opportunities for other optimizers by supporting a simple but common case in InstSimplify.

This first patch in the series handles code sequences that merges two values using shl and or and then extracts one value using lshr.

Differential Revision: https://reviews.llvm.org/D48828

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338485 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Fix build failed in clang-x86_64-linux-selfhost-modules.

Only generate symbol difference expression if needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338484 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Adding more test patterns for lea-opt (PR37939)

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338483 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Fix a really subtle miscompile due to a somewhat glaring bug in
EFLAGS copy lowering.

If you have a branch of LLVM, you may want to cherrypick this. It is
extremely unlikely to hit this case empirically, but it will likely
manifest as an "impossible" branch being taken somewhere, and will be
... very hard to debug.

Hitting this requires complex conditions living across complex control
flow combined with some interesting memory (non-stack) initialized with
the results of a comparison. Also, because you have to arrange for an
EFLAGS copy to be in *just* the right place, almost anything you do to
the code will hide the bug. I was unable to reduce anything remotely
resembling a "good" test case from the place where I hit it, and so
instead I have constructed synthetic MIR testing that directly exercises
the bug in question (as well as the good behavior for completeness).

The issue is that we would mistakenly assume any SETcc with a valid
condition and an initial operand that was a register and a virtual
register at that to be a register *defining* SETcc...

It isn't though....

This would in turn cause us to test some other bizarre register,
typically the base pointer of some memory. Now, testing this register
and using that to branch on doesn't make any sense. It even fails the
machine verifier (if you are running it) due to the wrong register
class. But it will make it through LLVM, assemble, and it *looks*
fine... But wow do you get a very unsual and surprising branch taken in
your actual code.

The fix is to actually check what kind of SETcc instruction we're
dealing with. Because there are a bunch of them, I just test the
may-store bit in the instruction. I've also added an assert for sanity
that ensure we are, in fact, *defining* the register operand. =D

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338481 91177308-0d34-0410-b5e6-96231b3b80d8

[x86/slh] Add unwind info to several tests to make it more obvious that
we aren't incorrectly generating any of it when doing SLH.

There was a bug that only occured with SLH that very much looked like it
could be caused by bad unwind info, and so this was a prime suspect.
Turns out that everything is fine, but this way we'll *see* if we end
up, for example, putting things we shouldn't inside the prolog.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338480 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Generate fixups as emitting DWARF .debug_line.

It is necessary to generate fixups in .debug_line as relaxation is
enabled due to the address delta may be changed after relaxation.

DWARF will record the mappings of lines and addresses in
.debug_line section. It will encode the information using special
opcodes, standard opcodes and extended opcodes in Line Number
Program. I use DW_LNS_fixed_advance_pc to encode fixed length
address delta and DW_LNE_set_address to encode absolute address
to make it possible to generate fixups in .debug_line section.

Differential Revision: https://reviews.llvm.org/D46850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338477 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][IRTranslator] Use RPO traversal when visiting blocks to translate.

Previously we were just visiting the blocks in the function in IR order, which
is rather arbitrary. Therefore we wouldn't always visit defs before uses, but
the translation code relies on this assumption in some places.

Only codegen change seen in tests is an elision of a redundant copy.

Fixes PR38396

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338476 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add clamp bit to dot intrinsics

Differential Revision: https://reviews.llvm.org/D49874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338470 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify selectELFSectionForGlobal by pulling out the entry size
determination for mergeable sections into a small static function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338469 91177308-0d34-0410-b5e6-96231b3b80d8

Tidy up logic around unique section name creation and remove a
mostly unused variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338468 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Clean up subtarget handling.

Call shouldOutlineFromFunctionByDefault, isFunctionSafeToOutlineFrom,
getOutliningType, and getMachineOutlinerMBBFlags using the correct
TargetInstrInfo. And don't create a MachineFunction for a function
declaration.

The call to getOutliningCandidateInfo is still a little weird, but at
least the weirdness is explicitly called out.

Differential Revision: https://reviews.llvm.org/D49880

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338465 91177308-0d34-0410-b5e6-96231b3b80d8

[PATCH] [SLC] Test simplification of pow() for vector types (NFC)

Add test case for the simplification of `pow()` for vector types that D50035
enables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338463 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r338354 "[ARM] Revert r337821"

Disable ARMCodeGenPrepare by default again. It is causing verifier
failues in V8 that look like:

  Duplicate integer as switch case
  switch i32 %trunc, label %if.end13 [
    i32 0, label %cleanup36
    i32 0, label %if.then8
  ], !dbg !4981
  i32 0
  fatal error: error in backend: Broken function found, compilation aborted!

I will continue reducing the test case and send it along.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338452 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix debug info tests after r338437.

After r338437, debug_ranges are no longer emitted. Previously, this was only
done for DWARF version 5 and above.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338448 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Support for .debug_addr (consumer)

This patch implements basic support for parsing
and dumping DWARFv5 .debug_addr section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338447 91177308-0d34-0410-b5e6-96231b3b80d8

[SLC] Refactor the simplication of pow() (NFC)

Reword comments and minor code reformatting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338446 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Make --strip-debug strip .gdb_index

Summary:
See binutils-gdb/bfd/elf.c, GNU objcopy also strips .stab* (STABS)
.line* (DWARF 1) .gnu.linkonce.wi.* (linkonce section for .debug_info) but
I'm not sure we need to be compatible with it.

Reviewers: dblaikie, alexshap, jakehehrlich, jhenderson

Reviewed By: alexshap, jakehehrlich

Subscribers: aprantl, JDevlieghere, jakehehrlich, llvm-commits

Differential Revision: https://reviews.llvm.org/D50100

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338443 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r338431: "Add DebugCounters to DivRemPairs"

This reverts r338431; the test it added is making buildbots unhappy.
Locally, I can repro the failure on reverse-iteration builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338442 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Do not create a .debug_ranges section when no ranges are needed.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D50089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338437 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Split amdgcn/r600 fminnum/fmaxnum tests

R600 breaks on too many things to usefully test changes
with ieee_mode on vs. off.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338435 91177308-0d34-0410-b5e6-96231b3b80d8

Add DebugCounters to DivRemPairs

For people who don't use DebugCounters, NFCI.

Patch by Zhizhou Yang!

Differential Revision: https://reviews.llvm.org/D50033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338431 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Update the help text to reflect "physical" registers. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338430 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Fix bad assert composition.

Use '&&' before the string instead of '||'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338429 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Correct pointer type used for stack slot

Correct the address space for the inserted argument
stack slot.

AMDGPU seems to not do anything with this information,
so I don't think this was breaking anything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338428 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Add coverage test for r338308 (Fixed crash in type merging)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338423 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Break 64-bit arguments into 32-bit pieces

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338421 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Split wide vectors of i16/f16 into 32-bit regs on calls

This improves code for the same reasons as scalarizing 32-bit
element vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338418 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Minimal support for S_UNAMESPACE records

Differential Revision: https://reviews.llvm.org/D50007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338417 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Scalarize vector argument types to calls

When lowering calling conventions, prefer to decompose vectors
into the constitute register types. This avoids artifical constraints
to satisfy a wide super-register.

This improves code quality because now optimizations don't need to
deal with the super-register constraint. For example the immediate
folding code doesn't deal with 4 component reg_sequences, so by
breaking the register down earlier the existing immediate folding
code is able to work.

This also avoids the need for the shader input processing code
to manually split vector types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338416 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][docs] Replace "temporary" with "physical registers". NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338415 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] WriteBSWAP sched classes are reg-reg only.

Don't declare them as X86SchedWritePair when the folded class will never be used.

Note: MOVBE (load/store endian conversion) instructions tend to have a very different behaviour to BSWAP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338412 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][docs] Improve the "How LLVM-MCA works" section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338410 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[DebugInfo] Generate DWARF debug information for labels."

This reverts commits r338390 and r338398, they were causing LSan
failures on the ASan bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338408 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151)

As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering.

Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW.

Differential Revision: https://reviews.llvm.org/D49562

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338407 91177308-0d34-0410-b5e6-96231b3b80d8

Make ICF log output order deterministic.

This patch does the same thing as r338153 for COFF.
Note that this patch affects only the order of log messages.
The output file is already deterministic.

Differential Revision: https://reviews.llvm.org/D50023

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338406 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][x86] Add 32-bit instruction resource tests

These aren't exhaustive, but cover some instructions that are only available in 32-bit mode (where would we be without good BCD math performance?).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338404 91177308-0d34-0410-b5e6-96231b3b80d8

Resubmit r338340 "[MS Demangler] Better demangling of template arguments."

This broke the build with GCC, but has since been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338403 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add pattern matching for PMADDUBSW

Summary:
Similar to D49636, but for PMADDUBSW. This instruction has the additional complexity that the addition of the two products saturates to 16-bits rather than wrapping around. And one operand is treated as signed and the other as unsigned.

A C example that triggers this pattern

```
static const int N = 128;

int8_t A[2*N];
uint8_t B[2*N];
int16_t C[N];

void foo() {
for (int i = 0; i != N; ++i)
C[i] = MIN(MAX((int16_t)A[2*i]*(int16_t)B[2*i] + (int16_t)A[2*i+1]*(int16_t)B[2*i+1], -32768), 32767);
}
```

Reviewers: RKSimon, spatel, zvi

Reviewed By: RKSimon, zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338402 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases that could use PMADDUBSW.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338401 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Preserve more liveness information in emitStackProbeInline

This commit fixes two issues with the liveness information after the
call:

1) The code always spills RCX and RDX if InProlog == true, which results
in an use of undefined phys reg.
2) FinalReg, JoinReg, RoundedReg, SizeReg are not added as live-ins to
the basic blocks that use them, therefore they are seen undefined.

https://llvm.org/PR38376

Differential Revision: https://reviews.llvm.org/D50020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338400 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Fix build failed in 'clang-cmake-armv8-full'.

Builder clang-cmake-armv8-full failed due to the assembly 'comment'
notation is not '#' in the target. So, I use CHECK-SAME to avoid to
check the comment notation in the same line in the test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338398 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Make slow walks shorter

Summary:
When DFS numbers are not yet calculated for a dominator tree, we have to walk it up to say whether one node dominates some other.

This patch makes the slow walks shorter by only walking until the level of the node we check against is reached. This is because a node cannot possibly dominate something higher in its tree.

When running opt with -O3, the patch results in:
* 25% fewer loop iterations for `opt` (fullLTO)
* 30% fewer loop iterations for sqlite

Reviewers: brzycki, asbirlea, chandlerc, NutshellySima, grosser

Reviewed By: NutshellySima

Subscribers: mehdi_amini, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D49955

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338396 91177308-0d34-0410-b5e6-96231b3b80d8

Fix InstCombine address space assert

Workaround bug where the InstCombine pass was asserting on the IR added in lit
test, where we have a bitcast instruction after a GEP from an addrspace cast.

The second bitcast in the test was getting combined into
`bitcast <16 x i32>* %0 to <16 x i32> addrspace(3)*`, which looks like it should
be an addrspace cast instruction instead. Otherwise if control flow is allowed
to continue as it is now we create a GEP instruction
`<badref> = getelementptr inbounds <16 x i32>, <16 x i32>* %0, i32 0`. However
because the type of this instruction doesn't match the address space we hit an
assert when replacing the bitcast with that GEP.

```
void llvm::Value::doRAUW(llvm::Value*, bool): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.
```

Differential Revision: https://reviews.llvm.org/D50058

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338395 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][docs] Always use `llvm-mca` in place of `MCA`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338394 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] regenerate checks and add tests for D50035; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338392 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo][LCSSA] Preserve debug location in lcssa phis

Summary:
When inserting lcssa Phi Nodes in the exit block
mak sure to preserve the original instructions DL.

Reviewers: vsk

Subscribers: JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D50009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338391 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Generate DWARF debug information for labels.

There are two forms for label debug information in DWARF format.

1. Labels in a non-inlined function:

DW_TAG_label
  DW_AT_name
  DW_AT_decl_file
  DW_AT_decl_line
  DW_AT_low_pc

2. Labels in an inlined function:

DW_TAG_label
  DW_AT_abstract_origin
  DW_AT_low_pc

We will collect label information from DBG_LABEL. Before every DBG_LABEL,
we will generate a temporary symbol to denote the location of the label.
The symbol could be used to get DW_AT_low_pc afterwards. So, we create a
mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase.
The DBG_LABEL in the mapping is used to query the symbol before it.

The AbstractLabels in DwarfCompileUnit is used to process labels in inlined
functions.

We also keep a mapping between scope and labels in DwarfFile to help to
generate correct tree structure of DIEs.

It also generates label debug information under global isel.

Differential Revision: https://reviews.llvm.org/D45556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338390 91177308-0d34-0410-b5e6-96231b3b80d8

Revert Enrich inline messages

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338389 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] auto-generate checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338388 91177308-0d34-0410-b5e6-96231b3b80d8

Enrich inline messages

Summary:
This patch improves Inliner to provide causes/reasons for negative inline decisions.
1. It adds one new message field to InlineCost to report causes for Always and Never instances. All Never and Always instantiations must provide a simple message.
2. Several functions that used to return the inlining results as boolean are changed to return InlineResult which carries the cause for negative decision.
3. Changed remark priniting and debug output messages to provide the additional messages and related inline cost.
4. Adjusted tests for changed printing.

Patch by: yrouban (Yevgeny Rouban)

Reviewers: craig.topper, sammccall, sgraenitz, NutshellySima, shchenz, chandlerc, apilipenko, javed.absar, tejohnson, dblaikie, sanjoy, eraman, xbolva00

Reviewed By: tejohnson, xbolva00

Subscribers: xbolva00, llvm-commits, arsenm, mehdi_amini, eraman, haicheng, steven_wu, dexonsmith

Differential Revision: https://reviews.llvm.org/D49412

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338387 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Remove README.txt

A detailed description of the tool has been recently added by Matt to
CommandGuide/llvm-mca.rst. File README.txt is now redundant and can be removed;
all the relevant user-guide information has been improved and then moved to
llvm-mca.rst.

In future, we should add another .rst for the "llvm-mca developer manual" to
provide infromation about:
- llvm-mca internals.
- How to add custom stages to the simulated pipeline.
- How to provide extra processor info in the scheduling model to improve the
analysis performed by llvm-mca.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338386 91177308-0d34-0410-b5e6-96231b3b80d8

[MemDep] Use PhiValuesAnalysis to improve alias analysis results

This is being done in order to make GVN able to better optimize certain inputs.
MemDep doesn't use PhiValues directly, but does need to notifiy it when things
get invalidated.

Differential Revision: https://reviews.llvm.org/D48489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338384 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Fold another Select with And/Or pattern

Summary: Proof: https://rise4fun.com/Alive/L5J

Reviewers: lebedev.ri, spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338383 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Fix PromoteFloatResult for fcanonicalize

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338382 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't handle FP16_TO_FP in isCanonicalized

This needs more special handling to do correctly.
Fixes test in subsequent commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338381 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Fix PR38339: Instruction does not dominate all uses!

Summary:
If the ExtractElement instructions can be optimized out during the
vectorization and we need to reshuffle the parent vector, this
ShuffleInstruction may be inserted in the wrong place causing compiler
to produce incorrect code.

Reviewers: spatel, RKSimon, mkuper, hfinkel, javed.absar

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D49928

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338380 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold undef fcanonicalize to qNaN

We could choose a free 0 for this, but this
matches the behavior for fmul undef, 1.0. Also,
the NaN use is more useful for folding use operations
although if it's not eliminated it is more expensive
in terms of code size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338376 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix test check line bugs

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338374 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Complete enumeration values for Tag_ABI_VFP_args

The LLD implementation of Tag_ABI_VFP_args needs to check the rarely seen
values of 3 (toolchain specific) and 4 compatible with both Base and VFP.
Add the missing enumeration values so that LLD can refer to them without
having to use the raw numbers.

Differential Revision: https://reviews.llvm.org/D50049

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338373 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.

This patch teaches llvm-mca how to identify dependency breaking instructions on
btver2.

An example of dependency breaking instructions is the zero-idiom XOR (example:
`XOR %eax, %eax`), which always generates zero regardless of the actual value of
the input register operands.
Dependency breaking instructions don't have to wait on their input register
operands before executing. This is because the computation is not dependent on
the inputs.

Not all dependency breaking idioms are also zero-latency instructions. For
example, `CMPEQ %xmm1, %xmm1` is independent on
the value of XMM1, and it generates a vector of all-ones.
That instruction is not eliminated at register renaming stage, and its opcode is
issued to a pipeline for execution. So, the latency is not zero.

This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis
interface. That method takes as input an instruction (i.e. MCInst) and a
MCSubtargetInfo.
The default implementation of isDependencyBreaking() conservatively returns
false for all instructions. Targets may override the default behavior for
specific CPUs, and return a value which better matches the subtarget behavior.

In future, we should teach to Tablegen how to automatically generate the body of
isDependencyBreaking from scheduling predicate definitions. This would allow us
to expose the knowledge about dependency breaking instructions to the machine
schedulers (and, potentially, other codegen passes).

Differential Revision: https://reviews.llvm.org/D49310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338372 91177308-0d34-0410-b5e6-96231b3b80d8

[ELF][ARM] Add Arm ABI names for float ABI ELF Header flags

The ELF for the Arm architecture document defines, for EF_ARM_EABI_VER5 and
above, the flags EF_ARM_ABI_FLOAT_HARD and EF_ARM_ABI_FLOAT_SOFT. These
have been defined to be compatible with the existing EF_ARM_VFP_FLOAT and
EF_ARM_SOFT_FLOAT used by gcc for EF_ARM_EABI_UNKNOWN.

This patch adds the flags in addition to the existing ones so that any code
depending on the old names will still work.

Differential Revision: https://reviews.llvm.org/D49992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338370 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r338365: [X86] Improved sched models for X86 BT*rr instructions.
https://reviews.llvm.org/D49243

Contains WIP code that should not have been included.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338369 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Improve decoding in case of instructions with four register operands.

Since z13, the max group size will be 2 if any μop has more than 3 register
sources.

This has been ignored sofar in the SystemZHazardRecognizer, but is now
handled by recognizing those instructions and adjusting the tracking of
decoding and the cost heuristic for grouping.

Review: Ulrich Weigand
https://reviews.llvm.org/D49847

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338368 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] simplify code for A & (A ^ B) --> A & ~B

This fold was written in an odd way and tried to avoid
an endless loop by bailing out on all constants instead
of the supposedly problematic case of -1. But (X & -1)
should always be simplified before we reach here, so I'm
not sure how that is a problem.

There were no tests for the commuted patterns, so I added
those at rL338364.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338367 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improved sched models for X86 BT*rr instructions.
https://reviews.llvm.org/D49243

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338365 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] move/add tests for xor+add fold; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338364 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improved sched models for X86 SHLD/SHRD* instructions.
Differential Revision: https://reviews.llvm.org/D9611

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338359 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] isFNEG - Use getTargetConstantBitsFromNode to handle all constant cases

isFNEG was duplicating much of what was done by getTargetConstantBitsFromNode in its own calls to getTargetConstantFromNode.

Noticed while reviewing D48467.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338358 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Allow automatically deducing the thumb instruction size for .inst

This matches GAS, that allows unsuffixed .inst for thumb.

Differential Revision: https://reviews.llvm.org/D49937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338357 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Support the .inst directive for MachO and COFF targets

Contrary to ELF, we don't add any markers that distinguish data generated
with .short/.long from normal instructions, so the .inst directive only
adds compatibility with assembly that uses it.

Differential Revision: https://reviews.llvm.org/D49936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338356 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Support the .inst directive for MachO and COFF targets

Contrary to ELF, we don't add any markers that distinguish data generated
with .long from normal instructions, so the .inst directive only adds
compatibility with assembly that uses it.

Differential Revision: https://reviews.llvm.org/D49935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338355 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Revert r337821

Re-enabling ARMCodeGenPrepare by default after failing to reproduce
the bootstrap issues that I was concerned it was causing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338354 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338352 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] tests for D48828, D49981: fold extraction from std::pair

Minor touch up in the previous comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338351 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] tests for D48828, D49981: fold extraction from std::pair

Updated unit tests for D48828 and D49981.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338350 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Collect statistics in GuardWidening

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338348 91177308-0d34-0410-b5e6-96231b3b80d8

[VPlan] Introduce VPLoopInfo analysis.

The patch introduces loop analysis (VPLoopInfo/VPLoop) for VPBlockBases.
This analysis will be necessary to perform some H-CFG transformations and
detect and introduce regions representing a loop in the H-CFG.

Reviewers: fhahn, rengolin, mkuper, hfinkel, mssimpso

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D48816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338346 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r338340 "[MS Demangler] Better demangling of template arguments."

Breaks the build with GCC, apparently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338344 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Stop accidentally running the Bonnell LEA fixup path on Goldmont.

In one place we checked X86Subtarget.slowLEA() to decide if the pass should run. But to decide what the pass should we only check isSLM. This resulted in Goldmont going down the Bonnell path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338342 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Fixed test case failure due to r338047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338341 91177308-0d34-0410-b5e6-96231b3b80d8

[MS Demangler] Better demangling of template arguments.

This patch fixes demangling of template aliases as template-template
arguments, and also fixes function pointers and references as
not type template parameters. All of these can be properly
demangled now, so I've ported over the test
clang/test/CodeGenCXX/ms-template-callbacks.cpp. All of these
tests pass

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338340 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Add isel support for G_BLOCK_ADDR.

Also refactors some existing code to materialize addresses for the large code
model so it can be shared between G_GLOBAL_VALUE and G_BLOCK_ADDR.

This implements PR36390.

Differential Revision: https://reviews.llvm.org/D49903

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338337 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Make G_BLOCK_ADDR legal.

Differential Revision: https://reviews.llvm.org/D49902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338336 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Add a G_BLOCK_ADDR opcode to handle IR blockaddress constants.

Differential Revision: https://reviews.llvm.org/D49900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338335 91177308-0d34-0410-b5e6-96231b3b80d8