granicus.if.org Git

[ConstantRange] fix doxygen comment formatting; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300554 91177308-0d34-0410-b5e6-96231b3b80d8

Make globalaa-retained.ll test catching more cases.

Summary:
* Add checks for store. That is needed because GlobalsAA is called
  twice in the current pipeline with different sets of Function passes
  following it. However, the loads are eliminated using instcombine
  which happens everywhere. On the other hand, DeadStoreElimination is
  performed only once so by checking for store we'll be able to catch
  more cases when GlobalsAA is invalidated unintentionally.
* Add empty function above/below the test so that we don't depend on
  the relative order of instcombine/dead-store-elimination and the
  pass that invalidates the analysis (inside the same
  FunctionPassManager).

Reviewers: kristof.beyls

Reviewed By: kristof.beyls

Subscribers: llvm-commits, n.bozhenov

Differential Revision: https://reviews.llvm.org/D32015
Patch by Andrei Elovikov <andrei.elovikov@intel.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300553 91177308-0d34-0410-b5e6-96231b3b80d8

[GVNHoist] Mark GlobalsAA as preserved by GVNHoist.

Reviewers: sebpop, hiraditya

Reviewed By: sebpop

Subscribers: n.bozhenov, llvm-commits

Differential Revision: https://reviews.llvm.org/D32158
Patch by Andrei Elovikov <andrei.elovikov@intel.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300552 91177308-0d34-0410-b5e6-96231b3b80d8

Add store Merge test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300551 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add hardware build attributes in assembler

In the assembler, we should emit build attributes based on the target
selected with command-line options. This matches the GNU assembler's
behaviour. We only do this for build attributes which describe the
hardware that is expected to be available, not the ones that describe
ABI compatibility.

This is done by moving some of the attribute emission code to
ARMTargetStreamer, so that it can be shared between the assembly and
code-generation code paths. Since the assembler only creates a
MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to
check raw features, and not use the convenience functions in
ARMSubtarget.

If different attributes are later specified using the .eabi_attribute
directive, then they will take precedence, as happens when the same
.eabi_attribute is specified twice.

This must be enabled by an option, because we don't want to do this when
parsing inline assembly. The attributes would match the ones emitted at
the start of the file, so wouldn't actually change the emitted object
file, but the extra directives would be added to every inline assembly
block when emitting assembly, which we'd like to avoid.

The majority of the changes in the build-attributes.ll test are just
re-ordering the directives, because the hardware attributes are now
emitted before the ABI ones. However, I did fix one bug which I spotted:
Tag_CPU_arch_profile was not being emitted for v6M.

Differential revision: https://reviews.llvm.org/D31812

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300547 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Add support for G_SUB

Support G_SUB throughout the GlobalISel pipeline. It is exactly the same
as G_ADD, nothing fancy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300546 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleProfile] Don't assert when printing the DebugLoc of a branch. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300544 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleProfile] Skip intrinsic calls when visiting callsites in InlineHotFunctions.

Before this patch, we always called method 'findCalleeFunctionSamples()' on
intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable
candidates for obvious reasons.

No functional change intended.

Differential Revision: https://reviews.llvm.org/D32008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300541 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[GlobalISel] Support vector-of-pointers in LLT"

This reverts r300535 and r300537.
The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
produces slightly different code between LLVM versions being built with different compilers.
E.g., dependent on the compiler LLVM is built with, either one of the following
can be produced:

remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement)
remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement)

Non-determinism like this is clearly a bad thing, so reverting this until
I can find and fix the root cause of the non-determinism.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300538 91177308-0d34-0410-b5e6-96231b3b80d8

Fix gcc build after r300535.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300537 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Check for correct HW div when lowering divmod

For subtargets that use the custom lowering for divmod, e.g. gnueabi,
we used to check if the subtarget has hardware divide and then lower to
a div-mul-sub sequence if true, or to a libcall if false.

However, judging by the usage of hasDivide vs hasDivideInARMMode, it
seems that hasDivide only refers to Thumb. For instance, in the
ARMTargetLowering constructor, the code that specifies whether to use
libcalls for (S|U)DIV looks like this:

bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
: Subtarget->hasDivideInARMMode();

In the case of divmod for arm-gnueabi, using only hasDivide() to
determine what to do means that instead of lowering to __aeabi_idivmod
to get the remainder, we lower to div-mul-sub and then further lower the
div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
not in Thumb, we generate a libcall instead of using it (this is not an
issue in practice since AFAICT none of the cores that we support have
hardware divide in ARM but not Thumb).

This patch fixes the code dealing with custom lowering to take into
account the mode (Thumb or ARM) when deciding whether or not hardware
division is available.

Differential Revision: https://reviews.llvm.org/D32005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300536 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Support vector-of-pointers in LLT

This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality. Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300535 91177308-0d34-0410-b5e6-96231b3b80d8

test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300532 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Cleanup the reverseBits slow case a little.

Use lshrInPlace. Use single bit extract and operator|=(uint64_t) to avoid a few temporary APInts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300527 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Make operator<<= shift in place. Improve the implementation of tcShiftLeft and use it to implement operator<<=.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300526 91177308-0d34-0410-b5e6-96231b3b80d8

PR32382: Fix emitting complex DWARF expressions.

The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
  - describe a variable in a register
  - consist of only a DW_OP_reg
2. Memory location descriptions
  - describe the address of a variable
3. Implicit location descriptions
  - describe the value of a variable
  - end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident.  This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
  DBG_VALUE, [RBP-8]                        --> DW_OP_fbreg -8
  DBG_VALUE, RAX                            --> DW_OP_reg0 +0
  DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300522 91177308-0d34-0410-b5e6-96231b3b80d8

Add const to a const method. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300520 91177308-0d34-0410-b5e6-96231b3b80d8

[Target] Use hasOneUse() instead of getNumUses().

The latter does a liner scan over a linked list, therefore is
much more expensive.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300518 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Shrink the size of irsymtab::Symbol by a word. NFCI.

Instead of storing an UncommonIndex on the Symbol, use a flag bit to store
whether the Symbol has an Uncommon. This shrinks Chromium's .bc files (after
D32061) by about 1%.

Differential Revision: https://reviews.llvm.org/D32070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300514 91177308-0d34-0410-b5e6-96231b3b80d8

Build SymbolMap in SampleProfileLoader to help matchin function names with suffix.

Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline.

Reviewers: davidxl, dnovillo, tejohnson

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300507 91177308-0d34-0410-b5e6-96231b3b80d8

Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mir

Differential Revision: https://reviews.llvm.org/D32037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300506 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Use hasNUses instead of comparing getNumUses to a constant."

The use list is a linked list so getNumUses requires a linear scan through the whole list. hasNUses will stop scanning at N and see if that is the end.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300505 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Merge the multiword code from lshrInPlace and tcShiftRight into a single implementation

This merges the two different multiword shift right implementations into a single version located in tcShiftRight. lshrInPlace now calls tcShiftRight for the multiword case.

I retained the memmove fast path from lshrInPlace and used a memset for the zeroing. The for loop is basically tcShiftRight's implementation with the zeroing and the intra-shift of 0 removed.

Differential Revision: https://reviews.llvm.org/D32114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300503 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix WebAssemblyOptimizeReturned after r300367

Summary:
Refactoring changed paramHasAttr(1 + i) to paramHasAttr(0), fix that to
paramHasAttr(i).
Add more tests to WebAssemblyOptimizeReturned that catch that
regression.

Reviewers: dschuff

Subscribers: jfb, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D32136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300502 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Fix another unused variable warning in release builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300500 91177308-0d34-0410-b5e6-96231b3b80d8

Fix an unused variable error in rL300494.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300499 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] experimental option -cleanse_crash: tries to replace all bytes in a crash reproducer with garbage, while still preserving the crash

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300498 91177308-0d34-0410-b5e6-96231b3b80d8

Add a linker script to version LLVM symbols

Summary:
This patch adds a very simple linker script to version the lib's symbols
and thus trying to avoid crashes if an application loads two different
LLVM versions (as long as they do not share data between them).

Note that we deliberately *don't* make LLVM_5.0 depend on LLVM_4.0:
they're incompatible and the whole point of this patch is
to tell the linker that.

Avoid unexpected crashes when two LLVM versions are used in the same process.

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Author: Lisandro Damían Nicanor Pérez Meyer <lisandro@debian.org>
Author: Sylvestre Ledru <sylvestre@debian.org>
Bug-Debian: https://bugs.debian.org/848368

Reviewers: beanz, rnk

Reviewed By: rnk

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D31524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300496 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Matchers work with both ConstExpr and Instructions.

So, `cast<Instruction>` is not guaranteed to succeed. Change the
code so that we create a new constant and use it in the newly
created instruction, as it's done in other places in InstCombine.

OK'ed by Sanjay/Craig. Fixes PR32686.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300495 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent
the exponential behavior.

The patch is to fix PR32043. Functions getZeroExtendExpr and getSignExtendExpr
may call themselves recursively more than once. This is potentially a 2^N
complexity behavior. The exponential behavior was not commonly exposed before
because of existing global cache mechnism like UniqueSCEVs or some early return
mechanism when flags FlagNSW or FlagNUW are seen. However, we still have case
which can expose the exponential behavior, like the case in PR32043, so we add
a local cache in getZeroExtendExpr and getSignExtendExpr. If the input of the
functions -- SCEV and type pair have been seen before, we can find the extended
expression directly in the local cache.

Differential Revision: https://reviews.llvm.org/D30350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300494 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add/move tests for (icmp X, C1 & icmp X, C2); NFC

We simplify based on range intersection, but we're missing folds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300493 91177308-0d34-0410-b5e6-96231b3b80d8

Update the test to fix the buildbot failure introduced by r300486 (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300492 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Encode block signatures as SLEB instead of ULEB

Use SLEB (varint) for block_type immediates in accordance with the spec.

Patch by Yury Delendik

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300490 91177308-0d34-0410-b5e6-96231b3b80d8

Add GNU_discriminator support for inline callsites in llvm-symbolizer.

Summary: LLVM symbolize cannot recognize GNU_discriminator for inline callsites. This patch adds support for it.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32134

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300486 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use MachineRegisterInfo to find max used register

Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.

Also fixes some counting errors with flat_scr when there
are no stack objects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300482 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Change stack alignment

While the incoming stack for a kernel is 256-byte aligned,
this refers to the base address of the entire wave. This isn't
useful information for most of codegen. Fixes unnecessarily
aligning stack objects in callees.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300481 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGenPrepare] Fix crash due to an invalid CFG

The splitIndirectCriticalEdges function generates and invalid CFG when the
'Target' basic block is a loop to itself. When this occurs, the code that
updates the predecessor terminator needs to update the terminator in the split
basic block.

This occurs when there is an edge from block D back to D. Since D is split in
to D0 and D1, the code needs to update the terminator in D1. But D1 is not in
the OtherPreds vector, so it was not getting updated.

Differential Revision: https://reviews.llvm.org/D32126

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300480 91177308-0d34-0410-b5e6-96231b3b80d8

Unbreak build of the wasm backend after r300463.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300479 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Add missing build dep to fix shlib build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300478 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Remove self move check from move assignment operator

This was added to work around a bug in MSVC 2013's implementation of stable_sort. That bug has been fixed as of MSVC 2015 so we shouldn't need this anymore.

Technically the current implementation has undefined behavior because we only protect the deleting of the pVal array with the self move check. There is still a memcpy of that.VAL to VAL that isn't protected. In the case of self move those are the same local and memcpy is undefined for src and dst overlapping.

This reduces the size of the opt binary on my local x86-64 build by about 4k.

Differential Revision: https://reviews.llvm.org/D32116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300477 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Implement DataLayout::getPointerTypeSizeInBits using getPointerSizeInBits directly

Currently we use getTypeSizeInBits which contains a switch statement to dispatch based on what the Type is. We know we always have a pointer type here, but the compiler isn't able to figure out that out to remove the switch.

This patch changes it to just call handle the pointer type directly by calling getPointerSizeInBits without going through a switch.

getPointerTypeSizeInBits is called pretty often, particularly by getOrEnforceKnownAlignment which is used by InstCombine. This should speed that up a little bit.

Differential Revision: https://reviews.llvm.org/D31841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300475 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: put nonlazybind special handling behind a flag for now.

It's basically a terrible idea anyway but objc_msgSend gets emitted like that.
We can decide on a better way to deal with it in the unlikely event that anyone
actually uses it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300474 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Test handling of R_AMDGPU_ABS64 in RelocVisitor

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300472 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Put the Use list waymarking bits in the bit positions documentation says they are using

The documentation for the waymarking algorithm says that we use the lower 2 bits of Use::Prev to store the way marking bits. But because we use a PointerIntPair with the default PointerLikeTypeTraits, we're using bits 2:1 on 64-bit targets.

There's also a trick employed for distinguishing Users that have Uses stored with them and Users that have Uses stored in a separate array. The documentation says we use the LSB of the first byte of the real User object or the User* that occurs at the end of the Use array. But again due to the PointerLikeTypeTraits we're really using bit 2(64-bit) or bit 1(32-bit) and not the LSB. This is a little worrying because the first byte of the User object is the vtable ptr so we're assuming the vtable has 8 byte or 4 byte alignment where what is documented would only require 2 byte alignment.

This patch provides a custom traits override for these two cases to put the bits where the documentation says they are. It also has the side effect of removing some shifts from the waymarking traversal implementation.

Differential Revision: https://reviews.llvm.org/D31733

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300471 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Set CodePointerSize to 8 for amdgcn

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300470 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Use offset+size as the irsymtab string representation.

This is consistent with the bitcode string table.

Differential Revision: https://reviews.llvm.org/D31922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300465 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Add a string table to the bitcode format.

Add a top-level STRTAB block containing a string table blob, and start storing
strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in
the string table.

This change allows us to share names between globals and comdats as well
as between modules, and improves the efficiency of loading bitcode files by
no longer using a bit encoding for symbol names. Once we start writing the
irsymtab to the bitcode file we will also be able to share strings between
it and the module.

On my machine, link time for Chromium for Linux with ThinLTO decreases by
about 7% for no-op incremental builds or about 1% for full builds. Total
bitcode file size decreases by about 3%.

As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html

Differential Revision: https://reviews.llvm.org/D31838

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300464 91177308-0d34-0410-b5e6-96231b3b80d8

Distinguish between code pointer size and DataLayout::getPointerSize() in DWARF info generation

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300463 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: support nonlazybind

It's almost certainly not a good idea to actually use it in most cases (there's
a pretty large code size overhead on AArch64), but we can't do those
experiments until it's supported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300462 91177308-0d34-0410-b5e6-96231b3b80d8

Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in place of isNegative in known bits tracking.

This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300457 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: SimplifyDemandedElts for image intrinsics

Causes some VGPR usage improvements in shaderdb, but
introduces some SGPR spilling regressions due to random
scheduling changes later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300453 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Don't insert tokens into the worklist at all.

We're gonna skip them anyway, so there's no point in inserting them
in the first place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300452 91177308-0d34-0410-b5e6-96231b3b80d8

Introducing LLVMMetadataRef

Summary:
This seems like an uncontroversial first step toward providing access to the metadata hierarchy that now exists in LLVM. This should allow for good debug info support from C.

Future plans are to deprecate API that take mixed bags of values and metadata (mainly the LLVMMDNode family of functions) and migrate the rest toward the use of LLVMMetadataRef.

Once this is in place, mapping of DIBuilder will be able to start.

Reviewers: mehdi_amini, echristo, whitequark, jketema, Wallbraker

Reviewed By: Wallbraker

Subscribers: Eugene.Zelenko, axw, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D19448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300447 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPeeling] Get rid of Phis that become invariant after N steps

This patch is a generalization of the improvement introduced in rL296898.
Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes
an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after
N iterations, we can peel N times and turn it into invariant.
In order to do this, we for every Phi in loop's header we define the Invariant Depth value
which is calculated as follows:

Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge].

If %y is a loop invariant, then Depth(%x) = 1.
If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1.
Otherwise, Depth(%x) is infinite.
Notice that if we peel a loop, all Phis with Depth = 1 become invariants,
and all other Phis with finite depth decrease the depth by 1.
Thus, peeling N first iterations allows us to turn all Phis with Depth <= N
into invariants.

Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300446 91177308-0d34-0410-b5e6-96231b3b80d8

[BPI] NFC: reorder ifs to bail out earlier

This is non-functional change to re-order if statements to bail out earlier
from unreachable and ColdCall heuristics.

Reviewers: sanjoy, reames, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300442 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPeeling] Fix condition for phi-eliminating peeling

When peeling loops basing on phis becoming invariants, we make a wrong loop size check.
UP.Threshold should be compared against the total numbers of instructions after the transformation,
which is equal to 2 * LoopSize in case of peeling one iteration.
We should also check that the maximum allowed number of peeled iterations is not zero.

Reviewers: sanjoy, anna, reames, mkuper

Reviewed By: mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300441 91177308-0d34-0410-b5e6-96231b3b80d8

[BPI] Use metadata info before any other heuristics

Metadata potentially is more precise than any heuristics we use, so
it makes sense to use first metadata info if it is available. However it makes
sense to examine it against other strong heuristics like unreachable one.
If edge coming to unreachable block has higher probability then it is expected
by unreachable heuristic then we use heuristic and remaining probability is
distributed among other reachable blocks equally.

An example where metadata might be more strong then unreachable heuristic is
as follows: it is possible that there are two branches and for the branch A
metadata says that its probability is (0, 2^25). For the branch B
the probability is (1, 2^25).
So the expectation is that first edge of B is hotter than first edge of A
because first edge of A did not executed at least once.
If first edge of A points to the unreachable block then using the unreachable
heuristics we'll set the probability for A to (1, 2^20) and now edge of A
becomes hotter than edge of B.
This is unexpected behavior.

This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214

Reviewers: sanjoy, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits, reames, davidxl

Differential Revision: https://reviews.llvm.org/D30631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300440 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify 1/X for vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300439 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test cases for missing support for simplifying 1/X for vectors. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300438 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add support for vector srem->urem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300437 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add missing testcases for srem->urem conversion. The vector version isn't currently supported. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300436 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add support for turning vector sdiv into udiv.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300435 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test cases for missing support for turning vector sdiv into udiv. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300434 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Simplify a loop. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300433 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][ValueTracking] When computing known bits for Srem make sure we don't compute known bits for the LHS twice.

If we already called computeKnownBits for the RHS being a constant power of 2, we've already computed everything we can and should just stop. I think previously we would still recurse if we had determined the result was negative or had not determined the sign bit at all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300432 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Fix non-determinism due to iterating over a SmallPtrSet.

Use a SmallSetVector instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300431 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] In SimplifyDemandedUseBits, don't bother to mask known bits of constants with DemandedMask.

Just because we didn't demand them doesn't mean they aren't known.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300430 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove special handling for 16 bit for A asm constraints.

Our 16 bit support is assembler-only + the terrible hack that is
.code16gcc. Simply using 32 bit registers does the right thing for the
latter.

Fixes PR32681.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300429 91177308-0d34-0410-b5e6-96231b3b80d8

MemorySSA: Stop tracking def-or-use blocks.

The tracking is unused, since MemoryPhis are not pruned as of r282419.

Differential Revision: https://reviews.llvm.org/D32121

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300428 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] improve getTrue/getFalse; NFCI

The ConstantInt version has the same assert, and using null/allOnes is likely less efficient.
The only advantage of these local variants (and there's probably a better way to achieve this?)
is to save typing "ConstantInt::" over and over.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300426 91177308-0d34-0410-b5e6-96231b3b80d8

Garbage collect HAVE_EXECINFO_H from config.h.cmake after r300062. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300425 91177308-0d34-0410-b5e6-96231b3b80d8

[Constants] simplify get true/false code; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300424 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][X86 intrinsics]Folding cmp(sub(a,b),0) into cmp(a,b) optimization

This patch adds new optimization (Folding cmp(sub(a,b),0) into cmp(a,b))
to instCombineCall pass and was written specific for X86 CMP intrinsics.

Differential Revision: https://reviews.llvm.org/D31398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300422 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Fix a bug in lshr by a value more than 64 bits above the bit width.

This was throwing an assert because we determined the intra-word shift amount by subtracting the size of the full word shift from the total shift amount. But we failed to account for the fact that we clipped the full word shifts by total words first. To fix this just calculate the intra-word shift as the remainder of dividing by bits per word.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300405 91177308-0d34-0410-b5e6-96231b3b80d8

Use correct registers for "A" inline asm constraint

Summary:
In PR32594, inline assembly using the 'A' constraint on x86_64 causes
llvm to crash with a "Cannot select" stack trace. This is because
`X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A'
means the EAX and EDX registers.

However, on x86_64 it means the RAX and RDX registers, and on 16-bit x86
(ia16?) it means the old AX and DX registers.

Add new register classes in `X86RegisterInfo.td` to support these cases,
and amend the logic in `getRegForInlineAsmConstraint` to cope with
different subtargets. Also add a test case, derived from PR32594.

Reviewers: craig.topper, qcolombet, RKSimon, ab

Reviewed By: ab

Subscribers: ab, emaste, royger, llvm-commits

Differential Revision: https://reviews.llvm.org/D31902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300404 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow (X != C1 && X != C2) and similar patterns to match splat vector constants

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300402 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests to show missing transforms for vectors; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300401 91177308-0d34-0410-b5e6-96231b3b80d8

Tidy checking for the soft float attribute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300394 91177308-0d34-0410-b5e6-96231b3b80d8

Cache the DataLayout rather than looking it up frequently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300393 91177308-0d34-0410-b5e6-96231b3b80d8

[ProfileData] Unify getInstrProf*SectionName helpers

This is a version of D32090 that unifies all of the
`getInstrProf*SectionName` helper functions. (Note: the build failures
which D32090 would have addressed were fixed with r300352.)

We should unify these helper functions because they are hard to use in
their current form. E.g we recently introduced more helpers to fix
section naming for COFF files. This scheme doesn't totally succeed at
hiding low-level details about section naming, so we should switch to an
API that is easier to maintain.

This is not an NFC commit because it fixes llvm-cov's testing support
for COFF files (this falls out of the API change naturally). This is an
area where we lack tests -- I will see about adding one as a follow up.

Testing: check-clang, check-profile, check-llvm.

Differential Revision: https://reviews.llvm.org/D32097

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300381 91177308-0d34-0410-b5e6-96231b3b80d8

Generalize SCEV's unit testing helper a bit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300379 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] MakeAnd/Or/Xor handling to reuse previous APInt computations

When checking if we should return a constant, we create some temporary APInts to see if we know all bits. But the exact computations we do are needed in several other locations in the same code.

This patch moves them to named temporaries so we can reuse them.

Ideally we'd write directly to KnownZero/One, but we currently seem to only write those variables after all the simplifications checks and I didn't want to change that with this patch.

Differential Revision: https://reviews.llvm.org/D32094

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300376 91177308-0d34-0410-b5e6-96231b3b80d8

[RDF] No longer ignore implicit defs or uses on any instructions

This used to be a Hexagon-specific treatment, but is no longer needed
since it's switched to subregister liveness tracking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300369 91177308-0d34-0410-b5e6-96231b3b80d8

[RDF] Correctly enumerate reg units for reg masks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300368 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Make paramHasAttr to use arg indices instead of attr indices

This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300367 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] more trophies

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300366 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Improve readobj and nm support for wasm

Now that the libObect support for wasm is better we can
have readobj and nm produce more useful output too.

Differential Revision: https://reviews.llvm.org/D31514

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300365 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] (X != C1 && X != C2) --> (X | (C1 ^ C2)) != C2
...when C1 differs from C2 by one bit and C1 <u C2:
http://rise4fun.com/Alive/Vuo

And move related folds to a helper function. This reduces code duplication and
will make it easier to remove the scalar-only restriction as a follow-up step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300364 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Support folding a subtract with a constant LHS into a phi node

We currently only support folding a subtract into a select but not a PHI. This fixes that.

I had to fix an assumption in FoldOpIntoPhi that assumed the PHI node was always in operand 0. Now we pass it in like we do for FoldOpIntoSelect. But we still require some dancing to find the Constant when we create the BinOp or ConstantExpr. This is based code is similar to what we do for selects.

Since I touched all call sites, this also renames FoldOpIntoPhi to foldOpIntoPhi to match coding standards.

Differential Revision: https://reviews.llvm.org/D31686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300363 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] set read_only access qualifier for pointers

If a kernel's pointer argument is known to be readonly
set access qualifier accordingly. This allows RT not to
flush caches before dispatches.

Differential Revision: https://reviews.llvm.org/D32091

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300362 91177308-0d34-0410-b5e6-96231b3b80d8

[Test commit] Cleanup some whitespace in a test file

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300361 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Regenerate test checks using script. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300360 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add/move tests for and/or-of-icmps equality folds; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300357 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Avoid undefined behavior in unittest by not making a named ArrayRef from a std::initializer_list

One of the ValueTracking unittests creates a named ArrayRef initialized by a std::initializer_list. The underlying array for an std::initializer_list is only guaranteed to have a lifetime as long as the initializer_list object itself. So this can leave the ArrayRef pointing at an array that no long exists.

This fixes this to just create an explicit array instead of an ArrayRef.

Differential Revision: https://reviews.llvm.org/D32089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300354 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor SimplifyUsingDistributiveLaws to more explicitly skip code when LHS/RHS aren't BinaryOperators

Currently this code always makes 2 or 3 calls to tryFactorization regardless of whether the LHS/RHS are BinaryOperators. We make 3 calls when both operands are BinaryOperators with the same opcode. Or surprisingly, when neither are BinaryOperators. This is because getBinOpsForFactorization returns Instruction::BinaryOpsEnd when the operand is not a BinaryOperator. If both LHS and RHS are not BinaryOperators then they both have an Opcode of Instruction::BinaryOpsEnd. When this happens we rely on tryFactorization to early out due to A/B/C/D being null. Similar behavior occurs for the other calls, we rely on getBinOpsForFactorization having made A/B or C/D null to get tryFactorization to early out.

We also rely on these null checks to check the result of getIdentityValue and early out for it.

This patches refactors this to pull these checks up to SimplifyUsingDistributiveLaws so we don't rely on BinaryOpsEnd as a sentinel or this A/B/C/D null behavior. I think this makes this code easier to reason about. Should also give a tiny performance improvement for cases where the LHS or RHS isn't a BinaryOperator.

Differential Revision: https://reviews.llvm.org/D31913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300353 91177308-0d34-0410-b5e6-96231b3b80d8

[Profile] Make host tool aware of object format when quering prof section names

Differential Revision: https://reviews.llvm.org/D32073

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300352 91177308-0d34-0410-b5e6-96231b3b80d8

Update tests for the patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300351 91177308-0d34-0410-b5e6-96231b3b80d8

Use range-for in a few places

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300350 91177308-0d34-0410-b5e6-96231b3b80d8

Rewrite SCEV Normalization using SCEVRewriteVisitor; NFC

Removes all of the boilerplate, cache management etc. from
ScalarEvolutionNormalization, and keeps only the interesting bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300349 91177308-0d34-0410-b5e6-96231b3b80d8

Make SCEVRewriteVisitor smarter about when it trys to create SCEVs

This change really saves just one foldingset lookup, but makes
SCEVRewriteVisitor "feature compatible" with the handwritten logic in
ScalarEvolutionNormalization, so that I can change
ScalarEvolutionNormalization to use SCEVRewriteVisitor in a next step.

This is a non-functional change, but _may_ improve performance in some
pathological cases, but that's unlikely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300348 91177308-0d34-0410-b5e6-96231b3b80d8