granicus.if.org Git

[GlobalISel] Support vector-of-pointers in LLT

This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality. Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300664 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Remove non-determinism from IRTranslator.

This showed up in r300535/r300537, which were reverted in r300538 due to
some of the introduced tests in there failing on some bots, due to the
non-determinism fixed in this commit.

Re-committing r300535/r300537 will add 2 tests for the change in this
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300663 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r300657 due to crashes in stage2 of bootstraps:
http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/2476/steps/build-stage2-LLVMgold.so/logs/stdio
http://bb.pgr.jp/builders/clang-3stage-x86_64-linux/builds/15036/steps/build_llvmclang/logs/stdio

I've updated the commit thread, reverting to get the bots back to green.

Original commit summary:
[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300662 91177308-0d34-0410-b5e6-96231b3b80d8

[JumpThread] We want to fold (not thread) when all predecessor go to single BB's successor. .

Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread).

Reviewers: efriedma, sanjoy

Reviewed By: sanjoy

Subscribers: dberlin, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D30869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300657 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanup some GraphTraits iteration code

Use children<> and nodes<> in appropriate places to cleanup the code.

Also, as part of the cleanup,
change the signature of DominatorTreeBase's Split.
It is a protected non-virtual member function called only twice,
both from within the class,
and the removed passed argument in both cases is '*this'.
The reason for the existence of that argument seems to be that
back before r43115 Split was a free function,
so an argument to get '*this' was needed - but now that is no longer the
case.

Patch by Yoav Ben-Shalom!

Differential Revision: https://reviews.llvm.org/D32118

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300656 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: Use methods to access data stored with frame instructions

In r300196 several methods were added to TarfetInstrInfo to access
data stored with call frame setup/destroy instructions. This change
replaces calls to getOperand with calls to such special methods in
ARM target.

Differential Revision: https://reviews.llvm.org/D32127

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300655 91177308-0d34-0410-b5e6-96231b3b80d8

Remove buggy 'addAttributes(unsigned, AttrBuilder)' overload

The 'addAttributes(unsigned, AttrBuilder)' overload delegated to 'get'
instead of 'addAttributes'.

Since we can implicitly construct an AttrBuilder from an AttributeSet,
just standardize on AttrBuilder.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300651 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] update -help: mention -exact_artifact_path in help for -minimize_crash and -cleanse_crash

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300642 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Migrate to new MCAsmInfo CodePointerSize

Reviewers: dylanmckay, rengolin, kzhuravl, jroelofs

Reviewed By: kzhuravl, jroelofs

Subscribers: kzhuravl, llvm-commits

Differential Revision: https://reviews.llvm.org/D32154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300641 91177308-0d34-0410-b5e6-96231b3b80d8

ARMFrameLowering: Reserve emergency spill slot for large arguments

We need to reserve an emergency spill slot in cases with large argument
types that could overflow immediate offsets for FP relative address
calculations.

rdar://31317893

Differential Revision: https://reviews.llvm.org/D31643

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300639 91177308-0d34-0410-b5e6-96231b3b80d8

[DataLayout] Removed default value from a variable that isn't used without being overwritten. Make variable an enum instead of an int to avoid a cast later. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300634 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay][tools] Fix yaml matching to be more permissive

Account for a potentially empty function name.

Follow-up to D32153.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300631 91177308-0d34-0410-b5e6-96231b3b80d8

Allow suppressing host and target info in VersionPrinter

Summary:
VersionPrinter by default outputs information about the Host CPU
and Default target. Printing this information requires linking in
a large amount of data, such as supported target triples as C
strings, which in turn bloats the binary size.

Enable a new CMake option LLVM_VERSION_PRINTER_SHOW_HOST_TARGET_INFO
which controls printing of the host and target info. This allows
the target triple names to be dead-code stripped. This is a nice
win for LLVM clients that wish to minimize their binary size, such
as graphics drivers.

By default this is ON, so there is no change in the default behavior.
Clients who wish to suppress this printing can do so by setting this
option to off via CMake.

A test app on Linux that uses ParseCommandLineOptions() shows a binary
size reduction of 23KB (from 149K to 126K) for a Release build, and 24KB
(from 135K to 111K) in a MinSizeRel build.

Reviewers: klimek, beanz, bogner, chandlerc, compnerd

Reviewed By: compnerd

Patch by pammon (Peter Ammon) !

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300630 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Fix the build

'PointerSize' was renamed to 'CodePointerSize'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300629 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay][tools] Add option to llvm-xray extract to symbolize functions

Summary:
This allows us to, if the symbol names are available in the binary, be
able to provide the function name in the YAML output.

Reviewers: dblaikie, pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300624 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] Optimize APInt creation in getSignedMax/getSignedMin.

We were creating an APInt at the top of these methods that isn't always returned. For ranges wider than 64-bits this results in an allocation and deallocation when its not used.

In getSignedMax we were creating Upper-1 to use in a compare and then creating it again for a return value. The compiler is unable to determine that these can be shared. So help it out and create the Upper-1 in a temporary that can be reused.

This provides a little compile time improvement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300621 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for potential andn optimization; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300617 91177308-0d34-0410-b5e6-96231b3b80d8

Fix crash in AttributeList::addAttributes, add test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300614 91177308-0d34-0410-b5e6-96231b3b80d8

Add a getPointerOperandType() helper to LoadInst and StoreInst; NFC

I will use this in a later change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300613 91177308-0d34-0410-b5e6-96231b3b80d8

[MemoryBuiltins] Add isMallocOrCallocLikeFn so BasicAA can check for both at the same time

BasicAA wants to know if a function is either a malloc or calloc like function. Currently we have to check both separately. This means both calls check if its an intrinsic, query TLI, check the nobuiltin attribute, scan the AllocationFnData, etc.

This patch adds a isMallocOrCallocLikeFn so we can go through all of the checks once per call.

This also changes the one other location I saw that called both together.

Differential Revision: https://reviews.llvm.org/D32188

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300608 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopReroll] Prefer hasNUses/hasNUses or more as they're cheaper. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300607 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Make mayBeEmittedAsTailCall parameter const

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300603 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300597 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Make MFI fields private

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300596 91177308-0d34-0410-b5e6-96231b3b80d8

[MemoryBuiltins] Use ImmutableCallSite instead of CallSite to remove a const_cast and const correct. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300585 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Fix memory congruence verification. The return true should be a return false. Merge the appropriate if statements so it doesn't happen again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300584 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64.

Android x86_64 target uses f128 type and stores f128 values in %xmm* registers.
SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value
from f128 to i128.

Differential Revision: http://reviews.llvm.org/D32102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300583 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Inline the single word case of lshrInPlace similar to what we do for <<=.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300577 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add scheduling latency/throughput tests for (most) SSE1 instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300576 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP vectorizer] Allow phi node reordering in tryToVectorizeList.

In tryToVectorizeList, under a very limited circumstance (when entered
from tryToVectorizePair), the values may be reordered (swapped) and the
SLP tree is built with the new order. This extends that to the case when
starting from phis in vectorizeChainsInBlock when there are exactly two
phis. The textual order of phi nodes shouldn't really matter. Without
this change, the loop body in the accompnaying test case is fully vectorized
when we swap the orde of the phis but not with this order. While this
doesn't solve the phi-ordering problem in a general way (for more than 2
phis), this is simple fix that piggybacks on an existing mechanism and
is useful in cases like multiplying two complex numbers.

Differential revision: https://reviews.llvm.org/D32065

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300574 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use for-range loop. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300567 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Use lshrInPlace to replace lshr where possible

This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.

This adds an lshrInPlace(const APInt &) version as well.

Differential Revision: https://reviews.llvm.org/D32155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300566 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Don't waste time value numbering unreachable blocks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300565 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Improve store merge candidate pruning.

Remove non-consecutive stores from store merge candidate search as
they cannot be merged and will prevent us from finding subsequent
mergeable store cases.

Reviewers: jyknight, bogner, javed.absar, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300561 91177308-0d34-0410-b5e6-96231b3b80d8

Add base-index-based store merge test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300559 91177308-0d34-0410-b5e6-96231b3b80d8

LoopRerollPass: Prefer Value::hasOneUse() over Value::getNumUses(). NFC.

getNumUses() can be more expensive as it iterates over all list's elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300558 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Cache block mask values

This patch is part of D28975's breakdown.

Add caching for block masks similar to the cache already used for edge masks,
replacing generation per user with reusing the first generated value which
dominates all uses.

Differential Revision: https://reviews.llvm.org/D32054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300557 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] fix doxygen comment formatting; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300554 91177308-0d34-0410-b5e6-96231b3b80d8

Make globalaa-retained.ll test catching more cases.

Summary:
* Add checks for store. That is needed because GlobalsAA is called
  twice in the current pipeline with different sets of Function passes
  following it. However, the loads are eliminated using instcombine
  which happens everywhere. On the other hand, DeadStoreElimination is
  performed only once so by checking for store we'll be able to catch
  more cases when GlobalsAA is invalidated unintentionally.
* Add empty function above/below the test so that we don't depend on
  the relative order of instcombine/dead-store-elimination and the
  pass that invalidates the analysis (inside the same
  FunctionPassManager).

Reviewers: kristof.beyls

Reviewed By: kristof.beyls

Subscribers: llvm-commits, n.bozhenov

Differential Revision: https://reviews.llvm.org/D32015
Patch by Andrei Elovikov <andrei.elovikov@intel.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300553 91177308-0d34-0410-b5e6-96231b3b80d8

[GVNHoist] Mark GlobalsAA as preserved by GVNHoist.

Reviewers: sebpop, hiraditya

Reviewed By: sebpop

Subscribers: n.bozhenov, llvm-commits

Differential Revision: https://reviews.llvm.org/D32158
Patch by Andrei Elovikov <andrei.elovikov@intel.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300552 91177308-0d34-0410-b5e6-96231b3b80d8

Add store Merge test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300551 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add hardware build attributes in assembler

In the assembler, we should emit build attributes based on the target
selected with command-line options. This matches the GNU assembler's
behaviour. We only do this for build attributes which describe the
hardware that is expected to be available, not the ones that describe
ABI compatibility.

This is done by moving some of the attribute emission code to
ARMTargetStreamer, so that it can be shared between the assembly and
code-generation code paths. Since the assembler only creates a
MCSubtargetInfo, not an ARMSubtarget, the code had to be changed to
check raw features, and not use the convenience functions in
ARMSubtarget.

If different attributes are later specified using the .eabi_attribute
directive, then they will take precedence, as happens when the same
.eabi_attribute is specified twice.

This must be enabled by an option, because we don't want to do this when
parsing inline assembly. The attributes would match the ones emitted at
the start of the file, so wouldn't actually change the emitted object
file, but the extra directives would be added to every inline assembly
block when emitting assembly, which we'd like to avoid.

The majority of the changes in the build-attributes.ll test are just
re-ordering the directives, because the hardware attributes are now
emitted before the ABI ones. However, I did fix one bug which I spotted:
Tag_CPU_arch_profile was not being emitted for v6M.

Differential revision: https://reviews.llvm.org/D31812

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300547 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Add support for G_SUB

Support G_SUB throughout the GlobalISel pipeline. It is exactly the same
as G_ADD, nothing fancy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300546 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleProfile] Don't assert when printing the DebugLoc of a branch. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300544 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleProfile] Skip intrinsic calls when visiting callsites in InlineHotFunctions.

Before this patch, we always called method 'findCalleeFunctionSamples()' on
intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable
candidates for obvious reasons.

No functional change intended.

Differential Revision: https://reviews.llvm.org/D32008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300541 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[GlobalISel] Support vector-of-pointers in LLT"

This reverts r300535 and r300537.
The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
produces slightly different code between LLVM versions being built with different compilers.
E.g., dependent on the compiler LLVM is built with, either one of the following
can be produced:

remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement)
remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement)

Non-determinism like this is clearly a bad thing, so reverting this until
I can find and fix the root cause of the non-determinism.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300538 91177308-0d34-0410-b5e6-96231b3b80d8

Fix gcc build after r300535.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300537 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Check for correct HW div when lowering divmod

For subtargets that use the custom lowering for divmod, e.g. gnueabi,
we used to check if the subtarget has hardware divide and then lower to
a div-mul-sub sequence if true, or to a libcall if false.

However, judging by the usage of hasDivide vs hasDivideInARMMode, it
seems that hasDivide only refers to Thumb. For instance, in the
ARMTargetLowering constructor, the code that specifies whether to use
libcalls for (S|U)DIV looks like this:

bool hasDivide = Subtarget->isThumb() ? Subtarget->hasDivide()
: Subtarget->hasDivideInARMMode();

In the case of divmod for arm-gnueabi, using only hasDivide() to
determine what to do means that instead of lowering to __aeabi_idivmod
to get the remainder, we lower to div-mul-sub and then further lower the
div to __aeabi_idiv. Even worse, if we have hardware divide in ARM but
not in Thumb, we generate a libcall instead of using it (this is not an
issue in practice since AFAICT none of the cores that we support have
hardware divide in ARM but not Thumb).

This patch fixes the code dealing with custom lowering to take into
account the mode (Thumb or ARM) when deciding whether or not hardware
division is available.

Differential Revision: https://reviews.llvm.org/D32005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300536 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Support vector-of-pointers in LLT

This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality. Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300535 91177308-0d34-0410-b5e6-96231b3b80d8

test commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300532 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Cleanup the reverseBits slow case a little.

Use lshrInPlace. Use single bit extract and operator|=(uint64_t) to avoid a few temporary APInts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300527 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Make operator<<= shift in place. Improve the implementation of tcShiftLeft and use it to implement operator<<=.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300526 91177308-0d34-0410-b5e6-96231b3b80d8

PR32382: Fix emitting complex DWARF expressions.

The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
  - describe a variable in a register
  - consist of only a DW_OP_reg
2. Memory location descriptions
  - describe the address of a variable
3. Implicit location descriptions
  - describe the value of a variable
  - end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident.  This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
  DBG_VALUE, [RBP-8]                        --> DW_OP_fbreg -8
  DBG_VALUE, RAX                            --> DW_OP_reg0 +0
  DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300522 91177308-0d34-0410-b5e6-96231b3b80d8

Add const to a const method. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300520 91177308-0d34-0410-b5e6-96231b3b80d8

[Target] Use hasOneUse() instead of getNumUses().

The latter does a liner scan over a linked list, therefore is
much more expensive.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300518 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Shrink the size of irsymtab::Symbol by a word. NFCI.

Instead of storing an UncommonIndex on the Symbol, use a flag bit to store
whether the Symbol has an Uncommon. This shrinks Chromium's .bc files (after
D32061) by about 1%.

Differential Revision: https://reviews.llvm.org/D32070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300514 91177308-0d34-0410-b5e6-96231b3b80d8

Build SymbolMap in SampleProfileLoader to help matchin function names with suffix.

Summary: If there is suffix added in the function name (e.g. module hash added by thinLTO), we will not be able to find a match in profile as the suffix does not exist in profile. This patch build a map from function name to Function *. The map includes the entry for the stripped function name so that inlineHotFunctions can find the corresponding function to promote/inline.

Reviewers: davidxl, dnovillo, tejohnson

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D31952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300507 91177308-0d34-0410-b5e6-96231b3b80d8

Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mir

Differential Revision: https://reviews.llvm.org/D32037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300506 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Use hasNUses instead of comparing getNumUses to a constant."

The use list is a linked list so getNumUses requires a linear scan through the whole list. hasNUses will stop scanning at N and see if that is the end.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300505 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Merge the multiword code from lshrInPlace and tcShiftRight into a single implementation

This merges the two different multiword shift right implementations into a single version located in tcShiftRight. lshrInPlace now calls tcShiftRight for the multiword case.

I retained the memmove fast path from lshrInPlace and used a memset for the zeroing. The for loop is basically tcShiftRight's implementation with the zeroing and the intra-shift of 0 removed.

Differential Revision: https://reviews.llvm.org/D32114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300503 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix WebAssemblyOptimizeReturned after r300367

Summary:
Refactoring changed paramHasAttr(1 + i) to paramHasAttr(0), fix that to
paramHasAttr(i).
Add more tests to WebAssemblyOptimizeReturned that catch that
regression.

Reviewers: dschuff

Subscribers: jfb, sbc100, llvm-commits

Differential Revision: https://reviews.llvm.org/D32136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300502 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Fix another unused variable warning in release builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300500 91177308-0d34-0410-b5e6-96231b3b80d8

Fix an unused variable error in rL300494.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300499 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] experimental option -cleanse_crash: tries to replace all bytes in a crash reproducer with garbage, while still preserving the crash

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300498 91177308-0d34-0410-b5e6-96231b3b80d8

Add a linker script to version LLVM symbols

Summary:
This patch adds a very simple linker script to version the lib's symbols
and thus trying to avoid crashes if an application loads two different
LLVM versions (as long as they do not share data between them).

Note that we deliberately *don't* make LLVM_5.0 depend on LLVM_4.0:
they're incompatible and the whole point of this patch is
to tell the linker that.

Avoid unexpected crashes when two LLVM versions are used in the same process.

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Author: Lisandro Damían Nicanor Pérez Meyer <lisandro@debian.org>
Author: Sylvestre Ledru <sylvestre@debian.org>
Bug-Debian: https://bugs.debian.org/848368

Reviewers: beanz, rnk

Reviewed By: rnk

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D31524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300496 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Matchers work with both ConstExpr and Instructions.

So, `cast<Instruction>` is not guaranteed to succeed. Change the
code so that we create a new constant and use it in the newly
created instruction, as it's done in other places in InstCombine.

OK'ed by Sanjay/Craig. Fixes PR32686.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300495 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Add a local cache for getZeroExtendExpr and getSignExtendExpr to prevent
the exponential behavior.

The patch is to fix PR32043. Functions getZeroExtendExpr and getSignExtendExpr
may call themselves recursively more than once. This is potentially a 2^N
complexity behavior. The exponential behavior was not commonly exposed before
because of existing global cache mechnism like UniqueSCEVs or some early return
mechanism when flags FlagNSW or FlagNUW are seen. However, we still have case
which can expose the exponential behavior, like the case in PR32043, so we add
a local cache in getZeroExtendExpr and getSignExtendExpr. If the input of the
functions -- SCEV and type pair have been seen before, we can find the extended
expression directly in the local cache.

Differential Revision: https://reviews.llvm.org/D30350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300494 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add/move tests for (icmp X, C1 & icmp X, C2); NFC

We simplify based on range intersection, but we're missing folds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300493 91177308-0d34-0410-b5e6-96231b3b80d8

Update the test to fix the buildbot failure introduced by r300486 (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300492 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Encode block signatures as SLEB instead of ULEB

Use SLEB (varint) for block_type immediates in accordance with the spec.

Patch by Yury Delendik

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300490 91177308-0d34-0410-b5e6-96231b3b80d8

Add GNU_discriminator support for inline callsites in llvm-symbolizer.

Summary: LLVM symbolize cannot recognize GNU_discriminator for inline callsites. This patch adds support for it.

Reviewers: dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32134

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300486 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use MachineRegisterInfo to find max used register

Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.

Also fixes some counting errors with flat_scr when there
are no stack objects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300482 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Change stack alignment

While the incoming stack for a kernel is 256-byte aligned,
this refers to the base address of the entire wave. This isn't
useful information for most of codegen. Fixes unnecessarily
aligning stack objects in callees.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300481 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGenPrepare] Fix crash due to an invalid CFG

The splitIndirectCriticalEdges function generates and invalid CFG when the
'Target' basic block is a loop to itself. When this occurs, the code that
updates the predecessor terminator needs to update the terminator in the split
basic block.

This occurs when there is an edge from block D back to D. Since D is split in
to D0 and D1, the code needs to update the terminator in D1. But D1 is not in
the OtherPreds vector, so it was not getting updated.

Differential Revision: https://reviews.llvm.org/D32126

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300480 91177308-0d34-0410-b5e6-96231b3b80d8

Unbreak build of the wasm backend after r300463.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300479 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Add missing build dep to fix shlib build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300478 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Remove self move check from move assignment operator

This was added to work around a bug in MSVC 2013's implementation of stable_sort. That bug has been fixed as of MSVC 2015 so we shouldn't need this anymore.

Technically the current implementation has undefined behavior because we only protect the deleting of the pVal array with the self move check. There is still a memcpy of that.VAL to VAL that isn't protected. In the case of self move those are the same local and memcpy is undefined for src and dst overlapping.

This reduces the size of the opt binary on my local x86-64 build by about 4k.

Differential Revision: https://reviews.llvm.org/D32116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300477 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Implement DataLayout::getPointerTypeSizeInBits using getPointerSizeInBits directly

Currently we use getTypeSizeInBits which contains a switch statement to dispatch based on what the Type is. We know we always have a pointer type here, but the compiler isn't able to figure out that out to remove the switch.

This patch changes it to just call handle the pointer type directly by calling getPointerSizeInBits without going through a switch.

getPointerTypeSizeInBits is called pretty often, particularly by getOrEnforceKnownAlignment which is used by InstCombine. This should speed that up a little bit.

Differential Revision: https://reviews.llvm.org/D31841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300475 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: put nonlazybind special handling behind a flag for now.

It's basically a terrible idea anyway but objc_msgSend gets emitted like that.
We can decide on a better way to deal with it in the unlikely event that anyone
actually uses it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300474 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Test handling of R_AMDGPU_ABS64 in RelocVisitor

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300472 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Put the Use list waymarking bits in the bit positions documentation says they are using

The documentation for the waymarking algorithm says that we use the lower 2 bits of Use::Prev to store the way marking bits. But because we use a PointerIntPair with the default PointerLikeTypeTraits, we're using bits 2:1 on 64-bit targets.

There's also a trick employed for distinguishing Users that have Uses stored with them and Users that have Uses stored in a separate array. The documentation says we use the LSB of the first byte of the real User object or the User* that occurs at the end of the Use array. But again due to the PointerLikeTypeTraits we're really using bit 2(64-bit) or bit 1(32-bit) and not the LSB. This is a little worrying because the first byte of the User object is the vtable ptr so we're assuming the vtable has 8 byte or 4 byte alignment where what is documented would only require 2 byte alignment.

This patch provides a custom traits override for these two cases to put the bits where the documentation says they are. It also has the side effect of removing some shifts from the waymarking traversal implementation.

Differential Revision: https://reviews.llvm.org/D31733

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300471 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Set CodePointerSize to 8 for amdgcn

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300470 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Use offset+size as the irsymtab string representation.

This is consistent with the bitcode string table.

Differential Revision: https://reviews.llvm.org/D31922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300465 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Add a string table to the bitcode format.

Add a top-level STRTAB block containing a string table blob, and start storing
strings for module codes FUNCTION, GLOBALVAR, ALIAS, IFUNC and COMDAT in
the string table.

This change allows us to share names between globals and comdats as well
as between modules, and improves the efficiency of loading bitcode files by
no longer using a bit encoding for symbol names. Once we start writing the
irsymtab to the bitcode file we will also be able to share strings between
it and the module.

On my machine, link time for Chromium for Linux with ThinLTO decreases by
about 7% for no-op incremental builds or about 1% for full builds. Total
bitcode file size decreases by about 3%.

As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2017-April/111732.html

Differential Revision: https://reviews.llvm.org/D31838

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300464 91177308-0d34-0410-b5e6-96231b3b80d8

Distinguish between code pointer size and DataLayout::getPointerSize() in DWARF info generation

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300463 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: support nonlazybind

It's almost certainly not a good idea to actually use it in most cases (there's
a pretty large code size overhead on AArch64), but we can't do those
experiments until it's supported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300462 91177308-0d34-0410-b5e6-96231b3b80d8

Introduce APInt::isSignBitSet/isSignBitClear. Use in place isSignBitSet in place of isNegative in known bits tracking.

This makes statements like KnownZero.isNegative() (which means the value we're tracking is positive) less confusing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300457 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: SimplifyDemandedElts for image intrinsics

Causes some VGPR usage improvements in shaderdb, but
introduces some SGPR spilling regressions due to random
scheduling changes later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300453 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Don't insert tokens into the worklist at all.

We're gonna skip them anyway, so there's no point in inserting them
in the first place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300452 91177308-0d34-0410-b5e6-96231b3b80d8

Introducing LLVMMetadataRef

Summary:
This seems like an uncontroversial first step toward providing access to the metadata hierarchy that now exists in LLVM. This should allow for good debug info support from C.

Future plans are to deprecate API that take mixed bags of values and metadata (mainly the LLVMMDNode family of functions) and migrate the rest toward the use of LLVMMetadataRef.

Once this is in place, mapping of DIBuilder will be able to start.

Reviewers: mehdi_amini, echristo, whitequark, jketema, Wallbraker

Reviewed By: Wallbraker

Subscribers: Eugene.Zelenko, axw, mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D19448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300447 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPeeling] Get rid of Phis that become invariant after N steps

This patch is a generalization of the improvement introduced in rL296898.
Previously, we were able to peel one iteration of a loop to get rid of a Phi that becomes
an invariant on the 2nd iteration. In more general case, if a Phi becomes invariant after
N iterations, we can peel N times and turn it into invariant.
In order to do this, we for every Phi in loop's header we define the Invariant Depth value
which is calculated as follows:

Given %x = phi <Inputs from above the loop>, ..., [%y, %back.edge].

If %y is a loop invariant, then Depth(%x) = 1.
If %y is a Phi from the loop header, Depth(%x) = Depth(%y) + 1.
Otherwise, Depth(%x) is infinite.
Notice that if we peel a loop, all Phis with Depth = 1 become invariants,
and all other Phis with finite depth decrease the depth by 1.
Thus, peeling N first iterations allows us to turn all Phis with Depth <= N
into invariants.

Reviewers: reames, apilipenko, mkuper, skatkov, anna, sanjoy

Reviewed By: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300446 91177308-0d34-0410-b5e6-96231b3b80d8

[BPI] NFC: reorder ifs to bail out earlier

This is non-functional change to re-order if statements to bail out earlier
from unreachable and ColdCall heuristics.

Reviewers: sanjoy, reames, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300442 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPeeling] Fix condition for phi-eliminating peeling

When peeling loops basing on phis becoming invariants, we make a wrong loop size check.
UP.Threshold should be compared against the total numbers of instructions after the transformation,
which is equal to 2 * LoopSize in case of peeling one iteration.
We should also check that the maximum allowed number of peeled iterations is not zero.

Reviewers: sanjoy, anna, reames, mkuper

Reviewed By: mkuper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300441 91177308-0d34-0410-b5e6-96231b3b80d8

[BPI] Use metadata info before any other heuristics

Metadata potentially is more precise than any heuristics we use, so
it makes sense to use first metadata info if it is available. However it makes
sense to examine it against other strong heuristics like unreachable one.
If edge coming to unreachable block has higher probability then it is expected
by unreachable heuristic then we use heuristic and remaining probability is
distributed among other reachable blocks equally.

An example where metadata might be more strong then unreachable heuristic is
as follows: it is possible that there are two branches and for the branch A
metadata says that its probability is (0, 2^25). For the branch B
the probability is (1, 2^25).
So the expectation is that first edge of B is hotter than first edge of A
because first edge of A did not executed at least once.
If first edge of A points to the unreachable block then using the unreachable
heuristics we'll set the probability for A to (1, 2^20) and now edge of A
becomes hotter than edge of B.
This is unexpected behavior.

This fixed the biggest part of https://bugs.llvm.org/show_bug.cgi?id=32214

Reviewers: sanjoy, junbuml, vsk, chandlerc

Reviewed By: chandlerc

Subscribers: llvm-commits, reames, davidxl

Differential Revision: https://reviews.llvm.org/D30631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300440 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify 1/X for vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300439 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test cases for missing support for simplifying 1/X for vectors. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300438 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add support for vector srem->urem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300437 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add missing testcases for srem->urem conversion. The vector version isn't currently supported. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300436 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add support for turning vector sdiv into udiv.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300435 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test cases for missing support for turning vector sdiv into udiv. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300434 91177308-0d34-0410-b5e6-96231b3b80d8