granicus.if.org Git

[SCEV] Always sort AddRecExprs from different loops by dominance

Sorting of AddRecExprs by loop nesting does not make sense since we only invoke
the CompareSCEVComplexity for AddRecExprs that are used by one SCEV. This
guarantees that there is always a dominance relationship between them. This
patch removes the sorting by nesting which is a dead code in current usage of
this function.

Reviewed By: sanjoy

Differential Revision: https://reviews.llvm.org/D33228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303235 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV][NFC] Replace redundant dyn_cast with cast in getAddExpr

Replace dyn_cast which is ensured by isa just one line above with cast.

Differential Revision: https://reviews.llvm.org/D33231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303234 91177308-0d34-0410-b5e6-96231b3b80d8

[coroutines] Handle spills before catchswitch

If we need to spill the result of the PHI instruction, we insert the spill after
all of the PHIs and EHPads, however, in a catchswitch block there is no
room to insert the spill. Make room by splitting away catchswitch into a separate
block.

Before the fix:

    catch.dispatch:
       %val = phi i32 [ 1, %if.then ], [ 2, %if.else ]
       %switch = catchswitch within none [label %catch] unwind label %cleanuppad

After:

    catch.dispatch:
       %val = phi i32 [ 1, %if.then ], [ 2, %if.else ]
       %tok = cleanuppad within none []
       ; spill goes here
       cleanupret from %tok unwind label %catch.dispatch.switch
    catch.dispatch.switch:
       %switch = catchswitch within none [label %catch] unwind label %cleanuppad

https://reviews.llvm.org/D31846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303232 91177308-0d34-0410-b5e6-96231b3b80d8

Added LLVM_DUMP_METHOD attributes for MatchableInfo::dump(). Defined it only if dump is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303229 91177308-0d34-0410-b5e6-96231b3b80d8

BitVector: add iterators for set bits

Differential revision: https://reviews.llvm.org/D32060

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303227 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303221 91177308-0d34-0410-b5e6-96231b3b80d8

Fix for compilers with older CRT header libraries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303220 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Ignore OutputDebugString exceptions in our crash recovery.

Since we use AddVectoredExceptionHandler, we get notified of
every exception that gets raised by a program. Sometimes these
are not necessarily errors though, and this can be especially
true when linking against a library that we have no control
over, and may raise an exception internally which it intends
to catch.

In particular, the Windows API OutputDebugString does exactly
this. It raises an exception inside of a __try / __except,
giving the debugger a chance to handle the exception to print
the message to the debug console.

But this doesn't interoperate nicely with our vectored exception
handler, which just sees another exception and decides that we
need to terminate the program.

Add a special case for this so that we ignore ODS exceptions
and continue normally.

Note that a better fix is to simply not use vectored exception
handlers and use SEH instead, but given that MinGW doesn't support
SEH, this is the only solution for MinGW.

Differential Revision: https://reviews.llvm.org/D33260

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303219 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Prefer use_empty() to !hasNUsesOrMore(1) for clarity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303218 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Re-enable test now that the nondeterminism has been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303217 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/Transforms/InstCombine/debuginfo-skip.ll REQUIRES +asserts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303216 91177308-0d34-0410-b5e6-96231b3b80d8

Add test for FixedStreamArrayIterator::operator->

The operator-> implementation comes from iterator_facade_base, so it should
just work given that the iterator has a tested operator*. But r302257 showed
that required careful handling of for the const qualifier. This patch ensures
the fix in r302257 doesn't regress.

Differential Revision: https://reviews.llvm.org/D33249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303215 91177308-0d34-0410-b5e6-96231b3b80d8

Update doxygen description of a method. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303214 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add folds for constant mask of value shifted by constant

We would eventually catch these via demanded bits and computing known bits in InstCombine,
but I think it's better to handle the simple cases as soon as possible as a matter of efficiency.

This fold allows further simplifications based on distributed ops transforms. eg:
  %a = lshr i8 %x, 7
  %b = or i8 %a, 2
  %c = and i8 %b, 1

InstSimplify can directly fold this now:
  %a = lshr i8 %x, 7

Differential Revision: https://reviews.llvm.org/D33221

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303213 91177308-0d34-0410-b5e6-96231b3b80d8

The patch exclude a case from zero check skip in
CTLZ idiom recognition (r303102).

Summary:

The following case:
i = 1;
if(n)
  while (n >>= 1)
    i++;
use(i);

Was converted to:

i = 1;
if(n)
  i += builtin_ctlz(n >> 1, false);
use(i);

Which is not correct. The patch make it:

i = 1;
if(n)
  i += builtin_ctlz(n >> 1, true);
use(i);

From: Evgeny Stupachenko <evstupac@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303212 91177308-0d34-0410-b5e6-96231b3b80d8

Re-commit r302678, fixing PR33053.

The issue was that the AArch64 TTI hook allowed unpacked integer cmp reductions
which didn't have a lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303211 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Do not mix callsite and callee hotness based updates.

Update threshold based on callee's hotness only when BFI is not available.
Otherwise use only callsite's hotness. This makes it easier to reason about
hotness related threshold updates.

Differential revision: https://reviews.llvm.org/D33157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303210 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC] Add -ppc-asm-full-reg-names to atomic-2.ll. NFC.

Differential Revisions: https://reviews.llvm.org/D32763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303209 91177308-0d34-0410-b5e6-96231b3b80d8

Test for r303197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303208 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC] Lower load acquire/seq_cst trailing fence to cmp + bne + isync.

Summary:
This fixes pr32392.

The lowering pipeline is:
llvm.ppc.cfence in IR -> PPC::CFENCE8 in isel -> Actual instructions in
expandPostRAPseudo.

The reason why expandPostRAPseudo is chosen is because previous passes
are likely eliminating instructions like cmpw 3, 3 (early CSE) and bne-
7, .+4 (some branch pass(s)).

Differential Revision: https://reviews.llvm.org/D32763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303205 91177308-0d34-0410-b5e6-96231b3b80d8

Add hasProfileSummary and has{Sample|Instrumentation}Profile methods

ProfileSummaryInfo already checks whether the module has sample profile
in determining profile counts. This will also be useful in inliner to
clean up threshold updates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303204 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] auto-generate better checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303203 91177308-0d34-0410-b5e6-96231b3b80d8

In debug builds non-trivial amount of time is spent in InstCombine processing
@llvm.dbg.* calls in visitCallInst(). They can be safely ignored.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303202 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Only do something in verifyStoreExpressions if assertions are enabled, to avoid unused code warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303201 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Fix PR 33051 by making sure we remove old store expressions
from the ExpressionToClass mapping.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303200 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[X86] Replace slow LEA instructions in X86"

This reverts commit r303183, it broke various buildbots and introduced
sanitizer errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303199 91177308-0d34-0410-b5e6-96231b3b80d8

Elide stores which are overwritten without being observed.

Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.

Test notes:

* Many testcases overwrite store addresses multiple times and needed
  minor changes, mainly making stores volatile to prevent the
  optimization from optimizing the test away.

* Many X86 test cases optimized out instructions associated with
  associated with va_start.

* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
  dependencies to check and can probably be removed and potentially
  replaced with another test.

Reviewers: rnk, john.brawn

Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33206

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8

ShrinkWrap: Add skipFunction() call

ShrinkWrapping is a performance optimization that can safely be skipped,
so we can add `if (!skipFunction()) return;`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303197 91177308-0d34-0410-b5e6-96231b3b80d8

[MetadataLoader] Remove unused Vector. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303196 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[ARM] Mark LEApcrel instructions as isAsCheapAsAMove"

Revert "[ARM] Mark LEApcrel as not having side effects"

This reverts commit r303054 and r303053, as they broke the ARM
self-hosting buildbots:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-a15-full-sh/builds/1550

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/1349

http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost/builds/1845

Offline investigation on course.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303193 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Use GCNRPTracker dumper methods in scheduler

Differential Revision: https://reviews.llvm.org/D33244

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303186 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add motivational comment for tests; NFC

The referenced tests are derived from:
https://bugs.llvm.org/show_bug.cgi?id=32791
and:
https://reviews.llvm.org/D33172

The motivation for including negative tests may not be clear, so I'm adding an explanatory comment here.
In the post-commit thread for r303133:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170515/453793.html
...it was mentioned that we don't want to add redundant tests. This is a valid point. But in this case,
we have a patch under review (D33172) that demonstrates that no existing regression tests are affected by
a proposed code change, but these are. Therefore, I think these tests have value not visible in any
existing regression tests regardless of whether they show a transform.

Differential Revision: https://reviews.llvm.org/D33242

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303185 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Cache live-ins and register pressure in scheduler

Using LIS can be quite expensive, so caching of calculated region
live-ins and pressure is implemented. It does two things:

1. Caches the info for the second stage when we schedule with
   decreased target occupancy.
2. Tracks the basic block from top to bottom thus eliminating the
   need to scan whole register file liveness at every region split
   in the middle of the block.

The scheduling is now done in 3 stages instead of two, with the first
one being really a no-op and only used to collect scheduling regions
as sent by the scheduler driver.

There is no functional change to the current behavior, only compilation
speed is affected. In general computeBlockPressure() could be simplified
if we switch to backward RP tracker, because scheduler sends regions
within a block starting from the last upward. We could use a natural
order of upward tracker to seamlessly change between regions of the same
block, since live reg set of a previous tracked region would become a
live-out of the next region. That however requires fixing upward tracker
to properly account defs and uses of the same instruction as both are
contributing to the current pressure. When we converge on the produced
pressure we should be able to switch between them back and forth. In
addition, backward tracker is less expensive as it uses LIS in recede
less often than forward uses it in advance.

At the moment the worst known case compilation time has improved from 26
minutes to 8.5.

Differential Revision: https://reviews.llvm.org/D33117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303184 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Replace slow LEA instructions in X86

  According to Intel's Optimization Reference Manual for SNB+:
  " For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must
    dispatch via port 1:
  - LEA that has all three source operands: base, index, and offset
  - LEA that uses base and index registers where the base is EBP, RBP,or R13
  - LEA that uses RIP relative addressing mode
  - LEA that uses 16-bit addressing mode "
  This patch currently handles the first 2 cases only.

Differential Revision: https://reviews.llvm.org/D32277

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303183 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 303174, 303176, and 303178

These commits are breaking the bots. Reverting to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303182 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Prune deleted nodes in TokenFactor

Fix visitTokenFactor to correctly remove deleted nodes. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303181 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Turn register pressure estimation into forward tracker

This factors register pressure estimation mechanism from the
GCNSchedStrategy into the forward tracker to unify interface
with other strategies and expose it to other interested phases.

Differential Revision: https://reviews.llvm.org/D33105

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303179 91177308-0d34-0410-b5e6-96231b3b80d8

Make test target-specific

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303178 91177308-0d34-0410-b5e6-96231b3b80d8

Fix test case to unbreak bots

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303176 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Avoid potentential division by zero when selecting IC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303174 91177308-0d34-0410-b5e6-96231b3b80d8

[coroutines] Handle unwind edge splitting

Summary:
RewritePHIs algorithm used in building of CoroFrame inserts a placeholder
```
%placeholder = phi [%val]
```
on every edge leading to a block starting with PHI node with multiple incoming edges,
so that if one of the incoming values was spilled and need to be reloaded, we have a
place to insert a reload. We use SplitEdge helper function to split the incoming edge.

SplitEdge function does not deal with unwind edges comping into a block with an EHPad.

This patch adds an ehAwareSplitEdge function that can correctly split the unwind edge.

For landing pads, we clone the landing pad into every edge block and replace the original
landing pad with a PHI collection the values from all incoming landing pads.

For WinEH pads, we keep the original EHPad in place and insert cleanuppad/cleapret in the
edge blocks.

Reviewers: majnemer, rnk

Reviewed By: majnemer

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D31845

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303172 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] - Add RelocAddrEntry for cleanup. NFCi.

Was mentioned as possible cleanup during review of D33184.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303171 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][X86] Split memop test file. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303169 91177308-0d34-0410-b5e6-96231b3b80d8

Fix an improperly placed curly bracket. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303165 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.

Recommit of r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector"
All places were shitched to use DWARFAddressRange now.

Suggested during review of D33184.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303163 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r303159 "[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector."

Something went wrong, it broke BB.
http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental_build/38477/consoleFull#-200034420049ba4694-19c4-4d7e-bec5-911270d8a58c

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303162 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] - Use DWARFAddressRange struct instead of uint64_t pair for DWARFAddressRangesVector.

Suggested during review of D33184.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303159 91177308-0d34-0410-b5e6-96231b3b80d8

[LTO] Print time-passes information at conclusion of LTO codegen

The information collected when requested by -time-passes is only printed when
llvm_shutdown is called at the moment. This means that when linking against the LTO
library dynamically and using the C interface, it is not possible to see the timing
information, because llvm_shutdown cannot be called. This change modifies the LTO
code generation functions for both regular LTO and thin LTO to explicitly print and
reset the timing information.

I have tested that this works with our proprietary linker. However, as this relies
on a specific method of building and linking against the LTO library, I'm not sure
how or if this can be tested in the LLVM testsuite.

Reviewed by: mehdi_amini

Differential Revision: https://reviews.llvm.org/D32803

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303152 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Fix sorting order for AddRecExprs

The existing sorting order in defined CompareSCEVComplexity sorts AddRecExprs
by loop depth, but does not pay attention to dominance of loops. This can
lead us to the following buggy situation:

for (...) { // loop1
  op1 = {A,+,B}
}
for (...) { // loop2
  op2 = {A,+,B}
  S = add op1, op2
}

In this case there is no guarantee that in operand list of S the op2 comes
before op1 (loop depth is the same, so they will be sorted just
lexicographically), so we can incorrectly treat S as a recurrence of loop1,
which is wrong.

This patch changes the sorting logic so that it places the dominated recs
before the dominating recs. This ensures that when we pick the first recurrency
in the operands order, it will be the bottom-most in terms of domination tree.
The attached test set includes some tests that produce incorrect SCEV
estimations and crashes with oldlogic.

Reviewers: sanjoy, reames, apilipenko, anna

Reviewed By: sanjoy

Subscribers: llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D33121

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303148 91177308-0d34-0410-b5e6-96231b3b80d8

[CorrelatedValuePropagation] Don't use -> to call a static method of ConstantRange. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303147 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Use StoreExpression StoredValue instead of looking it up again, since it was already looked up when it was created

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303144 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Formatting fixes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303143 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[NewGVN] Replace predicate info leftovers."

It's breaking the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303142 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Replace predicate info leftovers.

Fixes PR32945.

Differential Revision: https://reviews.llvm.org/D33226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303141 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPUCodeGen: Fix warnings in r303111. [-Wunused-variable]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303137 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape().

This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.

Differential Revision: https://reviews.llvm.org/D33162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303134 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for PR32791; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303133 91177308-0d34-0410-b5e6-96231b3b80d8

[ShrinkWrapping] Handle restores on no-return paths

Shrink-wrapping uses post-dominators to find a restore point that
post-dominates all the uses of CSR / stack.

The way dominator trees are modeled in LLVM today is that unreachable
blocks are not present in a generic dominator tree, so, an unreachable node is
dominated by anything: include/llvm/Support/GenericDomTree.h:467.

Since for post-dominators, a no-return block is considered
"unreachable", calling findNearestCommonDominator on an unreachable node
A and a non-unreachable node B, will return B, which can be false. If we
find such node, we bail out since there is no good restore point
available.

rdar://problem/30186931

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303130 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] fix tests on Windows

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303128 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for unnecessary mask of shifted values; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303127 91177308-0d34-0410-b5e6-96231b3b80d8

Fix memory leak

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303126 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] improve the afl driver and it's tests. Make it possible to run individual inputs with afl driver

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303125 91177308-0d34-0410-b5e6-96231b3b80d8

Fix git command line in the Getting Started guide.

By default, git creates "llvm-project-20170507" directory,
but we want to create "llvm-project" directory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303124 91177308-0d34-0410-b5e6-96231b3b80d8

Add "REQUIRES:" to the last few tests that use target specific intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303123 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Kill now unused phiInfoElementGetDebugLoc(). NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303122 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Simplify a for loop initialization based on the fact that 'n' is known to be 1 by an earlier 'if'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303120 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Fix some Clang-tidy modernize-use-using warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303119 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: use linker-private symbols for globals in MachO.

We don't use section-relative relocations on AArch64, so all symbols must be at
least visible to the linker (i.e. properly global or l_whatever, but not
L_whatever).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303118 91177308-0d34-0410-b5e6-96231b3b80d8

PR32288: Describe a bool parameter's DWARF location with a simple register

There's no need (& a bit incorrect) to mask off the high bits of the
register reference when describing a simple bool value.

Reviewers: aprantl

Differential Revision: https://reviews.llvm.org/D31062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303117 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Enable 64-bit wide vectorization on AArch64

ARM Neon has native support for half-sized vector registers (64 bits).  This
is beneficial for example for 2D and 3D graphics.  This patch adds the option
to lower MinVecRegSize from 128 via a TTI in the SLP Vectorizer.

*** Performance Analysis

This change was motivated by some internal benchmarks but it is also
beneficial on SPEC and the LLVM testsuite.

The results are with -O3 and PGO.  A negative percentage is an improvement.
The testsuite was run with a sample size of 4.

** SPEC

* CFP2006/482.sphinx3  -3.34%

A pretty hot loop is SLP vectorized resulting in nice instruction reduction.
This used to be a +22% regression before rL299482.

* CFP2000/177.mesa     -3.34%
* CINT2000/256.bzip2   +6.97%

My current plan is to extend the fix in rL299482 to i16 which brings the
regression down to +2.5%.  There are also other problems with the codegen in
this loop so there is further room for improvement.

** LLVM testsuite

* SingleSource/Benchmarks/Misc/ReedSolomon               -10.75%

There are multiple small SLP vectorizations outside the hot code.  It's a bit
surprising that it adds up to 10%.  Some of this may be code-layout noise.

* MultiSource/Benchmarks/VersaBench/beamformer/beamformer -8.40%

The opt-viewer screenshot can be seen at F3218284.  We start at a colder store
but the tree leads us into the hottest loop.

* MultiSource/Applications/lambda-0.1.3/lambda            -2.68%
* MultiSource/Benchmarks/Bullet/bullet                    -2.18%

This is using 3D vectors.

* SingleSource/Benchmarks/Shootout-C++/Shootout-C++-lists +6.67%

Noise, binary is unchanged.

* MultiSource/Benchmarks/Ptrdist/anagram/anagram          +4.90%

There is an additional SLP in the cold code.  The test runs for ~1sec and
prints out over 2000 lines. This is most likely noise.

* MultiSource/Applications/aha/aha                        +1.63%
* MultiSource/Applications/JM/lencod/lencod               +1.41%
* SingleSource/Benchmarks/Misc/richards_benchmark         +1.15%

Differential Revision: https://reviews.llvm.org/D31965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303116 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r302678 "[AArch64] Enable use of reduction intrinsics."

This caused PR33053.

Original commit message:

> The new experimental reduction intrinsics can now be used, so I'm enabling this
> for AArch64. We will need this for SVE anyway, so it makes sense to do this for
> NEON reductions as well.
>
> The existing code to match shufflevector patterns are replaced with a direct
> lowering of the reductions to AArch64-specific nodes. Tests updated with the
> new, simpler, representation.
>
> Differential Revision: https://reviews.llvm.org/D32247

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303115 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Better workaround for gold PR19002.

See the comment for more details. Test in a follow-up CFE commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303113 91177308-0d34-0410-b5e6-96231b3b80d8

Re-submit AMDGPUMachineCFGStructurizer.

Differential Revision: https://reviews.llvm.org/D23209

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303111 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: diagnose unrecognized features in .cpu directive.

We were silently ignoring any features we couldn't match up, which led to
errors in an inline asm block missing the conventional "\n\t".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303108 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Remove unused setDefiningExpr(). NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303107 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] restrict icmp fold with 2 sdiv exact operands (PR32949)

This is the InstCombine counterpart to D32954.
I added some comments about the code duplication in:
rL302436

Alive-based verification:
http://rise4fun.com/Alive/dPw

This is a 2nd fix for the problem reported in:
https://bugs.llvm.org/show_bug.cgi?id=32949

Differential Revision: https://reviews.llvm.org/D32970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303105 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] restrict icmp fold with 2 sdiv exact operands (PR32949)

These folds were introduced with https://reviews.llvm.org/rL127064 as part of solving:
https://bugs.llvm.org/show_bug.cgi?id=9343

As shown here:
http://rise4fun.com/Alive/C8
...however, the sdiv exact case needs a stronger predicate.

I opted for duplicated code instead of adding another fallthrough because I think that's
easier to read (and edit in case we need/want to restrict/loosen the predicates any more).

This should fix:
https://bugs.llvm.org/show_bug.cgi?id=32949
https://bugs.llvm.org/show_bug.cgi?id=32948

Differential Revision: https://reviews.llvm.org/D32954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303104 91177308-0d34-0410-b5e6-96231b3b80d8

The patch adds CTLZ idiom recognition.

Summary:

The following loops should be recognized:
i = 0;
while (n) {
  n = n >> 1;
  i++;
  body();
}
use(i);

And replaced with builtin_ctlz(n) if body() is empty or
for CPUs that have CTLZ instruction converted to countable:

for (j = 0; j < builtin_ctlz(n); j++) {
  n = n >> 1;
  i++;
  body();
}
use(builtin_ctlz(n));

Reviewers: rengolin, joerg

Differential Revision: http://reviews.llvm.org/D32605

From: Evgeny Stupachenko <evstupac@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303102 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Fix verification of MemoryPhis in verifyMemoryCongruency().

verifyMemoryCongruency() filters out trivially dead MemoryDef(s),
as we find them immediately dead, before moving from TOP to a new
congruence class.
This fixes the same problem for PHI(s) skipping MemoryPhis if all
the operands are dead.

Differential Revision: https://reviews.llvm.org/D33044

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303100 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][Falkor] Fix sched details for FMOV

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303099 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 303091.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303098 91177308-0d34-0410-b5e6-96231b3b80d8

Add support for handling ifuncs to GlobalValue::getBaseObject

Summary:
All GlobalIndirectSymbol types (not just GlobalAlias) should return
their base object.

Without this patch LTO would warn "Unable to determine comdat of
alias!" for an ifunc.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D33202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303096 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Use copy initialization of APInts instead of direct initialization.

This is based on post commit feed back from r302769.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303092 91177308-0d34-0410-b5e6-96231b3b80d8

Add AMDGPUMachineCFGStructurizer.

Differential Revision: https://reviews.llvm.org/D23209

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303091 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] use m_OneUse to reduce code; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303090 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] fix a warning from Wunreachable-code-loop-increment reported by Christian Holler. This also fixes a logical bug, which however does not affect the libFuzzer's ability too much (I wasn't able to create a differentiating test)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303087 91177308-0d34-0410-b5e6-96231b3b80d8

CodeGen: BlockPlacement: Increase tail duplication size for O3.

At O3 we are more willing to increase size if we believe it will improve
performance. The current threshold for tail-duplication of 2 instructions is
conservative, and can be relaxed at O3.

Benchmark results:
llvm test-suite:
6% improvement in aha, due to duplication of loop latch
3% improvement in hexxagon

2% slowdown in lpbench. Seems related, but couldn't completely diagnose.

Internal google benchmark:
Produces 4% improvement on internal google protocol buffer serialization
benchmarks.

Differential-Revision: https://reviews.llvm.org/D32324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303084 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Don't flag StoreParam/LoadParam memory chain operands as ReadMem/WriteMem (PR32146)

Follow up to D33147

NVPTXTargetLowering::LowerCall was trusting the default argument values.

Fixes another 17 of the NVPTX '-verify-machineinstrs with EXPENSIVE_CHECKS' errors in PR32146.

Differential Revision: https://reviews.llvm.org/D33189

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303082 91177308-0d34-0410-b5e6-96231b3b80d8

build_llvm_package.bat: Minor updates

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303080 91177308-0d34-0410-b5e6-96231b3b80d8

Add an extra test for archive symbol tables.

The table should include only defined symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303075 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 add/sub/mul

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303074 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Enable FeatureFuseAES on Cortex-A72.

This patch enables fusing dependent AESE/AESMC and AESD/AESIMC
instruction pairs on Cortex-A72, as recommended in the Software
Optimization Guide, section 4.10.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303073 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64

See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33123

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303070 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer][X86] Add vectorization tests for vXi64/vXi32/vXi16/VXi8 shifts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303069 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303059 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Removed V_MQSAD_U16_U8

This instruction does not really exist

See Bug 33018: https://bugs.llvm.org//show_bug.cgi?id=33018

Reviewers: vpykhtin, artem.tamazov

Differential Revision: https://reviews.llvm.org/D33126

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303055 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Mark LEApcrel instructions as isAsCheapAsAMove

Doing this means that if an LEApcrel is used in two places we will rematerialize
instead of generating two MOVs. This is particularly useful for printfs using
the same format string, where we want to generate an address into a register
that's going to get corrupted by the call.

Differential Revision: https://reviews.llvm.org/D32858

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303054 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Mark LEApcrel as not having side effects

Doing this lets us hoist it out of loops, and I've also marked it as
rematerializable the same as the thumb1 and thumb2 counterparts.

It looks like it being marked as such was just a mistake, as the commit that
made that change only mentions LEApcrelJT and in thumb1 and thumb2 only the
LEApcrelJT instructions were marked as having side-effects, so it looks like
the intent was to only mark LEApcrelJT as having side-effects but LEApcrel was
accidentally marked as such also.

Differential Revision: https://reviews.llvm.org/D32857

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303053 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] - Speedup handling of relocations in DWARFContextInMemory.

I am working on a speedup of building .gdb_index in LLD and
noticed that relocations that are proccessed in DWARFContextInMemory often uses
the same symbol in a row. This patch introduces caching to reduce the relocations
proccessing time.

For benchmark,
I took debug LLC binary objects configured with -ggnu-pubnames and linked it using LLD.

Link time without --gdb-index is about 4,45s.
Link time with --gdb-index: a) Without patch: 19,16s b) With patch: 15,52s
That means time spent on --gdb-index in this configuration is
19,16s - 4,45s = 14,71s (without patch) vs 15,52s - 4,45s = 11,07s (with patch).

Differential revision: https://reviews.llvm.org/D31136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303051 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Relocate code of replacement of subtarget unsupported masked memory intrinsics to run also on -O0 option.

Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).

CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).

Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.

Differential Revision: https://reviews.llvm.org/D32487

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303050 91177308-0d34-0410-b5e6-96231b3b80d8