granicus.if.org Git

[PGO] Use template file to define runtime structures

With this change, instrumentation code and reader/write
code related to profile data structs are kept strictly
in-sync. THis will be extended to cfe and compile-rt
references as well.

Differential Revision: http://reviews.llvm.org/D13843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252113 91177308-0d34-0410-b5e6-96231b3b80d8

Fix Abbrev emission in WriteIdentificationBlock

This Abbrev was not emitted and basically unused, just leacking there.

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252110 91177308-0d34-0410-b5e6-96231b3b80d8

Fix pr24832.

It is pretty simple now that the yak is shaved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252105 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify now that emitValueToOffset always returns false.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252102 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify .org processing and make it a bit more powerful.

We now always create the fragment, which lets us handle things like .org after
a .align.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252101 91177308-0d34-0410-b5e6-96231b3b80d8

Define portable macros for packed struct definitions:

1. A macro with argument: LLVM_PACKED(StructDefinition)
2. A pair of macros defining scope of region with packing:
    LLVM_PACKED_START
     struct A { ... };
     struct B { ... };
    LLVM_PACKED_END

Differential Revision: http://reviews.llvm.org/D14337

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252099 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] New transformation: tan(atan(x)) -> x

This is enabled only under -ffast-math.
So, instead of emitting:
  4007b0:       50                      push   %rax
  4007b1:       e8 8a fd ff ff          callq  400540 <atanf@plt>
  4007b6:       58                      pop    %rax
  4007b7:       e9 94 fd ff ff          jmpq   400550 <tanf@plt>
  4007bc:       0f 1f 40 00             nopl   0x0(%rax)

for:
float mytan(float x) {
  return tanf(atanf(x));
}
we emit a single retq.

Differential Revision: http://reviews.llvm.org/D14302

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252098 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] when choosing the next unit to mutate, give some preference to the most recent units (they are more likely to be interesting)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252097 91177308-0d34-0410-b5e6-96231b3b80d8

fix typo; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252096 91177308-0d34-0410-b5e6-96231b3b80d8

[CaptureTracking] Support operand bundles conservatively

Summary:
Earlier CaptureTracking would assume all "interesting" operands to a
call or invoke were its arguments. With operand bundles this is no
longer true.

Note: an earlier change got `doesNotCapture` working correctly with
operand bundles.

This change uses DSE to test the changes to CaptureTracking. DSE is a
vehicle for testing only, and is not directly involved in this change.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252095 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Bug 25059 - CMake libllvm.so.$MAJOR.$MINOR shared object name not compatible with ldconfig

Summary:
This change makes the CMake build system generate libraries for Linux and Darwin matching the makefile build system.

Linux libraries follow the pattern lib${name}.${MAJOR}.${MINOR}.so so that ldconfig won't pick it up incorrectly.

Darwin libraries are not versioned.

Note: On linux the non-versioned symlink is generated at install-time not build time. I plan to fix that eventually, but I expect that is good enough for the purposes of fixing this bug.

Reviewers: loladiro, tstellarAMD

Subscribers: axw, llvm-commits

Differential Revision: http://reviews.llvm.org/D13841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252093 91177308-0d34-0410-b5e6-96231b3b80d8

Slightly saner handling of thumb branches.

The generic infrastructure already did a lot of work to decide if the
fixup value is know or not. It doesn't make sense to reimplement a very
basic case: same fragment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252090 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Teach the shrink-wrapping hooks to do the proper thing with Win64.

Win64 has some strict requirements for the epilogue. As a result, we disable
shrink-wrapping for Win64 unless the block that gets the epilogue is already an
exit block.

Fixes PR24193.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252088 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some Clang-tidy modernize warnings, other minor fixes.

Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg.

Differential revision: http://reviews.llvm.org/D14312

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252087 91177308-0d34-0410-b5e6-96231b3b80d8

PM: Rephrase PrintLoopPass as a wrapper around a new-style pass. NFC

Splits PrintLoopPass into a new-style pass and a PrintLoopPassWrapper,
much like we already do for PrintFunctionPass and PrintModulePass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252085 91177308-0d34-0410-b5e6-96231b3b80d8

Add new interfaces to MBB for manipulating successors with probabilities instead of weights. NFC.

This is part-1 of the patch that replaces all edge weights in MBB by
probabilities, which only adds new interfaces. No functional changes.

Differential revision: http://reviews.llvm.org/D13908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252083 91177308-0d34-0410-b5e6-96231b3b80d8

Warning fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252078 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add a `data_operand` abstraction

Summary:
Data operands of a call or invoke consist of the call arguments, and
the bundle operands associated with the `call` (or `invoke`)
instruction. The motivation for this change is that we'd like to be
able to query "argument attributes" like `readonly` and `nocapture`
for bundle operands naturally.

This change also provides a conservative "implementation" for these
attributes for any bundle operand, and an extension point for future
work.

Reviewers: chandlerc, majnemer, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14305

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252077 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-config: Add --has-rtti option

Summary:
This prints NO if LLVM was built with -fno-rtti or an equivalent flag
and YES otherwise. The reasons to add -has-rtti rather than adding -fno-rtti
to --cxxflags are:

1. Building LLVM with -fno-rtti does not always mean that client
applications need this flag.

2. Some compilers have a different flag for disabling rtti, and the
compiler being used to build LLVM may not be the compiler being used to
build the application.

Reviewers: echristo, chandlerc, beanz

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252075 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add general memory folding for (V)INSERTPS instruction

This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.

The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.

This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.

It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.

Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll

Differential Revision: http://reviews.llvm.org/D13988

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252074 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add bounds checking to paramHasAttr

Summary:
This is intended to make a later change simpler.

Note: adding this bounds checking required fixing `X86FastISel`. As
far I can tell I've preserved original behavior but a careful review
will be appreciated.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252073 91177308-0d34-0410-b5e6-96231b3b80d8

Orc: Streamline some lambda usage in a unit test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252070 91177308-0d34-0410-b5e6-96231b3b80d8

Relax the check for ninja.

On fedora the ninja executable is called ninja-build :-(

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252062 91177308-0d34-0410-b5e6-96231b3b80d8

Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of scalar FMA intrinsics.

Patch by Slava Klochkov

The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result.

Reviewers: Quentin Colombet, Elena Demikhovsky

Differential Revision: http://reviews.llvm.org/D13710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252060 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Combine CMOV into BFI where possible

If we have a CMOV, OR and AND combination such as:
  if (x & CN)
    y |= CM;

And:
  * CN is a single bit;
  * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252057 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Always set linkage type to external when converting alias

When converting an alias to a non-alias when the aliasee is not
imported, ensure that the linkage type is set to external so that it is
a valid linkage type. Added a test case that exposed this issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252054 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Merge conditional stores

We can often end up with conditional stores that cannot be speculated. They can come from fairly simple, idiomatic code:

  if (c & flag1)
    *a = x;
  if (c & flag2)
    *a = y;
  ...

There is no dominating or post-dominating store to a, so it is not legal to move the store unconditionally to the end of the sequence and cache the intermediate result in a register, as we would like to.

It is, however, legal to merge the stores together and do the store once:

  tmp = undef;
  if (c & flag1)
    tmp = x;
  if (c & flag2)
    tmp = y;
  if (c & flag1 || c & flag2)
    *a = tmp;

The real power in this optimization is that it allows arbitrary length ladders such as these to be completely and trivially if-converted. The typical code I'd expect this to trigger on often uses binary-AND with constants as the condition (as in the above example), which means the ending condition can simply be truncated into a single binary-AND too: 'if (c & (flag1|flag2))'. As in the general case there are bitwise operators here, the ladder can often be optimized further too.

This optimization involves potentially increasing register pressure. Even in the simplest case, the lifetime of the first predicate is extended. This can be elided in some cases such as using binary-AND on constants, but not in the general case. Threading 'tmp' through all branches can also increase register pressure.

The optimization as in this patch is enabled by default but kept in a very conservative mode. It will only optimize if it thinks the resultant code should be if-convertable, and additionally if it can thread 'tmp' through at least one existing PHI, so it will only ever in the worst case create one more PHI and extend the lifetime of a predicate.

This doesn't trigger much in LNT, unfortunately, but it does trigger in a big way in a third party test suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252051 91177308-0d34-0410-b5e6-96231b3b80d8

Error out when faced with value names containing '\0'

Bug found with afl-fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252048 91177308-0d34-0410-b5e6-96231b3b80d8

Silence an extra semicolon warning; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252046 91177308-0d34-0410-b5e6-96231b3b80d8

[ELF] elfiamcu triple should imply e_machine == EM_IAMCU

Differential Revision: http://reviews.llvm.org/D14109

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252043 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] DAGCombine should not introduce FILD in soft-float mode

The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp
directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252042 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[PatternMatch] Switch to use ValueTracking::matchSelectPattern"

This was breaking the modules build and is being reverted while we reach consensus on the right way to solve this layering problem. This reverts commit r251785.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252040 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unit tests on Windows: handle env vars with non-ASCII chars.

Summary: On Windows we have to take UTF16 encoded env vars and convert them to UTF8. This patch fixes CopyEnvironment helper function used by process unit tests.

Reviewers: yaron.keren

Subscribers: yaron.keren, llvm-commits

Differential Revision: http://reviews.llvm.org/D14278

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252039 91177308-0d34-0410-b5e6-96231b3b80d8

[OperandBundles] Refactor; NFCI.

Extract out a helper function `operandBundleFromBundleOpInfo`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252038 91177308-0d34-0410-b5e6-96231b3b80d8

[OperandBundles] Refactor; NFCI

Intended to make later changes simpler. Exposes
`getBundleOperandsStartIndex` and `getBundleOperandsEndIndex`, and uses
them for the computation in `getNumTotalBundleOperands`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252037 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI] Update a comment to clarify what's actually happening and why

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252033 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Fold return values if possible

In my previous change to CVP (251606), I made CVP much more aggressive about trying to constant fold comparisons. This patch is a reversal in direction. Rather than being agressive about every compare, we restore the non-block local restriction for most, and then try hard for compares feeding returns.

The motivation for this is two fold:
* The more I thought about it, the less comfortable I got with the possible compile time impact of the other approach. There have been no reported issues, but after talking to a couple of folks, I've come to the conclusion the time probably isn't justified.
* It turns out we need to know the context to leverage the full power of LVI. In particular, asking about something at the end of it's block (the use of a compare in a return) will frequently get more precise results than something in the middle of a block. This is an implementation detail, but it's also hard to get around since mid-block queries have to reason about possible throwing instructions and don't get to use most of LVI's block focused infrastructure. This will become particular important when combined with http://reviews.llvm.org/D14263.

Differential Revision: http://reviews.llvm.org/D14271

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252032 91177308-0d34-0410-b5e6-96231b3b80d8

[StatepointLowering] Remove distinction between call and invoke safepoints

There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
allow to lower call safepoints with relocates and results in a different
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252028 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the test case for Windows.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252027 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVMSymbolize] Reduce indentation by using helper function. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252022 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVMSymbolize] Properly propagate object parsing errors from the library.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252021 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-symbolizer] Improve the test for missing input file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252020 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused variable warning from r252017

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252019 91177308-0d34-0410-b5e6-96231b3b80d8

LLE 6/6: Add LoopLoadElimination pass

Summary:
The goal of this pass is to perform store-to-load forwarding across the
backedge of a loop.  E.g.:

  for (i)
     A[i + 1] = A[i] + B[i]

  =>

  T = A[0]
  for (i)
     T = T + B[i]
     A[i + 1] = T

The pass relies on loop dependence analysis via LoopAccessAnalisys to
find opportunities of loop-carried dependences with a distance of one
between a store and a load.  Since it's using LoopAccessAnalysis, it was
easy to also add support for versioning away may-aliasing intervening
stores that would otherwise prevent this transformation.

This optimization is also performed by Load-PRE in GVN without the
option of multi-versioning.  As was discussed with Daniel Berlin in
http://reviews.llvm.org/D9548, this is inferior to a more loop-aware
solution applied here.  Hopefully, we will be able to remove some
complexity from GVN/MemorySSA as a consequence.

In the long run, we may want to extend this pass (or create a new one if
there is little overlap) to also eliminate loop-indepedent redundant
loads and store that *require* versioning due to may-aliasing
intervening stores/loads.  I have some motivating cases for store
elimination. My plan right now is to wait for MemorySSA to come online
first rather than using memdep for this.

The main motiviation for this pass is the 456.hmmer loop in SPECint2006
where after distributing the original loop and vectorizing the top part,
we are left with the critical path exposed in the bottom loop.  Being
able to promote the memory dependence into a register depedence (even
though the HW does perform store-to-load fowarding as well) results in a
major gain (~20%).  This gain also transfers over to x86: it's
around 8-10%.

Right now the pass is off by default and can be enabled
with -enable-loop-load-elim.  On the LNT testsuite, there are two
performance changes (negative number -> improvement):

  1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the
     critical paths is reduced
  2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this
     outside of LNT

The pass is scheduled after the loop vectorizer (which is after loop
distribution).  The rational is to try to reuse LAA state, rather than
recomputing it.  The order between LV and LLE is not critical because
normally LV does not touch scalar st->ld forwarding cases where
vectorizing would inhibit the CPU's st->ld forwarding to kick in.

LoopLoadElimination requires LAA to provide the full set of dependences
(including forward dependences).  LAA is known to omit loop-independent
dependences in certain situations.  The big comment before
removeDependencesFromMultipleStores explains why this should not occur
for the cases that we're interested in.

Reviewers: dberlin, hfinkel

Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252017 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] LLE 5/6: Add predicate functions Dependence::isForward/isBackward, NFC

Summary: Will be used by the LoopLoadElimination pass.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13258

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252016 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] LLE 4/6: APIs to access the dependent instructions for a dependence, NFC

Summary:
The functions use LAI and MemoryDepChecker classes so they need to be
defined after those definitions outside of the Dependence class.

Will be used by the LoopLoadElimination pass.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252015 91177308-0d34-0410-b5e6-96231b3b80d8

CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.

A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().

It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.

NFCI.

Differential Revision: http://reviews.llvm.org/D14168

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252014 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Make flat_scratch name consistent

The printed name and the parsed assembler names weren't the same.
I'm not sure which name SC prints these as, but I think it's this one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252010 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix asserts on invalid register ranges

If the requested SGPR was not actually aligned, it was
accepted and rounded down instead of rejected.

Also fix an assert if the range is an invalid size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252009 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix off by one error in register parsing

If trying to use one past the end, this would assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252008 91177308-0d34-0410-b5e6-96231b3b80d8

Address nit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252004 91177308-0d34-0410-b5e6-96231b3b80d8

Align whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252003 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Support wasm select operator

Summary:
Add support for wasm's select operator, and lower LLVM's select DAG node
to it.

Reviewers: sunfish

Subscribers: dschuff, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D14295

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252002 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: s[102:103] is unavailable on VI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252000 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Define correct number of SGPRs

There are actually 104 so 2 were missing.

More assembler tests with high register number tuples
will be included in later patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251999 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Make findUsedSGPR more readable

Add more comments etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251996 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Initialize SIFixSGPRCopies so -print-after works

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251995 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Alphabetize includes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251994 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombine: fix sinking of convergent calls

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251991 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Use existing constant nodes instead of recreating them. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251990 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVMSymbolize] Factor out the logic for printing structs from DIContext. NFC.

Introduce DIPrinter which takes care of rendering DILineInfo and
friends. This allows LLVMSymbolizer class to return a structured data
instead of plain std::strings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251989 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Tweaked shuffle stack folding tests

To avoid alternative lowerings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251986 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] LLE 3/6: Rename InterestingDependence to Dependences, NFC

Summary:
We now collect all types of dependences including lexically forward
deps not just "interesting" ones.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251985 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Fixed shuffle test name to match shuffle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251984 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVMSymbolize] Move demangling away from printing routines. NFC.

Make printDILineInfo and friends responsible for just rendering the
contents of the structures, demangling should actually be performed
earlier, when we have the information about the originating
SymbolizableModule at hand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251981 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y)

This one is enabled only under -ffast-math (due to rounding/overflows)
but allows us to emit shorter code.

Before (on FreeBSD x86-64):
4007f0:       50                      push   %rax
4007f1:       f2 0f 11 0c 24          movsd  %xmm1,(%rsp)
4007f6:       e8 75 fd ff ff          callq  400570 <exp2@plt>
4007fb:       f2 0f 10 0c 24          movsd  (%rsp),%xmm1
400800:       58                      pop    %rax
400801:       e9 7a fd ff ff          jmpq   400580 <pow@plt>
400806:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
40080d:       00 00 00

After:
4007b0:       f2 0f 59 c1             mulsd  %xmm1,%xmm0
4007b4:       e9 87 fd ff ff          jmpq   400540 <exp2@plt>
4007b9:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

Differential Revision: http://reviews.llvm.org/D14045

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251976 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][XOP] Add support for the matching of the VPCMOV bit select instruction

XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )

This patch adds tablegen pattern matching for this instruction.

Differential Revision: http://reviews.llvm.org/D8841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251975 91177308-0d34-0410-b5e6-96231b3b80d8

llmv-pdbdump: Make BuiltinDumper shorter. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251974 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence

Summary:
When the dependence distance in zero then we have a loop-independent
dependence from the earlier to the later access.

No current client of LAA uses forward dependences so other than
potentially hitting the MaxDependences threshold earlier, this change
shouldn't affect anything right now.

This and the previous patch were tested together for compile-time
regression. None found in LNT/SPEC.

Reviewers: hfinkel

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13255

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251973 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] LLE 1/6: Expose Forward dependences

Summary:
Before this change, we didn't use to collect forward dependences since
none of the current clients (LV, LDist) required them.

The motivation to also collect forward dependences is a new pass
LoopLoadElimination (LLE) which discovers store-to-load forwarding
opportunities across the loop's backedge.  The pass uses both lexically
forward or backward loop-carried dependences to detect these
opportunities.

The new pass also analyzes loop-independent (forward) dependences since
they can conflict with the loop-carried dependences in terms of how the
data flows through memory.

The newly added test only covers loop-carried forward dependences
because loop-independent ones are currently categorized as NoDep.  The
next patch will fix this.

The two patches were tested together for compile-time regression.  None
found in LNT/SPEC.

Note that with this change LAA provides all dependences rather than just
"interesting" ones.  A subsequent NFC patch will remove the now trivial
isInterestingDependence and rename the APIs.

Reviewers: hfinkel

Subscribers: jmolloy, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13254

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251972 91177308-0d34-0410-b5e6-96231b3b80d8

Don't create empty sections just to look like gas.

We are long past the time when this much bug for bug compatibility was
useful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251970 91177308-0d34-0410-b5e6-96231b3b80d8

Relax a few more overspecified tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251967 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Move metadata linking after lazy global materialization/linking."

This reverts commit r251926. I believe this is causing an LTO
bootstrapping bot failure
(http://lab.llvm.org:8080/green/job/llvm-stage2-cmake-RgLTO_build/3669/).

Haven't been able to repro it yet, but after looking at the metadata I
am pretty sure I know what is going on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251965 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unnecessary dependency on section and string positions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251964 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] make -test_single_input more reliable: make sure the input's size is equal to it's capacity

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251961 91177308-0d34-0410-b5e6-96231b3b80d8

Delete dead code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251960 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify local common output.

We now create them as they are found and use higher level APIs.

This is a step in avoiding creating unnecessary sections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251958 91177308-0d34-0410-b5e6-96231b3b80d8

[CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks

Differential Revision: http://reviews.llvm.org/D14258

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251957 91177308-0d34-0410-b5e6-96231b3b80d8

Move code out of a loop and use a range loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251952 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines.""

This reverts commit r251937.

The test was updated to the new API, bring the API back.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251944 91177308-0d34-0410-b5e6-96231b3b80d8

[Kaleidoscope][Orc] Fix the fully_lazy Orc Kaleidoscope example.

r251933 changed the Orc compile callbacks API, which broke this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251942 91177308-0d34-0410-b5e6-96231b3b80d8

Fix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates a PHI to a SCEVConstant

Summary:
Since now Scalar Evolution can create non-add rec expressions for PHI
nodes, it can also create SCEVConstant expressions. This will confuse
replaceCongruentPHIs, which previously relied on the fact that SCEV
could not produce constants in this case.

We will now replace the node with a constant in these cases - or avoid
processing the Phi in case of a type mismatch.

Reviewers: sanjoy

Subscribers: llvm-commits, majnemer

Differential Revision: http://reviews.llvm.org/D14230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251938 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines."

This reverts commit r251933.

It broke the build of examples/Kaleidoscope/Orc/fully_lazy/toy.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251937 91177308-0d34-0410-b5e6-96231b3b80d8

Kaleidoscope-ch2: Remove the dependence on LLVM by cloning make_unique into this project

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251936 91177308-0d34-0410-b5e6-96231b3b80d8

[Orc] Directly emit machine code for the x86 resolver block and trampolines.

Bypassing LLVM for this has a number of benefits:

1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't
work on Windows as the resolver block was in Darwin asm).

2) For cross-process JITs, it allows resolver blocks and trampolines to be
emitted directly in the target process, reducing cross process traffic.

3) It should be marginally faster.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251933 91177308-0d34-0410-b5e6-96231b3b80d8

Move metadata linking after lazy global materialization/linking.

Summary:
Currently, named metadata is linked before the LazilyLinkGlobalValues
list is walked and materialized/linked. As a result, references
from DISubprogram and DIGlobalVariable metadata to yet unmaterialized
functions and variables cause them to be added to the lazy linking
list and their definitions are materialized and linked.

This makes the llvm-link -only-needed option not have the intended
effect when debug information is present, as the otherwise unneeded
functions/variables are still linked in.

Additionally, for ThinLTO I have implemented a mechanism to only link
in debug metadata needed by imported functions. Moving named metadata
linking after lazy GV linking will facilitate applying this mechanism
to the LTO and "llvm-link -only-needed" cases as well.

Reviewers: dexonsmith, tra, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14195

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251926 91177308-0d34-0410-b5e6-96231b3b80d8

Pass enum instead of bool to new linkInModule call in llvm-link

A new call I added to linkInModule from llvm-link in r251866
was still passing in a boolean for an argument that was changed to an
enum in r246561. I didn't catch this in my merge since the bool false
matched the flag value it mapped to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251925 91177308-0d34-0410-b5e6-96231b3b80d8

Don't assert if materializing before seeing any function bodies

This assert was reachable from user input. A minimized test case (no
FUNCTION_BLOCK_ID record) is attached.

Bug found with afl-fuzz

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251910 91177308-0d34-0410-b5e6-96231b3b80d8

Don't use Twine objects after their lifetimes end.

No test, since it would depend on what the compiler can optimize/reuse.
My next commit made this bug visible on Linux Release compiles with some
versions of gcc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251909 91177308-0d34-0410-b5e6-96231b3b80d8

LoopVectorizer - skip 'bitcast' between GEP and load.

Skipping 'bitcast' in this case allows to vectorize load:

  %arrayidx = getelementptr inbounds double*, double** %in, i64 %indvars.iv
  %tmp53 = bitcast double** %arrayidx to i64*
  %tmp54 = load i64, i64* %tmp53, align 8

Differential Revision http://reviews.llvm.org/D14112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251907 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments

When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251904 91177308-0d34-0410-b5e6-96231b3b80d8

AVX512: add encoding tests for vmovq/d instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251903 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader"

Commit 251839 triggers miscompiles on some bots:

http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723

(The commit is listed in 13722, but due to an existing failure introduced in
13721 and reverted in 13723 the failure is only visible in 13723)

To verify r251839 is indeed the only change that triggered the buildbot failures
and to ensure the buildbots remain green while investigating I temporarily
revert this commit. At the current state it is unclear if this commit introduced
some miscompile or if it only exposed code to Polly that is subsequently
miscompiled by Polly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251901 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build problme introduced in r251883

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251888 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterPressure: Improve assert message

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251885 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterPressure: Slightly nicer pressure diff dumping

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251884 91177308-0d34-0410-b5e6-96231b3b80d8

ScheduleDAGInstrs: Remove IsPostRA flag; NFC

ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.

The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the (previously
following and default initialized) RemoveKillFlags.

Differential Revision: http://reviews.llvm.org/D14245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251883 91177308-0d34-0410-b5e6-96231b3b80d8

Don't implicitly construct a Archive::child_iterator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251878 91177308-0d34-0410-b5e6-96231b3b80d8

This never returns end(), simplify to use Child instead of iterator. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251876 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-pdbdump: Simplify. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251873 91177308-0d34-0410-b5e6-96231b3b80d8