granicus.if.org Git

AMDGPU: Fold fneg into rcp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291779 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fp_round

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291778 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fp_extend

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291777 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some -Wsign-compare warnings by making some integer literals explicitly unsigned

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291776 91177308-0d34-0410-b5e6-96231b3b80d8

TTI: Add comment clarifying the meaning of MemIntrinsicInfo::PtrVal.

Patch by Tom Stellard.
Differential Revision: https://reviews.llvm.org/D27563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291772 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel] Move as much RegisterBank initialization to the constructor as possible

Summary:
The register bank is now entirely initialized in the constructor. However,
we still have the hardcoded number of register classes which will be
dealt with in the TableGen patch (D27338) since we do not have access
to this information to resolve this at this stage. The number of register
classes is known to the TRI and to TableGen but the RegisterBank
constructor is too early for the former and too late for the latter.
This will be fixed when the data is tablegen-erated.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27809

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291770 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Added DI macro creation API to DIBuilder.

Differential Revision: https://reviews.llvm.org/D16077

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291769 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel] Initialize RegisterBanks with static data.

Summary:
Refactor the RegisterBank initialization to use static data. This requires
GlobalISel implementations to rewrite calls to createRegisterBank() and
addRegBankCoverage() into a call to setRegBankData().

Out of tree targets can use diff 4 of D27807
(https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump
the register classes and other data that needs to be provided to
setRegBankData(). This is the method that was used to generate the static data
in this patch.

Tablegen-eration of this static data will follow after some refactoring.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27807
Differential Revision: https://reviews.llvm.org/D27808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291768 91177308-0d34-0410-b5e6-96231b3b80d8

[Devirtualization] MemDep returns non-local !invariant.group dependencies

Summary:
Memory Dependence Analysis was limited to return only local dependencies
for invariant.group handling. Now it returns NonLocal when it finds it
and then by asking getNonLocalPointerDependency we get found dep.

Thanks to this we are able to devirtualize loops!

    void indirect(A &a, int n) {
      for (int i = 0 ; i < n; i++)
        a.foo();

    }
    void test(int n) {
      A a;
      indirect(a);
    }

After inlining a.foo() will be changed to direct call, even if foo and A::A()
is external (but only if vtable definition is be available).

Reviewers: nlewycky, dberlin, chandlerc, rsmith

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D28137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291762 91177308-0d34-0410-b5e6-96231b3b80d8

Wdocumentation fix

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291761 91177308-0d34-0410-b5e6-96231b3b80d8

Fix windows buildbots building llvm-xray

2 issues:
1 - replaced unix-style pid_t with cross-platform llvm::sys::ProcessInfo::ProcessId
2 - fixed shadow variable warning in lambda expression

Reviewed by @filcab

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291760 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Include <numeric> for std::accumulate.

Fix-up following D24377.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291750 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Implement the `llvm-xray account` subcommand

Summary:
This is the third of a multi-part change to implement subcommands for
the `llvm-xray` tool.

Here we define the `account` subcommand which does simple function call
accounting, generating basic statistics on function calls we find in an
XRay log/trace. We support text output and csv output for this
subcommand.

This change also supports sorting, summing, and filtering the top N
results.

Part of this tool will later be turned into a library that could be used
for basic function call accounting.

Depends on D24376.

Reviewers: dblaikie, echristo

Subscribers: mehdi_amini, dberris, beanz, llvm-commits

Differential Revision: https://reviews.llvm.org/D24377

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291749 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix sub_oneuse being marked commutative

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291748 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Improve lowering of zero_extend of v4i1 to v4i32 and v2i1 to v2i64 with VLX, but no DQ or BW support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291747 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Improve lowering of sign_extend of v4i1 to v4i32 and v2i1 to v2i64 when avx512vl is available, but not avx512dq.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291746 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Fix PR31515 - Do not flip vselect condition if it's not a vXi1 mask

r289653 added a case where `vselect <cond> <vector1> <all-zeros>`
is transformed to:
`vselect xor(cond, DAG.getConstant(1, DL, CondVT) <all-zeros> <vector1>`
This was not aimed to catch cases where Cond is not a vXi1
mask but it does. Moreover, when Cond type is VxiN (N > 1)
then xor(cond, DAG.getConstant(1, DL, CondVT) != NOT(cond).
This patch changes the above to xor with allones, and avoids
entering the case for non-mask Conds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291745 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add more varied avx512 feature command lines to the avx512-cvt.ll test to show some poor codegen examples.

We're definitely doing bad things when avx512vl is enabled without avx512dq. It looks like avx512vl/dq without avx512bw may also have some issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291744 91177308-0d34-0410-b5e6-96231b3b80d8

Make a test actually test what it set out to test.

This test seems to have largely been relying on asserts being tripped.
It had a very specific and somewhat uninteresting grep of the output,
but it never really did anything to cause SCEV to be preserved across
loop simplify, certainly not explicitly. And a later addition to it
actually added CHECK lines despite the test never running FileCheck.

Now we actually print SCEV before and after loop simplify to make sure
it is *changing* and being *updated*. Which seems to be much more likely
the point of the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291740 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fma or fmad

Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291733 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fmul

Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291732 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fadd

Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291731 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Pull fneg/fabs out of a select

Allows better source modifier usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291729 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Fixup store count for the `initial` congruency class.

It was always zero. When we move a store from `initial` to its
own congruency class, we end up with a negative store count, which
is obviously wrong.
Also, while here, change StoreCount to be signed so that the assertions
actually fire.

Ack'ed by Daniel Berlin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291725 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Finish decoupling TypeDatabase from TypeDumper.

Previously the type dumper itself was passed around to a lot of different
places and manipulated in ways that were more appropriate on the type
database. For example, the entire TypeDumper was passed into the symbol
dumper, when all the symbol dumper wanted to do was lookup the name of a
TypeIndex so it could print it. That's what the TypeDatabase is for --
mapping type indices to names.

Another example is how if the user runs llvm-pdbdump with the option to
dump symbols but not types, we still have to visit all types so that we
can print minimal information about the type of a symbol, but just without
dumping full symbol records. The way we did this before is by hacking it
up so that we run everything through the type dumper with a null printer,
so that the output goes to /dev/null. But really, we don't need to dump
anything, all we want to do is build the type database. Since
TypeDatabaseVisitor now exists independently of TypeDumper, we can do
this. We just build a custom visitor callback pipeline that includes a
database visitor but not a dumper.

All the hackery around printers etc goes away. After this patch, we could
probably even delete the entire CVTypeDumper class since really all it is
at this point is a thin wrapper that hides the details of how to build a
useful visitation pipeline. It's not a priority though, so CVTypeDumper
remains for now.

After this patch we will be able to easily plug in a different style of
type dumper by only implementing the proper visitation methods to dump
one-line output and then sticking it on the pipeline.

Differential Revision: https://reviews.llvm.org/D28524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291724 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Remove dead code. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291721 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix shrinking of addc/subb.

To shrink to VOP2 the input carry must also be VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291720 91177308-0d34-0410-b5e6-96231b3b80d8

Add -Wl,-color-diagnostics if a linker supports the option.

Differential Revision: https://reviews.llvm.org/D28046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291719 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix sext_inreg for i1 in i16

This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix breaking VOP3 v_add_i32s

This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Set alignment of __asan_global_* globals to sizeof(GlobalStruct)

When using profiling and ASan together (-fprofile-instr-generate -fcoverage-mapping -fsanitize=address), at least on Darwin, the section of globals that ASan emits (__asan_globals) is misaligned and starts at an odd offset. This really doesn't have anything to do with profiling, but it triggers the issue because profiling emits a string section, which can have arbitrary size. This patch changes the alignment to sizeof(GlobalStruct).

Differential Revision: https://reviews.llvm.org/D28573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291715 91177308-0d34-0410-b5e6-96231b3b80d8

Use EXPECT_EQ instead of ASSERT_EQ in a unit test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291713 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[NewGVN] Strengthen a couple of assertions."

It's breaking some bots. Will investigate and recommit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291712 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix folding immediates into mac src2

Whether it is legal or not needs to check for the instruction
it will be replaced with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291711 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Parenthesise assertion condition (-Wparenthesis).

Format an assertion message while I'm here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291710 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Strengthen a couple of assertions.

StoreCount >= 0 on `unsigned` is always true, otherwise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291709 91177308-0d34-0410-b5e6-96231b3b80d8

Add test that verifies we don't peel loops in optsize functions. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291708 91177308-0d34-0410-b5e6-96231b3b80d8

LowerTypeTests: Represent the memory region size with the constant size-1.

This means that we can use a shorter instruction sequence in the case where
the size is a power of two and on the boundary between two representations.

Differential Revision: https://reviews.llvm.org/D28421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291706 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Make howFarToZero max backedge-taken count check for precondition.

Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.

Differential Revision: https://reviews.llvm.org/D28536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291704 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.

This is both easier to understand, and produces a tighter bound in certain
cases.

Differential Revision: https://reviews.llvm.org/D28393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291701 91177308-0d34-0410-b5e6-96231b3b80d8

Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.", with a fix for an off-by-one error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291699 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Fix PR31594, by tracking the store count of congruence
classes, and updating checking to allow for equivalence through
reachability.

(Sadly, the checking here is not perfect, and can't be made perfect,
so we'll have to disable it after we are satisfied with correctness.
Right now it is just "very unlikely" to happen.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291698 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Refactor performCongruenceFinding and split out congruence class moving

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291697 91177308-0d34-0410-b5e6-96231b3b80d8

Resubmit "[PGO] Turn off comdat renaming in IR PGO by default"

This patch resubmits the changes in r291588.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291696 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "CodeGen: Allow small copyable blocks to "break" the CFG."

This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.

This needs a simple probability check because there are some cases where it is
not profitable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291695 91177308-0d34-0410-b5e6-96231b3b80d8

Make some operator bools explicit for sanity/safety.

There are a couple left in bool-like containers (BitVector, etc) where
the implicit conversions seem more suitable - though it might be worth
considering explicitifying those too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291694 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] More aggressive matching for vpadd and vpaddl.

The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.

Differential Revision: https://reviews.llvm.org/D27779

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291693 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Remove bogus assert.

The removed assert seems bogus - it's perfectly legal for the roots of the
vectorized subtrees to be equal even if the original scalar values aren't,
if the original scalars happen to be equivalent.

This fixes PR31599.

Differential Revision: https://reviews.llvm.org/D28539

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291692 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/Object] Unbreak build with -Werror (unused variable). NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291691 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][XOP] Add vpermil2ps target shuffle -> insertps combine test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291690 91177308-0d34-0410-b5e6-96231b3b80d8

Remove all variants of DWARFDie::getAttributeValueAs...() that had parameters that specified default values.

Now we only support returning Optional<> values and have changed all clients over to use Optional::getValueOr().

Differential Revision: https://reviews.llvm.org/D28569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291686 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: only print debug info with -debug. NFC.

Turns out DEBUG(...) has uses even inside NDEBUG checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291685 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL291205 because it breaks Chrome tests under CFI.

Summary:
Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.

This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.

Original URL: https://reviews.llvm.org/D28341

Reviewers: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291684 91177308-0d34-0410-b5e6-96231b3b80d8

build_llvm_package.bat: Add note about what SWIG version to use

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291682 91177308-0d34-0410-b5e6-96231b3b80d8

Remove trailing whitespace. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291680 91177308-0d34-0410-b5e6-96231b3b80d8

[MemDep] NFC variable name change

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291679 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix test CodeGen/ARM/fpcmp_ueq.ll broken by rL290616

Commit rL290616 (https://reviews.llvm.org/rL290616) changed a checking command
for the triple arm-apple-darwin in LLVM::CodeGen/ARM/fpcmp_ueq.ll. As a result
of the changes the test could fail for the valid generated code.

These changes fixes the test to check only instructions we would expect.

Differential Revision: https://reviews.llvm.org/D28159

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291678 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/Object] - Introduce Decompressor class.

Decompressor intention is to reduce duplication of code.
Currently LLD has own implementation of decompressor
for compressed debug sections.

This class helps to avoid it and share the code.
LLD patch for reusing it is D28106

Differential revision: https://reviews.llvm.org/D28105

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291675 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Improve isFoldableMemAccessOffset().

A store of an extracted element or a load which gets inserted into a vector,
will be combined into a vector load/store element instruction.

Therefore, isFoldableMemAccessOffset(), which is called by LSR, should
return false in these cases.

Reviewer: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291673 91177308-0d34-0410-b5e6-96231b3b80d8

Make processing @llvm.assume more efficient - Add affected values to the assumption cache

Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.

As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).

Differential Revision: https://reviews.llvm.org/D28459

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291671 91177308-0d34-0410-b5e6-96231b3b80d8

X86 CodeGen: Optimized pattern for truncate with unsigned saturation.

DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512.
And VPACKUS* instructions on SEE* targets.

Differential Revision: https://reviews.llvm.org/D28216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291670 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and immediate operands

Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291668 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused variable warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291666 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512BW] Vectorize v64i8 vector shifts

Differential Revision: https://reviews.llvm.org/D28447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291665 91177308-0d34-0410-b5e6-96231b3b80d8

Fix line endings

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291663 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291662 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Take more drastic measures to work around MSVC's failure on this
code. If this doesn't work and I can't find someone to help who has MSVC
installed, I'll back everything out I guess. =[

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291661 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix PR30926 - Add patterns for (v)cvtsi2s{s,d} and (v)cvtsd2s{s,d}

The code emiited by Clang's intrinsics for (v)cvtsi2ss, (v)cvtsi2sd,
(v)cvtsd2ss and (v)cvtss2sd is lowered to a code sequence that includes
redundant (v)movss/(v)movsd instructions. This patch adds patterns for
optimizing these sequences.

Differential revision: https://reviews.llvm.org/D28455

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291660 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] fixing failed test in commit: r291657

Missing Requires asserts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291659 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.

updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq.
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291657 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Pull a lambda out of an argument into a named variable to try and
get a little more clarity about the nature of the issue MSVC is having
with this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291656 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Another attempt to satisfy MSVC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291655 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Try to appease MSVC by explicitly disambiguating a member name as
a template.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291654 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Define the library for XRay trace logs

Summary:
In this change we move the definition of the log reading routines from
the tools directory in LLVM to {include/llvm,lib}/XRay. We improve the
documentation a little bit for the publicly accessible headers, and
adjust the top-matter. This also leads to some refactoring and cleanup
in the tooling code.

In particular, we do the following:

  - Rename the class from LogReader to Trace, as it better represents
    the logical set of records as opposed to a log.
  - Use file type detection instead of asking the user to say what
    format the input file is. This allows us to keep the interface
    simple and encapsulate the logic of loading the data appropriately.

In future changes we increase the API surface and write dedicated unit
tests for the XRay library.

Depends on D24376.

Reviewers: dblaikie, echristo

Subscribers: mehdi_amini, mgorny, llvm-commits, varno

Differential Revision: https://reviews.llvm.org/D28345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291652 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Rewrite the loop pass manager to use a worklist and augmented run
arguments much like the CGSCC pass manager.

This is a major redesign following the pattern establish for the CGSCC layer to
support updates to the set of loops during the traversal of the loop nest and
to support invalidation of analyses.

An additional significant burden in the loop PM is that so many passes require
access to a large number of function analyses. Manually ensuring these are
cached, available, and preserved has been a long-standing burden in LLVM even
with the help of the automatic scheduling in the old pass manager. And it made
the new pass manager extremely unweildy. With this design, we can package the
common analyses up while in a function pass and make them immediately available
to all the loop passes. While in some cases this is unnecessary, I think the
simplicity afforded is worth it.

This does not (yet) address loop simplified form or LCSSA form, but those are
the next things on my radar and I have a clear plan for them.

While the patch is very large, most of it is either mechanically updating loop
passes to the new API or the new testing for the loop PM. The code for it is
reasonably compact.

I have not yet updated all of the loop passes to correctly leverage the update
mechanisms demonstrated in the unittests. I'll do that in follow-up patches
along with improved FileCheck tests for those passes that ensure things work in
more realistic scenarios. In many cases, there isn't much we can do with these
until the loop simplified form and LCSSA form are in place.

Differential Revision: https://reviews.llvm.org/D28292

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291651 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r291645 "[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent."

Some test appears to be hanging on the build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291650 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM] Report failing to hoist conditionally-executed loads

These are interesting again because the user may not be aware that this
is a common reason preventing LICM.

A const is removed from an instruction pointer declaration in order to
pass it to ORE.

Differential Revision: https://reviews.llvm.org/D27940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291649 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM] Report failing to hoist a load with an invariant address

These are interesting because lack of precision in alias information
could be standing in the way of this optimization.

An example is the case in the test suite that I showed in the DevMeeting
talk:

http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/MultiSource/Benchmarks/FreeBench/distray/CMakeFiles/distray.dir/html/_org_test-suite_MultiSource_Benchmarks_FreeBench_distray_distray.c.html#L236

canSinkOrHoistInst is also used from LoopSink, which does not use
opt-remarks so we need to take ORE as an optional argument.

Differential Revision: https://reviews.llvm.org/D27939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291648 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291647 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM] Report successful hoist/sink/promotion

Differential Revision: https://reviews.llvm.org/D27938

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291646 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Teach DAG combiner to fold (vselect (N0 xor AllOnes), N1, N2) -> (vselect N0, N2, N1). Only do this if the target indicates its vector boolean type is ZeroOrNegativeOneBooleanContent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291645 91177308-0d34-0410-b5e6-96231b3b80d8

DAGCombiner: Add hasOneUse checks to fadd/fma combine

Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291642 91177308-0d34-0410-b5e6-96231b3b80d8

[Target] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291641 91177308-0d34-0410-b5e6-96231b3b80d8

Re-commit r289955: [X86] Fold (setcc (cmp (atomic_load_add x, -C) C), COND) to (setcc (LADD x, -C), COND) (PR31367)

This was reverted because it would miscompile code where the cmp had
multiple uses. That was due to a deficiency in the existing code, which
was fixed in r291630 (see the PR for details).

This re-commit includes an extra test for the kind of code that got
miscompiled: @test_sub_1_setcc_jcc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291640 91177308-0d34-0410-b5e6-96231b3b80d8

tools/llvm-xray: Avoid std::errc::protocol_* to appease mingw, like r285261.

They are oriented from winsock and mingw doesn't import them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291636 91177308-0d34-0410-b5e6-96231b3b80d8

InstSimplify: Refactor function to use more switches

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291634 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused field.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291633 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Dont run combineSetCCAtomicArith() when the cmp has multiple uses

We would miscompile the following:

  void g(int);
  int f(volatile long long *p) {
    bool b = __atomic_fetch_add(p, 1, __ATOMIC_SEQ_CST) < 0;
    g(b ? 12 : 34);
    return b ? 56 : 78;
  }

into

  pushq   %rax
  lock            incq    (%rdi)
  movl    $12, %eax
  movl    $34, %edi
  cmovlel %eax, %edi
  callq   g(int)
  testq   %rax, %rax   <---- Bad.
  movl    $56, %ecx
  movl    $78, %eax
  cmovsl  %ecx, %eax
  popq    %rcx
  retq

because the code failed to take into account that the cmp has multiple
uses, replaced one of them, and left the other one comparing garbage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291630 91177308-0d34-0410-b5e6-96231b3b80d8

[RegBankSelect] Improve the output of the debug messages.

Add more information about mapping cost and chosen solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291629 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView/PDB] Rename a bunch of files.

We were starting to get some name clashes between llvm-pdbdump
and the common CodeView framework, so I took this opportunity
to rename a bunch of files to more accurately describe their
usage. This also helps in llvm-pdbdump to distinguish
between different files and whether they are used for pretty
dump mode or raw dump mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291627 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Add TypeDatabase class.

This creates a centralized class in which to store type records.
It stores types as an array of entries, which matches the
notion of a type stream being a topologically sorted DAG.
Logic to build up such a database was already being used in
CVTypeDumper, so CVTypeDumper is now updated to to read from
a TypeDatabase which is filled out by an earlier visitor in
the pipeline.

Differential Revision: https://reviews.llvm.org/D28486

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291626 91177308-0d34-0410-b5e6-96231b3b80d8

Add better documentation for iterator facade subclasses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291625 91177308-0d34-0410-b5e6-96231b3b80d8

InstSimplify: Eliminate fabs on known positive

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291624 91177308-0d34-0410-b5e6-96231b3b80d8

[gmock] Teach gmock ElementsAre and BeginEndDistanceIs matchers to
handle generic ranges by using std::begin and std::end rather than
requiring things to look exactly like an STL container.

Much of the credit for this goes to Dave Blaikie who helped me figure
out the right incantations.

This will probably be re-designed when I send this to the maintainers of
gmock, so I've instead structured it to change is little as possible
while it is a local patch. That makes it somewhat ugly, but I think a focused
change is better for getting this to work for LLVM today and letting the
upstream maintainers figure out the correct long-term pattern.

Differential Revision: https://reviews.llvm.org/D28288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291623 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/EG,CM: Add fp16 conversion instructions

Differential Revision: https://reviews.llvm.org/D28164

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291622 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[PGO] Turn off comdat renaming in IR PGO by default"

This patch reverts r291588: [PGO] Turn off comdat renaming in IR PGO by default,
as we are seeing some hash mismatches in our internal tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291621 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add a wrapper for a common pair of transforms; NFCI

Some of the callers are artificially limiting this transform to integer types;
this should make it easier to incrementally remove that restriction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291620 91177308-0d34-0410-b5e6-96231b3b80d8

[loop-unroll] Properly populate LoopInfo for loops cloned in LoopUnrollRuntime.

Summary:
This fixes Transforms/LoopUnroll/runtime-loop3.ll which failed with
EXTENSIVE_DEBUG, because the cloned basic blocks were not added to the
correct sub-loops in LoopUnrollRuntime.cpp.

Reviewers: dexonsmith, mzolotukhin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291619 91177308-0d34-0410-b5e6-96231b3b80d8

[TM] Restore default TargetOptions in TargetMachine::resetTargetOptions.

Summary:
Previously if you had

* a function with the fast-math-enabled attr, followed by
* a function without the fast-math attr,

the second function would inherit the first function's fast-math-ness.

This means that mixing fast-math and non-fast-math functions in a module
was completely broken unless you explicitly annotated every
non-fast-math function with "unsafe-fp-math"="false".  This appears to
have been broken since r176986 (March 2013), when the resetTargetOptions
function was introduced.

This patch tests the correct behavior as best we can.  I don't think I
can test FPDenormalMode and NoTrappingFPMath, because they aren't used
in any backends during function lowering.  Surprisingly, I also can't
find any uses at all of LessPreciseFPMAD affecting generated code.

The NVPTX/fast-math.ll test changes are an expected result of fixing
this bug.  When FMA is disabled, we emit add as "add.rn.f32", which
prevents fma combining.  Before this patch, fast-math was enabled in all
functions following the one which explicitly enabled it on itself, so we
were emitting plain "add.f32" where we should have generated
"add.rn.f32".

Reviewers: mkuper

Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291618 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test.

Also fix up whitespace.

Test-only change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291617 91177308-0d34-0410-b5e6-96231b3b80d8