granicus.if.org Git

Merging r291966:
------------------------------------------------------------------------
r291966 | majnemer | 2017-01-13 14:24:27 -0800 (Fri, 13 Jan 2017) | 6 lines

[LoopStrengthReduce] Don't bother rewriting PHIs in catchswitch blocks

The catchswitch instruction cannot be split, don't bother trying to
rewrite it.

This fixes PR31627.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292340 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r292133:
------------------------------------------------------------------------
r292133 | hfinkel | 2017-01-16 07:22:01 -0800 (Mon, 16 Jan 2017) | 10 lines

Fix use-after-free bug in AffectedValueCallbackVH::allUsesReplacedWith

When transferring affected values in the cache from an old value, identified by
the value of the current callback, to the specified new value we might need to
insert a new entry into the DenseMap which constitutes the cache. Doing so
might delete the current callback object. Move the copying logic into a new
function, a member of the assumption cache itself, so that we don't run into UB
should the callback handle itself be removed mid-copy.

Differential Revision: https://reviews.llvm.org/D28749
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292312 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r291968 and r291979:
------------------------------------------------------------------------
r291968 | dannyb | 2017-01-13 14:40:01 -0800 (Fri, 13 Jan 2017) | 23 lines

NewGVN: Move leaders around properly to ensure we have a canonical dominating leader. Fixes PR 31613.

Summary:
This is a testcase where phi node cycling happens, and because we do
not order the leaders by domination or anything similar, the leader
keeps changing.

Using std::set for the members is too expensive, and we actually don't
need them sorted all the time, only at leader changes.

We could keep both a set and a vector, and keep them mostly sorted and
resort as necessary, or use a set and a fibheap, but all of this seems
premature.

After running some statistics, we are able to avoid the vast majority
of sorting by keeping a "next leader" field. Most congruence classes only have
leader changes once or twice during GVN.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28594
------------------------------------------------------------------------

------------------------------------------------------------------------
r291979 | dannyb | 2017-01-13 15:54:10 -0800 (Fri, 13 Jan 2017) | 1 line

NewGVN: Fix PR31613 test regex naming
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292307 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r292255:
------------------------------------------------------------------------
r292255 | mgorny | 2017-01-17 13:04:19 -0800 (Tue, 17 Jan 2017) | 12 lines

[cmake] Update SOVERSION for the new versioning scheme

Update SOVERSION to use just the major version number rather than
major+minor, to match the new versioning scheme where only major is used
to indicate API/ABI version.

Since two-digit SOVERSIONs were introduced post 3.9 branching, this
change does not risk any SOVERSION collisions. In the past,
two-component X.Y SOVERSIONs were shortly used but those will not
interfere with the new ones since the new versions start at 4.

Differential Revision: https://reviews.llvm.org/D28730
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292270 91177308-0d34-0410-b5e6-96231b3b80d8

Drop 'if you're using released version' warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292263 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r292242:
------------------------------------------------------------------------
r292242 | bwilson | 2017-01-17 11:18:57 -0800 (Tue, 17 Jan 2017) | 5 lines

Revert r291640 change to fold X86 comparison with atomic_load_add.

Even with the fix from r291630, this still causes problems. I get
widespread assertion failures in the Swift runtime's WeakRefCount::increment()
function. I sent a reduced testcase in reply to the commit.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292243 91177308-0d34-0410-b5e6-96231b3b80d8

Mention ThinLTO in ReleaseNotes

https://reviews.llvm.org/D28746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292079 91177308-0d34-0410-b5e6-96231b3b80d8

Mention invariant.group in ReleaseNotes

https://reviews.llvm.org/D28605

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292009 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r291875:
------------------------------------------------------------------------
r291875 | chapuni | 2017-01-12 17:13:10 -0800 (Thu, 12 Jan 2017) | 8 lines

Revert r291503, "Lift the 10-type limit for AlignedCharArrayUnion", and followings.

  r291503, "Lift the 10-type limit for AlignedCharArrayUnion"
  r291514, "Fix MSVC build of AlignedCharArrayUnion"
  r291515, "Revert the attempt to optimize the constexpr functions. MSVC does not handle this yet"
  r291519, "Try once again to fix the MSVC build of AlignedCharArrayUnion"

They has been failing on i686-linux.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@291945 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r291863:
------------------------------------------------------------------------
r291863 | chapuni | 2017-01-12 16:17:15 -0800 (Thu, 12 Jan 2017) | 1 line

xray-account: Avoid std::errc::bad_message to appease mingw.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@291914 91177308-0d34-0410-b5e6-96231b3b80d8

ReleaseNotes: remove 'if you're reading on trunk' warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@291854 91177308-0d34-0410-b5e6-96231b3b80d8

Drop 'svn' suffix from version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@291843 91177308-0d34-0410-b5e6-96231b3b80d8

Creating release_40 branch off revision 291814

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@291816 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Remove redundant check in SimplifyCFG; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291813 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Handle same locations in DILocation::getMergedLocation

Revision 289661 introduced the function DILocation::getMergedLocation for
merging of debug locations. At the time is was simply a stub which always
returned no location. This patch modifies getMergedLocation to handle the
case where the two locations are the same or can't be discriminated.

Differential Revision: https://reviews.llvm.org/D28521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291809 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Simplify SolveLinEquationWithOverflow a bit.

Cleanup in preparation for generalizing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291808 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Replace AND+IMM64 with SRL/SHL

Emit SHRQ/SHLQ instead of ANDQ with a 64 bit constant mask if the result
is unused and the mask has only higher/lower bits set. For example, with
this patch LLVM emits

  shrq $41, %rdi
  je

instead of

  movabsq $0xFFFFFE0000000000, %rcx
  testq   %rcx, %rdi
  je

This reduces number of instructions, code size and register pressure.
The transformation is applied only for cases where the mask cannot be
encoded as an immediate value within TESTQ instruction.

Differential Revision: https://reviews.llvm.org/D28198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291806 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Modify BypassSlowDivision tests to match their new names (NFC)

- bypass-slow-division-32.ll:
  tests verifying correctness of divl-to-divb bypassing

- bypass-slow-division-64.ll:
  tests verifying correctness of divq-to-divl bypassing

- bypass-slow-division-tune.ll:
  tests verifying that bypassing is enabled only when appropriate

Differential Revision: https://reviews.llvm.org/D28551

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291804 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-config] Fix obviously wrong code in parsing DyLib components.

The code parsing the string was using the offset returned from
StringRef::find() wrong, assuming it was relative to the staring
offset that is passed to the function, but the returned offset
is always relative to the beginning of the line.

This causes odd behaviour while parsing the component string.
Spotted thanks to the newly added test:

tools/llvm-config/booleans.test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291803 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Rename tests for bypassing slow division (NFC)

For tests on bypassing slow division there's no need to be
Atom-specific. The patch renames all tests on division bypassing
and makes their names more consistent:

  atom-bypass-slow-division.ll -> bypass-slow-division-32.ll
  (tests verifying correctness of divl-to-divb bypassing)

  atom-bypass-slow-division-64.ll -> bypass-slow-division-64.ll
  (tests verifying correctness of divq-to-divl bypassing)

  slow-div.ll -> bypass-slow-division-tune.ll
  (tests verifying that bypassing is enabled only when appropriate)

Differential Revision: https://reviews.llvm.org/D28197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291802 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Tune bypassing of slow division for Intel CPUs

64-bit integer division in Intel CPUs is extremely slow, much slower
than 32-bit division. On the other hand, 8-bit and 16-bit divisions
aren't any faster. The only important exception is Atom where DIV8
is fastest. Because of that, the patch
1) Enables bypassing of 64-bit division for Atom, Silvermont and
   all big cores.
2) Modifies 64-bit bypassing to use 32-bit division instead of
   16-bit one. This doesn't make the shorter division slower but
   increases chances of taking it. Moreover, it's much more likely
   to prove at compile-time that a value fits 32 bits and doesn't
   require a run-time check (e.g. zext i32 to i64).

Differential Revision: https://reviews.llvm.org/D28196

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291800 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Update LLC tests for slow division bypassing (NFC)

Run update_llc_test_checks.py on

    CodeGen/X86/atom-bypass-slow-division.ll
    CodeGen/X86/atom-bypass-slow-division-64.ll
    CodeGen/X86/slow-div.ll

Differential Revision: https://reviews.llvm.org/D28469

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291799 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Skip fneg/select combine if it can fold into other

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291792 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold free fneg into sin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291790 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: slightly more table driven libcall setup

Switch some additional library call setup to be table driven. This
makes it more immediately obvious what the library call looks like.
This is important for ARM since the calling conventions for the builtins
change based on the target/libcall name. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291789 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] DILocation variable declaration should be const; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291787 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid std::errc::protocol_* to appease mingw

Like r291636 and r285261.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291786 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Add const to DILocation variable declaration; NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291785 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fmul_legacy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291784 91177308-0d34-0410-b5e6-96231b3b80d8

Bump year to 2017 in LICENSE.txt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291782 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into rcp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291779 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fp_round

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291778 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fp_extend

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291777 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some -Wsign-compare warnings by making some integer literals explicitly unsigned

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291776 91177308-0d34-0410-b5e6-96231b3b80d8

TTI: Add comment clarifying the meaning of MemIntrinsicInfo::PtrVal.

Patch by Tom Stellard.
Differential Revision: https://reviews.llvm.org/D27563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291772 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel] Move as much RegisterBank initialization to the constructor as possible

Summary:
The register bank is now entirely initialized in the constructor. However,
we still have the hardcoded number of register classes which will be
dealt with in the TableGen patch (D27338) since we do not have access
to this information to resolve this at this stage. The number of register
classes is known to the TRI and to TableGen but the RegisterBank
constructor is too early for the former and too late for the latter.
This will be fixed when the data is tablegen-erated.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27809

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291770 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Added DI macro creation API to DIBuilder.

Differential Revision: https://reviews.llvm.org/D16077

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291769 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel] Initialize RegisterBanks with static data.

Summary:
Refactor the RegisterBank initialization to use static data. This requires
GlobalISel implementations to rewrite calls to createRegisterBank() and
addRegBankCoverage() into a call to setRegBankData().

Out of tree targets can use diff 4 of D27807
(https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump
the register classes and other data that needs to be provided to
setRegBankData(). This is the method that was used to generate the static data
in this patch.

Tablegen-eration of this static data will follow after some refactoring.

Reviewers: t.p.northover, ab, rovka, qcolombet

Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris

Differential Revision: https://reviews.llvm.org/D27807
Differential Revision: https://reviews.llvm.org/D27808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291768 91177308-0d34-0410-b5e6-96231b3b80d8

[Devirtualization] MemDep returns non-local !invariant.group dependencies

Summary:
Memory Dependence Analysis was limited to return only local dependencies
for invariant.group handling. Now it returns NonLocal when it finds it
and then by asking getNonLocalPointerDependency we get found dep.

Thanks to this we are able to devirtualize loops!

    void indirect(A &a, int n) {
      for (int i = 0 ; i < n; i++)
        a.foo();

    }
    void test(int n) {
      A a;
      indirect(a);
    }

After inlining a.foo() will be changed to direct call, even if foo and A::A()
is external (but only if vtable definition is be available).

Reviewers: nlewycky, dberlin, chandlerc, rsmith

Subscribers: mehdi_amini, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D28137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291762 91177308-0d34-0410-b5e6-96231b3b80d8

Wdocumentation fix

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291761 91177308-0d34-0410-b5e6-96231b3b80d8

Fix windows buildbots building llvm-xray

2 issues:
1 - replaced unix-style pid_t with cross-platform llvm::sys::ProcessInfo::ProcessId
2 - fixed shadow variable warning in lambda expression

Reviewed by @filcab

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291760 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Include <numeric> for std::accumulate.

Fix-up following D24377.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291750 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Implement the `llvm-xray account` subcommand

Summary:
This is the third of a multi-part change to implement subcommands for
the `llvm-xray` tool.

Here we define the `account` subcommand which does simple function call
accounting, generating basic statistics on function calls we find in an
XRay log/trace. We support text output and csv output for this
subcommand.

This change also supports sorting, summing, and filtering the top N
results.

Part of this tool will later be turned into a library that could be used
for basic function call accounting.

Depends on D24376.

Reviewers: dblaikie, echristo

Subscribers: mehdi_amini, dberris, beanz, llvm-commits

Differential Revision: https://reviews.llvm.org/D24377

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291749 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix sub_oneuse being marked commutative

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291748 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Improve lowering of zero_extend of v4i1 to v4i32 and v2i1 to v2i64 with VLX, but no DQ or BW support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291747 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Improve lowering of sign_extend of v4i1 to v4i32 and v2i1 to v2i64 when avx512vl is available, but not avx512dq.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291746 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Fix PR31515 - Do not flip vselect condition if it's not a vXi1 mask

r289653 added a case where `vselect <cond> <vector1> <all-zeros>`
is transformed to:
`vselect xor(cond, DAG.getConstant(1, DL, CondVT) <all-zeros> <vector1>`
This was not aimed to catch cases where Cond is not a vXi1
mask but it does. Moreover, when Cond type is VxiN (N > 1)
then xor(cond, DAG.getConstant(1, DL, CondVT) != NOT(cond).
This patch changes the above to xor with allones, and avoids
entering the case for non-mask Conds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291745 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add more varied avx512 feature command lines to the avx512-cvt.ll test to show some poor codegen examples.

We're definitely doing bad things when avx512vl is enabled without avx512dq. It looks like avx512vl/dq without avx512bw may also have some issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291744 91177308-0d34-0410-b5e6-96231b3b80d8

Make a test actually test what it set out to test.

This test seems to have largely been relying on asserts being tripped.
It had a very specific and somewhat uninteresting grep of the output,
but it never really did anything to cause SCEV to be preserved across
loop simplify, certainly not explicitly. And a later addition to it
actually added CHECK lines despite the test never running FileCheck.

Now we actually print SCEV before and after loop simplify to make sure
it is *changing* and being *updated*. Which seems to be much more likely
the point of the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291740 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fma or fmad

Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291733 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fmul

Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291732 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold fneg into fadd

Patch mostly by Fiona Glaser

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291731 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Pull fneg/fabs out of a select

Allows better source modifier usage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291729 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Fixup store count for the `initial` congruency class.

It was always zero. When we move a store from `initial` to its
own congruency class, we end up with a negative store count, which
is obviously wrong.
Also, while here, change StoreCount to be signed so that the assertions
actually fire.

Ack'ed by Daniel Berlin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291725 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Finish decoupling TypeDatabase from TypeDumper.

Previously the type dumper itself was passed around to a lot of different
places and manipulated in ways that were more appropriate on the type
database. For example, the entire TypeDumper was passed into the symbol
dumper, when all the symbol dumper wanted to do was lookup the name of a
TypeIndex so it could print it. That's what the TypeDatabase is for --
mapping type indices to names.

Another example is how if the user runs llvm-pdbdump with the option to
dump symbols but not types, we still have to visit all types so that we
can print minimal information about the type of a symbol, but just without
dumping full symbol records. The way we did this before is by hacking it
up so that we run everything through the type dumper with a null printer,
so that the output goes to /dev/null. But really, we don't need to dump
anything, all we want to do is build the type database. Since
TypeDatabaseVisitor now exists independently of TypeDumper, we can do
this. We just build a custom visitor callback pipeline that includes a
database visitor but not a dumper.

All the hackery around printers etc goes away. After this patch, we could
probably even delete the entire CVTypeDumper class since really all it is
at this point is a thin wrapper that hides the details of how to build a
useful visitation pipeline. It's not a priority though, so CVTypeDumper
remains for now.

After this patch we will be able to easily plug in a different style of
type dumper by only implementing the proper visitation methods to dump
one-line output and then sticking it on the pipeline.

Differential Revision: https://reviews.llvm.org/D28524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291724 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Remove dead code. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291721 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix shrinking of addc/subb.

To shrink to VOP2 the input carry must also be VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291720 91177308-0d34-0410-b5e6-96231b3b80d8

Add -Wl,-color-diagnostics if a linker supports the option.

Differential Revision: https://reviews.llvm.org/D28046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291719 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix sext_inreg for i1 in i16

This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix breaking VOP3 v_add_i32s

This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Set alignment of __asan_global_* globals to sizeof(GlobalStruct)

When using profiling and ASan together (-fprofile-instr-generate -fcoverage-mapping -fsanitize=address), at least on Darwin, the section of globals that ASan emits (__asan_globals) is misaligned and starts at an odd offset. This really doesn't have anything to do with profiling, but it triggers the issue because profiling emits a string section, which can have arbitrary size. This patch changes the alignment to sizeof(GlobalStruct).

Differential Revision: https://reviews.llvm.org/D28573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291715 91177308-0d34-0410-b5e6-96231b3b80d8

Use EXPECT_EQ instead of ASSERT_EQ in a unit test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291713 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[NewGVN] Strengthen a couple of assertions."

It's breaking some bots. Will investigate and recommit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291712 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix folding immediates into mac src2

Whether it is legal or not needs to check for the instruction
it will be replaced with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291711 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Parenthesise assertion condition (-Wparenthesis).

Format an assertion message while I'm here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291710 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Strengthen a couple of assertions.

StoreCount >= 0 on `unsigned` is always true, otherwise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291709 91177308-0d34-0410-b5e6-96231b3b80d8

Add test that verifies we don't peel loops in optsize functions. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291708 91177308-0d34-0410-b5e6-96231b3b80d8

LowerTypeTests: Represent the memory region size with the constant size-1.

This means that we can use a shorter instruction sequence in the case where
the size is a power of two and on the boundary between two representations.

Differential Revision: https://reviews.llvm.org/D28421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291706 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Make howFarToZero max backedge-taken count check for precondition.

Refines max backedge-taken count if a loop like
"for (int i = 0; i != n; ++i) { /* body */ }" is rotated.

Differential Revision: https://reviews.llvm.org/D28536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291704 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Make howFarToZero use a simpler formula for max backedge-taken count.

This is both easier to understand, and produces a tighter bound in certain
cases.

Differential Revision: https://reviews.llvm.org/D28393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291701 91177308-0d34-0410-b5e6-96231b3b80d8

Re-apply r291205, "LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.", with a fix for an off-by-one error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291699 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Fix PR31594, by tracking the store count of congruence
classes, and updating checking to allow for equivalence through
reachability.

(Sadly, the checking here is not perfect, and can't be made perfect,
so we'll have to disable it after we are satisfied with correctness.
Right now it is just "very unlikely" to happen.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291698 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Refactor performCongruenceFinding and split out congruence class moving

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291697 91177308-0d34-0410-b5e6-96231b3b80d8

Resubmit "[PGO] Turn off comdat renaming in IR PGO by default"

This patch resubmits the changes in r291588.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291696 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "CodeGen: Allow small copyable blocks to "break" the CFG."

This reverts commit ada6595a526d71df04988eb0a4b4fe84df398ded.

This needs a simple probability check because there are some cases where it is
not profitable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291695 91177308-0d34-0410-b5e6-96231b3b80d8

Make some operator bools explicit for sanity/safety.

There are a couple left in bool-like containers (BitVector, etc) where
the implicit conversions seem more suitable - though it might be worth
considering explicitifying those too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291694 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] More aggressive matching for vpadd and vpaddl.

The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.

Differential Revision: https://reviews.llvm.org/D27779

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291693 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Remove bogus assert.

The removed assert seems bogus - it's perfectly legal for the roots of the
vectorized subtrees to be equal even if the original scalar values aren't,
if the original scalars happen to be equivalent.

This fixes PR31599.

Differential Revision: https://reviews.llvm.org/D28539

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291692 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/Object] Unbreak build with -Werror (unused variable). NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291691 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][XOP] Add vpermil2ps target shuffle -> insertps combine test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291690 91177308-0d34-0410-b5e6-96231b3b80d8

Remove all variants of DWARFDie::getAttributeValueAs...() that had parameters that specified default values.

Now we only support returning Optional<> values and have changed all clients over to use Optional::getValueOr().

Differential Revision: https://reviews.llvm.org/D28569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291686 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: only print debug info with -debug. NFC.

Turns out DEBUG(...) has uses even inside NDEBUG checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291685 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL291205 because it breaks Chrome tests under CFI.

Summary:
Revert LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.

This change separates how type identifiers are resolved from how intrinsic
calls are lowered. All information required to lower an intrinsic call
is stored in a new TypeIdLowering data structure. The idea is that this
data structure can either be initialized using the module itself during
regular LTO, or using the module summary in ThinLTO backends.

Original URL: https://reviews.llvm.org/D28341

Reviewers: pcc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D28532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291684 91177308-0d34-0410-b5e6-96231b3b80d8

build_llvm_package.bat: Add note about what SWIG version to use

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291682 91177308-0d34-0410-b5e6-96231b3b80d8

Remove trailing whitespace. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291680 91177308-0d34-0410-b5e6-96231b3b80d8

[MemDep] NFC variable name change

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291679 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix test CodeGen/ARM/fpcmp_ueq.ll broken by rL290616

Commit rL290616 (https://reviews.llvm.org/rL290616) changed a checking command
for the triple arm-apple-darwin in LLVM::CodeGen/ARM/fpcmp_ueq.ll. As a result
of the changes the test could fail for the valid generated code.

These changes fixes the test to check only instructions we would expect.

Differential Revision: https://reviews.llvm.org/D28159

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291678 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/Object] - Introduce Decompressor class.

Decompressor intention is to reduce duplication of code.
Currently LLD has own implementation of decompressor
for compressed debug sections.

This class helps to avoid it and share the code.
LLD patch for reusing it is D28106

Differential revision: https://reviews.llvm.org/D28105

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291675 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Improve isFoldableMemAccessOffset().

A store of an extracted element or a load which gets inserted into a vector,
will be combined into a vector load/store element instruction.

Therefore, isFoldableMemAccessOffset(), which is called by LSR, should
return false in these cases.

Reviewer: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291673 91177308-0d34-0410-b5e6-96231b3b80d8

Make processing @llvm.assume more efficient - Add affected values to the assumption cache

Here's my second try at making @llvm.assume processing more efficient. My
previous attempt, which leveraged operand bundles, r289755, didn't end up
working: it did make assume processing more efficient but eliminating the
assumption cache made ephemeral value computation too expensive. This is a
more-targeted change. We'll keep the assumption cache, but extend it to keep a
map of affected values (i.e. values about which an assumption might provide
some information) to the corresponding assumption intrinsics. This allows
ValueTracking and LVI to find assumptions relevant to the value being queried
without scanning all assumptions in the function. The fact that ValueTracking
started doing O(number of assumptions in the function) work, for every
known-bits query, has become prohibitively expensive in some cases.

As discussed during the review, this is a pragmatic fix that, longer term, will
likely be replaced by a more-principled solution (perhaps based on an extended
SSA form).

Differential Revision: https://reviews.llvm.org/D28459

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291671 91177308-0d34-0410-b5e6-96231b3b80d8

X86 CodeGen: Optimized pattern for truncate with unsigned saturation.

DAG patterns optimization: truncate + unsigned saturation supported by VPMOVUS* instructions in AVX-512.
And VPACKUS* instructions on SEE* targets.

Differential Revision: https://reviews.llvm.org/D28216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291670 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and immediate operands

Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291668 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unused variable warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291666 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512BW] Vectorize v64i8 vector shifts

Differential Revision: https://reviews.llvm.org/D28447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291665 91177308-0d34-0410-b5e6-96231b3b80d8

Fix line endings

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291663 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Separate the LoopAnalysisManager from the LoopPassManager and move
the latter to the Transforms library.

While the loop PM uses an analysis to form the IR units, the current
plan is to have the PM itself establish and enforce both loop simplified
form and LCSSA. This would be a layering violation in the analysis
library.

Fundamentally, the idea behind the loop PM is to *transform* loops in
addition to running passes over them, so it really seemed like the most
natural place to sink this was into the transforms library.

We can't just move *everything* because we also have loop analyses that
rely on a subset of the invariants. So this patch splits the the loop
infrastructure into the analysis management that has to be part of the
analysis library, and the transform-aware pass manager.

This also required splitting the loop analyses' printer passes out to
the transforms library, which makes sense to me as running these will
transform the code into LCSSA in theory.

I haven't split the unittest though because testing one component
without the other seems nearly intractable.

Differential Revision: https://reviews.llvm.org/D28452

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291662 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Take more drastic measures to work around MSVC's failure on this
code. If this doesn't work and I can't find someone to help who has MSVC
installed, I'll back everything out I guess. =[

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291661 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix PR30926 - Add patterns for (v)cvtsi2s{s,d} and (v)cvtsd2s{s,d}

The code emiited by Clang's intrinsics for (v)cvtsi2ss, (v)cvtsi2sd,
(v)cvtsd2ss and (v)cvtss2sd is lowered to a code sequence that includes
redundant (v)movss/(v)movsd instructions. This patch adds patterns for
optimizing these sequences.

Differential revision: https://reviews.llvm.org/D28455

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291660 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] fixing failed test in commit: r291657

Missing Requires asserts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291659 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.

updated instructions:
pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd.

special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq.
In case if the real operands bitwidth <= 16.

Differential Revision: https://reviews.llvm.org/D28104

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291657 91177308-0d34-0410-b5e6-96231b3b80d8