granicus.if.org Git

AMDGPU: Improve accuracy of instruction rates for VOPC

These were all using the default 32-bit VALU write class,
but the i64/f64 compares are half rate.

I'm not sure this is really correct, because they are still using
the write to VALU write class, even though they really write
to the SALU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248582 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalsAA] Teach GlobalsAA about nocapture

Arguments to function calls marked "nocapture" can be marked as
non-escaping. However, nocapture is defined in terms of the lifetime
of the callee, and if the callee can directly or indirectly recurse to
the caller, the semantics of nocapture are invalid.

Therefore, we eagerly discover which SCC each function belongs to,
and later can check if callee and caller of a callsite belong to
the same SCC, in which case there could be recursion.

This means that we can't be so optimistic in
getModRefInfo(ImmutableCallsite) - previously we assumed all call
arguments never aliased with an escaping global. Now we need to check,
because a global could now be passed as an argument but still not
escape.

This also solves a related conformance problem: MemCpyOptimizer can
turn non-escaping stores of globals into calls to intrinsics like
llvm.memcpy/llvm/memset. This confuses GlobalsAA, which knows the
global can't escape and so returns NoModRef when queried, when
obviously a memcpy/memset call does indeed reference and modify its
arguments.

This fixes PR24800, PR24801, and PR24802.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248576 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: make -Asserts,-Werror=unused-variable build happy

The value was only used in an assertion. Sink the variable usage into the
assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248562 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: address WoA division limitation

We now emit the compiler generated divide by zero check that was needed for the
MSVC routines.  We construct a psuedo-instruction for the DBZ check as the
operation requires splitting up the BB.  For the 64-bit operations, we need to
custom expand the node as we need to insert the DBZ check and then emit the
libcall to the appropriate name.  Because this is target specific, it seemed
better to reproduce the expansion operation from the target-agnostic type
legalization rather than sink this there to avoid the duplication.  The division
library calls now match MSVC semantically.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248561 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove unused includes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248553 91177308-0d34-0410-b5e6-96231b3b80d8

[LangRef] Unbreak the docs Sphinx build.

r248551 introduced some breakage due to incorrectly terminated
``literals`` s.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248552 91177308-0d34-0410-b5e6-96231b3b80d8

[Bitcode][Asm] Teach LLVM to read and write operand bundles.

Summary:
This also adds the first set of tests for operand bundles.

The optimizer has not been audited to ensure that it does the right
thing with operand bundles.

Depends on D12456.

Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner

Subscribers: maksfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D12457

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248551 91177308-0d34-0410-b5e6-96231b3b80d8

Restore test coverage for other than ELFOSABI_NONE

Add a FreeBSD test to restore testing of ELF OSABI other than
ELFOSABI_NONE after r248534.

Differential Revision: http://reviews.llvm.org/D13146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248550 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248549 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Improve the readability of the ld/st optimization pass. NFC.

In this context, MI is an add/sub instruction not a loads/store.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248540 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE2] Fix zero/any extension shuffles that don't start from the first element

Fix for D12561 - we weren't correctly ensuring that the base element for extension was moved to start on a boundary suitable for UNPCKL/H

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248536 91177308-0d34-0410-b5e6-96231b3b80d8

Use ELFOSABI_NONE instead of ELFOSABI_LINUX.

The doesn't seem to be a difference and ELFOSABI_NONE seems to be far more
common:

* Linux doesn't care when loading and puts ELFOSABI_NONE on core dumps.
* Gold and bfd ld produce files with ELFOSABI_NONE.
* Gold and bfd ld seems to ignore EI_OSABI other than for freebsd.
* Gas puts ELFOSABI_NONE in most .o files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248534 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add s_dcache_* instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add cache invalidation instructions.

These are necessary for implementing mem_fence for
OpenCL 2.0.

The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Run mubuf assembler test for CI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248531 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] The paired post-increment store instruction has an output register.

The pre- and post-increment version update the base register, but the post-
version was defined incorrectly. There is no test case as we don't currently
generate these instructions, but I plan on changing that in the near future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248528 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add operand bundles to CallInst and InvokeInst.

Summary:
This change teaches `CallInst`s and `InvokeInst`s to maintain a set of
operand bundles as part of its operands. `CallInst`s and `InvokeInst`s
with operand bundles co-allocate some space before their `Use` array to
hold meta information about which of its operands are part of an operand
bundle.

The strings corresponding to the bundle tags are interned into
`LLVMContextImpl::BundleTagCache`

This change does not include any parsing / bitcode support. That's the
next change.

Depends on D12455.

Reviewers: reames, chandlerc, majnemer, dexonsmith, kmod, JosephTremoulet, rnk, bogner

Subscribers: MatzeB, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12456

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248527 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def

Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.

This patch changes the handling of +t2dsp to be in line with other
architecture extensions.

Following a revert of r248152 and new review comments, this patch also includes
renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc.
The spelling of "t2dsp" is preserved, pending a further investigation of its
possible external usage.

Differential Revision: http://reviews.llvm.org/D12937

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248519 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Fix the condition to distinguish module imports form definitions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248512 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Teach isKnownNonZero a new trick

If the shifter operand is a constant, and all of the bits shifted out
are known to be zero, then if X is known non-zero at least one
non-zero bit must remain.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248508 91177308-0d34-0410-b5e6-96231b3b80d8

[objdump] Make iterator operator* return a reference.

This is closer to the expected behavior of an iterator and avoids awkward
warnings from clang's -Wrange-loop-analysis below.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248497 91177308-0d34-0410-b5e6-96231b3b80d8

Regression Test: Deletes redundant/invalid test.

Removes absdiff_expand.ll regression test file which is invalid.

Diffrential Revision: http://reviews.llvm.org/D11678

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248493 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Use PredicateControl for the MSA ASE instructions. NFC.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248486 91177308-0d34-0410-b5e6-96231b3b80d8

Codegen: Fix llvm.*absdiff semantic.

Fixes the overflow case of llvm.*absdiff intrinsic also updats the tests and LangRef.rst accordingly.

Differential Revision: http://reviews.llvm.org/D11678

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248483 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Recognize another bswap idiom.

Summary:
The byte-swap recognizer can now notice that this

```
uint32_t bswap(uint32_t x)
{
  x = (x & 0x0000FFFF) << 16 | (x & 0xFFFF0000) >> 16;
  x = (x & 0x00FF00FF) << 8 | (x & 0xFF00FF00) >> 8;
  return x;
}
```

is a bswap. Fixes PR23863.

Reviewers: nlewycky, hfinkel, hans, jmolloy, rengolin

Subscribers: majnemer, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D12637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248482 91177308-0d34-0410-b5e6-96231b3b80d8

Introduce target hook for optimizing register copies

Allow a target to do something other than search for copies
that will avoid cross register bank copies.

Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.

I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid kinds
of subregister copies on some targets.

I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.

The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248478 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Return after instruction is processed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248476 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove another unnecessary check from commuteInstruction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248475 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add readonly to InstrMapping functions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248474 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: Add LLVM_READONLY to generated InstrMapping functions

These just read from a generated table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248473 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix printing trailing whitespace for mubuf atomics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248472 91177308-0d34-0410-b5e6-96231b3b80d8

Remove dead declaration

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248471 91177308-0d34-0410-b5e6-96231b3b80d8

Use new TokenFactor chain when merging stores

If the stores are storing values from loads which partially
alias the stores, we could end up placing the merged loads
and stores on the same chain which has the potential to break.
Each store may have a different chain dependency on only some
of the original loads. Create a new TokenFactor to capture all
of the required dependencies of the stores rather than assuming
all stores can use the same chain.

The testcase is a situation where this happens, although
it does not have an observable change from this. The DAG nodes
just happened to not be reordered before despite this missing
chain dependency.

This is based on an off-list report for an out of tree target
which regressed due to r246307 and I haven't managed to find a case
where the nodes do end up reordered with an in tree target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248468 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Reduce number of copies emitted

Instead of always inserting a copy in case
the super register is itself a subregister,
only extract to the super reg class if this is
actually the case.

This shouldn't really change codegen, but
makes looking at the output of SIFixSGPRCopies
easier to read.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248467 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a think-o in which functions these should surround

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248465 91177308-0d34-0410-b5e6-96231b3b80d8

Add some NDEBUG checks I accidentally dropped in r248462

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248464 91177308-0d34-0410-b5e6-96231b3b80d8

BasicAA: Move BasicAAResult::alias out-of-line. NFC

This makes the header more readable and cleans up some unnecessary
header differences between NDEBUG and !NDEBUG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248462 91177308-0d34-0410-b5e6-96231b3b80d8

Add CFG Simplification pass after Loop Unswitching.

Loop unswitching produces conditional branches with constant condition,
and it's beneficial for later passes to clean this up with simplify-cfg.
We do this after the second invocation of loop-unswitch, but not after
the first one. Not doing so might cause problem for passes like
LoopUnroll, whose estimate of loop body size would be less accurate.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D13064

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248460 91177308-0d34-0410-b5e6-96231b3b80d8

[safestack] Fix compiler crash in the presence of stack restores.

A use can be emitted before def in a function with stack restore
points but no static allocas.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248455 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Teach `llvm::User` to co-allocate a descriptor.

Summary:
With this change, subclasses of `llvm::User` will be able to co-allocate
a variable number of bytes (called a "descriptor") with the `llvm::User`
instance. The co-allocated descriptor can later be accessed using
`llvm::User::getDescriptor`. This will be used in later changes to
implement operand bundles.

This change steals one bit from `NumUserOperands`, but given that it is
still 28 bits wide I don't think this will be a practical issue.

This change does not allow allocating hung off uses with descriptors.
This only for simplicity, not for any fundamental reason; and we can
easily add this functionality later if needed.

Reviewers: reames, chandlerc, dexonsmith, kmod, majnemer, pete, JosephTremoulet

Subscribers: pete, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12455

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248453 91177308-0d34-0410-b5e6-96231b3b80d8

Add REQUIRES: default_triple to these testcases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248452 91177308-0d34-0410-b5e6-96231b3b80d8

Remove iterator_range::end.

Because the current proposal does not include that member function,
and we are trying to keep in line with that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248451 91177308-0d34-0410-b5e6-96231b3b80d8

Add iterator_range::end() predicate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248447 91177308-0d34-0410-b5e6-96231b3b80d8

[Unroll] When completely unrolling the loop, replace conditinal branches with unconditional.

Nothing is expected to change, except we do less redundant work in
clean-up.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12951

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248444 91177308-0d34-0410-b5e6-96231b3b80d8

Put profile variables of COMDAT functions to it's own COMDAT group.

In -fprofile-instr-generate compilation, to remove the redundant profile
variables for the COMDAT functions, these variables are placed in the same
COMDAT group as its associated function. This way when the COMDAT function
is not picked by the linker, those profile variables will also not be
output in the final binary. This may cause warning when mix link objects
built w and wo -fprofile-instr-generate.

This patch puts the profile variables for COMDAT functions to its own COMDAT
group to avoid the problem.

Patch by xur.
Differential Revision: http://reviews.llvm.org/D12248

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248440 91177308-0d34-0410-b5e6-96231b3b80d8

set div/rem default values to 'expensive' in TargetTransformInfo's cost model

...because that's what the cost model was intended to do.

As discussed in D12882, this fix has a temporary unintended consequence for
SimplifyCFG: it causes us to not speculate an fdiv. However, two wrongs make
PR24818 right, and two wrongs make PR24343 act right even though it's really
still wrong.

I intend to correct SimplifyCFG and add to CodeGenPrepare to account for this
cost model change and preserve the righteousness for the bug report cases.

https://llvm.org/bugs/show_bug.cgi?id=24818
https://llvm.org/bugs/show_bug.cgi?id=24343

Differential Revision: http://reviews.llvm.org/D12882

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248439 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: fix folding stack adjustment (again again again...)

This time, the issue is that we weren't accounting for the possibility that
aligned DPRs could have been stored after the final "push" in a prologue. When
that happened we effectively moved a "sub sp, #N" from below the aligned stores
to above them, and everything went to pot.

To make it worse, I'd actually committed something testing that we produced
wrong code, so the test update is tiny.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248437 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Don't prune forward declarations inside a module definition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248428 91177308-0d34-0410-b5e6-96231b3b80d8

Fix this dsymutil testcase by not passing in a path to the modulemap file,
so the lookup works as expected after prepending the oso-prepend-path.

This manifested only on Windows, because "/" is not a relative path there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248423 91177308-0d34-0410-b5e6-96231b3b80d8

Remove handling of AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsets

Patch by: simoncook

Unlike BitCasts, AddrSpaceCasts do not always produce an output the same size as its input, which was previously assumed. This fixes cases where two address spaces do not have the same size pointer, as an assertion failure would occur when trying to prove deferenceability. LoopUnswitch is used in the particular test, but LICM also exhibits the same problem.

Differential Revision: http://reviews.llvm.org/D13008

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248422 91177308-0d34-0410-b5e6-96231b3b80d8

Swap loop invariant GEP with loop variant GEP to allow more LICM.

    This patch changes the order of GEPs generated by Splitting GEPs
    pass, specially when one of the GEPs has constant and the base is
    loop invariant, then we will generate the GEP with constant first
    when beneficial, to expose more cases for LICM.

    If originally Splitting GEP generate the following:
      do.body.i:
        %idxprom.i = sext i32 %shr.i to i64
        %2 = bitcast %typeD* %s to i8*
        %3 = shl i64 %idxprom.i, 2
        %uglygep = getelementptr i8, i8* %2, i64 %3
        %uglygep7 = getelementptr i8, i8* %uglygep, i64 1032
      ...
    Now it genereates:
      do.body.i:
        %idxprom.i = sext i32 %shr.i to i64
        %2 = bitcast %typeD* %s to i8*
        %3 = shl i64 %idxprom.i, 2
        %uglygep = getelementptr i8, i8* %2, i64 1032
        %uglygep7 = getelementptr i8, i8* %uglygep, i64 %3
      ...

    For no-loop cases, the original way of generating GEPs seems to
    expose more CSE cases, so we don't change the logic for no-loop
    cases, and only limit our change to the specific case we are
    interested in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248420 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Preserve metadata when merging loads that are phi
arguments.

Make sure InstCombiner::FoldPHIArgLoadIntoPHI doesn't drop the following
metadata:

MD_tbaa
MD_alias_scope
MD_noalias
MD_invariant_load
MD_nonnull
MD_range

rdar://problem/17617709

Differential Revision: http://reviews.llvm.org/D12710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248419 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Update DominatorTree docs to clarify expectations around unreachable blocks

Note: I'm am not trying to describe what "should be"; I'm only describing what is true today.

This came out of my recent question to llvm-dev titled: When can the dominator tree not contain a node for a basic block?

Differential Revision: http://reviews.llvm.org/D13078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248417 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] replace integer 'xor' ops with packed SSE FP 'xor' ops when operating on FP scalars

Turn this:

movd %xmm0, %eax
movd %xmm1, %ecx
xorl %eax, %ecx
movd %ecx, %xmm0

into this:

xorps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

This is an extension of:
http://reviews.llvm.org/rL248395

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248415 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] replace integer 'or' ops with packed SSE FP 'or' ops when operating on FP scalars

Turn this:

movd %xmm0, %eax
movd %xmm1, %ecx
orl %eax, %ecx
movd %ecx, %xmm0

into this:

orps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

This is an extension of:
http://reviews.llvm.org/rL248395

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248409 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the order of operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248406 91177308-0d34-0410-b5e6-96231b3b80d8

Android support for SafeStack.

Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

This is a re-commit of a change in r248357 that was reverted in
r248358.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248405 91177308-0d34-0410-b5e6-96231b3b80d8

move call to convertIntLogicToFPLogic up; NFCI

The BEXTR comments didn't make sense before, we may want to extend the
FP logic transform to work on vectors, and this way is more beautiful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248404 91177308-0d34-0410-b5e6-96231b3b80d8

Temporarily make testcase more verbose to debug a msvc buildbot failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248403 91177308-0d34-0410-b5e6-96231b3b80d8

[Bug 24848] Use range metadata to constant fold comparisons with constant values

Summary:
This is the first part of fixing bug 24848 https://llvm.org/bugs/show_bug.cgi?id=24848.

When range metadata is provided, it should be used to constant fold comparisons with constant values.

Reviewers: sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12988

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248402 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] move code for converting int logic to FP logic to a helper function; NFCI

This is a follow-on to:
http://reviews.llvm.org/rL248395

so we can add the call to the or/xor combines too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248399 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Resolve forward decls for types defined in clang modules.

This patch extends llvm-dsymutil's ODR type uniquing machinery to also
resolve forward decls for types defined in clang modules.

http://reviews.llvm.org/D13038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248398 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: print a warning when there is a module hash mismatch.

This also updates the module binaries in the test directory because
their module hash mismatched.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248396 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] replace integer 'and' ops with packed SSE FP 'and' ops when operating on FP scalars

Turn this:
   movd %xmm0, %eax
   movd %xmm1, %ecx
   andl %eax, %ecx
   movd %ecx, %xmm0

into this:
   andps %xmm1, %xmm0

This is related to, but does not solve:
https://llvm.org/bugs/show_bug.cgi?id=22428

Differential Revision: http://reviews.llvm.org/D13065

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248395 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix hasAddr64 being used before being initializer.

This reverts r248388 and fixes the underlying bug: hasAddr64 was initialized
in runOnMachineFunction, but runOnMachineFunction isn't ever called in
CodeGen/WebAssembly/global.ll since that testcase has no functions. The fix
here is to use AsmPrinter's getPointerSize() as needed to determine the
pointer size instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248394 91177308-0d34-0410-b5e6-96231b3b80d8

[Inline] Use AssumptionCache from the right Function

This changes the behavior of AddAligntmentAssumptions to match its
comment. I.e, prove the asserted alignment in the context of the caller,
not the callee.

Thanks to Mehdi Amini for seeing the issue here! Also to Artur Pilipenko
who also saw a fix for the issue.

rdar://22521387

Differential Revision: http://reviews.llvm.org/D12997

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248390 91177308-0d34-0410-b5e6-96231b3b80d8

Fix CodeGen/WebAssembly/global.ll test under ASAN.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248388 91177308-0d34-0410-b5e6-96231b3b80d8

[DeadArgElim] Split the invoke successor edge

Invoking a function which returns an aggregate can sometimes be
transformed to return a scalar value.  However, this means that we need
to create an insertvalue instruction(s) to recreate the correct
aggregate type.  We achieved this by inserting an insertvalue
instruction at the invoke's normal successor.  However, this is not
feasible if the normal successor uses the invoke's return value inside a
PHI node.

Instead, split the edge between the invoke and the unwind successor and
create the insertvalue instruction in the new basic block.  The new
basic block's successor will be the old invoke successor which leaves
us with IR which is well behaved.

This fixes PR24906.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248387 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Refactor pre- and post-index merge fuctions into a single function. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248377 91177308-0d34-0410-b5e6-96231b3b80d8

[DeadStoreElimination] Remove dead zero store to calloc initialized memory

This change allows dead store elimination to remove zero and null stores into memory freshly allocated with calloc-like function.

Differential Revision: http://reviews.llvm.org/D13021

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248374 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Plug a memory leak.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248372 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add option to force fast-isel

The ARM backend has some logic that only allows the fast-isel to be enabled for
subtargets where it is known to be stable. This adds a backend option to
override this and force the fast-isel to be used for any target, to allow it to
be tested.

This is an ARM-specific option, because no other backend disables the fast-isel
on a per-subtarget basis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248369 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR

This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead.

LLVM counterpart to D12835

Differential Revision: http://reviews.llvm.org/D13002

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248368 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Introduce ScalarEvolution::getOne and getZero.

Summary:
It is fairly common to call SE->getConstant(Ty, 0) or
SE->getConstant(Ty, 1); this change makes such uses a little bit
briefer.

I've refactored the call sites I could find easily to use getZero /
getOne.

Reviewers: hfinkel, majnemer, reames

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248362 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Android support for SafeStack."

test/Transforms/SafeStack/abi.ll breaks when target is not supported;
needs refactoring.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248358 91177308-0d34-0410-b5e6-96231b3b80d8

Android support for SafeStack.

Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248357 91177308-0d34-0410-b5e6-96231b3b80d8

Add a test case for the fix of profile update issue when lowering switch statement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248356 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed an issue on updating profile data when lowering switch statement.

Fixed the issue that when there is an edge from the jump table to the default statement, we should check it directly instead of checking if the sibling of the jump table header is a successor of the jump table header, which may not be the default statment but a successor of it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248354 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Fix a comment. [-Wdocumentation]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248353 91177308-0d34-0410-b5e6-96231b3b80d8

Add a unit test for r248341.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248348 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Add a setDWOId() method to DICompileUnit.

Tested via clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248342 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Fix the return value of DICompileUnit::getDWOId.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248341 91177308-0d34-0410-b5e6-96231b3b80d8

Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248340 91177308-0d34-0410-b5e6-96231b3b80d8

LiveIntervalAnalysis: Avoid multiple connected liveness components

We may have subregister defs which are unused but not discovered and
cleaned up prior to liveness analysis. This creates multiple connected
components in the resulting live range which are forbidden in the
MachineVerifier because they would unnecesarily constrain the register
allocator. Rewrite those dead definitions to define a newly created
virtual register.

Differential Revision: http://reviews.llvm.org/D13035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248335 91177308-0d34-0410-b5e6-96231b3b80d8

LiveInterval: Distribute subregister liveranges to new intervals in ConnectedVNInfoEqClasses::Distribute()

This improves ConnectedVNInfoEqClasses::Distribute() to distribute the
segments and value numbers in the subranges instead of conservatively
clearing all subregister info.

No separate test here, just clearing the subrange instead of properly
distributing them would however break my upcoming fix regarding dead super
register definitions.

Differential Revision: http://reviews.llvm.org/D13075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248334 91177308-0d34-0410-b5e6-96231b3b80d8

[Unroll] Do not crash trying to propagate a value to vector load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248333 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Follow references to clang modules and recursively clone the
debug info.

This does not yet resolve external type references.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248331 91177308-0d34-0410-b5e6-96231b3b80d8

[Unroll] Follow-up for r247769: fix a bug in UnrolledInstAnalyzer::visitLoad.

Apart from checking that GlobalVariable is a constant, we should check
that it's not a weak constant, in which case we can't propagate its
value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248327 91177308-0d34-0410-b5e6-96231b3b80d8

Instead of defining the operator delete() function, it is better to delete the function so that any uses (even from within Node or its subclasses) do not accidentally call it. NFC intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248320 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Make -oso-prepend-path available to DwarfLinker.
NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248312 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Make resolveDIEReference and getUnitForOffset static functions.
NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248311 91177308-0d34-0410-b5e6-96231b3b80d8

dsymutil: Make DwarfLinker::reportWarning() public. (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248310 91177308-0d34-0410-b5e6-96231b3b80d8

Remove macho-dump. Its functionality is now covered by llvm-readobj.

Approved by: Rafael Espindola, Eric Christopher, Jim Grosbach,
Alex Rosenberg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248302 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Emit clrex in the expanded cmpxchg fail block.

ARM counterpart to r248291:

In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248294 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Emit clrex in the expanded cmpxchg fail block.

In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248291 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a typo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248283 91177308-0d34-0410-b5e6-96231b3b80d8

Make helper function static. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248278 91177308-0d34-0410-b5e6-96231b3b80d8

[mips][sched] Split IIBranch into specific instruction classes.

Summary:
Almost no functional change since the InstrItinData's have been duplicated.
The one functional change is to remove IIBranch from the MSA branches. The
classes will be assigned to the MSA instructions as part of implementing
the P5600 scheduler.

II_IndirectBranchPseudo and II_ReturnPseudo can probably be removed. I've
preserved the itinerary information for the corresponding pseudo
instructions to avoid making a functional change to these pseudos in
this patch.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12189

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248273 91177308-0d34-0410-b5e6-96231b3b80d8

[mips][sched] Temporarily rename IIAlu to IIM16Alu. NFC.

Summary:
The only instructions left in IIAlu are MIPS16 specific. We're not
implementing a MIPS16 scheduler at this time so rename the class to make it
obvious that they are MIPS16 instructions.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12188

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248267 91177308-0d34-0410-b5e6-96231b3b80d8

Don't raise inexact when lowering ceil, floor, round, trunc.

The C standard has historically not specified whether or not these functions should raise the inexact flag. Traditionally on Darwin, these functions *did* raise inexact, and the llvm lowerings followed that conventions. n1778 (C bindings for IEEE-754 (2008)) clarifies that these functions should not set inexact. This patch brings the lowerings for arm64 and x86 in line with the newly specified behavior. This also lets us fold some logic into TD patterns, which is nice.

Differential Revision: http://reviews.llvm.org/D12969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248266 91177308-0d34-0410-b5e6-96231b3b80d8