granicus.if.org Git

Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size."

Getting a weird buildbot failure that I need to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215870 91177308-0d34-0410-b5e6-96231b3b80d8

Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215868 91177308-0d34-0410-b5e6-96231b3b80d8

Use copy initialization to initialize std::unique_ptr.

Thanks to David Blaikie for the suggestion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215867 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: mark missing functions from RTABI

Simply indicate the functions that are part of the runtime library that we do
not setup libcalls for. This is merely for ease of identification. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215863 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: improve RTABI 4.2 conformance on Linux

The set of functions defined in the RTABI was separated for no real reason.
This brings us closer to proper utilisation of the functions defined by the
RTABI. It also sets the ground for correctly emitting function calls to AEABI
functions on all AEABI conforming platforms.

The previously existing lie on the behaviour of __ldivmod and __uldivmod is
propagated as it is beyond the scope of the change.

The changes to the test are due to the fact that we now use the divmod functions
which return both the quotient and remainder and thus we no longer need to
invoke two functions on Linux (making it closer to EABI's behaviour).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215862 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: whitespace

Whitespace fix, NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215861 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused member variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215860 91177308-0d34-0410-b5e6-96231b3b80d8

Return a std::uinque_ptr. Every caller was already using one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215858 91177308-0d34-0410-b5e6-96231b3b80d8

Convert an ownership comment with std::uinque_ptr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215855 91177308-0d34-0410-b5e6-96231b3b80d8

Pass a std::uinque_ptr to ParseAssembly to make the ownership explicit. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215852 91177308-0d34-0410-b5e6-96231b3b80d8

getLazyIRModule always takes ownership. Make that explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215851 91177308-0d34-0410-b5e6-96231b3b80d8

Return a std::unique_ptr to make the ownership explicit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215850 91177308-0d34-0410-b5e6-96231b3b80d8

Don't repeat the function name in comments. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215849 91177308-0d34-0410-b5e6-96231b3b80d8

Don't repeat names in comments. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215848 91177308-0d34-0410-b5e6-96231b3b80d8

Revert: r215698 - Current implementation of c.cond.fmt instructions only accept default cc0 register...

It causes a number of regressions when -fintegrated-as is enabled. This happens
because there are codegen-only instructions that incorrectly uses the first
operand as the encoding for the $fcc register. The regressions do not occur when
-via-file-asm is also given.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215847 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: correct toggling behaviour

This was a thinko. The intent was to flip the explicit bits that need toggling
rather than all bits. This would result in incorrect behaviour (which now is
tested).

Thanks to Nico Weber for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215846 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-objdump: don't print relocations in non-relocatable files.

This matches the behavior of GNU objdump.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215844 91177308-0d34-0410-b5e6-96231b3b80d8

Remove a redundant "public:". NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215842 91177308-0d34-0410-b5e6-96231b3b80d8

BumpPtrAllocator: remove 'no slabs allocated yet' check

We already handle the no-slabs case when checking whether the current slab
is large enough: if no slabs have been allocated, CurPtr and End are both 0.
alignPtr(0), will still be 0, and so "if (Ptr + Size <= End)" fails.

Differential Revision: http://reviews.llvm.org/D4943

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215841 91177308-0d34-0410-b5e6-96231b3b80d8

Add a non-templated ELFObjectFileBase class.

Use it to implement some ELF only virtual interfaces instead of using error
prone series of dyn_casts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215838 91177308-0d34-0410-b5e6-96231b3b80d8

Fix an off-by-one bug in the target independent llvm-objdump.

It would prevent the display of a single byte instruction before a label.

Patch by Steve King!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215837 91177308-0d34-0410-b5e6-96231b3b80d8

Reverted last commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215828 91177308-0d34-0410-b5e6-96231b3b80d8

Reverted last commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215827 91177308-0d34-0410-b5e6-96231b3b80d8

Added a table for intrinsics on X86.
It should remove dosens of lines in handling instrinsics (in a huge switch) and give an easy way to add new intrinsics.
I did not completed to move al intrnsics to the table, I'll do this in the upcomming commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215826 91177308-0d34-0410-b5e6-96231b3b80d8

Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set.
While this might seem like an obvious canonicalization, there is one subtle problem with it.  The result of the original expression
is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN.  This means that the
new expression is strictly more defined than the original one.  One unfortunate consequence of this is that the transform is not reversible!
It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it.  Thus, targets that prefer the original
form of the expression cannot reverse the transform to recover it.  Another way to think of it is that the transform has lost source-level
information (the fast math flags), which is undesirable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215825 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Fix an indentation goof in a prior commit. Should have re-run
clang-format.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215824 91177308-0d34-0410-b5e6-96231b3b80d8

[shuffle] Teach the shufflevector fuzzer to support fixed element types.

I'm using this to try to find more minimal test cases by re-fuzzing
within a specific domain once errors are found.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215823 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/CodeGen/X86/fmul-combines.ll: Appease Windows x64. <4 x float> is passed by stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215821 91177308-0d34-0410-b5e6-96231b3b80d8

Fix fmul combines with constant splat vectors

Fixes things like fmul x, 2 -> fadd x, x

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215820 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Teach lots of the new vector shuffle lowering to use UNPCK
instructions for blend operations at 128 bits. This was a serious hole
in our prior blend lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215819 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombine: Fix a potential bug in 0 - (X sdiv C) -> (X sdiv -C)

While *most* (X sdiv 1) operations will get caught by InstSimplify, it
is still possible for a sdiv to appear in the worklist which hasn't been
simplified yet.

This means that it is possible for 0 - (X sdiv 1) to get transformed
into (X sdiv -1); dividing by -1 can make the transform produce undef
values instead of the proper result.

Sorry for the lack of testcase, it's a bit problematic because it relies
on the exact order of operations in the worklist.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215818 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombine: Combine mul with div.

We can combne a mul with a div if one of the operands is a multiple of
the other:

%mul = mul nsw nuw %a, C1
%ret = udiv %mul, C2
  =>
%ret = mul nsw %a, (C1 / C2)

This can expose further optimization opportunities if we end up
multiplying or dividing by a power of 2.

Consider this small example:

define i32 @f(i32 %a) {
  %mul = mul nuw i32 %a, 14
  %div = udiv exact i32 %mul, 7
  ret i32 %div
}

which gets CodeGen'd to:

    imull       $14, %edi, %eax
    imulq       $613566757, %rax, %rcx
    shrq        $32, %rcx
    subl        %ecx, %eax
    shrl        %eax
    addl        %ecx, %eax
    shrl        $2, %eax
    retq

We can now transform this into:
define i32 @f(i32 %a) {
  %shl = shl nuw i32 %a, 1
  ret i32 %shl
}

which gets CodeGen'd to:

    leal        (%rdi,%rdi), %eax
    retq

This fixes PR20681.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215815 91177308-0d34-0410-b5e6-96231b3b80d8

arm asm: Let .fpu enable instructions, PR20447.

I'm not very happy with duplicating the fpu->feature mapping in ARMAsmParser.cpp
and in clang's driver. See the bug for a patch that doesn't do that, and the
review thread [1] for why this duplication exists.

1: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/231052.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215811 91177308-0d34-0410-b5e6-96231b3b80d8

[LIT] Move display of unsupported and xfail tests to summary.

Summary:
This patch changes the way xfail and unsupported tests are displayed.
This output is only displayed when the --show-unsupported/--show-xfail flags are passed to lit.

Currently xfail/unsupported tests are printed during the run of the test-suite. I think its better to display this information during the summary instead.
This patch removes the printing of these tests from when they are run to the summary.

Reviewers: ddunbar, EricWF

Reviewed By: EricWF

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215809 91177308-0d34-0410-b5e6-96231b3b80d8

BitcodeReader: Only create one basic block for each blockaddress

Block address forward-references are implemented by creating a
`BasicBlock` ahead of time that gets inserted in the `Function` when
it's eventually encountered.

However, if the same blockaddress was used in two separate functions
that were parsed *before* the referenced function (and the blockaddress
was never used at global scope), two separate basic blocks would get
created, one of which would be forgotten creating invalid IR.

This commit changes the forward-reference logic to create only one basic
block (and always return the same blockaddress).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215805 91177308-0d34-0410-b5e6-96231b3b80d8

UseListOrder: Correctly count the number of uses

This is an off-by-one bug I found by inspection, which would only
trigger if the bitcode writer sees more uses of a `Value` than the
reader. Since this is only relevant when an instruction gets upgraded
somehow, there unfortunately isn't a reasonable way to add test
coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215804 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Don't add inbounds to GEPs of extern_weak variables

Global variables that have `extern_weak` linkage may be null, so it's
incorrect to add `inbounds` when constant folding.

This also fixes a bug when parsing global aliases, whose forward
reference placeholders are global variables with `extern_weak` linkage.
If GEPs to these aliases are encountered before the alias itself, the
GEPs would incorrectly gain the `inbounds` keyword as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215803 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Improve the folding of target independet shuffles to Undef.

When combining a pair of shuffle nodes, check if the combined shuffle mask is
trivially Undef. In case, immediately fold that pair of shuffles to Undef.

The lack of checks for undef masks was the root-cause of a poor-codegen bug
in the dag combiner.

Example:
  %1 = shufflevector <4 x i32> %A, <4 x i32> %B, <4 x i32> <i32 4, i32 1, i32 1, i32 6>
  %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 0, i32 4, i32 1, i32 6>
  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 5, i32 3, i32 3>

Before this patch, on x86 (with -mcpu=corei7) we failed to fold the entire
sequence to Undef value and therefore we generated:
  shufps $-123, %xmm1, $xmm0
  pshufd $-46, %xmm0, %xmm0

With this patch, the entire shuffle sequence is folded to Undef and no
shuffles are generated in the output assembly.

Added new test cases to test 'combine-vec-shuffle-5.ll'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215797 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Mark fixed-offset byvals as pointed-to by IR values

A byval object, even if allocated at a fixed offset (prescribed by the ABI) is
pointed to by IR values. Most fixed-offset stack objects are not pointed-to by
IR values, so the default is to assume this is not possible. However, we need
to override the default in this case (instruction scheduling can cause
miscompiles otherwise).

Fixes PR20280.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215795 91177308-0d34-0410-b5e6-96231b3b80d8

Make isAliased property for fixed-offset stack objects adjustable

We used to assume that any fixed-offset stack object was not aliased. This
meant that no IR value could point to the memory contained in such an object.
This is a reasonable default, but is not a universally-correct
target-independent fact. For example, on PowerPC (both Darwin and non-Darwin),
some byval arguments are allocated at fixed offsets by the ABI. These, however,
certainly can be pointed to by IR values. This change moves the 'isAliased'
logic out of FixedStackPseudoSourceValue and into MFI, and allows the isAliased
property to be overridden for fixed-offset objects.

This will be used by an upcoming commit to the PowerPC backend to fix PR20280.

No functionality change intended (the behavior of
FixedStackPseudoSourceValue::isAliased has been made more conservative for
callers that don't pass an MFI object, but I don't see any in-tree callers that
do that).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215794 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Darwin byval arguments are not immutable

On PPC/Darwin, byval arguments occur at fixed stack offsets in the callee's
frame, but are not immutable -- the pointer value is directly available to the
higher-level code as the address of the argument, and the value of the byval
argument can be modified at the IR level.

This is necessary, but not sufficient, to fix PR20280. When PR20280 is fixed in
a follow-up commit, its test case will cover this change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215793 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[Support] Promote cl::StringSaver to a separate utility"

This reverts commit r215784 / 3f8a26f6fe16cc76c98ab21db2c600bd7defbbaa.

LLD has 3 StringSaver's, one of which takes a lock when saving the
string... Need to investigate more closely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215790 91177308-0d34-0410-b5e6-96231b3b80d8

Get rid of dead code: SelectAtomic64 in X86ISelDAGtoDAG.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215789 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Promote cl::StringSaver to a separate utility

This class is generally useful.

In breaking it out, the primary change is that it has been made
non-virtual. It seems like being abstract led to there being 3 different
(2 in llvm + 1 in clang) concrete implementations which disagreed about
the ownership of the saved strings (see the manual call to free() in the
unittest StrDupSaver; yes this is different from the CommandLine.cpp
StrDupSaver which owns the stored strings; which is different from
Clang's StringSetSaver which just holds a reference to a
std::set<std::string> which owns the strings).

I've identified 2 other places in the
codebase that are open-coding this pattern:

memcpy(Alloc.Allocate<char>(strlen(S)+1), S, strlen(S)+1)

I'll be switching them over. They are
* llvm::sys::Process::GetArgumentVector
* The StringAllocator member of YAMLIO's Input class
This also will allow simplifying Clang's driver.cpp quite a bit.

Let me know if there are any other places that could benefit from
StringSaver. I'm also thinking of adding a saveStringRef member for
getting a stable StringRef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215784 91177308-0d34-0410-b5e6-96231b3b80d8

Add two helper functions: isAtLeastAcquire, isAtLeastRelease

These methods are available on AtomicOrdering values, and will be used
in a later separate patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215779 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typos in comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215777 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch32] Add support for FP rounding operations for ARMv8/AArch32.
Phabricator Revision: http://reviews.llvm.org/D4935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215772 91177308-0d34-0410-b5e6-96231b3b80d8

[Option] Support MultiArg in --help

Currently, if you use a MultiArg<> option, then printing out the help/usage
message will cause an assert. This fixes getOptionHelpName() to work with
MultiArg Options.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215770 91177308-0d34-0410-b5e6-96231b3b80d8

Set comdats when lazily linking functions.

We were setting the comdat when functions were copied in the initial pass, but
not when they were linked only when we found out that they are needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215765 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel][AArch64] Fix a latent bug in floating-point materialization.

The floating-point value positive zero (+0.0) is a valid immedate value
according to isFPImmLegal. As a result AArch64 FastISel went ahead and
used the immediate version of fmov to materialize the constant.

The problem is that the immediate version of fmov cannot encode an imediate for
postive zero. Instead a fmov from the zero register was supposed to be used in
this case.

This fix adds handling for this special case and uses fmov from the zero
register to materialize a positive zero (negative zeroes go to the constant
pool).

There is no test case for this, because this code is currently dead. It will be
enabled in a future commit and I will add a test case in a separate commit
after that.

This fixes <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215753 91177308-0d34-0410-b5e6-96231b3b80d8

Reapplying [FastISel][AArch64] Cleanup constant materialization code. NFCI.

Note: This reapplies r215582 without any modifications. The refactoring wasn't
responsible for the buildbot failures.

Original commit message:
Cleanup and prepare constant materialization code for future commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215752 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Move all fabs / fneg handling to patterns

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215749 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Use source modifiers for f64 fneg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215748 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Use source modifier for f64 fabs

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215747 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Refactor fneg / fabs patterns

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215746 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the build with MSVC 2013 after new shuffle code

MSVC gives this awesome diagnostic:

..\lib\Target\X86\X86ISelLowering.cpp(7085) : error C2971: 'llvm::VariadicFunction1' : template parameter 'Func' : 'isShuffleEquivalentImpl' : a local variable cannot be used as a non-type argument
..\include\llvm/ADT/VariadicFunction.h(153) : see declaration of 'llvm::VariadicFunction1'
..\lib\Target\X86\X86ISelLowering.cpp(7061) : see declaration of 'isShuffleEquivalentImpl'

Using an anonymous namespace makes the problem go away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215744 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Fix offset folding in some cases with shifted pointers.

Ordinarily (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
is only done if the add has one use. If the resulting constant
add can be folded into an addressing mode, force this to happen
for the pointer operand.

This ends up happening a lot because of how LDS objects are allocated.
Since the globals are allocated next to each other, acessing the first
element of the second object is directly indexed by a shifted pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215739 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Teach the new AVX v4f64 shuffle lowering to use UNPCK instructions
where applicable for blending.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215737 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel] Remove an performance debugging assert.

As Jim pointed out this assert isn't really needed to test for correctness,
because the code right afterwards does the same check and falls-back to
SelectionDAG - as intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215735 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Add intrinsic for ldexp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215734 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel][ARM] Fix unit test from r215682.

Thanks Jim for finding this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215733 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Implement isLegalAddressingMode

The default assumes that a 16-bit signed offset is used.
LDS instruction use a 16-bit unsigned offset, so it wasn't
being used in some cases where it was assumed a negative offset
could be used.

More should be done here, but first isLegalAddressingMode needs
to gain an addressing mode argument. For now, copy most of the rest
of the default implementation with the immediate offset change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215732 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: Fix and re-enable load/store optimizer for Thumb1.

In a previous iteration of the pass, we would try to compensate for
writeback by updating later instructions and/or inserting a SUBS to
reset the base register if necessary.
Since such a SUBS sets the condition flags it's not generally safe to do
this. For now, only merge LDR/STRs if there is no writeback to the base
register (LDM that loads into the base register) or the base register is
killed by one of the merged instructions. These cases are clear wins
both in terms of instruction count and performance.

Also add three new test cases, and update the existing ones accordingly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215729 91177308-0d34-0410-b5e6-96231b3b80d8

ARM load/store optimizer: Compute BaseKill correctly.

This adds some code back that was deleted in r92053. The location of the
last merged memory operation needs to be kept up-to-date since MemOps
may be in a different order to the original instruction stream to
allow merging (since registers need to be in ascending order). Also
simplify the logic to determine BaseKill using findRegisterUseOperandIdx
to use an equivalent function call instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215728 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel][ARM] Fix a think-o in my previous commit (r215682).

We actually need to return the register into which we materialized the constant
and not just "true" for success. This code is currently partially dead, that is
why it didn't trigger any failures yet. Once I change the order of the constant
materialization this code will be fully exercised.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215727 91177308-0d34-0410-b5e6-96231b3b80d8

Introduce a helper to combine instruction metadata.

Replace the old code in GVN and BBVectorize with it. Update SimplifyCFG to use
it.

Patch by Björn Steinbrink!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215723 91177308-0d34-0410-b5e6-96231b3b80d8

Make EmitAbsValue an static helper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215721 91177308-0d34-0410-b5e6-96231b3b80d8

Delete dead code. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215720 91177308-0d34-0410-b5e6-96231b3b80d8

Make EmitDwarfSetLineAddr an static helper. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215718 91177308-0d34-0410-b5e6-96231b3b80d8

Make BuildSymbolDiff an static helper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215717 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Narrow arguments passed in wrong position on the stack in
big-endian mode.

Patch by Asiri Rathnayake.

Differential Revision: http://reviews.llvm.org/D4922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215716 91177308-0d34-0410-b5e6-96231b3b80d8

Make ForceExpAbs an static helper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215715 91177308-0d34-0410-b5e6-96231b3b80d8

Add a helper to MCExpr for when an expression is know to be absolute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215713 91177308-0d34-0410-b5e6-96231b3b80d8

Remove HasLEB128.

We already require CFI, so it should be safe to require .leb128 and .uleb128.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215712 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC64] Add test case for r215685.

I had deferred adding this test case until I could get it down to a
reasonable size. That's done now.

Thanks,
Bill

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215711 91177308-0d34-0410-b5e6-96231b3b80d8

PPC: Clean up pointer casting, no functionality change.

Silences GCC's -Wcast-qual.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215703 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Add the initial skeleton of type-based dispatch for AVX vectors in
the new shuffle lowering and an implementation for v4 shuffles.

This allows us to handle non-half-crossing shuffles directly for v4
shuffles, both integer and floating point. This currently misses places
where we could perform the blend via UNPCK instructions, but otherwise
generates equally good or better code for the test cases included to the
existing vector shuffle lowering. There are a few cases that are
entertainingly better. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215702 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Teach the instruction printer to decode immediate operands to
BLENDPS, BLENDPD, and PBLENDW instructions into pretty shuffle comments.

These will be used in my next commit as part of test cases for AVX
shuffles which can directly use blend in more places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215701 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: implement MRS/MSR (banked reg) system instructions.

These are system-only instructions for CPUs with virtualization
extensions, allowing a hypervisor easy access to all of the various
different AArch32 registers.

rdar://problem/17861345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215700 91177308-0d34-0410-b5e6-96231b3b80d8

Remove testcase from README which we didn't get. We do get it now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215699 91177308-0d34-0410-b5e6-96231b3b80d8

Current implementation of c.cond.fmt instructions only accept default cc0 register. This patch enables the instruction to accept other fcc registers. The aliases with default fcc0 registers are also defined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215698 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Remove the duplicated code for testing whether we can widen the
elements of a shuffle mask and simplify how it works. No functionality
changed now that the bug that was here has been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215696 91177308-0d34-0410-b5e6-96231b3b80d8

[LIT]Correct name of global lit configuration object to be lit_config (not lit).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215695 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Fix the very broken formation of vpunpck instructions in the
target-specific shuffl DAG combines.

We were recognizing the paired shuffles backwards. This code needs to be
replaced anyways as we have the same functionality elsewhere, but I'll
do the refactoring in a follow-up, this is the minimal fix to the
behavior.

In addition to fixing miscompiles with the new vector shuffle lowering,
it also causes the canonicalization to kick in much better, selecting
the smaller encoding variants in lots of places in the new AVX path.
This still isn't quite ideal as we don't need both the shufpd and the
punpck instructions, but that'll get fixed in a follow-up patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690 91177308-0d34-0410-b5e6-96231b3b80d8

Don't print comments to an object streamer :-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215689 91177308-0d34-0410-b5e6-96231b3b80d8

EmitAbsValue is the same as EmitValue on non-darwin. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215688 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Fix PR20540 where the x86 shuffle DAG combiner had completely
broken logic for merging shuffle masks in the face of SM_SentinelZero
mask operands.

While these are '-1' they don't mean 'undef' the way '-1' means in the
pre-legalized shuffle masks. Instead, they mean that the shuffle
operation is forcibly zeroing that lane. Reflect this and explicitly
handle it in a bunch of places. In one place the effect is equivalent
but much more clear. In the rest it was really weirdly broken.

Also, rewrite the entire merging thing to be a more directy operation
with a single loop and just doing math to map the indices through the
various masks.

Also add a bunch of asserts to try to make in extremely clear what the
different masks can possibly look like.

Finally, add some comments to clarify that we're merging shuffle masks
*up* here rather than *down* as we do everywhere else, and thus the
logic is quite confusing.

Thanks to several different people for sending test cases, and for
Robert Khasanov for an initial attempt at fixing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215687 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC64] Add missing dependency on X2 to LDinto_toc.

The LDinto_toc pattern has been part of 64-bit PowerPC for a long
time, and represents loading from a memory location into the TOC
register (X2).  However, this pattern doesn't explicitly record that
it modifies that register.  This patch adds the missing dependency.

It was very surprising to me that this has never shown up as a problem
in the past, and that we only saw this problem recently in a single
scenario when building a self-hosted clang.  It turns out that in most
cases we have another dependency present that keeps the LDinto_toc
instruction tied in place.  LDinto_toc is used for TOC restore
following a call site, so this is a typical sequence:

   BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
   LDinto_toc 24, %X1
   ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>

Because the LDinto_toc is inserted prior to the ADJCALLSTACKUP, there
is a natural anti-dependency between the two that keeps it in place.

Therefore we don't usually see a problem.  However, in one particular
case, one call is followed immediately by another call, and the second
call requires a parameter that is a TOC-relative address.  This is the
code sequence:

  BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
  LDinto_toc 24, %X1
  ADJCALLSTACKUP 96, 0, %R1<imp-def>, %R1<imp-use>
  ADJCALLSTACKDOWN 96, %R1<imp-def>, %R1<imp-use>
  %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
  %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39

Note that the back-to-back stack adjustments are the same size!  The
back end is smart enough to recognize this and optimize them away:

  BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
  LDinto_toc 24, %X1
  %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
  %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39

Now there is nothing to prevent the ADDIStocHA instruction from moving
ahead of the LDinto_toc instruction, and because of the longest-path
heuristic, this is what happens.

With the accompanying patch, %X2 is represented as an implicit def:

  BCTRL8 <regmask>, %CTR8<imp-use>, %RM<imp-use>, %X3<imp-use>, %X4<imp-use>, %X5<imp-use>, %X12<imp-use>, %X1<imp-def>, ...
  LDinto_toc 24, %X1, %X2<imp-def,dead>
  ADJCALLSTACKUP 96, 0, %R1<imp-def,dead>, %R1<imp-use>
  ADJCALLSTACKDOWN 96, %R1<imp-def,dead>, %R1<imp-use>
  %vreg39<def> = ADDIStocHA %X2, <ga:@.str>; G8RC_and_G8RC_NOX0:%vreg39
  %vreg40<def> = ADDItocL %vreg39<kill>, <ga:@.str>; G8RC:%vreg40 G8RC_and_G8RC_NOX0:%vreg39

So now when the two stack adjustments are removed, ADDIStocHA is
prevented from being moved above LDinto_toc.

I have not yet created a test case for this, because the original
failure occurs on a relatively large function that needs reduction.
However, this is a fairly serious bug, despite its infrequency, and I
wanted to get this patch onto the list as soon as possible so that it
can be considered for a 3.5 backport.  I'll work on whittling down a
test case.

Have we missed the boat for 3.5 at this point?

Thanks,
Bill

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215685 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel][ARM] Fall-back to constant pool loads when materializing an i32 constant.

FastEmit_i won't always succeed to materialize an i32 constant and just fail.
This would trigger a fall-back to SelectionDAG, which is really not necessary.

This fix will first fall-back to a constant pool load to materialize the constant
before giving up for good.

This fixes <rdar://problem/18022633>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215682 91177308-0d34-0410-b5e6-96231b3b80d8

Copy noalias metadata from call sites to inlined instructions

When a call site with noalias metadata is inlined, that metadata can be
propagated directly to the inlined instructions (only those that might access
memory because it is not useful on the others). Prior to inlining, the noalias
metadata could express that a call would not alias with some other memory
access, which implies that no instruction within that called function would
alias. By propagating the metadata to the inlined instructions, we preserve
that knowledge.

This should complete the enhancements requested in PR20500.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215676 91177308-0d34-0410-b5e6-96231b3b80d8

Revert several FastISel commits to track down a buildbot error.

This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8

Fix whitespace error from r215279, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215667 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Add test for FMA masking instrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215665 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Switch FMA intrinsics to the masking version

This does the renaming and updates the lowering logic.

Part of <rdar://problem/17688758>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215664 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Break out logic to map FMA Intrinsic number to Opcode

No functional change. Will be used to lower AVX512 masking FMA intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215663 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Add enum for the static rounding types

No functional change. This will be used by the new FMA intrinsic lowering
code.

We can probably add NO_EXC here as well, I am just not too familiar with this
part of AVX512 yet. We can add that later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215662 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Break out the logic to lower masking intrinsics

No functional change. This will be used by the FMA intrinsic lowering as well
and hopefully many more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215661 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Add masking variant for the FMA instructions

This change further evolves the base class AVX512_masking in order to make it
suitable for the masking variants of the FMA instructions.

Besides AVX512_masking there is now a new base class that instructions
including FMAs can use: AVX512_masking_3src.  With three-source (destructive)
instructions one of the sources is already tied to the destination.  This
difference from AVX512_masking is captured by this new class.  The common bits
between _masking and _masking_3src are broken out into a new super class
called AVX512_masking_common.

As with valign, there is some corresponding restructuring of the underlying
format classes.  The idea is the same we want to derive from two classes
essentially: one providing the format bits and another format-independent
multiclass supplying the various masking and non-masking instruction variants.

Existing fma tests in avx512-fma*.ll provide coverage here for the non-masking
variants.  For masking, the next patches in the series will add intrinsics and
intrinsic tests.

For AVX512_masking_3src to work, the (ins ...) dag has to be passed *without*
the leading source operand that is tied to dst ($src1).  This is necessary to
properly construct the (ins ...) for the different variants.  For the record,
I did check that if $src is mistakenly included, you do get a fairly intuitive
error message from the tablegen backend.

Part of <rdar://problem/17688758>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215660 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[FastISel][AArch64] Add support for more addressing modes."

This reverts commits r215597, because it might have broken the build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215659 91177308-0d34-0410-b5e6-96231b3b80d8

Add noalias metadata for general calls (not just memory intrinsics) during inlining

When preserving noalias function parameter attributes by adding noalias
metadata in the inliner, we should do this for general function calls (not just
memory intrinsics). The logic is very similar to what already existed (except
that we want to add this metadata even for functions taking no relevant
parameters). This metadata can be used by ModRef queries in the caller after
inlining.

This addresses the first part of PR20500. Adding noalias metadata during
inlining is still turned off by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215657 91177308-0d34-0410-b5e6-96231b3b80d8