This removes the duplicate definition of GetXDataSection. This function is
available as a static method and is identical to the previous implementation.
This just cleans up the unnecessary duplication.
Use the full form of dccci and iccci from the early PPC 405 documents,
since the operands are actually used on those cores. Provide aliases for
the only documented case in the newer Power ISA speec.
Tom Stellard [Sat, 9 Aug 2014 01:06:53 +0000 (01:06 +0000)]
R600/SI: Update concat_vectors.ll to check for scratch usage
These tests were using SI-NOT: MOVREL to make sure concat vectors
weren't being lowered to stack loads and stores, but we are using
scratch buffers for the stack now instead of registers, so we need
to add an additional SI-NOT check for scratch buffers.
With this change I was able to uncover one broken test which will
be fixed in a future commit.
Lang Hames [Fri, 8 Aug 2014 23:12:22 +0000 (23:12 +0000)]
[MCJIT] Simplify immediate decoding code in the RuntimeDyldMachO hierarchy.
Cleanup only: no functional change.
This patch makes RuntimeDyldMachO targets directly responsible for decoding
immediates, rather than letting them implement catch a callback from generic
code. Since this is a very target specific operation, it makes sense to let the
target-specific code drive it.
I accidentally also used INC/DEC for unsigned arithmetic which doesn't work,
because INC/DEC don't set the required flag which is used for the overflow
check.
Tim Northover [Fri, 8 Aug 2014 17:31:52 +0000 (17:31 +0000)]
AArch64: avoid deleting the current iterator in a loop.
std::map invalidates the iterator to any element that gets deleted, which means
we can't increment it correctly afterwards. This was causing Darwin test
failures.
David Blaikie [Fri, 8 Aug 2014 17:12:35 +0000 (17:12 +0000)]
DebugInfo: Recommit (reverted in r215217, originally committed in r215157) the assertion that no argument variable is overwritten by subsequent argument variables.
This turned up a bug in clang where arguments were emitted with
duplicate argument numbers (see r215227).
Pedro Artigas [Fri, 8 Aug 2014 16:46:53 +0000 (16:46 +0000)]
Added a TLI hook to signal that the target does not have or does not care about
floating point exceptions, added use of flag to fold potentially exception
raising floating point math in selection DAG. No functionality change, as
targets have to explicitly ask for this behavior and none does today.
Daniel Sanders [Fri, 8 Aug 2014 15:47:17 +0000 (15:47 +0000)]
[mips] Invert the abicalls feature bit to be noabicalls so that it's possible for -mno-abicalls to take effect.
Also added the testcase that should have been in r215194.
This behaviour has surprised me a few times now. The problem is that the
generated MipsSubtarget::ParseSubtargetFeatures() contains code like this:
if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true;
so '-abicalls' means 'leave it at the default' and '+abicalls' means 'set it to
true'. In this case, (and the similar -modd-spreg case) I'd like the code to be
James Molloy [Fri, 8 Aug 2014 12:33:21 +0000 (12:33 +0000)]
[AArch64] Add an FP load balancing pass for Cortex-A57
For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.
This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.
Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).
This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.
Tim Northover [Fri, 8 Aug 2014 12:00:09 +0000 (12:00 +0000)]
llvm-objdump: use portable format specifiers for info.
ARM bots (& others, I think, now that I look) were failing because we
were using incorrect printf-style format specifiers. They were wrong
on almost any platform, actually, just mostly harmlessly so.
David Blaikie [Thu, 7 Aug 2014 22:22:49 +0000 (22:22 +0000)]
DebugInfo: Fix overwriting/loss of inlined arguments to recursively inlined functions.
Due to an unnecessary special case, inlined arguments that happened to
be from the same function as they were inlined into were misclassified
as non-inline arguments and would overwrite the non-inlined arguments.
Assert that we never overwrite a function's arguments, and stop
misclassifying inlined arguments as non-inline arguments to fix this
issue.
Excuse the rather crappy test case - handcrafted IR might do better, or
someone who understands better how to tickle the inliner to create a
recursive inlining situation like this (though it may also be necessary
to tickle the variable in a particular way to cause it to be recorded in
the MMI side table and go down this particular path for location
information).
Reed Kotler [Thu, 7 Aug 2014 22:09:01 +0000 (22:09 +0000)]
fix materialization of one bit constants and global values which are accessed through
a base GOT entry.
Summary:
get tip of tree mips fast-isel to pass test-suite
Two bugs were fixed:
1) one bit booleans were treated as 1 bit signed integers and so the literal '1' could become sign extended.
2) mips uses got for pic but in certain cases, as with string constants for example, many items can be referenced from the same got entry and this case was not handled properly.
Temporarily Revert "Nuke the old JIT." as it's not quite ready to
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reverts commits r215111, 215115, 215116, 215117, 215136.
Owen Anderson [Thu, 7 Aug 2014 21:07:35 +0000 (21:07 +0000)]
Fix a case in SROA where lifetime intrinsics could inhibit alloca promotion. In
this case, the code path dealing with vector promotion was missing the explicit
checks for lifetime intrinsics that were present on the corresponding integer
promotion path.
Lang Hames [Thu, 7 Aug 2014 20:41:57 +0000 (20:41 +0000)]
[MCJIT] Replace a c-style cast with reinterpret_cast + static_cast.
C-style casts (and reinterpret_casts) result in implementation defined
values when a pointer is cast to a larger integer type. On some platforms
this was leading to bogus address computations in RuntimeDyldMachOAArch64.
Richard Smith [Thu, 7 Aug 2014 20:41:17 +0000 (20:41 +0000)]
Remove Support/IncludeFile.h and its only user. This is actively harmful, since
it breaks the modules builds (where CallGraph.h can be quite reasonably
transitively included by an unimported portion of a module, and CallGraph.cpp
not linked in), and appears to have been entirely redundant since PR780 was
fixed back in 2008.
If this breaks anything, please revert; I have only tested this with a single
configuration, and it's possible that this is still somehow fixing something
(though I doubt it, since no other similar file uses this mechanism any more).
Akira Hatanaka [Thu, 7 Aug 2014 19:30:13 +0000 (19:30 +0000)]
[Branch probability] Recompute branch weights of tail-merged basic blocks.
BranchFolderPass was not correctly setting the basic block branch weights when
tail-merging created or merged blocks. This patch recomutes the weights of
tail-merged blocks using the following formula:
branch_weight(merged block to successor j) =
sum(block_frequency(bb) * branch_probability(bb -> j))
bb is a block that is in the set of merged blocks.
Justin Bogner [Thu, 7 Aug 2014 18:40:37 +0000 (18:40 +0000)]
FileCheck: Add a flag to allow checking empty input
Currently FileCheck errors out on empty input. This is usually the
right thing to do, but makes testing things like "this command does
not emit some error message" hard to test. This usually leads to
people using "command 2>&1 | count 0" instead, and then the bots that
use guard malloc fail a few hours later.
By adding a flag to FileCheck that allows empty inputs, we can make
tests that consist entirely of "CHECK-NOT" lines feasible.
Adam Nemet [Thu, 7 Aug 2014 17:53:55 +0000 (17:53 +0000)]
[AVX512] Generate masking instruction variants with tablegen
After adding the masking variants to several instructions, I have decided to
experiment with generating these from the non-masking/unconditional
variant. This will hopefully reduce the amount repetition that we currently
have in order to define an instruction with all its variants (for a reg/mem
instruction this would be 6 instruction defs and 2 Pat<> for the intrinsic).
The patch is the first cut that is currently only applied to valignd/q to make
the patch small.
A few notes on the approach:
* In order to stitch together the dag for both the conditional and the
unconditional patterns I pass the RHS of the set rather than the full
pattern (set dest, RHS).
* Rather than subclassing each instruction base class (e.g. AVX512AIi8),
with a masking variant which wouldn't scale, I derived the masking
instructions from a new base class AVX512 (this is just I<> with
Requires<HasAVX512>). The instructions derive from this now, plus a new set
of classes that add the format bits and everything else that instruction
base class provided (i.e. AVX512AIi8 vs. AVX512AIi8Base).
I hope we can go incrementally from here. I expect that:
* We will need different variants of the masking class. One example is
instructions requiring three vector sources. In this case we tie one of the
source operands to dest rather than a new implicit source operand ($src0)
* Add the zero-masking variant
* Add more AVX512*Base classes as new uses are added
I've looked at X86.td.expanded before and after to make sure that nothing got
lost for valignd/q.
Add first bunch of SPE instructions. As they overlap with Altivec, mark
them as parser-only until the disassembler is extended to handle
predicates properly.
Aaron Ballman [Thu, 7 Aug 2014 12:07:33 +0000 (12:07 +0000)]
Silencing an MSVC C4334 warning ('<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)). No functional changes intended.