David Majnemer [Thu, 31 Jul 2014 04:49:18 +0000 (04:49 +0000)]
InstSimplify: Simplify (X - (0 - Y)) if the second sub is NUW
If the NUW bit is set for 0 - Y, we know that all values for Y other
than 0 would produce a poison value. This allows us to replace (0 - Y)
with 0 in the expression (X - (0 - Y)) which will ultimately leave us
with X.
[FastISel][AArch64] Update and enable patchpoint and stackmap intrinsic tests for FastISel.
This commit updates the existing SelectionDAG tests for the stackmap and patchpoint
intrinsics and enables FastISel testing. It also splits up the tests into separate
files, due to different codegen between SelectionDAG and FastISel.
[FastISel][AArch64] Add MachO large code model support for function calls.
Currently the large code model for MachO uses the GOT to make function calls.
Emit the required adrp and ldr instructions to load the address from the GOT.
UseListOrder: Don't give constant IDs to GlobalValues
Since initializers of GlobalValues are being assigned IDs before
GlobalValues themselves, explicitly exclude GlobalValues from the
constant pool. Added targeted test in `test/Bitcode/use-list-order.ll`
and added two more RUN lines in `test/Assembly`.
[FastISel][AArch64 and X86] Don't emit stores for UNDEF arguments during function call lowering.
UNDEF arguments are not ment to be touched - especially for the webkit_js
calling convention. This fix reproduces the already existing behavior of
SelectionDAG in FastISel.
llvm-profdata: Use consistent file suffixes in tests
In some places we've been using different suffixes for the different
file formats involved in instrprof, but in others we've just
ambiguously used .profdata. Update the test files to indicate the
types of file more obviously.
Rafael Espindola [Wed, 30 Jul 2014 22:51:54 +0000 (22:51 +0000)]
Use "weak alias" instead of "alias weak"
Before this patch we had
@a = weak global ...
but
@b = alias weak ...
The patch changes aliases to look more like global variables.
Looking at some really old code suggests that the reason was that the old
bison based parser had a reduction for alias linkages and another one for
global variable linkages. Putting the alias first avoided the reduce/reduce
conflict.
The days of the old .ll parser are long gone. The new one parses just "linkage"
and a later check is responsible for deciding if a linkage is valid in a
given context.
This commit adds support for the {s|u}{add|sub|mul}.with.overflow intrinsics.
The unit tests for FastISel will be enabled in a later commit, once there is
also branch and select folding support.
[FastISel][AArch64] Add support for shift-immediate.
Currently the shift-immediate versions are not supported by tblgen and
hopefully this can be later removed, once the required support has been
added to tblgen.
David Majnemer [Wed, 30 Jul 2014 21:26:37 +0000 (21:26 +0000)]
InstCombine: Simplify (A ^ B) or/and (A ^ B ^ C)
While we can already transform A | (A ^ B) into A | B, things get bad
once we have (A ^ B) | (A ^ B ^ Cst) because reassociation will morph
this into (A ^ B) | ((A ^ Cst) ^ B). Our existing patterns fail once
this happens.
To fix this, we add a new pattern which looks through the tree of xor
binary operators to see that, in fact, there exists a redundant xor
operation.
What follows bellow is a correctness proof of the transform using CVC3.
$ cat t.cvc
A, B, C : BITVECTOR(64);
QUERY BVXOR(A, B) | BVXOR(BVXOR(B, C), A) = BVXOR(A, B) | C;
QUERY BVXOR(BVXOR(A, C), B) | BVXOR(A, B) = BVXOR(A, B) | C;
QUERY BVXOR(A, B) & BVXOR(BVXOR(B, C), A) = BVXOR(A, B) & ~C;
QUERY BVXOR(BVXOR(A, C), B) & BVXOR(A, B) = BVXOR(A, B) & ~C;
Louis Gerbarg [Wed, 30 Jul 2014 20:26:09 +0000 (20:26 +0000)]
Fix test case introduced in r214322
This patch adds an explicit triple to the test case introduced by r214322. This
should fix build failueres that are occuring on bots that are cross building.
Louis Gerbarg [Wed, 30 Jul 2014 18:24:41 +0000 (18:24 +0000)]
Retain alignment requirements for load->selects modified by DAGCombine
DAGCombine may choose to rewrite graphs where two loads feed a select into
graphs where a select of two addresses feed a load. While it sanity checks the
loads to make sure they are broadly equivalent it currently just uses the
alignment restriction of the left node. In cases where the right node has
stronger alignment requiresment this may lead to bad codegen, such as generating
an aligned load where an unaligned load is required. This patch makes the
combine generate a load with an alignment that is the same as whichever is more
restrictive of the two alignments.
When predicting use-list order, we visit functions in reverse order
followed by `GlobalValue`s and write out use-lists at the first
opportunity. In the reader, this will translate to *after* the last use
has been added.
For this to work, we actually need to descend into `GlobalValue`s.
Added a targeted test in `use-list-order.ll` and `RUN` lines to the
newly passing tests in `test/Bitcode`.
There are two remaining failures in `test/Bitcode`:
- blockaddress.ll: I haven't thought through how to model the way
block addresses change the order of use-lists (or how to work around
it).
- metadata-2.ll: There's an old-style `@llvm.used` global array here
that I suspect the .ll parser isn't upgrading properly. When it
round-trips through bitcode, the .bc reader *does* upgrade it, so
the extra variable (`i8* null`) has an extra use, and the shuffle
vector doesn't match.
I think the fix is to upgrade old-style global arrays (or reject
them?) in the .ll parser.
Turns out `parseBitcodeFile()` does *not* take ownership of the buffer.
This was already clear in the header docs, but I obviously didn't read
them (having noticed that it gets stored in a `unique_ptr<>`).
Don't manually (and forcibly) run the verifier on the entire module from
the jump instruction table pass. First, the verifier is already built
into all the tools. The test case is adapted to just run llvm-as
demonstrating that we still catch the broken module. Second, the
verifier is *extremely* slow. This was responsible for very significant
compile time regressions.
If you have deployed a Clang binary anywhere from r210280 to this
commit, you really want to re-deploy.
Lang Hames [Wed, 30 Jul 2014 03:35:05 +0000 (03:35 +0000)]
[MCJIT] Fix the ARM BR24 relocation in RuntimeDyldMachO.
We now (1) correctly decode the branch immediate, (2) modify the immediate to
corretly treat it as PC-rel, and (3) properly populate the stub entry.
Previously we had been doing each of these wrong.
Matt Arsenault [Wed, 30 Jul 2014 03:18:57 +0000 (03:18 +0000)]
R600/SI: Remove redundant setting of bits on instructions.
neverHasSideEffects is deprecated, and hasSideEffects = 0 is already
set on the base classes of the basic ALU instruction classes. The
base classes also already set mayLoad = 0 and mayStore = 0
Until r214242, this difference between compilers didn't matter. In
r214242, `OrderMap::LastGlobalValueID` was introduced and compared
against IDs, which in GCC were off-by-one my expectations.
Lang Hames [Tue, 29 Jul 2014 23:43:13 +0000 (23:43 +0000)]
[MCJIT] Add options to llvm-rtdyld to describe a phony target address space for
use in -verify mode.
This patch adds three hidden command line options to llvm-rtdyld:
-target-addr-start <start-addr> : Specify the start of the virtual address
space on the phony target.
-target-addr-end <end-addr> : Specify the end of the virtual address space
on the phony target.
-target-section-sep <sep> : Specify the separation (in bytes) between the
end of one section and the start of the next.
These options automatically default to sane values for the target platform. In
particular, they allow narrow (e.g. 32-bit, 16-bit) targets to be tested from
wider (e.g. 64-bit, 32-bit) hosts without overflowing pointers.
The section separation option defaults to zero, but can be set to a large number
(e.g. 1 << 32) to force large separations between sections in order to
stress-test large-code-model code.
Revert "UseListOrder: Order GlobalValue uses after initializers"
This reverts commits r214242 and r214243 while I investigate buildbot
failures [1][2][3]. I can't reproduce these failures locally, so if
anyone can see what I've done wrong, I'd appreciate a note.
UseListOrder: Additional test coverage for r214242
r214242 was subtle enough it really deserves a targeted test with
comments. This adds some global variables that trigger the relevant
code path. Sorry this wasn't committed with the fix.
UseListOrder: Order GlobalValue uses after initializers
To avoid unnecessary forward references, the reader doesn't process
initializers of `GlobalValue`s until after the constant pool has been
processed, and then in reverse order. Model this when predicting
use-list order. This gets two more Bitcode tests passing with
`llvm-uselistorder`.
This moves some tests around to make it clearer what's being tested,
and adds very rudimentary comment syntax to the text input format to
make specifying this kind of test a little bit simpler.
Lang Hames [Tue, 29 Jul 2014 21:48:22 +0000 (21:48 +0000)]
[MCJIT] XFAIL some RuntimeDyld tests on MIPS - RuntimeDyldChecker isn't properly
endian-aware yet, and this is causing failures when cross-linking on MIPS.
C:/lld-x86_64_win7/lld-x86_64-win7/llvm.src/include\llvm/IR/UseListOrder.h(92): error C2248: 'llvm::UseListShuffleVector::operator =' : cannot access private member declared in class 'llvm::UseListShuffleVector' (C:\lld-x86_64_win7\lld-x86_64-win7\llvm.src\lib\Bitcode\Writer\ValueEnumerator.cpp) [C:\lld-x86_64_win7\lld-x86_64-win7\llvm.obj\lib\Bitcode\Writer\LLVMBitWriter.vcxproj]
C:/lld-x86_64_win7/lld-x86_64-win7/llvm.src/include\llvm/IR/UseListOrder.h(56) : see declaration of 'llvm::UseListShuffleVector::operator ='
C:/lld-x86_64_win7/lld-x86_64-win7/llvm.src/include\llvm/IR/UseListOrder.h(32) : see declaration of 'llvm::UseListShuffleVector'
This diagnostic occurred in the compiler generated function 'llvm::UseListOrder &llvm::UseListOrder::operator =(const llvm::UseListOrder &)'
Remove the copy constructor added in r214178 to appease MSVC17 since it
shouldn't be called at all. My guess is that explicitly deleting it
will make the compiler happy. To round out the operations I've also
deleted copy assignment and added move assignment. Otherwise no
functionality change.
Lang Hames [Tue, 29 Jul 2014 20:40:37 +0000 (20:40 +0000)]
[MCJIT] Make the RuntimeDyldChecker stub_addr builtin use file names rather than
full paths for its first argument.
This allows us to remove the annoying sed lines in the test cases, and write
direct references to file names in stub_addr calls (rather than <filename>
placeholders).