Craig Topper [Wed, 30 Dec 2015 06:00:22 +0000 (06:00 +0000)]
[TableGen] Remove raw_string_ostream by just emitting the header for the switch the first time we emit a case. If the header was never emitted just print the default at the end. NFC
Craig Topper [Wed, 30 Dec 2015 06:00:15 +0000 (06:00 +0000)]
[TableGen] Use 'size_t' instead of 'unsigned' to better match the argument types of addAsmOperand. Simplify some code by using StringRef::find instead of std::find. These were previously done in r247527 and r247528, but another commit seems to have erased them. NFC
Chandler Carruth [Wed, 30 Dec 2015 04:00:24 +0000 (04:00 +0000)]
[ptr-traits] Implement the base pointer traits using the actual
alignment of the pointee type!
This is the culmination of the ptr-traits work. Now the compiler will
catch me if I try to use a pointer to an empty struct as a key in
a dense map or inside a PointerIntPair or PointerUnion! This is much,
much better than sometimes corrupting data (and other times working
fine) due to insufficient alignment.
It also means that we will be much more diligent about rejecting other
uses of these constructs that aren't safe.
It also means that we can now be more aggressive with the constructs
when we actually have guaranteed higher alignment without specializing
stuff. I'll be going through and cleaning up all the current overrides
of these traits which are no longer necessary.
Many thanks to Richard, David, and others who helped me get all of this
together.
Chandler Carruth [Wed, 30 Dec 2015 03:56:19 +0000 (03:56 +0000)]
[ptr-traits] Refactor how PointerIntPair does its pointer manipulation
to isolate it in a dependent helper class.
Without doing this, we end up requiring all of the pointer traits the
moment you even define a PointerIntPair. That makes them *incredibly*
hard to use, for example you can't use them at all inside a class for
pointers to that class!
This change sinks all the logic into a helper template class that only
needs to be fully instantiated when *using* the PointerIntPair. We still
get compile-time checking, but it is deferred long enough to make
tradition out-of-line method definitions (or just the normal deferred
method body parsing) sufficient to handle cycling references.
Manuel Jacob [Tue, 29 Dec 2015 21:57:55 +0000 (21:57 +0000)]
[PlaceSafepoints] Assert that the gc.safepoint_poll function is present in the module.
If running the PlaceSafepoints pass on a module which doesn't have the
gc.safepoint_poll function without disabling entry and backedge safepoints,
previously the pass crashed with an obscure error because of a null pointer.
Now it fails the assert instead.
Geoff Berry [Tue, 29 Dec 2015 18:10:16 +0000 (18:10 +0000)]
[JumpThreading] Fix opcode bonus in getJumpThreadDuplicationCost()
The code that was meant to adjust the duplication cost based on the
terminator opcode was not being executed in cases where the initial
threshold was hit inside the loop.
Chandler Carruth [Tue, 29 Dec 2015 09:52:41 +0000 (09:52 +0000)]
[ADT] Teach alignment helpers to work correctly for abstract classes.
This is necessary to use them as part of pointer traits and is generally
useful. I've added unit test coverage to isolate and ensure this works
correctly.
I'll watch the build bots to try to see if any compilers can't tolerate
this bit of magic (and much credit goes to Richard Smith for coming up
with this magical production!) but give a shout if you see issues.
Chandler Carruth [Tue, 29 Dec 2015 09:32:18 +0000 (09:32 +0000)]
[ptr-traits] Provide a real MCFragment address for the sentinel instead
of casting the integer '4' to such a pointer. There is no reason to
expect '4' to be a portable or reliable pointer of this form. The only
reason this ever worked is because the PointerIntPair that this actually
gets used with has an artificially *low* presumed alignment that allowed
it to work. When the alignment of PointerIntPair is derived from the
actual type's alignment, the asserts start firing on this pointer. I'm
amazed we never managed to do anything that triggered the alignment
sanitizer with it, as this is just flat out UB.
If folks dislike this approach to providing a sentinel fragment address,
there are a myriad of other alternatives, suggestions welcome. But this
one has the distinct advantage of not requiring the friend dance of
ilist's sentinel (which I'll point out is *also* in play for
MCFragment!) and seems to be using a nicely provided facility in
MCFragment to establish just such dummy nodes.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
Chandler Carruth [Tue, 29 Dec 2015 09:24:42 +0000 (09:24 +0000)]
[ptr-traits] Sink several in-body method definitions to be out-of-line
inline definitions after the mutually recursive pair of types have been
defined. The two types mutually recurse specifically through
abstractions that require pointer traits which makes this kind of mutual
recursion especially tricky to get right in terms of ordering.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
Chandler Carruth [Tue, 29 Dec 2015 09:24:39 +0000 (09:24 +0000)]
[ptr-traits] Sink a constructor definition to the .cpp file and add
missing includes so that the pointee types for DenseMap pointer keys and
such are complete prior to us querying the pointer traits for them.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
Chandler Carruth [Tue, 29 Dec 2015 09:06:21 +0000 (09:06 +0000)]
[ptr-traits] Add a bunch of includes to provide complete types that are
used in pointer dense map key types or in other ways that require
pointer traits.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
Chandler Carruth [Tue, 29 Dec 2015 09:06:16 +0000 (09:06 +0000)]
[ptr-traits] Split the MCFragment type hierarchy out of the MCAssembler
header to its own header, allowing users of fragments to have a narrower
header file, and avoid circular header dependencies when getting the
definition of MCSection prior to inspecting traits on MCSection
pointers.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
Note that this doesn't in any way change the design of MC, it is just
moving code around to allow the *header files* to be more fine grained.
Without this, it is impossible to get a complete type for MCSection
where it is needed.
If anyone would prefer a different slicing of the header files, I'm
happy to oblige of course. =]
Previously, the code enforced non-decreasing alignment of each trailing
type. However, it's easy enough to allow for realignment as needed, and
thus avoid the developer having to think about the possiblilities for
alignment requirements on all architectures.
(E.g. on Linux/x86, a struct with an int64 member is 4-byte aligned,
while on other 32-bit archs -- and even with other OSes on x86 -- it has
8-byte alignment. This sort of thing is irritating to have to manually
deal with.)
Chandler Carruth [Tue, 29 Dec 2015 02:14:50 +0000 (02:14 +0000)]
[ptr-traits] Merge the MetadataTracking helpers into the Metadata
header.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
The MetadataTracking helpers aren't actually independent. They rely on
constructing a PointerUnion between Metadata and MetadataAsValue
pointers, which requires know the alignment of pointers to those types
which requires them to be complete.
The .cpp file even defined a method declared in Metadata.h! These really
don't seem like something that is separable, and there is no real
layering problem with just placing them together.
Chandler Carruth [Tue, 29 Dec 2015 00:03:24 +0000 (00:03 +0000)]
[ADT] Use a nonce type with at least 4 byte alignment.
We didn't actually statically check this, and so it worked 25% of the
time for me. =/ Really sorry it took so long to fix, I shouldn't leave
the commit log editor window open without saving and landing the commit.
=[
Artyom Skrobov [Mon, 28 Dec 2015 21:40:45 +0000 (21:40 +0000)]
[Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'
Summary:
* avoid generating POP {LR} in Thumb1 epilogues
* combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass
* combine POP {LR} + B + BX LR -> POP {PC} on v5T+
Easwaran Raman [Mon, 28 Dec 2015 20:28:19 +0000 (20:28 +0000)]
Refactor inline costs analysis by removing the InlineCostAnalysis class
InlineCostAnalysis is an analysis pass without any need for it to be one.
Once it stops being an analysis pass, it doesn't maintain any useful state
and the member functions inside can be made free functions. NFC.
Manuel Jacob [Mon, 28 Dec 2015 20:14:05 +0000 (20:14 +0000)]
[RS4GC] Fix rematerialization of bitcast of bitcast.
Summary:
Previously, only the outer (last) bitcast was rematerialized, resulting in a
use of the unrelocated inner (first) bitcast after the statepoint. See the
test case for an example.
Implemented cost model for masked gather and scatter operations
The cost is calculated for all X86 targets. When gather/scatter instruction
is not supported we calculate the cost of scalar sequence.
Sanjay Patel [Mon, 28 Dec 2015 18:28:44 +0000 (18:28 +0000)]
Specify triple so 'make check' passes on darwin x86-64
The check lines were added with:
http://reviews.llvm.org/rL256458
http://reviews.llvm.org/rL256460
but on a darwin target, the output looks like:
## InlineAsm Start
rorq %rdi
## InlineAsm End
## InlineAsm Start
rorq %rsi
## InlineAsm End
leaq (%rsi,%rdi), %rax
retq
[X86] Better support for the MCU psABI (LLVM part)
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.
The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.
Asaf Badouh [Mon, 28 Dec 2015 08:26:26 +0000 (08:26 +0000)]
[X86][AVX512] Lower broadcast sub vector to vector inrtrinsics
lower broadcast<type>x<vector> to shuffles.
there are two cases:
1.src is 128 bits and dest is 512 bits: in this case we will lower it to shuffle with imm = 0.
2.src is 256 bit and dest is 512 bits: in this case we will lower it to shuffle with imm = 01000100b (0x44) that way we will broadcast the 256bit source: ymm[0,1,2,3] => zmm[0,1,2,3,0,1,2,3] then it will mask it with the passthru value (in case it's mask op).
Chandler Carruth [Mon, 28 Dec 2015 01:54:18 +0000 (01:54 +0000)]
[lcg] Fix formatting errors found with clang-format, remove the now
optional '\brief' tag and reflow some comments based on the added
horizontal space. NFC.
Craig Topper [Sun, 27 Dec 2015 21:33:50 +0000 (21:33 +0000)]
[AVX512] Remove separate instruction and patterns for lowering ctlz_zero_undef. Change the operation for CTLZ_ZERO_UNDEF to Expand so SelectionDAG will convert them to CTLZ before lowering.
Craig Topper [Sun, 27 Dec 2015 21:33:47 +0000 (21:33 +0000)]
[SelectionDAG] Teach LegalizeVectorOps to not unroll CTLZ_ZERO_UNDEF and CTTZ_ZERO_UNDEF if the non-ZERO_UNDEF form is legal or custom. Will be used to simplify X86 code in a follow on commit.
Craig Topper [Sun, 27 Dec 2015 19:45:21 +0000 (19:45 +0000)]
[AVX512] Remove alternate data type versions of VALIGND, VALIGNQ, VMOVSHDUP and VMOVSLDUP. They don't have any tests and I don't think they can be selected. If they are truly needed they should be implemented with patterns against the normal instructions and not separate instructions.
Dan Liew [Sun, 27 Dec 2015 14:03:49 +0000 (14:03 +0000)]
[lit] Implement support of per test timeout in lit.
This should work with ShTest (executed externally or internally) and GTest
test formats.
To set the timeout a new option ``--timeout=`` has
been added which specifies the maximum run time of an individual test
in seconds. By default this 0 which causes no timeout to be enforced.
The timeout can also be set from a lit configuration file by modifying
the ``lit_config.maxIndividualTestTime`` property.
To implement a timeout we now require the psutil Python module if a
timeout is requested. This dependency is confined to the newly added
``lit.util.killProcessAndChildren()``. A note has been added into the
TODO document describing how we can remove the dependency on the
``pustil`` module in the future. It would be nice to remove this
immediately but that is a lot more work and Daniel Dunbar believes it is
better that we get a working implementation first and then improve it.
To avoid breaking the existing behaviour the psutil module will not be
imported if no timeout is requested.
The included testcases are derived from test cases provided by
Jonathan Roelofs which were in an previous attempt to add a per test
timeout to lit (http://reviews.llvm.org/D6584). Thanks Jonathan!
Igor Breger [Sun, 27 Dec 2015 13:56:16 +0000 (13:56 +0000)]
AVX512: Change VPMOVB2M DAG lowering , use CVT2MASK node instead TRUNCATE.
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.
Chandler Carruth [Sun, 27 Dec 2015 08:41:34 +0000 (08:41 +0000)]
[attrs] Extract the pure inference of function attributes into
a standalone pass.
There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.
In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.
The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.
Chandler Carruth [Sun, 27 Dec 2015 08:13:45 +0000 (08:13 +0000)]
[attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.
This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.
I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.
Craig Topper [Sun, 27 Dec 2015 06:55:08 +0000 (06:55 +0000)]
[AVX-512] Remove alernate integer forms for VPERMILPS and VPERMILPD. There no tests for them and I don't see any way to select them anyway. If they are really needed they should be implemented as patterns and not full fledged instructions.
David Majnemer [Sun, 27 Dec 2015 06:07:26 +0000 (06:07 +0000)]
[X86, Win64] Use a frame pointer if pushf is emitted
A frame pointer must be used if stack pointer is modified after the
prologue. LLVM will emit pushf/popf if we need to save/restore the
FLAGS register, requiring us to have a frame pointer for the function.
There is a small twist: this sequence might exist in user code via
inline-assembly. For now, conservatively assume that such functions
require a frame pointer. For real world justification, please see
clang's implementation of __readeflags.
Chen Li [Sat, 26 Dec 2015 07:54:32 +0000 (07:54 +0000)]
[gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint.
Craig Topper [Sat, 26 Dec 2015 04:58:05 +0000 (04:58 +0000)]
Add test case for r256433. "[X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct."
Craig Topper [Sat, 26 Dec 2015 04:50:07 +0000 (04:50 +0000)]
[X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct.