Oliver Stannard [Thu, 23 Oct 2014 08:52:58 +0000 (08:52 +0000)]
[Thumb2] Improve disassembly of memory hints
Currently, the ARM disassembler will disassemble the Thumb2 memory hint
instructions (PLD, PLDW and PLI), even for targets which do not have
these instructions. This patch adds the required checks to the
disassmebler.
Frederic Riss [Thu, 23 Oct 2014 04:08:42 +0000 (04:08 +0000)]
Assert that ValueHandleBase::ValueIsRAUWd doesn't change the tracked Value type.
This invariant is enforced in Value::replaceAllUsesWith, thus it seems
logical to apply it also to ValueHandles. This commit fixes InstCombine
to not trigger the assertion during the removal of constant bitcasts in
call instructions.
Frederic Riss [Thu, 23 Oct 2014 04:08:38 +0000 (04:08 +0000)]
Modernize doxygen comments in Support/Dwarf.h
In post-commit review of r219442, Rafael pointed out that the comment style
of the newly introduced helper didn't follow LLVM's coding standard.
Modernize the whole file to the new standards.
David Blaikie [Thu, 23 Oct 2014 00:16:05 +0000 (00:16 +0000)]
[DebugInfo] Sink DwarfDebug::addCurrentFnArgument down into DwarfFile.
Variable handling will be sunk into DwarfFile so that abstract variables
and the like can be shared across multiple CUs (to handle cross-CU
inlining, for example).
David Blaikie [Thu, 23 Oct 2014 00:06:27 +0000 (00:06 +0000)]
[DebugInfo] Remove LexicalScopes::isCurrentFunctionScope and CSE a use of LexicalScopes::getCurrentFunctionScope
Now that we're sure the only root (non-abstract) scope is the current
function scope, there's no need for isCurrentFunctionScope, the property
can be tested directly instead.
Lang Hames [Wed, 22 Oct 2014 23:18:42 +0000 (23:18 +0000)]
[MCJIT] Make repeat calls to MCJIT::getPointerToFunction for declarations safe.
MCJIT::getPointerForFunction adds the resulting address to the global mapping.
This should be done via updateGlobalMapping rather than addGlobalMapping, since
the latter asserts if a mapping already exists.
MCJIT::getPointerToFunction is actually deprecated - hopefully we can remove it
(or more likely re-task it) entirely soon. In the mean time it should at least
work as advertised.
Derek Schuff [Wed, 22 Oct 2014 22:38:06 +0000 (22:38 +0000)]
[MC] Attach labels to existing fragments instead of using a separate fragment
Summary:
Currently when emitting a label, a new data fragment is created for it if the
current fragment isn't a data fragment.
This change instead enqueues the label and attaches it to the next fragment
(e.g. created for the next instruction) if possible.
When bundle alignment is not enabled, this has no functionality change (it
just results in fewer extra fragments being created). For bundle alignment,
previously labels would point to the beginning of the bundle padding instead
of the beginning of the emitted instruction. This was not only less efficient
(e.g. jumping to the nops instead of past them) but also led to miscalculation
of the address of the GOT (since MC uses a label difference rather than
emitting a "." symbol).
This would cause the flag to appear in the output of "llvm-config --cppflags",
which should contain only preprocessor flags. The -gsplit-dwarf flag in
particular can cause problems with certain downstream users such as cgo.
Justin Bogner [Wed, 22 Oct 2014 18:18:54 +0000 (18:18 +0000)]
test: Make this test runnable in directories with @ in their names
Jenkins likes to use directories with names involving the '@'
character, which breaks the sed expression in this test. Switch to use
'|' on the assumption that it's less likely to show up in a path.
Bill Schmidt [Wed, 22 Oct 2014 16:58:20 +0000 (16:58 +0000)]
[PATCH] Support select-cc for VSFRC when VSX is enabled
A previous patch enabled SELECT_VSRC and SELECT_CC_VSRC for VSX to
handle <2 x double> cases. This patch adds SELECT_VSFRC and
SELECT_CC_VSFRC to allow use of all 64 vector-scalar registers for the
f64 type when VSX is enabled. The changes are analogous to those in
the previous patch. I've added a new variant to vsx.ll to test the
code generation.
(I also cleaned up a little formatting in PPCInstrVSX.td from the
previous patch.)
Colin LeMahieu [Wed, 22 Oct 2014 16:49:14 +0000 (16:49 +0000)]
[Hexagon] Adding basic disassembler.
Marking all instructions as CodeGenOnly since encoding bits are not set yet.
http://reviews.llvm.org/D5829?vs=on&id=15023&whitespace=ignore-all#toc
Philip Reames [Wed, 22 Oct 2014 16:37:13 +0000 (16:37 +0000)]
Preserving 'nonnull' metadata in SimplifyCFG
When we hoist two loads above an if, we can preserve the nonnull metadata. We could also do the same for sinking them, but we appear to not handle metadata at all in that case.
Sanjay Patel [Wed, 22 Oct 2014 15:29:23 +0000 (15:29 +0000)]
Shrinkify libcalls: use float versions of double libm functions with fast-math (bug 17850)
When a call to a double-precision libm function has fast-math semantics
(via function attribute for now because there is no IR-level FMF on calls),
we can avoid fpext/fptrunc operations and use the float version of the call
if the input and output are both float.
We already do this optimization using a command-line option; this patch just
adds the ability for fast-math to use the existing functionality.
I moved the cl::opt from InstructionCombining into SimplifyLibCalls because
it's only ever used internally to that class.
Modified the existing test cases to use the unsafe-fp-math attribute rather
than repeating all tests.
This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850
Diego Novillo [Wed, 22 Oct 2014 13:36:35 +0000 (13:36 +0000)]
Change error to warning when a profile cannot be found.
When the profile for a function cannot be applied, we use to emit an
error. This seems extreme. The compiler can continue, it's just that the
optimization opportunities won't include profile information.
Bill Schmidt [Wed, 22 Oct 2014 13:13:40 +0000 (13:13 +0000)]
[PowerPC] Support select-cc for VSX
The tests test/CodeGen/Generic/select-cc.ll and
test/CodeGen/PowerPC/select-cc.ll both fail with VSX enabled. The
problem is that the lowering logic for the SELECT and SELECT_CC
operations doesn't currently support the VSX registers. This patch
fixes that.
In lib/Target/PowerPC/PPCInstrInfo.td, we have pseudos to handle this
for other register classes. Similar pseudos are added in
PPCInstrVSX.td (they must be there, because the "vsrc" register class
definition appears there) for the VSRC register class. The
SELECT_VSRC pseudo is then used in pattern matching for SELECT_CC.
The rest of the patch just adds logic for SELECT_VSRC wherever similar
logic appears for SELECT_VRRC.
There are no new test cases because the existing tests above test
this, along with a variant in test/CodeGen/PowerPC/vsx.ll.
After discussion with Hal, a future patch will add similar _VSFRC
variants to override f64 type handling (currently using F8RC).
Aaron Ballman [Wed, 22 Oct 2014 13:09:43 +0000 (13:09 +0000)]
Fixing a -Wsign-compare warning; NFC.
I think it might make sense to make COFF::MaxNumberOfSections16 be a uint32_t, however, that may have wider-reaching implications in other projects, which is why I did not change that declaration.
Diego Novillo [Wed, 22 Oct 2014 12:59:00 +0000 (12:59 +0000)]
Support using sample profiles with partial debug info.
Summary:
When using a profile, we used to require the use -gmlt so that we could
get access to the line locations. This is used to match line numbers in
the input profile to the line numbers in the function's IR.
But this is actually not necessary. The driver can provide source
location tracking without the emission of debug information. In these
cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the
actual line location annotations are still present.
This patch adds a new way of looking for the start of the current
function. Instead of looking through the compile units in llvm.dbg.cu,
we can walk up the scope for the first instruction in the function with
a debug loc. If that describes the function, we use it. Otherwise, we
keep looking until we find one.
If no such instruction is found, we then give up and produce an error.
gcc's (4.7, I think) -Wcomment warning is not "as smart" as clang's and
warns even if the line right after the backslash-newline sequence only has
a line comment that starts at the beginning of the line.
Evgeniy Stepanov [Wed, 22 Oct 2014 00:12:40 +0000 (00:12 +0000)]
[msan] Handle param-tls overflow.
ParamTLS (shadow for function arguments) is of limited size. This change
makes all arguments that do not fit unpoisoned, and avoids writing
past the end of a TLS buffer.
Lang Hames [Tue, 21 Oct 2014 23:41:15 +0000 (23:41 +0000)]
[MCJIT] Defer application of AArch64 MachO GOT relocations until resolve time.
On AArch64, GOT references are page relative (ADRP + LDR), so they can't be
applied until we know exactly where, within a page, the GOT entry will be in
the target address space.
JF Bastien [Tue, 21 Oct 2014 23:18:21 +0000 (23:18 +0000)]
LTO: respect command-line options that disable vectorization.
Summary: Patches 202051 and 208013 added calls to LTO's PassManager which unconditionally add LoopVectorizePass and SLPVectorizerPass instead of following the logic in PassManagerBuilder::populateModulePassManager and honoring the -vectorize-loops -run-slp-after-loop-vectorization flags.
Matt Arsenault [Tue, 21 Oct 2014 23:00:20 +0000 (23:00 +0000)]
Add minnum / maxnum intrinsics
These are named following the IEEE-754 names for these
functions, rather than the libm fmin / fmax to avoid
possible ambiguities. Some languages may implement something
resembling fmin / fmax which return NaN if either operand is
to propagate errors. These implement the IEEE-754 semantics
of returning the other operand if either is a NaN representing
missing data.
Enumerate `MDNode`'s operands *before* the node itself, so that the
reader requires less RAUW. Although this will cause different code
paths to be hit in the reader, this should effectively be no
functionality change.
Reid Kleckner [Tue, 21 Oct 2014 21:15:45 +0000 (21:15 +0000)]
GCC has supported C++11 ref-qualifiers since 4.8.1
This requires incorporating __GNUC_PATCHLEVEL__ into our prerequisite
check, and renaming our __GNUC_PREREQ to LLVM_GNUC_PREREQ, since it is
now functionally different.
Philip Reames [Tue, 21 Oct 2014 21:02:19 +0000 (21:02 +0000)]
Teach combineMetadata how to merge 'nonnull' metadata.
combineMetadata is used when merging two instructions into one. This change teaches it how to merge 'nonnull' - i.e. only preserve it on the new instruction if it's set on both sources. This isn't actually used yet since I haven't adjusted any of the call sites to pass in nonnull as a 'known metadata'.
Philip Reames [Tue, 21 Oct 2014 21:00:03 +0000 (21:00 +0000)]
Preserve 'nonnull' when changing type of the load.
When changing the type of a load in Chandler's recent InstCombine changes, we can preserve the new 'nonnull' metadata.
I considered adding an assert since 'nonnull' is only valid on pointer types, but casting a pointer to a non-pointer would involve more than a bitcast anyways. If someone extends this transform to handle more than bitcasts, the verifier will report the malformed IR, so a separate assertion isn't needed. Also, the fpmath flags would have the same problem.
[PBQP] Teach PassConfig to tell if the default register allocator is used.
This enables targets to adapt their pass pipeline to the register
allocator in use. For example, with the AArch64 backend, using PBQP
with the cortex-a57, the FPLoadBalancing pass is no longer necessary.
David Majnemer [Tue, 21 Oct 2014 19:51:55 +0000 (19:51 +0000)]
InstCombine: Simplify FoldICmpCstShrCst
This function was complicated by the fact that it tried to perform
canonicalizations that were already preformed by InstSimplify. Remove
this extra code and move the tests over to InstSimplify. Add asserts to
make sure our preconditions hold before we make any assumptions.
[PBQP] Check for out of bound access in DEBUG builds
It is just too easy to use a virtual register intead of a NodeId without a
compiler warning. This does not fix the fundamental problem, i.e. both
have the same underlying types, but increases the likelyhood to detect it.
With VSX enabled, test/CodeGen/PowerPC/recipest.ll exposes a bug in
the FMA mutation pass. If we have a situation where a killed product
register is the same register as the FMA target, such as:
then the substitution makes no sense. We end up getting a crash when
we try to extend the interval associated with the killed product
register, as there is already a live range for %vreg5 there. This
patch just disables the mutation under those circumstances.
Since recipest.ll generates different code with VMX enabled, I've
modified that test to use -mattr=-vsx. I've borrowed the code from
that test that exposed the bug and placed it in fma-mutate.ll, where
it tests several mutation opportunities including the "bad" one.
Chandler Carruth [Tue, 21 Oct 2014 09:00:40 +0000 (09:00 +0000)]
Teach the load analysis to allow finding available values which require
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.
To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.
These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.
I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.
Paul Robinson [Tue, 21 Oct 2014 01:00:55 +0000 (01:00 +0000)]
Do not attribute static allocas to the call site's DebugLoc.
When functions are inlined, instructions without debug information are
attributed to the call site's DebugLoc. After inlining, inlined static
allocas are moved to the caller's entry block, adjacent to the caller's
original static alloca instructions. By retaining the call site's
DebugLoc, these instructions could cause instructions that were
subsequently inserted at the entry block to pick up the same DebugLoc.