Nick Lewycky [Wed, 4 Jun 2014 07:45:54 +0000 (07:45 +0000)]
Fix a use of uninitialized value. OldCC is set when IsCmpZero || IsSwapped and read when ShouldUpdateCC || IsSwapped, and ShouldUpdateCC is independent. Fixes PR19932, but no test since I wasn't able to get any symptoms to appear, not even with valgrind and the testcase from the PR. It's clear what happened from inspection of the code.
Andrew Trick [Wed, 4 Jun 2014 07:06:27 +0000 (07:06 +0000)]
Add a subtarget hook: enablePostMachineScheduler.
As requested by AArch64 subtargets.
Note that this will have no effect until the
AArch64 target actually enables the pass like this:
substitutePass(&PostRASchedulerID, &PostMachineSchedulerID);
As soon as armv7 switches over, PostMachineScheduler will become the
default postRA scheduler, so this won't be necessary any more.
Targets using the old postRA schedule would then do:
substitutePass(&PostMachineSchedulerID, &PostRASchedulerID);
Andrew Trick [Wed, 4 Jun 2014 07:06:18 +0000 (07:06 +0000)]
Move GenericScheduler and PostGenericScheduler into a header.
These were not exposed previously because I didn't want out-of-tree
targets to be too dependent on their internals. They can be reused for
a very wide variety of processors with casual scheduling needs without
exposing the classes by instead using hooks defined in
MachineSchedPolicy (we can add more if needed). When targets are more
aggressively tuned or want to provide custom heuristics, they can
define their own MachineSchedStrategy. I tend to think this is better
once you start customizing heuristics because you can copy over only
what you need. I don't think that layering heuristics generally works
well.
However, Arch64 targets now want to reuse the Generic scheduling logic
but also provide extensions. I don't see much harm in exposing the
Generic scheduling classes with a major caveat: these scheduling
strategies may change in the future without validating performance on
less mainstream processors. If you want to be immune from changes,
just define your own MachineSchedStrategy.
Justin Bogner [Wed, 4 Jun 2014 06:29:38 +0000 (06:29 +0000)]
docs: Remove documentation for legacy PGO options
Late last year r191835 removed a largely unmaintained legacy PGO
infrastructure, but some of the docs were missed. Since these docs are
for things that don't actually exist anymore, they should be removed.
Alp Toker [Wed, 4 Jun 2014 04:11:12 +0000 (04:11 +0000)]
GraphWriter: try gv before xdg-open
Avoid changing behaviour for everyone who's used to the traditional ghostview
UI, especially since it knows how to stay in the foreground unlike xdg-open.
Alp Toker [Wed, 4 Jun 2014 03:21:38 +0000 (03:21 +0000)]
config.h: fix layering and don't duplicate definitions
Also correct the llvm-config.h header guard so it doesn't depend on 'CONFIG_H'
which is commonly defined in external projects and caused trouble for
embedders.
In future llvm/Config/llvm-config.h will be installed, but not
the private llvm/Config/config.h header.
David Blaikie [Wed, 4 Jun 2014 01:30:59 +0000 (01:30 +0000)]
DebugInfo: Partial revert r209984 due to more cases where abstract DbgVariables do not have associated DIEs.
Along with a test case to demonstrate that due to inlining order there
are cases where abstract variable DIEs are not constructed since the
abstract subprogram was built due to a previous inlining that optimized
away those variables. This produces incorrect debug info (the 'missing'
abstract variable causes the inlined instance of that variable to be
emitted with a full description (name, line, file) rather than
referencing the abstract origin), but this commit at least ensures that
it doesn't crash...
Tim Northover [Tue, 3 Jun 2014 13:54:53 +0000 (13:54 +0000)]
AArch64: mark small types (i1, i8, i16) as promoted
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.
Nick Lewycky [Tue, 3 Jun 2014 04:25:36 +0000 (04:25 +0000)]
Ignore line numbers on debug intrinsics. Add an assert to ensure that we aren't emitting line number zero, the .gcno format uses this to indicate that the next field is a filename.
Allow alias to point to an arbitrary ConstantExpr.
This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.
This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like
@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
i32 ptrtoint (i32* @bar to i32)) to i32*)
An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).
Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.
Alexey Samsonov [Mon, 2 Jun 2014 18:08:27 +0000 (18:08 +0000)]
Remove sanitizer blacklist from ASan/TSan/MSan function passes.
Instrumentation passes now use attributes
address_safety/thread_safety/memory_safety which are added by Clang frontend.
Clang parses the blacklist file and adds the attributes accordingly.
Currently blacklist is still used in ASan module pass to disable instrumentation
for certain global variables. We should fix this as well by collecting the
set of globals we're going to instrument in Clang and passing it to ASan
in metadata (as we already do for dynamically-initialized globals and init-order
checking).
This change also removes -tsan-blacklist and -msan-blacklist LLVM commandline
flags in favor of -fsanitize-blacklist= Clang flag.
Dinesh Dwivedi [Mon, 2 Jun 2014 07:24:36 +0000 (07:24 +0000)]
Added inst combine transforms for single bit tests from Chris's note
if ((x & C) == 0) x |= C becomes x |= C
if ((x & C) != 0) x ^= C becomes x &= ~C
if ((x & C) == 0) x ^= C becomes x |= C
if ((x & C) != 0) x &= ~C becomes x &= ~C
if ((x & C) == 0) x &= ~C becomes nothing
Alp Toker [Mon, 2 Jun 2014 01:40:04 +0000 (01:40 +0000)]
GraphWriter: detect graph viewer programs at runtime
Replace the crufty build-time configure checks for program paths with
equivalent runtime logic.
This lets users install graphing tools as needed without having to reconfigure
and rebuild LLVM, while eliminating a long chain of inappropriate compile
dependencies that included GUI programs and the windowing system.
Additional features:
* Support the OS X 'open' command to view graphs generated by any of the
Graphviz utilities. This is an alternative to the Graphviz OS X UI which is
no longer available on Mountain Lion.
* Produce informative log output upon failure to indicate which programs can
be installed to view graphs.
Ping me if this doesn't work for your particular environment.
Since we cannot yet use variadic templates, add a specialisation for
6-parameters to format. This is motivated by a need for the additional
parameter for formatting information for an unwind decoder for Windows on ARM.
Introduce the support structures necessary to deal with the Windows ARM EH data.
These definitions are extremely aggressive about assertions to aid future use
for generation of the entries and subsequent decoding.
The names for the various fields are meant to reflect the names used by the
Visual Studio toolchain to aid communication.
Due to the complexity in reading a few of the values, there are a couple of
additional utility functions to decode the information.
In general, there are two ways to encode the unwinding information:
- packed, which places the data inline into the
_IMAGE_ARM_RUNTIME_FUNCTION_ENTRY structure.
- unpacked, which places the data into auxiliary structures placed into the
.xdata section.
The set of structures allow reading of data in either encoding, with the minor
caveat that epilogue scopes need to be decoded manually by constructing the
structure from the data returned by the RuntimeFunction structure.
These definitions are meant for read-only access at the current point as the
first use of them will be to decode the exception information.
David Blaikie [Sun, 1 Jun 2014 03:38:13 +0000 (03:38 +0000)]
DebugInfo: Assert that DbgVariables have associated DIEs
This was previously committed in r209680 and reverted in r209683 after
it caused sanitizer builds to crash.
The issue seems to be that the DebugLoc associated with dbg.value IR
intrinsics isn't necessarily accurate. Instead, we duplicate the
DIVariables and add an InlinedAt field to them to record their
location.
We were using this InlinedAt field to compute the LexicalScope for the
variable, but not using it in the abstract DbgVariable construction and
mapping. This resulted in a formal parameter to the current concrete
function, correctly having no InlinedAt information, but incorrectly
having a DebugLoc that described an inlined location within the
function... thus an abstract DbgVariable was created for the variable,
but its DIE was never constructed (since the LexicalScope had no such
variable). This DbgVariable was silently ignored (by testing for a
non-null DIE on the abstract DbgVariable).
So, fix this by using the right scoping information when constructing
abstract DbgVariables.
In the long run, I suspect we want to undo the work that added this
second kind of location tracking and fix the places where the DebugLoc
propagation on the dbg.value intrinsic fails. This will shrink debug
info (by not duplicating DIVariables), make it more efficient (by not
having to construct new DIVariable metadata nodes to try to map back to
a single variable), and benefit all instructions.
But perhaps there are insurmountable issues with DebugLoc quality that
I'm unaware of... I just don't know how we can't /just keep the DebugLoc
from the dbg.declare to the dbg.values and never get this wrong/.
Adam Nemet [Sat, 31 May 2014 16:23:20 +0000 (16:23 +0000)]
[SelectionDAG] Force cycle detection in AssignTopologicalOrder before aborting
DAG cycle detection is only enabled with ENABLE_EXPENSIVE_CHECKS. However we
can run it just before we would crash in order to provide more informative
diagnostics.
Now in addition to the "Overran sorted position" message we also get the Node
printed if a cycle was detected.
Tested by building several configs: Debug+Assert, Debug+Assert+Check (this is
ENABLE_EXPENSIVE_CHECKS), Release+Assert and Release. Also tried that the
AssignTopologicalOrder assert produces the expected results.
Benjamin Kramer [Sat, 31 May 2014 15:01:54 +0000 (15:01 +0000)]
[Reassociate] Similar to "X + -X" -> "0", added code to handle "X + ~X" -> "-1".
Handle "X + ~X" -> "-1" in the function Value *Reassociate::OptimizeAdd(Instruction *I, SmallVectorImpl<ValueEntry> &Ops);
This patch implements:
TODO: We could handle "X + ~X" -> "-1" if we wanted, since "-X = ~X+1".
Simon Atanasyan [Sat, 31 May 2014 04:51:07 +0000 (04:51 +0000)]
[yaml2obj] Add new command line option `-docnum`.
Input YAML file might contain multiple object file definitions.
New option `-docnum` allows to specify an ordinal number (starting from 1)
of definition used for an object file generation.
Andrea Di Biagio [Fri, 30 May 2014 23:17:53 +0000 (23:17 +0000)]
[X86] Add two combine rules to simplify dag nodes introduced during type legalization when promoting nodes with illegal vector type.
This patch teaches the backend how to simplify/canonicalize dag node
sequences normally introduced by the backend when promoting certain dag nodes
with illegal vector type.
This patch adds two new combine rules:
1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) ->
(shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>)
Both rules are only triggered on the type-legalized DAG.
In particular, rule 1. is a target specific combine rule that attempts
to sink a bitconvert into the operands of a binary operation.
Rule 2. is a target independet rule that attempts to move a shuffle
immediately after a binary operation.
Convert a vselect into a concat_vector if possible
Summary:
If both vector args to vselect are concat_vectors and the condition is
constant and picks half a vector from each argument, convert the vselect
into a concat_vectors.
Added a test.
The ConvertSelectToConcatVector is assuming it doesn't get vselects with
arguments of, for example, <undef, undef, true, true>. Those get taken
care of in the checks above its call.
Summary:
Separate the check for blend shuffle_vector masks into isBlendMask.
This function will also be used to check if a vector shuffle is legal. No
change in functionality was intended, but we ended up improving codegen on
two tests, which were being (more) optimized only if the resulting shuffle
was legal.
Logan Chien [Fri, 30 May 2014 16:48:56 +0000 (16:48 +0000)]
Fix MIPS exception personality encoding.
For MIPS, we have to encode the personality routine with
an indirect pointer to absptr; otherwise, some link warning
warning will be raised, and the program might crash in some
early MIPS Android device.
Tim Northover [Fri, 30 May 2014 13:23:06 +0000 (13:23 +0000)]
ARM: use AAPCS-style prologues for embedded MachO.
Darwin prologues save their GPRs in two stages: a narrow push of r0-r7 & lr,
followed by a wide push of the remaining registers if there are any. AAPCS uses
a single push.w instruction.
It turns out that, on average, enough registers get pushed that code is smaller
in the AAPCS prologue, which is a nice property for M-class programmers. They
also have other options available for back-traces, so can hopefully deal with
the fact that FP & LR aren't adjacent in memory.