We extended the .section syntax to allow multiple sections with the
same name but different comdats, but currently we don't make sure that
the output section has that comdat symbol.
That happens to work with the code llc produces currently because it looks like
Bill Schmidt [Thu, 5 Jun 2014 22:57:38 +0000 (22:57 +0000)]
[PPC64LE] Add test case for r210282 commit
Chandler correctly pointed out that I need an LLVM IR test for
r210282, which modified the vperm -> shuffle transform for little
endian PowerPC. This patch provides that test.
Remove X86Subtarget from the X86FrameLowering constructor since
we can just pass in the values we already know and we're not
caching the subtarget anymore.
Jingyue Wu [Thu, 5 Jun 2014 22:07:33 +0000 (22:07 +0000)]
Fixed several correctness issues in SeparateConstOffsetFromGEP
Most issues are on mishandling s/zext.
Fixes:
1. When rebuilding new indices, s/zext should be distributed to
sub-expressions. e.g., sext(a +nsw (b +nsw 5)) = sext(a) + sext(b) + 5 but not
sext(a + b) + 5. This also affects the logic of recursively looking for a
constant offset, we need to include s/zext into the context of the searching.
2. Function find should return the bitwidth of the constant offset instead of
always sign-extending it to i64.
3. Stop shortcutting zext'ed GEP indices. LLVM conceptually sign-extends GEP
indices to pointer-size before computing the address. Therefore, gep base,
zext(a + b) != gep base, a + b
Improvements:
1. Add an optimization for splitting sext(a + b): if a + b is proven
non-negative (e.g., used as an index of an inbound GEP) and one of a, b is
non-negative, sext(a + b) = sext(a) + sext(b)
2. Function Distributable checks whether both sext and zext can be distributed
to operands of a binary operator. This helps us split zext(sext(a + b)) to
zext(sext(a) + zext(sext(b)) when a + b does not signed or unsigned overflow.
Refactoring:
Merge some common logic of handling add/sub/or in find.
Testing:
Add many tests in split-gep.ll and split-gep-and-gvn.ll to verify the changes
we made.
Kevin Enderby [Thu, 5 Jun 2014 21:21:57 +0000 (21:21 +0000)]
Add "-format darwin" to llvm-nm to be like darwin's nm(1) -m output.
This is a first step in seeing if it is possible to make llvm-nm produce
the same output as darwin's nm(1). Darwin's default format is bsd but its
-m output prints the longer Mach-O specific details. For now I added the
"-format darwin" to do this (whos name may need to change in the future).
As there are other Mach-O specific flags to nm(1) which I'm hoping to add some
how in the future. But I wanted to see if I could get the correct output for
-m flag using llvm-nm and the libObject interfaces.
I got this working but would love to hear what others think about this approach
to getting object/format specific details printed with llvm-nm.
Bill Schmidt [Thu, 5 Jun 2014 19:46:04 +0000 (19:46 +0000)]
[PPC64LE] Correct vperm -> shuffle transform for little endian
As discussed in cfe commit r210279, the correct little-endian
semantics for the vec_perm Altivec interfaces are implemented by
reversing the order of the input vectors and complementing the permute
control vector. This converts the desired permute from little endian
element order into the big endian element order that the underlying
PowerPC vperm instruction uses. This is represented with a
ppc_altivec_vperm intrinsic function.
The instruction combining pass contains code to convert a
ppc_altivec_vperm intrinsic into a vector shuffle operation when the
intrinsic has a permute control vector (mask) that is a constant.
However, the vector shuffle operation assumes that vector elements are
in natural order for their endianness, so for little endian code we
will get the wrong result with the existing transformation.
This patch reverses the semantic change to vec_perm that was performed
in altivec.h by once again swapping the input operands and
complementing the permute control vector, returning the element
ordering to little endian.
The correctness of this code is tested by the new perm.c test added in
a previous patch, and by other tests in the test suite that fail
without this patch.
Tom Roeder [Thu, 5 Jun 2014 19:29:43 +0000 (19:29 +0000)]
Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.
This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.
Bill Schmidt [Thu, 5 Jun 2014 16:21:13 +0000 (16:21 +0000)]
[PPC64LE] Temporarily disable VSX support in little-endian mode
This is a preliminary patch for the PowerPC64LE support. In stage 1
of the vector support, we will support the VMX (Altivec) instruction
set, but will not yet support the VSX instructions. This is merely a
staging issue to provide functional vector support as soon as
possible.
Ulrich Weigand [Thu, 5 Jun 2014 14:20:10 +0000 (14:20 +0000)]
[SystemZ] Do not install IfConverter pass at -O0
When not optimizing, do not run the IfConverter pass, this makes
debugging more difficult (and causes a testsuite failure in
DebugInfo/unconditional-branch.ll).
David Blaikie [Thu, 5 Jun 2014 00:51:35 +0000 (00:51 +0000)]
PR19388: DebugInfo: Emit dead arguments in their originally declared order.
Unused arguments were not being added to the argument list, but instead
treated as arbitrary scope variables. This meant they weren't carefully
added in the original argument order.
In this particular example, though, it turns out the argument is only
/mostly/ unused (well, actually it's entirely used, but in a specific
way). It's a struct that, due to ABI reasons, is decomposed into chunks
(exactly one chunk, since it has one member) and then passed. Since only
one of those chunks is used (SROA, etc, kill the original reconstitution
code) we don't have a location to describe the whole variable.
In this particular case, since the struct consists of just the one int,
once we have partial location information, this should have a location
that describes the entire variable (since the piece is the entirety of
the object).
And at some point we'll need to describe the location of even /entirely/
unused arguments so that they can at least be printed on function entry.
David Blaikie [Wed, 4 Jun 2014 23:50:52 +0000 (23:50 +0000)]
DebugInfo: Reapply r209984 (reverted in r210143), asserting that abstract DbgVariables have DIEs.
Abstract variables within abstract scopes that are entirely optimized
away in their first inlining are omitted because their scope is not
present so the variable is never created. Instead, we should ensure the
scope is created so the variable can be added, even if it's been
optimized away in its first inlining.
This fixes the incorrect debug info in missing-abstract-variable.ll
(added in r210143) and passes an asserts self-hosting build, so
hopefully there's not more of these issues left behind... *fingers
crossed*.
Yaron Keren [Wed, 4 Jun 2014 17:35:28 +0000 (17:35 +0000)]
Two small enhancements for the JIT.
When JITting a large project such as Boost it's quite hard to figure out the problematic inline asm without debug location. This patch provides debug location printout before the JIT aborts due to inline asm. printDebugLoc() was exposed from MachineInstr.cpp and reused here.
If the JIT run with debug info, don't bomb on DBG_VALUE but ignore them.
Add support to llvm-readobj to decode Windows ARM Exception Handling data. This
uses the previously added datastructures to decode the information into a format
that can be used by tests. This is a necessary step to add support for emitting
Windows on ARM exception handling information.
A fair amount of formatting inspiration is drawn from the Win64 EH printer as
well as the ARM EHABI printer. This allows for a reasonably thorough look into
the encoded data.
This is purely a documentation/whitespace cleanup for the format support
functions.
The current style does not duplicate the function/class names in the
documentation; conform to this style.
Additionally, there was a large amount of duplication of comments that added no
real value. Use block comments for the related sets of functions which are used
for type deduction and parameter container classes.
Support: add additional comment for ARM EH structure
Replicate the fact that ARM::WinEH::RuntimeFunction purposefully does not merge
functions to accommodate raw data access use cases in tools such as readobj.
Pointed out by Renato during post-commit review.
InstCombine: Improvement to check if signed addition overflows.
This patch implements two things:
1. If we know one number is positive and another is negative, we return true as
signed addition of two opposite signed numbers will never overflow.
2. Implemented TODO : If one of the operands only has one non-zero bit, and if
the other operand has a known-zero bit in a more significant place than it
(not including the sign bit) the ripple may go up to and fill the zero, but
won't change the sign. e.x - (x & ~4) + 1
Nick Lewycky [Wed, 4 Jun 2014 07:45:54 +0000 (07:45 +0000)]
Fix a use of uninitialized value. OldCC is set when IsCmpZero || IsSwapped and read when ShouldUpdateCC || IsSwapped, and ShouldUpdateCC is independent. Fixes PR19932, but no test since I wasn't able to get any symptoms to appear, not even with valgrind and the testcase from the PR. It's clear what happened from inspection of the code.
Andrew Trick [Wed, 4 Jun 2014 07:06:27 +0000 (07:06 +0000)]
Add a subtarget hook: enablePostMachineScheduler.
As requested by AArch64 subtargets.
Note that this will have no effect until the
AArch64 target actually enables the pass like this:
substitutePass(&PostRASchedulerID, &PostMachineSchedulerID);
As soon as armv7 switches over, PostMachineScheduler will become the
default postRA scheduler, so this won't be necessary any more.
Targets using the old postRA schedule would then do:
substitutePass(&PostMachineSchedulerID, &PostRASchedulerID);
Andrew Trick [Wed, 4 Jun 2014 07:06:18 +0000 (07:06 +0000)]
Move GenericScheduler and PostGenericScheduler into a header.
These were not exposed previously because I didn't want out-of-tree
targets to be too dependent on their internals. They can be reused for
a very wide variety of processors with casual scheduling needs without
exposing the classes by instead using hooks defined in
MachineSchedPolicy (we can add more if needed). When targets are more
aggressively tuned or want to provide custom heuristics, they can
define their own MachineSchedStrategy. I tend to think this is better
once you start customizing heuristics because you can copy over only
what you need. I don't think that layering heuristics generally works
well.
However, Arch64 targets now want to reuse the Generic scheduling logic
but also provide extensions. I don't see much harm in exposing the
Generic scheduling classes with a major caveat: these scheduling
strategies may change in the future without validating performance on
less mainstream processors. If you want to be immune from changes,
just define your own MachineSchedStrategy.
Justin Bogner [Wed, 4 Jun 2014 06:29:38 +0000 (06:29 +0000)]
docs: Remove documentation for legacy PGO options
Late last year r191835 removed a largely unmaintained legacy PGO
infrastructure, but some of the docs were missed. Since these docs are
for things that don't actually exist anymore, they should be removed.
Alp Toker [Wed, 4 Jun 2014 04:11:12 +0000 (04:11 +0000)]
GraphWriter: try gv before xdg-open
Avoid changing behaviour for everyone who's used to the traditional ghostview
UI, especially since it knows how to stay in the foreground unlike xdg-open.
Alp Toker [Wed, 4 Jun 2014 03:21:38 +0000 (03:21 +0000)]
config.h: fix layering and don't duplicate definitions
Also correct the llvm-config.h header guard so it doesn't depend on 'CONFIG_H'
which is commonly defined in external projects and caused trouble for
embedders.
In future llvm/Config/llvm-config.h will be installed, but not
the private llvm/Config/config.h header.
David Blaikie [Wed, 4 Jun 2014 01:30:59 +0000 (01:30 +0000)]
DebugInfo: Partial revert r209984 due to more cases where abstract DbgVariables do not have associated DIEs.
Along with a test case to demonstrate that due to inlining order there
are cases where abstract variable DIEs are not constructed since the
abstract subprogram was built due to a previous inlining that optimized
away those variables. This produces incorrect debug info (the 'missing'
abstract variable causes the inlined instance of that variable to be
emitted with a full description (name, line, file) rather than
referencing the abstract origin), but this commit at least ensures that
it doesn't crash...
Tim Northover [Tue, 3 Jun 2014 13:54:53 +0000 (13:54 +0000)]
AArch64: mark small types (i1, i8, i16) as promoted
This means the output of LowerFormalArguments returns a lowered
SDValue with the correct type (expected in SelectionDAGBuilder).
Without this, an assertion under a DEBUG macro triggers when those
types are passed on the stack.
Nick Lewycky [Tue, 3 Jun 2014 04:25:36 +0000 (04:25 +0000)]
Ignore line numbers on debug intrinsics. Add an assert to ensure that we aren't emitting line number zero, the .gcno format uses this to indicate that the next field is a filename.
Allow alias to point to an arbitrary ConstantExpr.
This patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.
This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like
@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
i32 ptrtoint (i32* @bar to i32)) to i32*)
An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).
Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.
The code was actually correct. Sorry for the confusion. I have expanded the
comment saying why the analysis is valid to avoid me misunderstaning it
again in the future.
Alexey Samsonov [Mon, 2 Jun 2014 18:08:27 +0000 (18:08 +0000)]
Remove sanitizer blacklist from ASan/TSan/MSan function passes.
Instrumentation passes now use attributes
address_safety/thread_safety/memory_safety which are added by Clang frontend.
Clang parses the blacklist file and adds the attributes accordingly.
Currently blacklist is still used in ASan module pass to disable instrumentation
for certain global variables. We should fix this as well by collecting the
set of globals we're going to instrument in Clang and passing it to ASan
in metadata (as we already do for dynamically-initialized globals and init-order
checking).
This change also removes -tsan-blacklist and -msan-blacklist LLVM commandline
flags in favor of -fsanitize-blacklist= Clang flag.