llvm-cov: Rework the API for getting the coverage of a file (NFC)
This encapsulates how we handle the coverage regions of a file or
function. In the old model, the user had to deal with nested regions,
so they needed to maintain their own auxiliary data structures to get
any useful information out of this. The new API provides a sequence of
non-overlapping coverage segments, which makes it possible to render
coverage information in a single pass and avoids a fair amount of
extra work.
Robin Morisset [Wed, 17 Sep 2014 18:09:13 +0000 (18:09 +0000)]
Revert "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors"
It is breaking the build on the buildbots but works fine on my machine, I revert
while trying to understand what happens (it appears to depend on the compiler used
to build, I probably used a C++11 feature that is not perfectly supported by some
of the buildbots).
[FastISel][AArch64] Improve branch selection to support all FP conditions.
This adds the last two missing floating-point condition codes (FCMP_UEQ and
FCMP_ONE) also to the branch selection. In these two cases an additonal branch
instruction is required.
This also adds unit tests to checks all the different condition codes.
Robin Morisset [Wed, 17 Sep 2014 17:41:16 +0000 (17:41 +0000)]
[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors
Summary:
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.
Thanks to luqmana for having noticed this bug.
Test Plan: Added more cases to atomic-load-store.ll + make check-all
LineIterator: Provide a variant that keeps blank lines
It isn't always useful to skip blank lines, as evidenced by the
somewhat awkward use of line_iterator in llvm-cov. This adds a knob to
control whether or not to skip blanks.
Matt Arsenault [Wed, 17 Sep 2014 15:35:43 +0000 (15:35 +0000)]
R600/SI: Remove promotion of instructions to e64 forms.
Instructions are now generally selected to the e64 forms originally,
and shrunk down later. Rename foldOperands to legalizeOperands,
since that's really most of what it tries to do.
Before this fix, the instruction combiner wrongly thought that %shr
could have never been equal to -5. Therefore, %cmp was always folded to 'true'.
However, when %a is equal to 1, then %cmp evaluates to 'false'. Therefore,
in this example, it is not valid to fold %cmp to 'true'.
The problem was only affecting the case where the comparison was between
negative quantities where one of the quantities was obtained from arithmetic
shift of a negative constant.
This patch fixes the problem with the wrong folding (fixes PR20945).
With this patch, the 'icmp' from the example is now simplified to a
comparison between %a and 1. This still allows us to get rid of the arithmetic
shift (%shr).
llvm-cov: Distinguish expansion/instantiation from SourceCoverageView
SourceCoverageView currently has "Kind" and a list of child views, all
of which must have either an expansion or an instantiation Kind. In
addition to being an error-prone design, this makes it awkward to
differentiate between the two child types and adds a number of
optionally used members to the type.
Split the subview types into their own separate objects, and maintain
lists of each rather than one combined "Children" list.
Robin Morisset [Wed, 17 Sep 2014 00:06:58 +0000 (00:06 +0000)]
[X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).
Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.
Kevin Enderby [Tue, 16 Sep 2014 18:00:57 +0000 (18:00 +0000)]
Hookup the MCSymbolizer to llvm-objdump’s disassembly for Mach-O files.
First step done in this commit is to get flush out enough of the
SymbolizerGetOpInfo() routine to symbolic an X86_64 hello world .o and
its loading of the literal string and call to printf. Also the code to
symbolicate the X86_64_RELOC_SUBTRACTOR relocation and a test is also
added to show a slightly more complicated case.
Next will be to flush out enough of SymbolizerSymbolLookUp() to get the
literal string “Hello world” printed as a comment on the instruction that load
the pointer to it.
Fix move-only type issues in Interpreter with MSVC
MSVC 2012 cannot infer any move special members, but it will call them
if available. MSVC 2013 cannot infer move assignment. Therefore,
explicitly implement the special members for the ExecutionContext class
and its contained types.
Adam Nemet [Tue, 16 Sep 2014 17:14:13 +0000 (17:14 +0000)]
[TableGen] Fully resolve class-instance values before defs in multiclasses
By class-instance values I mean 'Class<Arg>' in 'Class<Arg>.Field' or in
'Other<Class<Arg>>' (syntactically s SimpleValue). This is to differentiate
from unnamed/anonymous record definitions (syntactically an ObjectBody) which
are not affected by this change.
Consider the testcase:
class Struct<int i> {
int I = !shl(i, 1);
int J = !shl(I, 1);
}
Then class Struct<i> is added to anonymous_0. Also Class<NAME#anonymous_0> is
added to Def:
multiclass Multiclass<int i> {
def anonymous_0 {
int I = !shl(i, 1);
int J = !shl(I, 1);
}
def Def {
int Class_J = NAME#anonymous_0.J;
}
}
So far so good but then we move on to instantiating this in the defm
by substituting the template arg 'i'.
This is how the anonymous prototype looks after fully instantiating.
defm Defm = {
def Defmanonymous_0 {
int I = 4;
int J = !shl(I, 1);
}
Note that we only resolved the reference to the template arg. The
non-template-arg reference in 'J' has not been resolved yet.
Then we go on to instantiating the Def prototype:
def DefmDef {
int Class_J = NAME#anonymous_0.J;
}
Which is resolved to Defmanonymous_0.J and then to !shl(I, 1).
When we fully resolve each record in a defm, Defmanonymous_0.J does get set
to 8 but that's too late for its use.
The patch adds a new attribute to the Record class that indicates that this
def is actually a class-instance value that may be *used* by other defs in a
multiclass. (This is unlike regular defs which don't reference each other and
thus can be resolved indepedently.) They are then fully resolved before the
other defs while the multiclass is instantiated.
I added vg_leak to the new test. I am not sure if this is necessary but I
don't think I have a way to test it. I can also check in without the XFAIL
and let the bots test this part.
Also tested that X86.td.expanded and AAarch64.td.expanded were unchange before
and after this change. (This issue triggering this problem is a WIP patch.)
Moritz Roth [Tue, 16 Sep 2014 16:25:07 +0000 (16:25 +0000)]
ARM load/store optimizer: Don't materialize a new base register with
ADDS/SUBS unless it's safe to clobber the condition flags.
If the merged instructions are in a range where the CPSR is live,
e.g. between a CMP -> Bcc, we can't safely materialize a new base
register.
This problem is quite rare, I couldn't come up with a test case and I've
never actually seen this happen in the tests I'm running - there is a
potential trigger for this in LNT/oggenc (spills being inserted between
a CMP/Bcc), but at the moment this isn't being merged. I'll try to
reduce that into a small test case once I've committed my upcoming patch
to make merging less conservative.
[mips] Improve the error messages given by MipsAsmParser.
Summary: Changed error messages to be more informative and to resemble other clang/llvm error messages (first letter is lower case, no ending punctuation) and updated corresponding tests.
Joe Abbey [Tue, 16 Sep 2014 09:18:23 +0000 (09:18 +0000)]
ARMAsmBackend uses a factory method to generate binary file format specific
objects. There were a few FIXMEs in ARMAsmBackend.cpp suggesting the class
definitions should be in a separate file. Starting with ARMAsmBackend, the
class definition has been put in a header file, and #includes reduced. Each
sub-type of ARMAsmBackend is now in its own header file.
Derived types have been painted with a different color of bike-shed:
Hal Finkel [Tue, 16 Sep 2014 04:35:50 +0000 (04:35 +0000)]
Fix BasicTTI::getCmpSelInstrCost to deal with illegal vector types
The default implementation of getCmpSelInstrCost, which provides the cost of
icmp/fcmp/select instructions, did not deal sensibly with illegal vector types
that were scalarized. We'd ask for the legalization cost of the vector type,
which would return something like (4, f64) given an input of <4 x double>, and
we'd then check the TLI status of the ISD opcode on that scalar type. This would
result in querying (ISD::VSELECT, f64), for example. Amusingly enough,
ISD::VSELECT on scalar types is marked as Legal by default (as with most other
operations), and most backends never change this because VSELECT is never
generated on scalars. However, seeing the resulting operation as Legal, we'd
neglect to add the scalarization cost before returning. The result is that we'd
grossly under-estimate the cost of cmps/selects on illegal vector types.
Now, if type legalization clearly results in scalarization, we skip the early
return and add the scalarization cost.
David Majnemer [Tue, 16 Sep 2014 03:52:46 +0000 (03:52 +0000)]
yaml2obj: Support bigobj
Teach yaml2obj how to make a bigobj COFF file. Like the rest of LLVM,
we automatically decide whether or not to use regular COFF or bigobj
COFF on the fly depending on how many sections the resulting object
would have.
This ends the task of adding bigobj support to LLVM.
N.B. This was tested by forcing yaml2obj to be used in bigobj mode
regardless of the number of sections. While a dedicated test was
written, the smallest I could make it was 36 MB (!) of yaml and it still
took a significant amount of time to execute on a powerful machine.
[x86] Remove a FIXME that doesn't make any sense. Only the lanes feeding
the blend that is matched by this are "used" in any sense, and so any
build_vector or other nodes feeding these will already drop other lanes.
[x86] Remove the last vestiges of the BLENDI-based ADDSUB pattern
matching. This design just fundamentally didn't work because ADDSUB is
available prior to any legal lowerings of BLENDI nodes. Instead, we have
a dedicated ADDSUB synthetic ISD node which is pattern matched trivially
into the instructions. These nodes are then recognized by both the
existing and a trivial new lowering combine in the backend. Removing
these patterns required adding 2 missing shuffle masks to the DAG
combine, without which tests would have failed. Added the masks and
a helpful assert as well to catch if anything ever goes wrong here.
[x86] As a follow-up to r217819, don't check for VSELECT legality now
that we don't use VSELECT and directly emit an addsub synthetic node.
Also remove a stale comment referencing VSELECT.
The test case is updated to use 'core2' which only has SSE3, not SSE4.1,
and it still passes. Previously it would not because we lacked
sufficient blend support to legalize the VSELECT.
[x86] Add the beginnings of a proper DAG combine to match ADDSUBPS and
ADDSUBPD nodes out of blends of adds and subs.
This allows us to actually form these instructions with SSE3 rather than
only forming them when we had both SSE3 for the ADDSUB instructions and
SSE4.1 for the blend instructions. ;] Kind-of important.
I've adjusted the CPU requirements on one of the tests to demonstrate
this kicking in nicely for an SSE3 cpu configuration.
[FastISel][AArch64] Allow handling of vectors during return lowering for little endian machines.
Allow handling of vectors during return lowering at least for little endian machines.
This was restricted in r208200 to fix it for big endian machines (according to
the comment), but it also disabled it for little endian too.
llvm-cov: Fix an issue with showing regions but not counts
In r217746, though it was supposed to be NFC, I broke llvm-cov's
handling of showing regions without showing counts. This should've
shown up in the existing tests, except they were checking debug output
that was displayed regardless of what was actually output. I've moved
the relevant debug output to a more appropriate place so that the
tests catch this kind of thing.
Nick Kledzik [Mon, 15 Sep 2014 21:51:49 +0000 (21:51 +0000)]
[Support] add decodeSLEB128()
We already have routines to encode SLEB128 as well as encode/decode ULEB128.
This last function fills out the matrix. I'll need this for some llvm-objdump
work I am doing.
Add mips32 r1 to the list of supported targets for Mips fast-isel
Summary:
Expand list of supported targets for Mips to include mips32 r1.
Previously it only include r2. More patches are coming where there is
a difference but in the current patches as pushed upstream, r1 and r2
are equivalent.
Test Plan:
simplestorefp1.ll
add new build bots at mips to test this flavor at both -O0 and -O2
[x86] Start fixing our emission of ADDSUBPS and ADDSUBPD instructions by
introducing a synthetic X86 ISD node representing this generic
operation.
The relevant patterns for mapping these nodes into the concrete
instructions are also added, and a gnarly bit of C++ code in the
target-specific DAG combiner is replaced with simple code emitting this
primitive.
The next step is to generically combine blends of adds and subs into
this node so that we can drop the reliance on an SSE4.1 ISD node
(BLENDI) when matching an SSE3 feature (ADDSUB).
David Majnemer [Mon, 15 Sep 2014 19:42:42 +0000 (19:42 +0000)]
MC: Add support for BigObj
Teach WinCOFFObjectWriter how to write -mbig-obj style object files;
these object files allow for more sections inside an object file.
Our support for BigObj is notably different from binutils and cl: we
implicitly upgrade object files to BigObj instead of asking the user to
compile the same file *again* but with another flag. This matches up
with how LLVM treats ELF variants.
This was tested by forcing LLVM to always emit BigObj files and running
the entire test suite. A specific test has also been added.
I've lowered the maximum number of sections in a normal COFF file,
VS "14" CTP 3 supports no more than 65279 sections. This is important
otherwise we might not switch to BigObj quickly enough, leaving us with
a COFF file that we couldn't link.
yaml2obj support is all that remains to implement.
Rafael Espindola [Mon, 15 Sep 2014 18:32:58 +0000 (18:32 +0000)]
Fix a lot of confusion around inserting nops on empty functions.
On MachO, and MachO only, we cannot have a truly empty function since that
breaks the linker logic for atomizing the section.
When we are emitting a frame pointer, the presence of an unreachable will
create a cfi instruction pointing past the last instruction. This is perfectly
fine. The FDE information encodes the pc range it applies to. If some tool
cannot handle this, we should explicitly say which bug we are working around
and only work around it when it is actually relevant (not for ELF for example).
Given the unreachable we could omit the .cfi_def_cfa_register, but then
again, we could also omit the entire function prologue if we wanted to.
[CodeGenPrepare][AddressingModeMatcher] Fix a think-o for the sext(zext) -> zext promotion
introduced in r217629.
We were returning the old sext instead of the new zext as the promoted instruction!
Peephole optimization was folding MOVSDrm, which is a zero-extending double
precision floating point load, into ADDPDrr, which is a SIMD add of two packed
double precision floating point values.
X86InstrInfo::foldMemoryOperandImpl already had the logic that prevented this
from happening. However the check wasn't being conducted for loads from stack
objects. This commit factors out the logic into a new function and uses it for
checking loads from stack slots are not zero-extending loads.