Patrik Hagglund [Thu, 16 Jun 2016 10:48:54 +0000 (10:48 +0000)]
PR27938: Don't remove valid DebugLoc in Scalarizer
Added checks to make sure the Scalarizer::transferMetadata() don't
remove valid debug locations from instructions. This is important as
the verifier pass require that e.g. inlinable callsites have a valid
debug location.
Daniel Sanders [Thu, 16 Jun 2016 10:20:59 +0000 (10:20 +0000)]
[mips][mips16] Fix machine verifier errors about incorrect register classes on load/stores.
Summary:
[ls][bh] and [ls][bh]u cannot use sp-relative addresses and must therefore
lower frameindex nodes such that there is a copy to a CPU16Regs register. This
is now done consistently using a separate addressing mode that does not
permit frameindex nodes.
As part of this I've had to remove an optimization that reduced the number of
instructions needed to work around the lack of sp-relative addresses on [ls][bh]
and [ls][bh]u. This optimization used one of the eight CPU16Regs registers as
a copy of the stack pointer and it's implementation was the root cause of many
of the register vs register class mismatches.
lw/sw can use sp-relative addresses but we ought to ensure that we use the
correct version of lw/sw internally for things like IAS. This is not currently
the case and this change does not fix this. However, this change does clean it
up sufficiently well to fix the machine verifier failures.
Daniel Sanders [Thu, 16 Jun 2016 09:17:03 +0000 (09:17 +0000)]
[llvm-objdump] Support detection of feature bits from the object and implement this for Mips.
Summary:
The Mips implementation only covers the feature bits described by the ELF
e_flags so far. Mips stores additional feature bits such as MSA in the
.MIPS.abiflags section.
Also fixed a small bug this revealed where microMIPS wouldn't add the
EF_MIPS_MICROMIPS flag when using -filetype=obj.
Chuang-Yu Cheng [Thu, 16 Jun 2016 04:44:25 +0000 (04:44 +0000)]
SimplifyCFG is able to detect the pattern:
(i == 5334 || i == 5335)
to:
((i & -2) == 5334)
This transformation has some incorrect side conditions. Specifically, the
transformation is only applied when the right-hand side constant (5334 in
the example) is a power of two not equal and not equal to the negated mask.
These side conditions were added in r258904 to fix PR26323. The correct side
condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334].
It's a little bit hard to see why these transformations are correct and what
the side conditions ought to be. Here is a CVC3 program to verify them for
64-bit values:
ONE : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63);
x : BITVECTOR(64);
y : BITVECTOR(64);
z : BITVECTOR(64);
mask : BITVECTOR(64) = BVSHL(ONE, z);
QUERY( (y & ~mask = y) =>
((x & ~mask = y) <=> (x = y OR x = (y | mask)))
);
Please note that each pattern must be a dual implication (<--> or iff). One
directional implication can create spurious matches. If the implication is
only one-way, an unsatisfiable condition on the left side can imply a
satisfiable condition on the right side. Dual implication ensures that
satisfiable conditions are transformed to other satisfiable conditions and
unsatisfiable conditions are transformed to other unsatisfiable conditions.
Here is a concrete example of a unsatisfiable condition on the left
implying a satisfiable condition on the right:
mask = (1 << z)
(x & ~mask) == y --> (x == y || x == (y | mask))
Substituting y = 3, z = 0 yields:
(x & -2) == 3 --> (x == 3 || x == 2)
The version of this code before r258904 had no side-conditions and
incorrectly justified itself in comments through one-directional
implication.
Thanks to Chandler for the suggestion!
Author: Thomas Jablin (tjablin)
Reviewers: chandlerc majnemer hfinkel cycheng
Craig Topper [Thu, 16 Jun 2016 03:58:45 +0000 (03:58 +0000)]
[X86] Pre-size some SmallVectors using the constructor in the shuffle lowering code instead of using push_back. Some of these already did this but used resize or assign instead of the constructor. NFC
Tim Northover [Thu, 16 Jun 2016 01:42:25 +0000 (01:42 +0000)]
AArch64: allow MOV (imm) alias to be printed
The backend has been around for years, it's pretty ridiculous that we can't
even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen
can't handle the complex predicates when printing so it's a bunch of nasty C++.
Oh well.
Vitaly Buka [Thu, 16 Jun 2016 00:14:42 +0000 (00:14 +0000)]
Enable libFuzzer's afl_driver to append stderr to a file.
Summary:
[libFuzzer] Enable afl_driver to append stderr to a user specified file.
Append stderr of afl_driver to the file specified by the environmental variable
AFL_DRIVER_STDERR_DUPLICATE_FILENAME if it is set. This lets users see outputs
on crashes without rerunning crashing test cases (which won't work for crashes
that are difficult to reproduce). Before this patch, stderr would only be sent to afl-fuzz
and users would have no way of seeing it.
Justin Lebar [Wed, 15 Jun 2016 23:20:15 +0000 (23:20 +0000)]
[IR] [DAE] Copy comdats during DAE, and don't copy comdats in GlobalObject::copyAttributesFrom.
Summary: This reverts the changes to Globals.cpp and IRMover.cpp in
"[IR] Copy comdats in GlobalObject::copyAttributesFrom" (D20631,
rL270743).
The DeadArgElim test is left unchanged, and we change DAE to explicitly
copy comdats.
The reverted change breaks copyAttributesFrom when the destination lives
in a different module from the source. The decision in D21255 was to
revert this patch and handle comdat copying separately from
copyAttributesFrom.
Justin Lebar [Wed, 15 Jun 2016 23:20:12 +0000 (23:20 +0000)]
[Bugpoint] Erase comdat annotations after removing a global's initializer.
Summary:
This is necessary to keep the verifier happy after bugpoint removes an
initializer from a global variable with a comdat annotation, because
globals without initializers may not have comdats.
Kevin Enderby [Wed, 15 Jun 2016 21:14:01 +0000 (21:14 +0000)]
Fix llvm-objdump when disassembling a stripped Mach-O binary with the -macho option.
It was printing out nothing in this case.
llvm-objdump tries to disassemble sections a symbol at a time. In the case of a
fully stripped Mach-O executable the only symbol remaining in the (__TEXT,__text)
section is the special linker defined symbol __mh_execute_header . This
symbol is special in that while it is N_SECT symbol in the (__TEXT,__text)
its address is before the start of the (__TEXT,__text). It’s address is the
start of the __TEXT segment which is where the mach header is statically
linked. So the code in DisassembleMachO() needs to deal with this case specially.
[CFLAA] Ignore non-pointers, move Attrs to graph nodes.
This patch makes CFLAA ignore non-pointer values, since we can now
sanely do that with the escaping/unknown attributes. Additionally,
StratifiedAttrs make more sense to sit on nodes than edges (since
they're properties of values, and ultimately end up on the nodes of
StratifiedSets). So, this patch puts said attributes on nodes.
Tim Northover [Wed, 15 Jun 2016 20:33:36 +0000 (20:33 +0000)]
AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints.
Of course the assembly was right but because the opcode was MOVZWi it was
encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and
would go horribly wrong on a CPU.
No idea how this bug survived this long. It seems nobody is using that aspect
of patchpoints.
Sanjay Patel [Wed, 15 Jun 2016 20:26:58 +0000 (20:26 +0000)]
[x86] add folds for x86 vector compare nodes (PR27924)
Ideally, we can get rid of most x86 LLVM intrinsics by transforming them to IR (and some of that happened
with http://reviews.llvm.org/rL272807), but it doesn't cost much to have some simple folds in the backend
too while we're working on that and as a backstop.
This fixes:
https://llvm.org/bugs/show_bug.cgi?id=27924
Matthias Braun [Wed, 15 Jun 2016 20:19:16 +0000 (20:19 +0000)]
Statistic: Add machine parseable json output
- We lacked a short unique identifier for a statistics, so I renamed the
current "Name" field that just contained the DEBUG_TYPE name of the
current file to DebugType and added a new "Name" field that contains
the C++ identifier of the statistic variable.
- Add the -stats-json option which outputs statistics in json format.
Sanjay Patel [Wed, 15 Jun 2016 17:17:27 +0000 (17:17 +0000)]
[x86, SSE] remove the GCCBuiltins from the integer min/max intrinsics
This allows us to emit native IR in Clang (next commit).
Also, update the intrinsic tests to show that codegen already knows how to handle
the IR that Clang will soon produce.
Sean Silva [Wed, 15 Jun 2016 10:51:40 +0000 (10:51 +0000)]
Work around MSVC "friend" semantics.
The error on clang-x86-win2008-selfhost is:
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(955) : error C2248: 'llvm::slpvectorizer::BoUpSLP::ScheduleData' : cannot access private struct declared in class 'llvm::slpvectorizer::BoUpSLP'
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(608) : see declaration of 'llvm::slpvectorizer::BoUpSLP::ScheduleData'
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(337) : see declaration of 'llvm::slpvectorizer::BoUpSLP'
I reproduced this locally with both MSVC 2013 and MSVC 2015.
Patrik Hagglund [Wed, 15 Jun 2016 10:32:00 +0000 (10:32 +0000)]
Use FPasses in opt exactly when it is initialized.
Previously, there was a discrepancy between the population of function
passes in FPasses, and their invocation. Function passes specified on
the command line, after an optimizaton level was simply discared. This
fix PR27509.
Sean Silva [Wed, 15 Jun 2016 09:00:33 +0000 (09:00 +0000)]
Speculative buildbot fix.
This wasn't failing for me with clang as the compiler. I think GCC may
disagree with clang about whether a friend declaration introduces a
declaration in the enclosing namespace (or something).
Example error:
/home/uweigand/sandbox/buildbot/clang-s390x-linux/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp:950:77: error: ‘llvm::raw_ostream& llvm::slpvectorizer::operator<<(llvm::raw_ostream&, const llvm::slpvectorizer::BoUpSLP::ScheduleData&)’ should have been declared inside ‘llvm::slpvectorizer’
const BoUpSLP::ScheduleData &SD) {
^
Sean Silva [Wed, 15 Jun 2016 08:43:40 +0000 (08:43 +0000)]
[PM] Port SLPVectorizer to the new PM
This uses the "runImpl" approach to share code with the old PM.
Porting to the new PM meant abandoning the anonymous namespace enclosing
most of SLPVectorizer.cpp which is a bit of a bummer (but not a big deal
compared to having to pull the pass class into a header which the new PM
requires since it calls the constructor directly).
Summary:
This fixes two related bugs. First, the generic optimization passes
unfortunately generate negative constant offsets but the hardware treats
SOffset as an unsigned value.
Second, there is a hardware bug on SI and CI, where address clamping in MUBUF
instructions does not work correctly when SOffset is larger than the buffer
size. This patch works around this bug by never using SOffset.
An alternative workaround would be to do the clamping manually when SOffset
is too large, but generating the required code sequence during instruction
selection would be rather involved, and in any case the resulting code would
probably be worse.
Sanjoy Das [Wed, 15 Jun 2016 05:35:14 +0000 (05:35 +0000)]
Don't force SP-relative addressing for statepoints
Summary:
... when the offset is not statically known.
Prioritize addresses relative to the stack pointer in the stackmap, but
fallback gracefully to other modes of addressing if the offset to the
stack pointer is not a known constant.
Dan Liew [Wed, 15 Jun 2016 01:40:02 +0000 (01:40 +0000)]
[LibFuzzer] Fix ``FuzzerMutate.ShuffleBytes2`` unit test on OSX.
The ``FuzzerMutate.ShuffleBytes2`` unit test was failing on
OSX due to the implementation of ``std::random_shuffle()``
being different between libcxx and libstdc++.
@kcc has decided (see http://reviews.llvm.org/D21218) it is acceptable
for there to be different mutation behavior on different platforms so
this commit just adjusts the test to perform the minimum number of
iterations (that is a power of 2) to see all the mutations the unit test
is looking for.
Recommit [LV] Enable vectorization of loops where the IV has an external use
r272715 broke libcxx because it did not correctly handle cases where the
last iteration of one IV is the second-to-last iteration of another.
Original commit message:
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.
David Majnemer [Wed, 15 Jun 2016 00:19:56 +0000 (00:19 +0000)]
[LoopUnroll] Don't crash trying to unroll loop with EH pad exit
We do not support splitting cleanuppad or catchswitches. This is
problematic for passes which assume that a loop is in loop simplify
form (the loop would have a dedicated exit block instead of sharing it).
While it isn't great that we don't support this for cleanups, we still
cannot make loop-simplify form an assertable precondition because
indirectbr will also disable these sorts of CFG cleanups.
David Majnemer [Wed, 15 Jun 2016 00:19:09 +0000 (00:19 +0000)]
Remove the ScalarReplAggregates pass
Nearly all the changes to this pass have been done while maintaining and
updating other parts of LLVM. LLVM has had another pass, SROA, which
has superseded ScalarReplAggregates for quite some time.