Sanjoy Das [Thu, 16 Jun 2016 18:54:06 +0000 (18:54 +0000)]
NFC; refactor getFrameIndexReferenceFromSP
Summary:
... into getFrameIndexReferencePreferSP. This change folds the
fail-then-retry logic into getFrameIndexReferencePreferSP.
There is a non-functional but behaviorial change in WinException --
earlier if `getFrameIndexReferenceFromSP` failed we'd trip an assert,
but now we'll silently use the (wrong) offset from the base pointer. I
could not write the assert I'd like to write ("FrameReg ==
StackRegister", like I've done in X86FrameLowering) since there is no
easy way to get to the stack register from WinException (happy to be
proven wrong here). One solution to this is to add a `bool
OnlyStackPointer` parameter to `getFrameIndexReferenceFromSP` that
asserts if it could not satisfy its promise of returning an offset from
a stack pointer, but that seems overkill.
Zachary Turner [Thu, 16 Jun 2016 18:00:28 +0000 (18:00 +0000)]
[pdb] Change type visitor pattern to be dynamic.
This allows better catching of compiler errors since we can use
the override keyword to verify that methods are actually
overridden.
Also in this patch I've changed from storing a boolean Error
code everywhere to returning an llvm::Error, to propagate richer
error information up the call stack.
Davide Italiano [Thu, 16 Jun 2016 17:40:53 +0000 (17:40 +0000)]
[PM] Revert the port of MergeLoadStoreMotion to the new pass manager.
Daniel Berlin expressed some real concerns about the port and proposed
and alternative approach. I'll revert this for now while working on a
new patch, which I hope to put up for review shortly. Sorry for the churn.
Sanjay Patel [Thu, 16 Jun 2016 16:58:54 +0000 (16:58 +0000)]
[DAG] Remove redundant FMUL in Newton-Raphson SQRT code
When calculating a square root using Newton-Raphson with two constants,
a naive implementation is to use five multiplications (four muls to calculate
reciprocal square root and another one to calculate the square root itself).
However, after some reassociation and CSE the same result can be obtained
with only four multiplications. Unfortunately, there's no reliable way to do
such a reassociation in the back-end. So, the patch modifies NR code itself
so that it directly builds optimal code for SQRT and doesn't rely on any
further reassociation.
Daniel Sanders [Thu, 16 Jun 2016 15:47:19 +0000 (15:47 +0000)]
Remove redundant -mattr options from llvm-objdump commands.
The -mattr options in these four tests have no effect on the output of
llvm-objdump. In the case of the two Mips tests, removing the -mattr option
left duplicate RUN lines so the duplicates have been removed.
Igor Laevsky [Thu, 16 Jun 2016 13:28:25 +0000 (13:28 +0000)]
[JumpThreading] Prevent dangling pointer problems in BranchProbabilityInfo
We should update results of the BranchProbabilityInfo after removing block in JumpThreading. Otherwise
we will get dangling pointer inside BranchProbabilityInfo cache.
Patrik Hagglund [Thu, 16 Jun 2016 10:48:54 +0000 (10:48 +0000)]
PR27938: Don't remove valid DebugLoc in Scalarizer
Added checks to make sure the Scalarizer::transferMetadata() don't
remove valid debug locations from instructions. This is important as
the verifier pass require that e.g. inlinable callsites have a valid
debug location.
Daniel Sanders [Thu, 16 Jun 2016 10:20:59 +0000 (10:20 +0000)]
[mips][mips16] Fix machine verifier errors about incorrect register classes on load/stores.
Summary:
[ls][bh] and [ls][bh]u cannot use sp-relative addresses and must therefore
lower frameindex nodes such that there is a copy to a CPU16Regs register. This
is now done consistently using a separate addressing mode that does not
permit frameindex nodes.
As part of this I've had to remove an optimization that reduced the number of
instructions needed to work around the lack of sp-relative addresses on [ls][bh]
and [ls][bh]u. This optimization used one of the eight CPU16Regs registers as
a copy of the stack pointer and it's implementation was the root cause of many
of the register vs register class mismatches.
lw/sw can use sp-relative addresses but we ought to ensure that we use the
correct version of lw/sw internally for things like IAS. This is not currently
the case and this change does not fix this. However, this change does clean it
up sufficiently well to fix the machine verifier failures.
Daniel Sanders [Thu, 16 Jun 2016 09:17:03 +0000 (09:17 +0000)]
[llvm-objdump] Support detection of feature bits from the object and implement this for Mips.
Summary:
The Mips implementation only covers the feature bits described by the ELF
e_flags so far. Mips stores additional feature bits such as MSA in the
.MIPS.abiflags section.
Also fixed a small bug this revealed where microMIPS wouldn't add the
EF_MIPS_MICROMIPS flag when using -filetype=obj.
Chuang-Yu Cheng [Thu, 16 Jun 2016 04:44:25 +0000 (04:44 +0000)]
SimplifyCFG is able to detect the pattern:
(i == 5334 || i == 5335)
to:
((i & -2) == 5334)
This transformation has some incorrect side conditions. Specifically, the
transformation is only applied when the right-hand side constant (5334 in
the example) is a power of two not equal and not equal to the negated mask.
These side conditions were added in r258904 to fix PR26323. The correct side
condition is that: ((Constant & Mask) == Constant)[(5334 & -2) == 5334].
It's a little bit hard to see why these transformations are correct and what
the side conditions ought to be. Here is a CVC3 program to verify them for
64-bit values:
ONE : BITVECTOR(64) = BVZEROEXTEND(0bin1, 63);
x : BITVECTOR(64);
y : BITVECTOR(64);
z : BITVECTOR(64);
mask : BITVECTOR(64) = BVSHL(ONE, z);
QUERY( (y & ~mask = y) =>
((x & ~mask = y) <=> (x = y OR x = (y | mask)))
);
Please note that each pattern must be a dual implication (<--> or iff). One
directional implication can create spurious matches. If the implication is
only one-way, an unsatisfiable condition on the left side can imply a
satisfiable condition on the right side. Dual implication ensures that
satisfiable conditions are transformed to other satisfiable conditions and
unsatisfiable conditions are transformed to other unsatisfiable conditions.
Here is a concrete example of a unsatisfiable condition on the left
implying a satisfiable condition on the right:
mask = (1 << z)
(x & ~mask) == y --> (x == y || x == (y | mask))
Substituting y = 3, z = 0 yields:
(x & -2) == 3 --> (x == 3 || x == 2)
The version of this code before r258904 had no side-conditions and
incorrectly justified itself in comments through one-directional
implication.
Thanks to Chandler for the suggestion!
Author: Thomas Jablin (tjablin)
Reviewers: chandlerc majnemer hfinkel cycheng
Craig Topper [Thu, 16 Jun 2016 03:58:45 +0000 (03:58 +0000)]
[X86] Pre-size some SmallVectors using the constructor in the shuffle lowering code instead of using push_back. Some of these already did this but used resize or assign instead of the constructor. NFC
Tim Northover [Thu, 16 Jun 2016 01:42:25 +0000 (01:42 +0000)]
AArch64: allow MOV (imm) alias to be printed
The backend has been around for years, it's pretty ridiculous that we can't
even use the preferred form for printing "MOV" aliases. Unfortunately, TableGen
can't handle the complex predicates when printing so it's a bunch of nasty C++.
Oh well.
Vitaly Buka [Thu, 16 Jun 2016 00:14:42 +0000 (00:14 +0000)]
Enable libFuzzer's afl_driver to append stderr to a file.
Summary:
[libFuzzer] Enable afl_driver to append stderr to a user specified file.
Append stderr of afl_driver to the file specified by the environmental variable
AFL_DRIVER_STDERR_DUPLICATE_FILENAME if it is set. This lets users see outputs
on crashes without rerunning crashing test cases (which won't work for crashes
that are difficult to reproduce). Before this patch, stderr would only be sent to afl-fuzz
and users would have no way of seeing it.
Justin Lebar [Wed, 15 Jun 2016 23:20:15 +0000 (23:20 +0000)]
[IR] [DAE] Copy comdats during DAE, and don't copy comdats in GlobalObject::copyAttributesFrom.
Summary: This reverts the changes to Globals.cpp and IRMover.cpp in
"[IR] Copy comdats in GlobalObject::copyAttributesFrom" (D20631,
rL270743).
The DeadArgElim test is left unchanged, and we change DAE to explicitly
copy comdats.
The reverted change breaks copyAttributesFrom when the destination lives
in a different module from the source. The decision in D21255 was to
revert this patch and handle comdat copying separately from
copyAttributesFrom.
Justin Lebar [Wed, 15 Jun 2016 23:20:12 +0000 (23:20 +0000)]
[Bugpoint] Erase comdat annotations after removing a global's initializer.
Summary:
This is necessary to keep the verifier happy after bugpoint removes an
initializer from a global variable with a comdat annotation, because
globals without initializers may not have comdats.
Kevin Enderby [Wed, 15 Jun 2016 21:14:01 +0000 (21:14 +0000)]
Fix llvm-objdump when disassembling a stripped Mach-O binary with the -macho option.
It was printing out nothing in this case.
llvm-objdump tries to disassemble sections a symbol at a time. In the case of a
fully stripped Mach-O executable the only symbol remaining in the (__TEXT,__text)
section is the special linker defined symbol __mh_execute_header . This
symbol is special in that while it is N_SECT symbol in the (__TEXT,__text)
its address is before the start of the (__TEXT,__text). It’s address is the
start of the __TEXT segment which is where the mach header is statically
linked. So the code in DisassembleMachO() needs to deal with this case specially.
[CFLAA] Ignore non-pointers, move Attrs to graph nodes.
This patch makes CFLAA ignore non-pointer values, since we can now
sanely do that with the escaping/unknown attributes. Additionally,
StratifiedAttrs make more sense to sit on nodes than edges (since
they're properties of values, and ultimately end up on the nodes of
StratifiedSets). So, this patch puts said attributes on nodes.
Tim Northover [Wed, 15 Jun 2016 20:33:36 +0000 (20:33 +0000)]
AArch64: stop trying to use 32-bit MOVZs when expanding patchpoints.
Of course the assembly was right but because the opcode was MOVZWi it was
encoded as "movz w16, #65535, lsl #32" which is an unallocated encoding and
would go horribly wrong on a CPU.
No idea how this bug survived this long. It seems nobody is using that aspect
of patchpoints.
Sanjay Patel [Wed, 15 Jun 2016 20:26:58 +0000 (20:26 +0000)]
[x86] add folds for x86 vector compare nodes (PR27924)
Ideally, we can get rid of most x86 LLVM intrinsics by transforming them to IR (and some of that happened
with http://reviews.llvm.org/rL272807), but it doesn't cost much to have some simple folds in the backend
too while we're working on that and as a backstop.
This fixes:
https://llvm.org/bugs/show_bug.cgi?id=27924
Matthias Braun [Wed, 15 Jun 2016 20:19:16 +0000 (20:19 +0000)]
Statistic: Add machine parseable json output
- We lacked a short unique identifier for a statistics, so I renamed the
current "Name" field that just contained the DEBUG_TYPE name of the
current file to DebugType and added a new "Name" field that contains
the C++ identifier of the statistic variable.
- Add the -stats-json option which outputs statistics in json format.
Sanjay Patel [Wed, 15 Jun 2016 17:17:27 +0000 (17:17 +0000)]
[x86, SSE] remove the GCCBuiltins from the integer min/max intrinsics
This allows us to emit native IR in Clang (next commit).
Also, update the intrinsic tests to show that codegen already knows how to handle
the IR that Clang will soon produce.
Sean Silva [Wed, 15 Jun 2016 10:51:40 +0000 (10:51 +0000)]
Work around MSVC "friend" semantics.
The error on clang-x86-win2008-selfhost is:
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(955) : error C2248: 'llvm::slpvectorizer::BoUpSLP::ScheduleData' : cannot access private struct declared in class 'llvm::slpvectorizer::BoUpSLP'
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(608) : see declaration of 'llvm::slpvectorizer::BoUpSLP::ScheduleData'
C:\buildbot\slave-config\clang-x86-win2008-selfhost\llvm\lib\Transforms\Vectorize\SLPVectorizer.cpp(337) : see declaration of 'llvm::slpvectorizer::BoUpSLP'
I reproduced this locally with both MSVC 2013 and MSVC 2015.