Sanjoy Das [Sun, 20 Sep 2015 01:52:18 +0000 (01:52 +0000)]
[IndVars] Fix a bug in r248045.
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.
This change also adds a test case that demonstrates the problem with
r248045.
Update edge weights properly when merging blocks in if-conversion.
In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue.
Scaling up values in ARMBaseInstrInfo::isProfitableToIfCvt() before they are scaled by a probability to avoid precision issue.
In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison.
Chris Bieneman [Fri, 18 Sep 2015 17:39:58 +0000 (17:39 +0000)]
[CMake] More cleanup of installing symlinks.
In order to support building clang out-of-tree the install_symlink script needs to be installed, and it needs to be found by searching the CMAKE_MODULE_PATH.
This change renames install_symlink -> LLVMInstallSymlink so it doesn't conflict with naming from other projects, and adds searching behavior in AddLLVM.cmake
Simplify SmallBitVector::applyMask by consolidating common code for 32- and 64-bit builds
and assert when mask is too large to apply in the small case,
previously the extra words were silently ignored.
clang-format the entire function to match current code standards.
This is a rewrite of r247972 which was reverted in r247983 due to
warning and possible UB on 32-bits hosts.
Daniel Sanders [Fri, 18 Sep 2015 14:20:54 +0000 (14:20 +0000)]
[mips][microMIPS] Fix an invalid read for lwm32 and reserved reglist values.
Summary:
Some values of 'reglist' are reserved and cause the disassembler to read past
the end of the Regs array. Treat lwm32's containing reserved values as invalid
instructions.
Igor Laevsky [Fri, 18 Sep 2015 13:01:48 +0000 (13:01 +0000)]
[LazyValueInfo] Report nonnull range for nonnull pointers
Currently LazyValueInfo will report only alloca's as having nonnull range.
For loads with !nonnull metadata it will bailout with no additional information.
Same is true for calls returning nonnull pointers.
This change extends LazyValueInfo to handle additional nonnull instructions.
Reverting r247972 (and subordinate commit r247972) as the 32-bit left-shift is undefined behavior on implementations where uinptr_t is 32-bits. One such platform is Windows, MSVC, x86.
Michael Kruse [Fri, 18 Sep 2015 10:56:30 +0000 (10:56 +0000)]
[Support] Reapply r245289 "Always wait for GraphViz before opening the viewer"
The change was accidentally undone by r245290.
Original log message:
When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always wait for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer.
This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux.
David Majnemer [Fri, 18 Sep 2015 08:18:07 +0000 (08:18 +0000)]
[WinEH] Moved funclet pads should be in relative order
We shifted the MachineBasicBlocks to the end of the MachineFunction in
DFS order. This will not ensure that MachineBasicBlocks which fell
through to one another will remain contiguous. Instead, implement
a stable sort algorithm for iplist.
Fix BitVectorTest on 32-bit hosts after r247972.
We can't apply two words of 32-bit mask in the small case
where the internal storage is just one 32-bit word.
Simplify SmallBitVector::applyMask by consolidating common code for 32-bit and 64-bit builds.
Extend mask value to 64 bits before taking its complement and assert when mask is
too large to apply in the small case (previously the extra words were silently ignored).
[ShrinkWrap] Refactor the handling of infinite loop in the analysis.
- Strenghten the logic to be sure we hoist the restore point out of the current
loop. (The fixes a bug with infinite loop, added as part of the patch.)
- Walk over the exit blocks of the current loop to conver to the desired restore
point in one iteration of the update loop.
David Blaikie [Thu, 17 Sep 2015 22:18:59 +0000 (22:18 +0000)]
[opaque pointer types] Add an explicit pointee type to alias records in the IR
Since aliases actually use and verify their explicit type already, no
further invalid testing is required here. The
invalid.test:ALIAS-TYPE-MISMATCH case catches errors due to emitting a
non-pointee type in the new format or a non-pointer type in the old
format.
This makes catchret look more like a branch, and less like a weird use
of BlockAddress. It also lets us get away from
llvm.x86.seh.restoreframe, which relies on the old parentfpoffset label
arithmetic.
Simon Pilgrim [Thu, 17 Sep 2015 20:32:45 +0000 (20:32 +0000)]
[InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions
The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results.
Teresa Johnson [Thu, 17 Sep 2015 20:12:00 +0000 (20:12 +0000)]
Restore "Function bitcode index in Value Symbol Table and lazy reading support"
This reverts commit r247898 (which reverted r247894).
Patch fixed to address two issues exposed by buildbots:
- unused variable warning in NDEBUG mode
- std::initializer_list lifetime issue causing test failures
Original Summary:
Support for including the function bitcode indices in the Value Symbol
Table. This requires writing the VST after the function blocks, which in
turn requires a new VST forward declaration record encoding the offset of
the full VST (which is backpatched to contain the offset after the VST
is written).
This patch also enables the lazy function reader to use the new function
indices out of the VST. This support will be used by ThinLTO as well, which
will be in a follow on patch. Backwards compatibility with older bitcode
files is maintained.
A new test is also included.
The bitcode format (used for the lazy reader as well as the upcoming
ThinLTO patches) came out of discussions with Duncan and others and is
described here:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
Diego Novillo [Thu, 17 Sep 2015 19:05:48 +0000 (19:05 +0000)]
Temporarily fix gcov failures in big-endian hosts.
This test uses a gcov file generated in a little-endian host. The gcov
reader does not allow different endianness, so the test fails on big
endian hosts.
[WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessor
getLandingPadSuccessor assumes that each invoke can have at most one EH
pad successor, but WinEH invokes can have more than one. Two out of
three callers of getLandingPadSuccessor don't use the returned
landingpad, so we can make them use this simple predicate instead.
Eventually we'll have to circle back and fix SplitKit.cpp so that
register allocation works. Baby steps.
Teresa Johnson [Thu, 17 Sep 2015 16:19:10 +0000 (16:19 +0000)]
Revert "Function bitcode index in Value Symbol Table and lazy reading support"
Temporarily revert to fix some buildbot issues. One is a minor issue
with a variable unused in NDEBUG mode. More concerning are some test
failures on win7 that I need to dig into.
Daniel Sanders [Thu, 17 Sep 2015 16:08:39 +0000 (16:08 +0000)]
[mips] Add assembler support for the .cprestore directive.
Summary:
This assembler directive is used in O32 PIC to restore the current function's $gp after executing JAL's. The $gp is first stored on the stack at a user-specified offset.
It has the following format: ".cprestore 8" (where 8 is the offset).
Teresa Johnson [Thu, 17 Sep 2015 15:52:30 +0000 (15:52 +0000)]
Function bitcode index in Value Symbol Table and lazy reading support
Summary:
Support for including the function bitcode indices in the Value Symbol
Table. This requires writing the VST after the function blocks, which in
turn requires a new VST forward declaration record encoding the offset of
the full VST (which is backpatched to contain the offset after the VST
is written).
This patch also enables the lazy function reader to use the new function
indices out of the VST. This support will be used by ThinLTO as well, which
will be in a follow on patch. Backwards compatibility with older bitcode
files is maintained.
A new test is also included.
The bitcode format (used for the lazy reader as well as the upcoming
ThinLTO patches) came out of discussions with Duncan and others and is
described here:
https://drive.google.com/file/d/0B036uwnWM6RWdnBLakxmeDdOeXc/view
AVX-512: shufflevector for i1 vectors <2 x i1> .. <64 x i1>
AVX-512 does not provide an instruction that shuffles mask register. So I do the following way:
Diego Novillo [Thu, 17 Sep 2015 00:17:24 +0000 (00:17 +0000)]
GCC AutoFDO profile reader - Initial support.
This adds enough machinery to support reading simple GCC AutoFDO
profiles. It now supports reading flat profiles (no function calls).
Subsequent patches will add support for:
- Inlined calls (in particular, the inline call stack is not traversed
to accumulate samples).
- Working sets and modules. These are used mostly for GCC's LIPO
optimizations, so they're not needed in LLVM atm. I'm not sure that
we will ever need them. For now, I've if0'd around the calls.
The patch also adds support in GCOV.h for gcov version V704 (generated
by GCC's profile conversion tool).
ScalarEvolution: added tmp to avoid use-after-dtor in for loop.
Summary:
For loop destroyed current instance before invoking next.
Temporary variable added to prevent use-after-dtor when invoke
destructor on current instance.
David Majnemer [Wed, 16 Sep 2015 20:42:16 +0000 (20:42 +0000)]
[WinEHPrepare] Turn terminatepad into a cleanuppad + call + cleanupret
The MSVC doesn't really support exception specifications so let's just
turn these into cleanuppads. Later, we might use terminatepad to more
efficiently encode the "noexcept"-ness of a function body.
Summary:
`signum(x)` is sometimes implemented as `(x >> 63) | (-x >>> 63)` (for
an `i64` `x`). This change adds a matcher for that pattern, and an
instcombine rule to optimize `signum(x) s< 1`.
[WinEH] Pull Adjectives and CatchObj out of the catchpad arg list
Clang now passes the adjectives as an argument to catchpad.
Getting the CatchObj working is simply a matter of threading another
static alloca through codegen, first as an alloca, then as a frame
index, and finally as a frame offset.
David Majnemer [Wed, 16 Sep 2015 18:40:24 +0000 (18:40 +0000)]
[WinEHPrepare] Refactor explicit EH preparation
Split the preparation machinery into several functions, we will want to
selectively enable/disable different parts of it for an alternative
mechanism for dealing with cross-funclet uses.
Teresa Johnson [Wed, 16 Sep 2015 18:06:45 +0000 (18:06 +0000)]
Disable the second verification run when performing LTO through
gold in NDEBUG mode.
Follow on patch for r247729 - LTO: Disable extra verify runs in release
builds.
[WinEH] Skip state numbering when no EH pads are present
Otherwise we'd try to emit the thunk that passes the LSDA to
__CxxFrameHandler3. We don't emit the LSDA if there were no landingpads,
so we'd end up with an assembler error when trying to write the COFF
object.
Dan Gohman [Wed, 16 Sep 2015 16:51:30 +0000 (16:51 +0000)]
[WebAssembly] Check in an initial CFG Stackifier pass
This pass implements a simple algorithm for conversion from CFG to
wasm's structured control flow. It doesn't yet handle multiple-entry
loops; that will be added in a future patch.
It also adds initial support for switch statements.
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing,
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests:
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.
This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.
This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.