Davide Italiano [Sat, 24 Oct 2015 22:09:54 +0000 (22:09 +0000)]
[CodeGen] Get rid of NDEBUG to ensure structure stability.
I think it's fine to keep this fields around in terms of overhead,
I wasn't able to measure any substantial regression while running the
test suite, but, in case this causes some regression I'm ready to revert
and work on an alternative solution.
This was tested building with clang/gcc both in Debug and Release mode
and passes the test-suite.
Yaron Keren [Sat, 24 Oct 2015 19:27:28 +0000 (19:27 +0000)]
Add libuuid to required system libraries list for mingw.
This list is produced by llvm-config --system-libs to be used
by external programs using the llvm libraries, such as creduce.
In r250501 llvm/Support/Windows/Path.inc started to use the constant
FOLDERID_Profile from libuuid.
Simon Pilgrim [Sat, 24 Oct 2015 13:17:26 +0000 (13:17 +0000)]
[X86][XOP] Add support for lowering vector rotations
This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions.
This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future.
Sanjoy Das [Sat, 24 Oct 2015 05:37:35 +0000 (05:37 +0000)]
Extract out getConstantRangeFromMetadata; NFC
The loop idiom creating a ConstantRange is repeated twice in the
codebase, time to give it a name and a home.
The loop is also repeated in `rangeMetadataExcludesValue`, but using
`getConstantRangeFromMetadata` there would not be an NFC -- the range
returned by `getConstantRangeFromMetadata` may contain a value that none
of the subranges did.
Rafael Espindola [Fri, 23 Oct 2015 21:48:05 +0000 (21:48 +0000)]
Add a RAW mode to StringTableBuilder.
In this mode it just tries to tail merge the strings without imposing any other
format constrains. It will not, for example, add a null byte between them.
Also add support for keeping a tentative size and offset if we decide to
not optimize after all.
This will be used shortly in lld for merging SHF_STRINGS sections.
Hal Finkel [Fri, 23 Oct 2015 20:37:08 +0000 (20:37 +0000)]
Handle non-constant shifts in computeKnownBits, and use computeKnownBits for constant folding in InstCombine/Simplify
First, the motivation: LLVM currently does not realize that:
((2072 >> (L == 0)) >> 7) & 1 == 0
where L is some arbitrary value. Whether you right-shift 2072 by 7 or by 8, the
lowest-order bit is always zero. There are obviously several ways to go about
fixing this, but the generic solution pursued in this patch is to teach
computeKnownBits something about shifts by a non-constant amount. Previously,
we would give up completely on these. Instead, in cases where we know something
about the low-order bits of the shift-amount operand, we can combine (and
together) the associated restrictions for all shift amounts consistent with
that knowledge. As a further generalization, I refactored all of the logic for
all three kinds of shifts to have this capability. This works well in the above
case, for example, because the dynamic shift amount can only be 0 or 1, and
thus we can say a lot about the known bits of the result.
This brings us to the second part of this change: Even when we know all of the
bits of a value via computeKnownBits, nothing used to constant-fold the result.
This introduces the necessary code into InstCombine and InstSimplify. I've
added it into both because:
1. InstCombine won't automatically pick up the associated logic in
InstSimplify (InstCombine uses InstSimplify, but not via the API that
passes in the original instruction).
2. Putting the logic in InstCombine allows the resulting simplifications to become
part of the iterative worklist
3. Putting the logic in InstSimplify allows the resulting simplifications to be
used by everywhere else that calls SimplifyInstruction (inlining, unrolling,
and many others).
And this requires a small change to our definition of an ephemeral value so
that we don't break the rest case from r246696 (where the icmp feeding the
@llvm.assume, is also feeding a br). Under the old definition, the icmp would
not be considered ephemeral (because it is used by the br), but this causes the
assume to remove itself (in addition to simplifying the branch structure), and
it seems more-useful to prevent that from happening.
Tim Northover [Fri, 23 Oct 2015 20:30:02 +0000 (20:30 +0000)]
GVN: don't try to replace instruction with itself.
After some look-ahead PRE was added for GEPs, an instruction could end
up in the table of candidates before it was actually inspected. When
this happened the pass might decide it was the best candidate to
replace itself. This didn't go well.
Sanjoy Das [Fri, 23 Oct 2015 20:09:57 +0000 (20:09 +0000)]
[SCEV] Fix stylistic issue in MatchBinaryAddToConst; NFCI
Instead of checking `(FlagsPresent & ExpectedFlags) != 0`, check
`(FlagsPresent & ExpectedFlags) == ExpectedFlags`. Right now they're
equivalent since `ExpectedFlags` can only be either `FlagNUW` or
`FlagNSW`, but if we ever pass in `ExpectedFlags` as `FlagNUW | FlagNSW`
then checking `(FlagsPresent & ExpectedFlags) != 0` would be wrong.
Sanjoy Das [Fri, 23 Oct 2015 20:09:55 +0000 (20:09 +0000)]
[Inliner] Don't inline through callsites with operand bundles
Summary:
This change teaches the LLVM inliner to not inline through callsites
with unknown operand bundles. Currently all operand bundles are
"unknown" operand bundles but in the near future we will add support for
inlining through some select kinds of operand bundles.
Reid Kleckner [Fri, 23 Oct 2015 19:35:38 +0000 (19:35 +0000)]
[X86] Clean up the tail call eligibility logic
Summary:
The logic here isn't straightforward because our support for
TargetOptions::GuaranteedTailCallOpt.
Also fix a bug where we were allowing tail calls to cdecl functions from
fastcall and vectorcall functions. We were special casing thiscall and
stdcall callers rather than checking for any convention that requires
clearing stack arguments before returning.
Oleg Ranevskyy [Fri, 23 Oct 2015 17:17:59 +0000 (17:17 +0000)]
[ARM CodeGen] @llvm.debugtrap call may be removed when restoring callee saved registers
Summary:
When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR.
This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap.
Joseph Tremoulet [Fri, 23 Oct 2015 15:06:05 +0000 (15:06 +0000)]
[CodeGen] Mark setjmp/catchret MBBs address-taken
Summary:
This ensures that BranchFolding (and similar) won't remove these blocks.
Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks. I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.
James Molloy [Fri, 23 Oct 2015 14:17:03 +0000 (14:17 +0000)]
[BasicAA] Bugfix for r251016
If the loaded type sizes don't match the element type of the sequential type, all bets are off and the addresses may, indeed, overlap.
Surprisingly, this just got caught in one test, on one builder, out of the 30+ builders testing this change. Congratulations go to http://lab.llvm.org:8011/builders/clang-aarch64-lnt/builds/5205.
Sanjoy Das [Fri, 23 Oct 2015 06:33:47 +0000 (06:33 +0000)]
[SCEV] Fix a latent bug in `getPreStartForExtend`
I could not come up a way to test this -- I think this bug is latent
today, and will not actually result in a miscompile.
In `getPreStartForExtend`, SCEV constructs `PreStart` as a sum of all of
`SA`'s operands except `Op`. It also uses `SA`'s no-wrap flags, and
this is problematic because removing an element from an add expression
can make it signed-wrap. E.g. if `SA` was `(127 + 1 + -1)`, then it
could safely be `<nsw>` (since `sext(127) + sext(1) + sext(-1)` ==
`sext(127 + 1 + -1)`), but `(127 + 1)` (== `PreStart` if `Op` is `-1`)
is not `<nsw>`.
Transferring `<nuw>` from `SA` to `PreStart` is safe, as far as I can
tell.
Add more intrumentation/runtime helper interfaces (NFC)
This patch converts the remaining references to literal
strings for names of profile runtime entites (such as
profile runtime hook, runtime hook use function, profile
init method, register function etc).
Also added documentation for all the new interfaces.
Chen Li [Thu, 22 Oct 2015 20:48:38 +0000 (20:48 +0000)]
[SimplifyCFG] Extend SimplifyResume to handle phi of trivial landing pad.
Summary: Currently SimplifyResume can convert an invoke instruction to a call instruction if its landing pad is trivial. In practice we could have several invoke instructions with trivial landing pads and share a common rethrow block, and in the common rethrow block, all the landing pads join to a phi node. The patch extends SimplifyResume to check the phi of landing pad and their incoming blocks. If any of them is trivial, remove it from the phi node and convert the invoke instruction to a call instruction.
Add helper functions and remove hard coded references to instProf related name/name-prefixes
This is a clean up patch that defines instr prof section and variable
name prefixes in a common header with access helper functions.
clang FE change will be done as a follow up once this patch is in.
David Majnemer [Thu, 22 Oct 2015 20:29:08 +0000 (20:29 +0000)]
[Sink] Don't check BB.empty()
As an invariant, BasicBlocks cannot be empty when passed to a transform.
This is not the case for MachineBasicBlocks and the Sink pass was ported
from the MachineSink pass which would explain the check's existence.
Sanjoy Das [Thu, 22 Oct 2015 19:57:41 +0000 (19:57 +0000)]
[SCEV] Remove a test case added in r249168
The test case wasn't testing what it was commented to be testing; and
when I tried to fix the test I noticed that SCEV does not support the
simplification that the test was supposed to test.
This change removes the test case to avoid confusion.
Sanjoy Das [Thu, 22 Oct 2015 19:57:34 +0000 (19:57 +0000)]
[SCEV] Opportunistically interpret unsigned constraints as signed
Summary:
An unsigned comparision is equivalent to is corresponding signed version
if both the operands being compared are positive. Teach SCEV to use
this fact when profitable.
Alexey Samsonov [Thu, 22 Oct 2015 19:51:59 +0000 (19:51 +0000)]
[ASan] Minor fixes to dynamic allocas handling:
* Don't instrument promotable dynamic allocas:
We already have a test that checks that promotable dynamic allocas are
ignored, as well as static promotable allocas. Make sure this test will
still pass if/when we enable dynamic alloca instrumentation by default.
* Handle lifetime intrinsics before handling dynamic allocas:
lifetime intrinsics may refer to dynamic allocas, so we need to emit
instrumentation before these dynamic allocas would be replaced.
Tim Northover [Thu, 22 Oct 2015 17:20:48 +0000 (17:20 +0000)]
CodeGen: increase bits allocated for LegalizeActions
The array handling CondCodes only allocated 2 bits to describe the
desired action for each type. The new addition of a "LibCall" option
overflowed this and caused corruption for Custom actions.
No in-tree targets have a Custom CondCodeAction, so unfortunately it
can't be tested.
Zia Ansari [Thu, 22 Oct 2015 16:14:45 +0000 (16:14 +0000)]
[X86] - Catch extra combine opportunities for redundant imuls.
When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple
users which would result in an extra add instruction.
In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add.
I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works).
Bill Schmidt [Thu, 22 Oct 2015 15:53:44 +0000 (15:53 +0000)]
[PPC] Fix PR24686 by failing assembly for an invalid relocation
PR24686 identifies a problem where a relocation expression is invalid
when not all of the symbols in the expression can be locally
resolved. This causes the compiler to request a PC-relative half16ds
relocation, which is nonsensical for PowerPC. This patch recognizes
this situation and ensures we fail the assembly cleanly.
Summary:
Hyphens were missing from the triple, causing it to be parsed
incorrectly. This patch updates the triple and makes necessary
changes to the expected output.
James Molloy [Thu, 22 Oct 2015 13:44:26 +0000 (13:44 +0000)]
[GlobalsAA] Loosen an overly conservative bailout
Instead of bailing out when we see loads, analyze them. If we can prove that the loaded-from address must escape, then we can conclude that a load from that address must escape too and therefore cannot alias a non-addr-taken global.
When checking if a Value can alias a non-addr-taken global, if the Value is a LoadInst of a non-global, recurse instead of bailing.
If we can follow a trail of loads up to some base that is captured, we know by inference that all the loads we followed are also captured.
James Molloy [Thu, 22 Oct 2015 13:18:42 +0000 (13:18 +0000)]
[ValueTracking] Add a new predicate: isKnownNonEqual()
isKnownNonEqual(A, B) returns true if it can be determined that A != B.
At the moment it only knows two facts, that a non-wrapping add of nonzero to a value cannot be that value:
A + B != A [where B != 0, addition is nsw or nuw]
and that contradictory known bits imply two values are not equal.
This patch also hooks this up to InstSimplify; InstSimplify had a peephole for the first fact but not the second so this teaches InstSimplify a new trick too (alas no measured performance impact!)
Manuel Klimek [Thu, 22 Oct 2015 08:31:46 +0000 (08:31 +0000)]
Fix add_llvm_external_project.
r250835 unintentionally discarded the optional parameter to the
add_llvm_external_project() macro that may point to a path when the said
path is different from ${name}. This should fix it by passing ${ARGN} on
to add_llvm_subdirectory(). The problem manifests itself with e.g.
add_llvm_external_project(clang-tools-extra extra) from
clang/tools/CMakeLists.txt