Alex Lorenz [Thu, 20 Aug 2015 00:12:57 +0000 (00:12 +0000)]
MIR Serialization: Change syntax for the call entry pseudo source values.
The global IR values in machine memory operands should use the global value
'@<name>' syntax instead of the current '%ir.<name>' syntax.
However, the global value call entry pseudo source values use the global value
syntax already. Therefore, the syntax for the call entry pseudo source values
has to be changed so that the global values and call entry global value PSVs
can be parsed without ambiguities.
David Blaikie [Wed, 19 Aug 2015 23:07:27 +0000 (23:07 +0000)]
Allow Optionals to be compared to None
This is something like nullopt in std::experimental::optional. Optional
could already be constructed from None, so this seems like an obvious
extension from there.
I have a use in a future patch for Clang, though it may not go that
way/end up used - so this seemed worth committing now regardless.
Juergen Ributzka [Wed, 19 Aug 2015 20:52:55 +0000 (20:52 +0000)]
[AArch64][FastISel] Don't fold shifts with UB.
We are already falling back to SelectionDAG when encountering an shift with UB.
This adds the same checks for shifts with UB that get folded into arithmetic or
logical operations.
Dan Gohman [Wed, 19 Aug 2015 20:30:20 +0000 (20:30 +0000)]
[WebAssembly] Use the default alignment for SIMD types.
Previously WebAssembly's datalayout string had -v128:8:128. This had been an
attempt to declare a certain level of support for unaligned SIMD accesses.
However, clang makes its own determinations for SIMD alignment that are
independent of the datalayout string, so this wasn't actually meaningful.
Simon Pilgrim [Wed, 19 Aug 2015 20:09:50 +0000 (20:09 +0000)]
[DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to VECTOR_SHUFFLE.
Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle.
Alex Lorenz [Wed, 19 Aug 2015 19:05:34 +0000 (19:05 +0000)]
MIR Serialization: Serialize instruction's register ties.
This commit serializes the machine instruction's register operand ties.
The ties are printed out only when the instructon has register ties that are
different from the ties that are specified in the instruction's description.
Nemanja Ivanovic [Wed, 19 Aug 2015 19:04:47 +0000 (19:04 +0000)]
Temporary fix for the self-host failures introduced by rL244921.
This revision has introduced an issue that only affects bootstrapped compiler
when it is printing the ASM. I am working on resolving the issue, but in the
meantime, I'm disabling the legalization of scalar_to_vector operation for v2i64
and the associated testing until I can get this fixed.
Alex Lorenz [Wed, 19 Aug 2015 18:55:47 +0000 (18:55 +0000)]
MIR Serialization: Serialize defined registers that require 'def' register flag.
The defined registers are already serialized - they are represented by placing
them before the '=' in a machine instruction. However, certain instructions like
INLINEASM can have defined register operands after the '=', so this commit
introduces the 'def' register flag for such operands.
[PeepholeOptimizer] Look through PHIs to find additional register sources
Reintroduce r245442. Remove an overly conservative assertion introduced
in r245442. We could replace the assertion to use `shareSameRegisterFile`
instead, but in that point in `insertPHI` we already lost the original
Def subreg to check against. So drop the assertion completely.
Original commit message:
- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.
With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:
Ahmed Bougacha [Wed, 19 Aug 2015 17:40:19 +0000 (17:40 +0000)]
[AArch64] Improve short-form diags on long-form Match_InvalidOperand.
Since r244955, we try to use the short-form ErrorInfo when both
tries failed, and the long-form match failed on a suffix operand.
However, this means we sometimes mix ErrorInfo and MatchResult
(one manifestation of this being PR24498). Instead, restore both.
Hal Finkel [Wed, 19 Aug 2015 17:26:07 +0000 (17:26 +0000)]
[SCEV] Fix GCC 4.8.0 ICE in lambda function
Rewrite some code to not use a lambda function. The non-lambda code is just
about as clean as the original, and not any longer. The lambda function causes
an internal compiler error in GCC 4.8.0, and it is not worth breaking support
for that compiler over this. NFC.
[PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply r243486.
- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.
With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:
Silviu Baranga [Wed, 19 Aug 2015 14:11:27 +0000 (14:11 +0000)]
[ARM] Add instruction selection patterns for vmin/vmax
Summary:
The mid-end was generating vector smin/smax/umin/umax nodes, but
we were using vbsl to generatate the code. This adds the vmin/vmax
patterns and a test to check that we are now generating vmin/vmax
instructions.
[X86] Do not lower scalar sdiv/udiv to a shifts + mul sequence when optimizing for minsize
There are some cases where the mul sequence is smaller, but for the most part,
using a div is preferable. This does not apply to vectors, since x86 doesn't
have vector idiv, and a vector mul/shifts sequence ought to be smaller than a
scalarized division.
This removes the isPow2SDivCheap() query, as it is not currently used in
any meaningful way. isIntDivCheap() no longer relies on a state variable
(as all in-tree target set it to false), but the interface allows querying
based on the type optimization level.
Chandler Carruth [Wed, 19 Aug 2015 03:02:12 +0000 (03:02 +0000)]
[LPM] Teach the legacy pass manager to support *using* an analysis
without *requiring* it.
This allows a pass indicate that it will use an analysis if available
(through getAnalysisIfAvailable). When the pass manager knows this, it
will refrain from deleting that analysis if it can. Naturally, it will
still get invalidated at the correct time. These passes are not
considered when scheduling the pass pipeline, so typically they will
require manual scheduling, but this may also allow passes with
getAnalysisIfAvailable to find the analysis more often if nothing after
them requires that analysis and it wasn't invalidated.
I don't have a particular use case with the current passes, but with my
new structure for alias analyses, this will be very useful. We want to
allow people to customize the set of AAs available by scheduling
additional passes. These's aren't ever *required* for obvious reasons.
So we need some way to mark in the legacy pass manager that they will
still be used if available.
This is essentially how analysis groups already work. But this makes the
feature generally available and more explicit. It should allow the AA
change to not impact how people trigger a custom alias analysis being
available at a certain point in compilation.
Hal Finkel [Wed, 19 Aug 2015 02:56:36 +0000 (02:56 +0000)]
Fix how DependenceAnalysis calls delinearization
Fix how DependenceAnalysis calls delinearization, mirroring what is done in
Delinearization.cpp (mostly by making sure to call getSCEVAtScope before
delinearizing, and by removing the unnecessary 'Pairs == 1' check).
Hal Finkel [Wed, 19 Aug 2015 01:51:51 +0000 (01:51 +0000)]
Make ScalarEvolution::isKnownPredicate a little smarter
Here we make ScalarEvolution::isKnownPredicate, indirectly, a little smarter.
Given some relational comparison operator OP, and two AddRec SCEVs, {I,+,S} OP
{J,+,T}, we can reduce this to the comparison I OP J when S == T, both AddRecs
are for the same loop, and both are known not to wrap.
As it turns out, because of the way that backedge-guard expressions can be
leveraged when computing known predicates, this allows indvars to simplify the
if-statement comparison in this loop:
void foo (int *a, int *b, int n) {
for (int i = 0; i < n; ++i) {
if (i > n)
a[i] = b[i] + 1;
}
}
which, somewhat surprisingly, we were not previously optimizing away.
Alex Lorenz [Tue, 18 Aug 2015 22:52:15 +0000 (22:52 +0000)]
MIR Serialization: Serialize the operand's bit mask target flags.
This commit adds support for bit mask target flag serialization to the MIR
printer and the MIR parser. It also adds support for the machine operand's
target flag serialization to the AArch64 target.
Sanjay Patel [Tue, 18 Aug 2015 22:48:12 +0000 (22:48 +0000)]
use TLI.allowsMemoryAccess() to check if memory accesses are fast; NFCI
This consolidates use of isUnalignedMem32Slow() in one place.
There is a slight change in logic although I'm not sure that it would ever
come up in the real world: we were assuming that an alignment of the type
size is always fast; now, we actually check the data layout to confirm that.
Remove support for Valgrind-based TSan, which hasn't been maintained for a
few years. We now use the TSan annotations only if LLVM is compiled with
-fsanitize=thread. We no longer need the weak function definitions as we
are guaranteed that our program is linked directly with the TSan runtime.
Alex Lorenz [Tue, 18 Aug 2015 22:18:52 +0000 (22:18 +0000)]
MIR Parser: Extract the code that parses stack object references into a new
method.
This commit extracts the code that parses the stack object references into a
new method named 'parseStackFrameIndex', so that it can be reused when
parsing standalone stack object references.
Load/store instructions for floating points with address space require SparcV9.
To properly handle this, define the *a instructions as separate
instruction classes by refactoring the LoadA and StoreA multiclasses.
Move the instruction tests into the sparcv9 file to test the difference.
The current code normalizes select(C0, x, select(C1, x, y)) towards
select(C0|C1, x, y) if the targets prefers that form. This patch adds an
additional rule that if the select(C1, x, y) part already exists in the
function then we want to normalize into the other direction because the
effects of reusing the existing value are bigger than transforming into
the target preferred form.
This addresses regressions following r238793, see also:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html
Chandler Carruth [Tue, 18 Aug 2015 20:28:40 +0000 (20:28 +0000)]
[PM/AA] Add using declarations to avoid hiding virtual overloads.
Note that this actually has no functional change -- we never call these
methods using the derived type. But it is still cleaner and fixes a GCC
warning.
Spotted by Dave in code review and the warning spotted by Joerg on IRC.
David Majnemer [Tue, 18 Aug 2015 19:07:12 +0000 (19:07 +0000)]
[WinEH] Calculate state numbers for the new EH representation
State numbers are calculated by performing a walk from the innermost
funclet to the outermost funclet. Rudimentary support for the new EH
constructs has been added to the assembly printer, just enough to test
the new machinery.
Chandler Carruth [Tue, 18 Aug 2015 17:51:53 +0000 (17:51 +0000)]
[PM/AA] Remove the last relics of the separate IPA library from LLVM,
folding the code into the main Analysis library.
There already wasn't much of a distinction between Analysis and IPA.
A number of the passes in Analysis are actually IPA passes, and there
doesn't seem to be any advantage to separating them.
Moreover, it makes it hard to have interactions between analyses that
are both local and interprocedural. In trying to make the Alias Analysis
infrastructure work with the new pass manager, it becomes particularly
awkward to navigate this split.
I've tried to find all the places where we referenced this, but I may
have missed some. I have also adjusted the C API to continue to be
equivalently functional after this change.
[LVI] Use a SmallDenseMap instead of std::map for ValueCacheEntryTy
Historically there seems to be some resistance regarding the change to DenseMap
(r147980). However, I couldn't find cases of iterator invalidation for
ValueCacheEntryTy, but only for ValueCache, which I left untouched.
This reduces 20s on an internal testcase. Follow up from r245309.
[LVI] Improve LazyValueInfo compile time performance
Changes in LoopUnroll in the past six months exposed scalability
issues in LazyValueInfo when used from JumpThreading. One internal test
that used to take 20s under -O2 now takes 6min.
This commit change the OverDefinedCache from
DenseSet<std::pair<AssertingVH<BasicBlock>, Value*>> to
DenseMap<AssertingVH<BasicBlock>, SmallPtrSet<Value *, 4>>
and reduces compile time down to 1m40s.
Davide Italiano [Tue, 18 Aug 2015 16:05:13 +0000 (16:05 +0000)]
[MC] Convert another bunch of tests from macho-dump to llvm-readobj.
This is (almost) everything under MC/MachO/ARM. There are still some
cases missing, because llvm-readobj doesn't (yet) support some features,
that macho-dump provides. I plan to reduce the gap between them shortly.
Daniel Sanders [Tue, 18 Aug 2015 12:33:54 +0000 (12:33 +0000)]
[mips] Make the MipsAsmParser capable of knowing whether PIC mode is enabled or not.
Summary:
This information is needed to decide whether we do the PIC-only JAL expansions or not. It's also needed for an upcoming patch which implements the .cprestore assembler directive (which can only be used effectively in PIC mode).
By making this information available to the MipsAsmParser, we will know when to insert the instructions mandated by the .cprestore assembler directive and we will be able to give some useful warnings when we encounter a potential misuse of this directive.
Michael Kruse [Tue, 18 Aug 2015 12:17:37 +0000 (12:17 +0000)]
[Support] On Windows, generate PDF files for graphs and open with associated viewer
Summary: Windows system rarely have good PostScript viewers installed, but PDF viewers are common. So for viewing graphs, generate PDF files and open with the associated PDF viewer using cmd.exe's start command.
Michael Kruse [Tue, 18 Aug 2015 12:13:57 +0000 (12:13 +0000)]
[Support] Always wait for GraphViz before opening the viewer
Summary:
When calling DisplayGraph and a PS viewer is chosen, two programs are executed: The GraphViz generator and the PostScript viewer. Always for the generator to finish to ensure that the .ps file is written before opening the viewer for that file. DisplayGraph's wait parameter refers to whether to wait until the user closes the viewer.
This happened on Windows and if none of the options to open the .dot file directly applies, also on Linux.
Piotr Padlewski [Tue, 18 Aug 2015 03:55:30 +0000 (03:55 +0000)]
Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
is constant we can change all variables with constants in the same BasicBlock