Diana Picus [Wed, 8 Feb 2017 14:23:30 +0000 (14:23 +0000)]
Fix test to work on swift/cyclone too
I forgot to remove the neonfp target feature from the test, which means we'd
have trouble selecting VADDS on targets that have neonfp enabled by default.
NAKAMURA Takumi [Wed, 8 Feb 2017 13:49:28 +0000 (13:49 +0000)]
Revert r294356, "DebugInfo: Track spilled variables in LiveDebugValues"
It caused undefined behavior in VarLoc. As far as I investigated,
- VarLoc::VarLoc() treats negative offset value as InvalidKind.
Consider the case that (int64_t)MI.getOperand(1).getImm() is negative and whether it satisfies ((uint64_t)Offset < (1ULL << 32)).
- Comparison operators in VarLoc behave undefined since VarLoc::Loc.Hash is uninitialized in case of InvalidKind.
I guess Offset (in VarLoc) could be made aware of signed, but I am not sure.
So I have reverted it for now.
A virtual destructor is needed, since the derived classes are stored in
`iplist<PredicateBase> AllInfos;` and, apparently, ilist_node doesn't have a
virtual destructor.
Diana Picus [Wed, 8 Feb 2017 13:23:04 +0000 (13:23 +0000)]
[ARM] GlobalISel: Add FPR reg bank
Add a register bank for floating point values and select simple instructions
using them (add, copies from GPR).
This assumes that the hardware can cope with a single precision add (VADDS)
instruction, so the legalizer will treat G_FADD as legal and the instruction
selector will refuse to select if the hardware doesn't support it. In the future
we'll want to be more careful about this, and legalize to libcalls if we have to
use soft float.
Amara Emerson [Wed, 8 Feb 2017 11:28:08 +0000 (11:28 +0000)]
[AArch64][TableGen] Skip tied result operands for InstAlias
This patch checks the number of operands in the resulting
instruction instead of just the alias, then skips over
tied operands when generating the printing method.
This allows us to generate the preferred assembly syntax
for the AArch64 'ins' instruction, which should always be
displayed as 'mov' according to the ARMARM.
Several unit tests have changed as a result, but only to
reflect the preferred disassembly.
Some other InstAlias patterns (movk/bic/orr) needed a
slight adjustment to stop them becoming the default
and breaking other unit tests.
Sanne Wouda [Wed, 8 Feb 2017 10:20:07 +0000 (10:20 +0000)]
[Assembler] Enable nicer diagnostics for inline assembly.
Summary:
Enables source location in diagnostic messages from the backend. This
is after parsing, during finalization. This requires the SourceMgr, the
inline assembly string buffer, and DiagInfo to still be alive after
EmitInlineAsm returns.
This patch creates a single SourceMgr for inline assembly inside the
AsmPrinter. MCContext gets a pointer to this SourceMgr. Using one
SourceMgr per call to EmitInlineAsm would make it difficult for
MCContext to figure out in which SourceMgr the SMLoc is located, while a
single SourceMgr can figure it out if it has multiple buffers.
The Str argument to EmitInlineAsm is copied into a buffer and owned by
the inline asm SourceMgr. This ensures that DiagHandlers won't print
garbage. (Clang emits a "note: instantiated into assembly here", which
refers to this string.)
The AsmParser gets destroyed before finalization, which means that the
DiagHandlers the AsmParser installs into the SourceMgr will be stale.
Restore the saved DiagHandlers.
Since now we're using just one SourceMgr for multiple inline asm
strings, we need to tell the AsmParser which buffer it needs to parse
currently. Hand a buffer id -- returned from SourceMgr::
AddNewSourceBuffer -- to the AsmParser.
Sam Parker [Wed, 8 Feb 2017 09:44:18 +0000 (09:44 +0000)]
Use dynamic symbols for ELF disassembly
Disassembly currently begins from addresses obtained from the objects
symbol table. For ELF, add the dynamic symbols to the list if no
static symbols are available so that we can more successfully
disassemble stripped binaries.
[ArgPromote] Delete a test that makes no sense (any more).
This test is under 'ArgumentPromotion' but there are no arguments that
get promoted in the test case, so there seems to be no point. Also,
there are no assertions about the output at all, so this seems like
something we should just delete given the low value.
[ArgPromote] Clean up a crash test case by rinsing it through opt,
renaming things to at least have somewhat spelled out names, and even
have meaningful names where I could guess at what they should be.
Also add FileCheck assertions that we're actually doing what we set out
to do for some of the tests, for example not promoting a type that would
result in infinite promotion.
[ArgPromote] Actually add FileCheck to a test that I actually updated to
have nice CHECK patterns instead of relying on a coarse 'not grep'
check. Sorry that I missed this the first time through.
Craig Topper [Wed, 8 Feb 2017 02:54:12 +0000 (02:54 +0000)]
Move mnemonicIsValid to Mips target.
Summary:
The Mips target is the only user of mnemonicIsValid. This patch
moves this method from AsmMatcherEmitter.cpp to MipsAsmParser.cpp,
getting rid of the method in all other targets where it generated
warnings about an unused function.
Daniel Berlin [Wed, 8 Feb 2017 02:35:07 +0000 (02:35 +0000)]
CVP: Make CVP iterate in an order that maximizes reuse of LVI cache
Summary:
After the DFS order change for LVI, i have a few testcases that now
take forever.
The TL;DR - This is mainly due to the overdefined cache, but that
requires predicateinfo to fix[1]
In order to maximize reuse of the LVI cache for now, change the order
we iterate in.
This reduces my testcase from 5 minutes to 4 seconds.
I have verified cases like gmic do not get slower.
I am playing with whether the order should be postorder or idf.
[1] In practice, overdefined anywhere should be overdefined
everywhere, so this cache should be global. That also fixes this bug.
The problem, however, is that LVI relies on this cache being filled in
per-block because it wants different values in different blocks due to
precisely the naming issue that predicateinfo fixes. With
predicateinfo, making the cache global works fine on individual
passes, and also resolves this issue.
Marcos Pividori [Wed, 8 Feb 2017 00:03:31 +0000 (00:03 +0000)]
[libFuzzer] Use long long to ensure 64 bits.
We should always use unsigned long long to ensure 64 bits. On Windows, unsigned
long is 4 bytes. This was the reason why value-profile-cmp4.test was failing on
Windows.
Marcos Pividori [Wed, 8 Feb 2017 00:03:26 +0000 (00:03 +0000)]
[libFuzzer] Use custom target instead of list of binaries for tests.
Update cmake to use a custom target TestBinaries instead of a list of targets.
This simplifies cmake, and fix some errors. This way, we don't have to propagate
the values into parents directories. We only need to use add_dependencies.
Marcos Pividori [Wed, 8 Feb 2017 00:03:18 +0000 (00:03 +0000)]
[libFuzzer] Properly use Handle instead of FD on Windows.
For Windows, sanitizers work with Handles, not with posix file descriptors,
because they use the windows-specific API. So we need to convert the fds to
handles before passing them to the sanitizer library.
After this change, close_fd_mask is fixed for Windows (this fix some tests too).
Marcos Pividori [Wed, 8 Feb 2017 00:02:41 +0000 (00:02 +0000)]
[libFuzzer] Properly configure tests for Windows.
This configuration is necessary, and is included in all tests suites.
We need to execute: `config.test_format = lit.formats.ShTest(False)`
Otherwise, lit will try to use bash, which generates many problems.
Marcos Pividori [Wed, 8 Feb 2017 00:02:36 +0000 (00:02 +0000)]
[libFuzzer] Simplify dump_coverage test.
Environment variables are handled differently on Windows. In this case it is not
necessary to use environment variables. So, I simplify the test to work on
Windows.
Marcos Pividori [Wed, 8 Feb 2017 00:02:32 +0000 (00:02 +0000)]
[libFuzzer] Update Load test to work on 32 bits.
We should ensure the size of the variable `a` is 8 bytes. Otherwise, this
generates a stack buffer overflow inside the memcpy call in 32 bits machines.
(We write more bytes than the size of a, when it is 4 bytes)
Sanjoy Das [Tue, 7 Feb 2017 23:59:07 +0000 (23:59 +0000)]
[IRCE] Add a missing invariant check
Currently IRCE relies on the loops it transforms to be (semantically) of
the form:
for (i = START; i < END; i++)
...
or
for (i = START; i > END; i--)
...
However, we were not verifying the presence of the START < END entry
check (i.e. check before the first iteration). We were only verifying
that the backedge was guarded by (i + 1) < END.
Usually this would work "fine" since (especially in Java) most loops do
actually have the START < END check, but of course that is not
guaranteed.
Eric Fiselier [Tue, 7 Feb 2017 22:48:20 +0000 (22:48 +0000)]
[CMake] Fix USE_LLVM_SANITIZER configuration for out-of-tree builds.
Summary:
r291918 changed `HandleLLVMOptions.cmake` to add `-fsanitize-blacklist=<llvm-file>` when `LLVM_USE_SANITIZER=Undefined` is specified. This breaks out-of-tree users of `LLVM_USE_SANITIZER` since that file is not present.
This patch fixes the issue by checking if the file exists first.
Wolfgang Pieb [Tue, 7 Feb 2017 21:23:15 +0000 (21:23 +0000)]
DebugInfo: Track spilled variables in LiveDebugValues
When variables are spilled to the stack by the register allocator, keep track of their
debug locations in LiveDebugValues and insert DBG_VALUE instructions at the appropriate
place. Ensure that the locations are propagated down the dominator tree via the existing
mechanisms.
Kevin Enderby [Tue, 7 Feb 2017 21:20:44 +0000 (21:20 +0000)]
Fix a typo in an error message for a check of invalid Mach-O files where
it was printing the field name fileoff instead of filesize. The original check
was added in r278557.
This was found in tracking down the problem that lead to the fix in
r293842 - [dsymutil] Fix __LINKEDIT vmsize in dsymutil upgrade path
Daniel Berlin [Tue, 7 Feb 2017 21:10:46 +0000 (21:10 +0000)]
Add PredicateInfo utility and printing pass
Summary:
This patch adds a utility to build extended SSA (see "ABCD: eliminating
array bounds checks on demand"), and an intrinsic to support it. This
is then used to get functionality equivalent to propagateEquality in
GVN, in NewGVN (without having to replace instructions as we go). It
would work similarly in SCCP or other passes. This has been talked
about a few times, so i built a real implementation and tried to
productionize it.
Copies are inserted for operands used in assumes and conditional
branches that are based on comparisons (see below for more)
Every use affected by the predicate is renamed to the appropriate
intrinsic result.
E.g.
%cmp = icmp eq i32 %x, 50
br i1 %cmp, label %true, label %false
true:
ret i32 %x
false:
ret i32 1
will become
%cmp = icmp eq i32, %x, 50
br i1 %cmp, label %true, label %false
true:
; Has predicate info
; branch predicate info { TrueEdge: 1 Comparison: %cmp = icmp eq i32 %x, 50 }
%x.0 = call @llvm.ssa_copy.i32(i32 %x)
ret i32 %x.0
false:
ret i23 1
(you can use -print-predicateinfo to get an annotated-with-predicateinfo dump)
This enables us to easily determine what operations are affected by a
given predicate, and how operations affected by a chain of
predicates.
ADT: Add explicit conversions for reverse ilist iterators
Add explicit conversions between forward and reverse ilist iterators.
These follow the conversion conventions of std::reverse_iterator, which
are off-by-one: the newly-constructed "reverse" iterator dereferences to
the previous node of the one sent in. This has the benefit of
converting reverse ranges in place:
- If [I, E) is a valid range,
- then [reverse(E), reverse(I)) gives the same range in reverse order.
ilist_iterator::getReverse() is unchanged: it returns a reverse iterator
to the *same* node.
Daniel Berlin [Tue, 7 Feb 2017 19:29:25 +0000 (19:29 +0000)]
This patch adds a ssa_copy intrinsic, as part of splitting up D29316.
Summary:
The intrinsic, marked as returning it's first argument, has no code
generation effect (though currently not every optimization pass knows
that intrinsics with the returned attribute can be looked through).
It is about to be used to by the PredicateInfo pass to attach
predicate information to existing operands, and be able to tell what
the predicate information affects.
We deliberately do not attach any info through a second operand so
that the intrinsics do not need to dominate the comparisons/etc (since
in the case of assume, we may want to push them up the post-dominator
tree).
Daniel Berlin [Tue, 7 Feb 2017 19:24:26 +0000 (19:24 +0000)]
Replace custom written DFS walk with depth first iterator
Summary:
GenericDomTreeConstruction had its own written DFS walk.
It is basically identical to the DFS walk df_* is doing in the iterators.
the one difference is that df_iterator uses an internal visited set.
The GenericDomTreeConstruction one reused a field in an existing densemap lookup.
Time-wise, this way is actually more cache-friendly (the previous way has a random store
into a successor's info, the new way does that store at the same time and in the same place
as other stores to the same info)
It costs some very small amount of memory to do this, and one we pay in some other part of
dom tree construction *anyway*, so we aren't really increasing dom tree constructions's
peak memory usage.
It could still be changed to use the old field with a little work on df_ext_* if we care
(and if someone find performance regressions)
Sanjoy Das [Tue, 7 Feb 2017 19:19:49 +0000 (19:19 +0000)]
[ImplicitNullCheck] Extend Implicit Null Check scope by using stores
Summary:
This change allows usage of store instruction for implicit null check.
Memory Aliasing Analisys is not used and change conservatively supposes
that any store and load may access the same memory. As a result
re-ordering of store-store, store-load and load-store is prohibited.
Matthew Simpson [Tue, 7 Feb 2017 19:17:44 +0000 (19:17 +0000)]
[LV] Simplify ARM/AArch64 interleaved access cost model tests (NFC)
This patch removes unneeded instructions from the existing ARM/AArch64
interleaved access cost model tests. I'll be adding a similar set of tests in a
follow-on patch to increase coverage.
Chris Bieneman [Tue, 7 Feb 2017 19:06:22 +0000 (19:06 +0000)]
[CMake] Move ninja job pool options to HandleLLVMOptions
Moving the Ninja job pool configuration settings into the HandleLLVMOptions module will allow standalone builds of LLVM sub-projects to use the LLVM options without needing to re-implement them.
Reid Kleckner [Tue, 7 Feb 2017 18:42:53 +0000 (18:42 +0000)]
[SDAGISel] Simplify some SDAGISel code, NFC
Hoist entry block code for arguments and swift error values out of the
basic block instruction selection loop. Lowering arguments once up front
seems much more readable than doing it conditionally inside the loop. It
also makes it clear that argument lowering can update StaticAllocaMap
because no instructions have been selected yet.
Pavel Labath [Tue, 7 Feb 2017 18:11:33 +0000 (18:11 +0000)]
[Support] Add FormatVariadic support for chrono types
Summary:
The formatter has three knobs:
- the user can choose which time unit to use for formatting (default: whatever is the unit of the input)
- he can choose whether the unit gets displayed (default: yes)
- he can affect the way the number itself is formatted via standard number formatting options (default:default)
Adrian Prantl [Tue, 7 Feb 2017 17:35:41 +0000 (17:35 +0000)]
Fix the bitcode upgrade for DIGlobalVariable in a DIImportedEntity context.
The bitcode upgrade for DIGlobalVariable unconditionally wrapped
DIGlobalVariables in a DIGlobalVariableExpression. When a
DIGlobalVariable is referenced by a DIImportedEntity, however, this is
wrong. This patch fixes the bitcode upgrade by deferring the creation
of DIGlobalVariableExpressions until we know the context of the
DIGlobalVariable.
Artur Pilipenko [Tue, 7 Feb 2017 14:09:37 +0000 (14:09 +0000)]
Add DAGCombiner load combine tests for {a|s}ext, {a|z|s}ext load nodes
Currently we don't support these nodes, so the tests check the current codegen without load combine. This change makes the review of the change to support these nodes more clear.
Separated from https://reviews.llvm.org/D29591 review.
Christof Douma [Tue, 7 Feb 2017 13:07:12 +0000 (13:07 +0000)]
[ARM] Make RWPI use movw/movt when available
When constructing global address literals while targeting the RWPI
relocation model. LLVM currently only uses literal pools. If MOVW/MOVT
instructions are available we can use these instead. Beside being more
efficient it allows -arm-execute-only to work with
-relocation-model=RWPI as well.
When we generate MOVW/MOVT for global addresses when targeting the RWPI
relocation model, we need to use base relative relocations. This patch
does the needed plumbing in MC to generate these for MOVW/MOVT.
Daniel Jasper [Tue, 7 Feb 2017 08:57:50 +0000 (08:57 +0000)]
Revert "[DAGCombiner] (add X, (adde Y, 0, Carry)) -> (adde X, Y, Carry)"
This reverts commit r294186.
On an internal test, this triggers an out-of-memory error on PPC,
presumably because there is another dagcombine that does the exact
opposite triggering and endless loop consuming more and more memory.
Chandler has started at creating a reduced test case and we'll attach it
as soon as possible.