Aaron Ballman [Mon, 14 Nov 2016 13:33:51 +0000 (13:33 +0000)]
Reverting r285406, which was a temporary workaround to get one of the documentation bots upgraded to something newer than GCC 4.7. This restores the check for GCC 4.8.
James Molloy [Mon, 14 Nov 2016 11:14:41 +0000 (11:14 +0000)]
[InlineCost] Remove skew when calculating call costs
When calculating the cost of a call instruction we were applying a heuristic penalty as well as the cost of the instruction itself.
However, when calculating the benefit from inlining we weren't discounting the equivalent penalty for the call instruction that would be removed! This caused skew in the calculation and meant we wouldn't inline in the following, trivial case:
Pablo Barrio [Mon, 14 Nov 2016 10:24:26 +0000 (10:24 +0000)]
[JumpThreading] Prevent non-deterministic use lists
Summary:
Unfolding selects was previously done with the help of a vector
of pointers that was then sorted to be able to remove duplicates.
As this sorting depends on the memory addresses, it was
non-deterministic. A SetVector is used now so that duplicates are
removed without the need of sorting first.
Eric Fiselier [Mon, 14 Nov 2016 07:26:17 +0000 (07:26 +0000)]
Add explicit (void) cast to unused unique_ptr::release() results
Summary:
This patch adds explicit `(void)` casts to discarded `release()` calls to suppress -Wunused-result.
This patch fixes *all* warnings are generated as a result of [applying `[[nodiscard]]` within libc++](https://reviews.llvm.org/D26596).
Similar fixes were applied to Clang in r286796.
Only attempt to demangle symbols which have the itanium C++ prefix of `_Z`.
This ensures that we do not treat any symbol name as a managled named. We would
previously treat a C function `f` as a mangled name and decode that to `float`
incorrectly.
While it is easy to add tests for this, Mehdi recommended against introducing
tests for the demangler as libc++abi should cover the testing.
Craig Topper [Mon, 14 Nov 2016 01:53:29 +0000 (01:53 +0000)]
[X86] Cleanup 'x' and 'y' mnemonic suffixes for vcvtpd2dq/vcvttpd2dq/vcvtpd2ps and similar instructions.
-Don't print the 'x' suffix for the 128-bit reg/mem VEX encoded instructions in Intel syntax. This is consistent with the EVEX versions.
-Don't print the 'y' suffix for the 256-bit reg/reg VEX encoded instructions in Intel or AT&T syntax. This is consistent with the EVEX versions.
-Allow the 'x' and 'y' suffixes to be used for the reg/mem forms when we're assembling using Intel syntax.
-Allow the 'x' and 'y' suffixes on the reg/reg EVEX encoded instructions in Intel or AT&T syntax. This is consistent with what VEX was already allowing.
Sanjoy Das [Sun, 13 Nov 2016 23:40:40 +0000 (23:40 +0000)]
[LangRef] Drop misleading anecdote
`shl nsw i8 1, i8 8` is poison, but `mul i8 1, i8 128` is not.
This was discussed previously here:
http://lists.llvm.org/pipermail/llvm-dev/2015-April/084195.html. From
the discussion, it was not clear which semantics we want for `shl`, but
for now at least make the language reference more accurate.
`c++filt` when given no arguments runs as a REPL, decoding each line as a
decorated name. Unify the test structure to be more uniform, with the tests for
llvm-cxxfilt living under test/tools/llvm-cxxfilt.
Sanjay Patel [Sun, 13 Nov 2016 20:04:52 +0000 (20:04 +0000)]
[ValueTracking] recognize even more variants of smin/smax
Similar to:
https://reviews.llvm.org/rL285499
https://reviews.llvm.org/rL286318
We can't minimally expose this in IR tests because we don't have min/max intrinsics,
but the difference is visible in codegen because SelectionDAGBuilder::visitSelect()
uses matchSelectPattern().
We're not canonicalizing these patterns in IR (yet), so I don't expect there to be any
regressions as noted here:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html
Matt Arsenault [Sun, 13 Nov 2016 18:20:54 +0000 (18:20 +0000)]
AMDGPU: Implement SGPR spilling with scalar stores
nThis avoids the nasty problems caused by using
memory instructions that read the exec mask while
spilling / restoring registers used for control flow
masking, but only for VI when these were added.
This always uses the scalar stores when enabled currently,
but it may be better to still try to spill to a VGPR
and use this on the fallback memory path.
The cache also needs to be flushed before wave termination
if a scalar store is used.
Analysis: Simplify the ScalarEvolution::getGEPExpr() interface. NFCI.
All existing callers were manually extracting information out of an existing
GEP instruction and passing it to getGEPExpr(). Simplify the interface by
changing it to take a GEPOperator instead.
Lang Hames [Sat, 12 Nov 2016 23:12:41 +0000 (23:12 +0000)]
[ORC] Remove the 'const' qualifier from the member function wrapper, make the
lambda in wrapHandler mutable to allow it to pass the handler through as a
non-const value.
llvm-strings: trivialise logic until we support more options
Until we have handling for ignoring unloaded sections, simplify the logic to
the point of triviality. This fixes the scanning of archives, particularly when
embedded in archives.
Craig Topper [Sat, 12 Nov 2016 18:04:46 +0000 (18:04 +0000)]
[AVX-512] Remove the remaining masked shift by immediate or by single value. Autoupgrade them to recently introduced unmasked versions and a select.
After this I'll add the unmasked intrinsics to InstCombineCalls to finish making our handling of these types of shuffles consistent between AVX-512 and the legacy intrinsics.
Michal Gorny [Sat, 12 Nov 2016 14:58:30 +0000 (14:58 +0000)]
[OCaml] Clear cross-target test deps when building out-of-tree
Clear cross-target test dependencies when using LLVM_OCAML_OUT_OF_TREE,
in order to make it possible to run check-llvm-bindings-ocaml without
rebuilding the whole LLVM.
Craig Topper [Sat, 12 Nov 2016 05:28:24 +0000 (05:28 +0000)]
[AVX-512] Add unmasked version of shift by immediate and shift by single element in XMM.
Summary:
This is the first step towards being able to add the avx512 shift by immediate intrinsics to InstCombineCalls where we aleady support the sse2 and avx2 intrinsics. We need to the unmasked versions so we can avoid having to teach InstCombineCalls that it would need to insert selects sometimes. Instead we'll just add the selects around the new instrinsics in the frontend.
This change should also enable the shift by i32 intrinsics to take a non-constant shift value just like the avx2 and sse intrinsics. This will enable us to fix PR30691 once we update clang.
Next I'll switch clang to use the new builtins. Then we'll come back to the backend and remove/autoupgrade the old intrinsics. Then I'll work on the same series for variable shifts.
Craig Topper [Sat, 12 Nov 2016 05:05:27 +0000 (05:05 +0000)]
[AVX-512] Add support for lowering shuffles to VALIGND/VALIGNQ
Summary: VALIGND and VALIGNQ are similar to PALIGNR but instead of working on a 128-bit lane they work on the entire vector register. This change leverages the shuffle rotate detection code used for PALIGNR to detect these cases.
Lang Hames [Sat, 12 Nov 2016 02:19:31 +0000 (02:19 +0000)]
[ORC] Add a WrappedHandlerReturn type to map handler return types onto error
return types.
This class allows user provided handlers to return either error-wrapped types
or plain types. In the latter case, the plain type is wrapped with a success
value of Error or Expected<T> type to fit it into the rest of the serialization
machinery.
This patch allows us to remove the RPC unit-test workaround added in r286646.
Zachary Turner [Fri, 11 Nov 2016 23:57:40 +0000 (23:57 +0000)]
[Support] Introduce llvm::formatv() function.
This introduces a new type-safe general purpose formatting
library. It provides compile-time type safety, does not require
a format specifier (since the type is deduced), and provides
mechanisms for extending the format capability to user defined
types, and overriding the formatting behavior for existing types.
This patch additionally adds documentation for the API to the
LLVM programmer's manual.
Mailing List Thread:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/105836.html
Rui Ueyama [Fri, 11 Nov 2016 23:41:13 +0000 (23:41 +0000)]
Define DbiStreamBuilder::addSectionContribs.
This patch defines a new function to add a SectionContribs stream
to a PDB file. Unlike SectionMap, SectionContribs contains a list
of input sections as opposed to output sections.
Note that this patch needs improving because currently we do not
set Module field in SectionContribs entries. In a follow-up patch,
I'll add Modules and then fix it after that.
Tom Stellard [Fri, 11 Nov 2016 23:35:42 +0000 (23:35 +0000)]
AMDGPU/SI: Fix visit order assumption in SIFixSGPRCopies
Summary:
This pass was assuming that when a PHI instruction defined a register
used by another PHI instruction that the defining insstruction would
be legalized before the using instruction.
This assumption was causing the pass to not legalize some PHI nodes
within divergent flow-control.
Anna Zaks [Fri, 11 Nov 2016 23:01:02 +0000 (23:01 +0000)]
[tsan][llvm] Implement the function attribute to disable TSan checking at run time
This implements a function annotation that disables TSan checking for the
function at run time. The benefit over attribute((no_sanitize("thread")))
is that the accesses within the callees will also be suppressed.
The motivation for this attribute is a guarantee given by the objective C
language that the calls to the reference count decrement and object
deallocation will be synchronized. To model this properly, we would need
to intercept all ref count decrement calls (which are very common in ObjC
due to use of ARC) and also every single message send. Instead, we propose
to just ignore all accesses made from within dealloc at run time. The main
downside is that this still does not introduce any synchronization, which
means we might still report false positives if the code that relies on this
synchronization is not executed from within dealloc. However, we have not seen
this in practice so far and think these cases will be very rare.
Unfortunately given the current structure of optimization diagnostics we
lack the capability to tell whether the user has
passed -Rpass-analysis=loop-vectorize since this is local to the
front-end (BackendConsumer::OptimizationRemarkHandler).
So rather than printing this even if the user has already
passed -Rpass-analysis, this patch just punts and stops recommending
this option. I don't think that getting this right is worth the
complexity.
Matthias Braun [Fri, 11 Nov 2016 22:37:34 +0000 (22:37 +0000)]
MachineScheduler/ScheduleDAG: Add support to skipping a node.
The DAG mutators in the scheduler cannot really remove DAG nodes as
additional anlysis information such as ScheduleDAGToplogicalSort are
already computed at this point and rely on a fixed number of DAG nodes.
Alleviate the missing removal with a new flag: Setting the new skip
flag on a node ignores it during scheduling.
Erik Eckstein [Fri, 11 Nov 2016 22:21:39 +0000 (22:21 +0000)]
FunctionComparator: don't rely on argument evaluation order.
This is a follow-up on the recent refactoring of the FunctionMerge pass.
It should fix a fail of the new FunctionComparator unittest whe compiling with MSVC.
Evgeniy Stepanov [Fri, 11 Nov 2016 21:39:26 +0000 (21:39 +0000)]
[cfi] Fix weak functions handling.
When a function pointer is replaced with a jumptable pointer, special
case is needed to preserve the semantics of extern_weak functions.
Since a jumptable entry can not be extern_weak, we emulate that
behaviour by replacing all references to F (the extern_weak function)
with the following expression: F != nullptr ? JumpTablePtr : nullptr.
Extra special care is needed for global initializers, since most (or
probably all) backends can not lower an initializer that includes
this kind of constant expression. Initializers like that are replaced
with a global constructor (i.e. a runtime initializer).
Erik Eckstein [Fri, 11 Nov 2016 21:15:13 +0000 (21:15 +0000)]
Make the FunctionComparator of the MergeFunctions pass a stand-alone utility.
This is pure refactoring. NFC.
This change moves the FunctionComparator (together with the GlobalNumberState
utility) in to a separate file so that it can be used by other passes.
For example, the SwiftMergeFunctions pass in the Swift compiler:
https://github.com/apple/swift/blob/master/lib/LLVMPasses/LLVMMergeFunctions.cpp
Details of the change:
*) The big part is just moving code out of MergeFunctions.cpp into FunctionComparator.h/cpp
*) Make FunctionComparator member functions protected (instead of private)
so that a derived comparator class can use them.
Following refactoring helps to share code between the base FunctionComparator
class and a derived class:
*) Add a beginCompare() function
*) Move some basic function property comparisons into a separate function compareSignature()
*) Do the GEP comparison inside cmpOperations() which now has a new
needToCmpOperands reference parameter
Fixed the lost FastMathFlags for FCmp operations in SLPVectorizer.
Reviewer: Michael Zolotukhin.
Differential Revision: https://reviews.llvm.org/D26543