Michal Gorny [Tue, 10 Jan 2017 19:55:51 +0000 (19:55 +0000)]
[llvm-config] Canonicalize CMake booleans to 0/1
Following the similar change to lit configuration, ensure that all CMake
booleans are canonicalized to 0/1 when being passed to llvm-config. This
fixes the incorrect interpretation of values when user passes another
value than the ON/OFF, and simplifies the code by removing unnecessary
string matching.
Furthermore, the code for --has-rtti and --has-global-isel has been
modified to print consistent values indepdently of the boolean used by
passed by the user to CMake. Sadly, the code already implicitly used
different values for the two (YES/NO for --has-rtti, ON/OFF for
--has-global-isel).
Include tests for all booleans and multi-value options in llvm-config.
Petr Hosek [Tue, 10 Jan 2017 19:47:05 +0000 (19:47 +0000)]
[CMake] Handle common options for runtimes build
All the existing runtimes relies on flags which are set by AddLLVM
and HandleLLVMOptions. In the standalone case, they would include
these themselves, but when being built using LLVM runtimes we should
include these in the top-level runtimes CMake files.
[LV] Don't panic when encountering the IV of an outer loop.
Bail out instead of asserting when we encounter this situation,
which can actually happen.
The reason the test uses the new PM is that the "bad" phi, incidentally, gets
cleaned up by LoopSimplify. But LICM can create this kind of phi and preserve
loop simplify form, so the cleanup has no chance to run.
This fixes PR31190.
We may want to solve this in a less conservative manner, since this phi is
actually uniform within the inner loop (or we may want LICM to output a cleaner
promotion to begin with).
Rong Xu [Tue, 10 Jan 2017 19:30:20 +0000 (19:30 +0000)]
[PGO] Turn off comdat renaming in IR PGO by default
Summary:
In IR PGO we append the function hash to comdat functions to avoid the
potential hash mismatch. This turns out not legal in some cases: if the comdat
function is address-taken and used in comparison. Renaming changes the semantic.
This patch turns off comdat renaming by default.
To alleviate the hash mismatch issue, we now rename the profile variable
for comdat functions. Profile allows co-existing multiple versions of profiles
with different hash value. The inlined copy will always has the correct profile
counter. The out-of-line copy might not have the correct count. But we will
not have the bogus mismatch warning.
[X86][AVX512]Improving shuffle lowering by using AVX-512 EXPAND* instructions
This patch fix PR31351: https://llvm.org/bugs/show_bug.cgi?id=31351
1. This patch adds new type of shuffle lowering
2. We can use the expand instruction, When the shuffle pattern is as following:
{ 0*a[0]0*a[1]...0*a[n] , n >=0 where a[] elements in a ascending order}.
Simon Dardis [Tue, 10 Jan 2017 16:40:57 +0000 (16:40 +0000)]
[mips] Fix Mips MSA instrinsics
The usage of some MIPS MSA instrinsics that took immediates could crash LLVM
during lowering. This patch addresses that behaviour. Crucially this patch
also makes the use of intrinsics with out of range immediates as producing an
internal error.
The ld,st instrinsics would trigger an assertion failure for MIPS64 as their
lowering would attempt to add an i32 offset to a i64 pointer.
Simon Dardis [Tue, 10 Jan 2017 15:53:10 +0000 (15:53 +0000)]
[mips] Honour -mno-odd-spreg for vector splat (again)
Previous the lowering of FILL_FW would use the MSA128W register class when
performing a vector splat. Instead it should be honouring -mno-odd-spreg and
only use the even registers when performing a splat from word to vector
register.
Logical follow-on from r230235.
This fixes PR/31369.
A previous commit was missing the test case and had another differential
in it.
Simon Dardis [Tue, 10 Jan 2017 10:28:37 +0000 (10:28 +0000)]
[mips] Honour -mno-odd-spreg for vector splat
Previous the lowering of FILL_FW would use the MSA128W register class when
performing a vector splat. Instead it should be honouring -mno-odd-spreg and
only use the even registers when performing a splat from word to vector
register.
Craig Topper [Tue, 10 Jan 2017 07:42:57 +0000 (07:42 +0000)]
[DAGCombiner] Merge together duplicate checks for folding fold (select C, 1, X) -> (or C, X) and folding (select C, X, 0) -> (and C, X). Also be consistent about checking that both the condition and the result type are i1. NFC
I guess previously we just assumed if the result type was i1, then the condition type must also be i1?
Chris Bieneman [Tue, 10 Jan 2017 06:22:49 +0000 (06:22 +0000)]
[ObjectYAML] Support for DWARF line tables
One more try... relanding r291541 with a fix to properly gate MaxOpsPerInst on DWARF version.
Description from r291541:
This patch re-lands r291470, which failed on Linux bots. The issue (I believe) was undefined behavior because the size of llvm::dwarf::LineNumberOps was not explcitly specified or consistently respected. The updated patch adds an explcit underlying type to the enum and preserves the size more correctly.
Original description:
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Craig Topper [Tue, 10 Jan 2017 06:01:16 +0000 (06:01 +0000)]
AMD family 17h (znver1) enablement
Summary:
This patch enables the following
1. AMD family 17h architecture using "znver1" tune flag (-march, -mcpu).
2. ISAs that are enabled for "znver1" architecture.
3. Checks ADX isa from cpuid to identify "znver1" flag when -march=native is used.
4. ISAs FMA4, XOP are disabled as they are dropped from amdfam17.
5. For the time being, it uses the btver2 scheduler model.
6. Test file is updated to check this flag.
This item is linked to clang review item https://reviews.llvm.org/D28018
Chris Bieneman [Tue, 10 Jan 2017 05:25:24 +0000 (05:25 +0000)]
[ObjectYAML] Support for DWARF line tables
This patch re-lands r291470, which failed on Linux bots. The issue (I believe) was undefined behavior because the size of llvm::dwarf::LineNumberOps was not explcitly specified or consistently respected. The updated patch adds an explcit underlying type to the enum and preserves the size more correctly.
Original description:
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Craig Topper [Tue, 10 Jan 2017 04:12:24 +0000 (04:12 +0000)]
[X86] When lowering uniform shifts, use X86ISD::VZEXT instead of using a ZERO_EXTEND_VECTOR_INREG. If we emit the ZERO_EXTEND_VECTOR_INREG too late it doesn't get lowered properly and makes it through to isel and fails.
Serge Pavlov [Tue, 10 Jan 2017 02:50:47 +0000 (02:50 +0000)]
[StructurizeCfg] Update dominator info.
In some cases StructurizeCfg updates root node, but dominator info
remains unchanges, it causes crash when expensive checks are enabled.
To cope with this problem a new method was added to DominatorTreeBase
that allows adding new root nodes, it is called in StructurizeCfg to
put dominator tree in sync.
This is the second part of a multi-part change to define additional
subcommands to the `llvm-xray` tool.
This change defines a conversion subcommand to take XRay log files, and
turns them from one format to another (binary or YAML). This currently
only supports the first version of the log file format, defined in the
compiler-rt runtime.
Evandro Menezes [Tue, 10 Jan 2017 01:08:01 +0000 (01:08 +0000)]
[CodeGen] Implement the SUnit::print() method
This method seems to have had a troubled life. This patch proposes that it
replaces the recently added helper function dumpSUIdentifier. This way, the
method can be used in other files using the SUnit class.
Reid Kleckner [Tue, 10 Jan 2017 01:05:33 +0000 (01:05 +0000)]
Try once again to fix the MSVC build of AlignedCharArrayUnion
It was complaining about ambiguity between llvm::detail and
llvm::support::detail:
error C2872: 'detail': ambiguous symbol
note: could be 'llvm::detail'
note: or 'llvm::support::detail'
Standardize on llvm::support::detail to hide these symbols further.
Reid Kleckner [Tue, 10 Jan 2017 00:26:56 +0000 (00:26 +0000)]
Fix MSVC build of AlignedCharArrayUnion
Use constexpr recursion for alignof like we do for sizeof. Seems to work
with Clang and MSVC. Also, don't recurse twice to avoid slowdowns in
compilers that don't memoize constexpr results (Clang).
James Y Knight [Mon, 9 Jan 2017 23:11:25 +0000 (23:11 +0000)]
Commit a test for match-full-lines.
I unfortunately neglected to add it in r260540, but it has been
sitting in my working dir ever since. D'oh.
Modified to work with r290069, which made the CHECK patterns
themselves whitespace-sensitive as well, and remove the test added
then, as this tests both strict and non-strict modes.
Rui Ueyama [Mon, 9 Jan 2017 22:55:00 +0000 (22:55 +0000)]
TarWriter: Fix a bug in Ustar header.
If we split a filename into `Name` and `Prefix`, `Prefix` is at most
145 bytes. We had a bug that didn't split a path correctly. This bug
was pointed out by Rafael in the post commit review.
This patch adds a unit test for TarWriter to verify the fix.
Easwaran Raman [Mon, 9 Jan 2017 21:56:26 +0000 (21:56 +0000)]
Refactor inline threshold update code.
Functional change: Previously, if a callee is cold, we used ColdThreshold if it minimizes the existing threshold. This was irrespective of whether we were optimizing for minsize (-Oz) or not. But -Oz uses very low threshold to begin with and the inlining with -Oz is expected to be tuned for lowering code size, so there is no good reason to set an even lower threshold for cold callees. We now lower the threshold for cold callees only when -Oz is not used. For default values of -inlinethreshold and -inlinecold-threshold, this change has no effect and this simplifies the code.
NFC changes: Group all threshold updates that are guarded by !Caller->optForMinSize() and within that group threshold updates that require profile summary info.
When writing to a non regular file we cannot rename to it. Since we
have to write, we may as well create a temporary file to avoid trying
to create an unique file in /dev when trying to write to /dev/null.
Matthias Braun [Mon, 9 Jan 2017 21:38:17 +0000 (21:38 +0000)]
PeepholeOptimizer: Do not replace SubregToReg(bitcast like)
While we can usually replace bitcast like instructions
(MachineInstr::isBitcast()) with a COPY this is not legal if any of the
users uses SUBREG_TO_REG to assert the upper bits of the result are
zero.
Rui Ueyama [Mon, 9 Jan 2017 21:20:42 +0000 (21:20 +0000)]
TarWriter: Set "00" to Ustar version field.
Most (maybe all?) tar commands can handle tar archives with blank
version fields, but POSIX requires "00" to be set to the field, so
doing it is good for compliance.
Chris Bieneman [Mon, 9 Jan 2017 20:01:37 +0000 (20:01 +0000)]
[ObjectYAML] Support for DWARF line tables
This patch adds support for the DWARF debug_lines section. The line table state machine opcodes are preserved, so this can be used to test the state machine evaluation directly.
Matthew Simpson [Mon, 9 Jan 2017 19:05:29 +0000 (19:05 +0000)]
[LV] Fix-up external IV users after updating dominator tree
This patch delays the fix-up step for external induction variable users until
after the dominator tree has been properly updated. This should fix PR30742.
The SCEVExpander in InductionDescriptor::transform can generate code in the
wrong location if the dominator tree is not up-to-date. We should work towards
keeping the dominator tree up-to-date throughout the transformation.
Matt Arsenault [Mon, 9 Jan 2017 18:52:39 +0000 (18:52 +0000)]
AMDGPU: Add Assert[SZ]Ext during argument load creation
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.
Fixes code quality regressions vs. SI in min.ll test.
Simon Pilgrim [Mon, 9 Jan 2017 17:20:03 +0000 (17:20 +0000)]
[X86][AVX512] Enable v16i8/v32i8 vector shifts to use an extend+shift+truncate pattern.
Use the existing AVX2 v8i16 vector shift lowering for v16i8 (extending to v16i32) on AVX512 targets and v32i8 (extending to v32i16) on AVX512BW targets.
Eugene Leviant [Mon, 9 Jan 2017 09:56:31 +0000 (09:56 +0000)]
RuntimeDyldELF: don't create thunk if not needed
This patch doesn't create thunk for branch operation when following conditions are met:
- Architecture is AArch64
- Relocation target is in the same object file
- Relocation target is close enough to be encoded in immediate offset
In such case we branch directly to the target instead of branching to thunk
[PM] Teach SCEV to invalidate itself when its dependencies become
invalid.
This fixes use-after-free bugs that will arise with any interesting use
of SCEV.
I've added a dedicated test that works diligently to trigger these kinds
of bugs in the new pass manager and also checks for them explicitly as
well as triggering ASan failures when things go squirly.
Craig Topper [Mon, 9 Jan 2017 04:19:34 +0000 (04:19 +0000)]
[AVX-512] Change another pattern that was using BLENDM to use masked moves. A future patch will conver it back to BLENDM if its beneficial to register allocation.
Lang Hames [Sun, 8 Jan 2017 20:09:35 +0000 (20:09 +0000)]
[Orc][RPC] Lock the pending results data structure when installing new result
handlers, make abandonPendingResults public API.
This should make installing asynchronous result handlers thread safe.
The abandonPendingResults method is made public so that clients can disconnect
from a remote even if they have asynchronous handlers awaing results from that
remote. The asynchronous handlers will all receive "abandoned result" errors as
their argument.
Lang Hames [Sun, 8 Jan 2017 01:13:47 +0000 (01:13 +0000)]
[Orc][RPC] Add an APICalls utility for grouping RPC funtions for registration.
APICalls allows groups of functions to be composed into an API that can be
registered as a unit with an RPC endpoint. Doing registration on a-whole API
basis (rather than per-function) allows missing API functions to be detected
early.
APICalls also allows Function membership to be tested at compile-time. This
allows clients to write static assertions that functions to be called are
members of registered APIs.
Mehdi Amini [Sun, 8 Jan 2017 00:44:45 +0000 (00:44 +0000)]
[ThinLTO] Fix lazy-loading of Metadata attachment, which left some Fwd ref behind
The change in r291362 was too agressive. We still need to flush at the
end of the block because function local metadata can introduce fwd
ref as well.
(Bootstrap with ThinLTO was broken)