Sam Parker [Tue, 21 May 2019 07:56:47 +0000 (07:56 +0000)]
[ARM][CGP] Skip nuw in PrepareConstants
PrepareConstants step converts add/sub with 'negative' immediates to
sub/add with a 'positive' imm to make promotion more simple. nuw
already states that the add shouldn't cause an unsigned wrap, so
it shouldn't need any tweaking. Plus, we also don't allow a sub with
a 'negative' immediate to be safe wrap, so this functionality has
been removed. The PrepareConstants step now just handles the add
instructions that we've determined would be safe if they wrap around
zero.
Petr Hosek [Tue, 21 May 2019 07:13:58 +0000 (07:13 +0000)]
[CMake] Specify component for all target types
This addresses an issue introduced in r360230 which broke existing
use cases of LLVM_DISTRIBUTION_COMPONENTS since ARCHIVE and LIBRARY
target types are no longer handled as components.
Dylan McKay [Tue, 21 May 2019 06:38:02 +0000 (06:38 +0000)]
Add TargetLoweringInfo hook for explicitly setting the ABI calling convention endianess
Summary:
The endianess used in the calling convention does not always match the
endianess of the target on all architectures, namely AVR.
When an argument is too large to be legalised by the architecture and is
split for the ABI, a new hook TargetLoweringInfo::shouldSplitFunctionArgumentsAsLittleEndian
is queried to find the endianess that function arguments must be laid
out in.
This approach was recommended by Eli Friedman.
Originally reported in https://github.com/avr-rust/rust/issues/129.
Nico Weber [Tue, 21 May 2019 03:01:01 +0000 (03:01 +0000)]
Tweaks for setting CMAKE_LINKER to lld-link
- Just look for "lld-link", not "lld-link.exe".
llvm/cmake/platforms/WinMsvc.cmake for example sets CMAKE_LINKER to
lld-link without .exe
- Stop passing -gwarf to the compiler in sanitizer options when lld is
enabled -- there's no reason to use different debug information keyed
off the linker. (If this was for MinGW, we should check for that
instead.)
Nick Desaulniers [Mon, 20 May 2019 22:17:43 +0000 (22:17 +0000)]
[ORC] fix use-after-move. NFC
Summary:
scan-build flagged a potential use-after-move in debug builds. It's not
safe that a moved from value contains anything but garbage. Manually
DRY up these repeated expressions.
Matt Arsenault [Mon, 20 May 2019 22:04:42 +0000 (22:04 +0000)]
AMDGPU: Force skip branches over calls
Unfortunately the way SIInsertSkips works is backwards, and is
required for correctness. r338235 added handling of some special cases
where skipping is mandatory to avoid side effects if no lanes are
active. It conservatively handled asm correctly, but the same logic
needs to apply to calls.
Usually the call sequence code is larger than the skip threshold,
although the way the count is computed is really broken, so I'm not
sure if anything was likely to really hit this.
Lang Hames [Mon, 20 May 2019 20:53:05 +0000 (20:53 +0000)]
[Support] Renamed member 'Size' to 'AllocatedSize' in MemoryBlock and OwningMemoryBlock.
Rename member 'Size' to 'AllocatedSize' in order to provide a hint that the
allocated size may be different than the requested size. Comments are added to
clarify this point. Updated the InMemoryBuffer in FileOutputBuffer.cpp to track
the requested buffer size.
Nikita Popov [Mon, 20 May 2019 19:13:04 +0000 (19:13 +0000)]
[LFTR] Add additional PR31181 test cases
One case where overflow happens in the first loop iteration, and
two cases where we switch to a dynamically dead IV with post/pre
increment, respectively.
Craig Topper [Mon, 20 May 2019 17:37:52 +0000 (17:37 +0000)]
[X86] Add test case for r361177.
That commit makes sure we flush PendingExports in SelectDAGBuilder
before we create INLINEASM_BR. Unfortunatley, I haven't yet found
a CodeGen failure without that change.
This commit uses the debug output from SelectionDAG to at least
ensure we build the DAG correctly.
Craig Topper [Mon, 20 May 2019 17:08:02 +0000 (17:08 +0000)]
[SelectionDAGBuilder] Flush PendingExports before creating INLINEASM_BR node for asm goto.
Since INLINEASM_BR is a terminator we need to flush the pending exports before
emitting it. If we don't do this, a TokenFactor can be inserted between it and
the BR instruction emitted to finish the callbr lowering.
It looks like nodes are glued to the INLINEASM_BR so I had to make sure we emit
the TokenFactor before that.
Nick Desaulniers [Mon, 20 May 2019 16:58:59 +0000 (16:58 +0000)]
[DWARF] hoist nullptr checks. NFC
Summary:
This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No.
15" (see under #13). It looks like PVS studio flags nullptr checks where
the ptr is used inbetween creation and checking against nullptr.
Nick Desaulniers [Mon, 20 May 2019 16:48:09 +0000 (16:48 +0000)]
[INLINER] allow inlining of blockaddresses if sole uses are callbrs
Summary:
It was supposed that Ref LazyCallGraph::Edge's were being inserted by
inlining, but that doesn't seem to be the case. Instead, it seems that
there was no test for a blockaddress Constant in an instruction that
referenced the function that contained the instruction. Ex:
When iterating blockaddresses, do not add the function they refer to
back to the worklist if the blockaddress is referring to the contained
function (as opposed to an external function).
Because blockaddress has sligtly different semantics than GNU C's
address of labels, there are 3 cases that can occur with blockaddress,
where only 1 can happen in GNU C due to C's scoping rules:
* blockaddress is within the function it refers to (possible in GNU C).
* blockaddress is within a different function than the one it refers to
(not possible in GNU C).
* blockaddress is used in to declare a global (not possible in GNU C).
This patch adjusts the iteration of blockaddresses in
LazyCallGraph::visitReferences to not revisit the blockaddresses
function in the first case.
The Linux kernel contains code that's not semantically valid at -O0;
specifically code passed to asm goto. It requires that asm goto be
inline-able. This patch conservatively does not attempt to handle the
more general case of inlining blockaddresses that have non-callbr users
(pr/39560).
Bjorn Pettersson [Mon, 20 May 2019 16:41:08 +0000 (16:41 +0000)]
[AMDGPU] Fix std::array initializers to avoid warnings with older tool chains. NFC
A std::array is implemented as a template with an array
inside a struct. Older versions of clang, like 3.6,
require an extra set of curly braces around std::array
initializations to avoid warnings.
The C++ language was changed regarding this by CWG 1270.
So more modern tool chains does not complaing even if
leaving out one level of braces.
Craig Topper [Mon, 20 May 2019 16:27:09 +0000 (16:27 +0000)]
[Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64
We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well.
Nikita Popov [Mon, 20 May 2019 16:09:22 +0000 (16:09 +0000)]
[SDAG] Vector op legalization for overflow ops
Fixes issue reported by aemerson on D57348. Vector op legalization
support is added for uaddo, usubo, saddo and ssubo (umulo and smulo
were already supported). As usual, by extracting TargetLowering methods
and calling them from vector op legalization.
Vector op legalization doesn't really deal with multiple result nodes,
so I'm explicitly performing a recursive legalization call on the
result value that is not being legalized.
There are some existing test changes because expansion happens
earlier, so we don't get a DAG combiner run in between anymore.
George Rimar [Mon, 20 May 2019 15:41:48 +0000 (15:41 +0000)]
[llvm-readelf] - Rework how we parse the .dynamic section.
This is a result of what I found during my work on https://bugs.llvm.org/show_bug.cgi?id=41679.
Previously LLVM readelf took the information about .dynamic section
from its PT_DYNAMIC segment only. GNU tools have a bit different logic.
They also use the information from the .dynamic section header if it is available.
This patch changes the code to improve the compatibility with the GNU Binutils.
Matt Arsenault [Mon, 20 May 2019 14:09:36 +0000 (14:09 +0000)]
RegAlloc: Fix verifier error with undef identity copies
The code did not match the example in the comment, and was checking
the undef flag on the copy dest instead of source. The existing tests
were only hitting the > 2 operands case.
[DebugInfo] Update loop metadata for inlined loops
Currently, when a loop is cloned while inlining function (A) into function (B)
the loop metadata is copied and then not modified at all. The loop metadata can
encode the loop's start and end DILocations. Therefore, the new inlined loop in
function (B) may have loop metadata which shows start and end locations residing
in function (A).
This patch ensures loop metadata is updated while inlining so that the start and
end DILocations are given the "inlinedAt" operand. I've also added a regression
test for this.
This fix is required for D60831 because that patch uses loop metadata to
determine the DILocation for the branches of new loop preheaders.
Sander de Smalen [Mon, 20 May 2019 09:54:06 +0000 (09:54 +0000)]
Match types of accumulator and result for llvm.experimental.vector.reduce.fadd/fmul
The scalar start/accumulator value of the fadd- and fmul reduction
should match the result type of the reduction, as well as the vector
element-type of the input vector. Although this was not explicitly
specified in the LangRef, it was taken for granted in code implementing
the reductions. The patch also fixes the LangRef by adding this
constraint.
[DebugInfo] Update loop metadata for inlined loops
Summary:
Currently, when a loop is cloned while inlining function (A) into function (B) the loop metadata is copied and then not modified at all. The loop metadata can encode the loop's start and end DILocations. Therefore, the new inlined loop in function (B) may have loop metadata which shows start and end locations residing in function (A).
This patch ensures loop metadata is updated while inlining so that the start and end DILocations are given the "inlinedAt" operand. I've also added a regression test for this.
This fix is required for D60831 because that patch uses loop metadata to determine the DILocation for the branches of new loop preheaders.
Carl Ritson [Mon, 20 May 2019 07:20:12 +0000 (07:20 +0000)]
[AMDGPU] gfx1010 Avoid SMEM WAR hazard for some s_waitcnt values
Summary:
Avoid introducing hazard mitigation when lgkmcnt is reduced to 0.
Clarify code comments to explain assumptions made for this hazard
mitigation. Expand and correct test cases to cover variants of
s_waitcnt.
[SLP] Refactoring of EdgeInfo and UserTreeIdx in buildTree_rec().
This is a follow-up refactoring patch after the introduction of usable TreeEntry pointers in D61706.
The EdgeInfo struct can now use a TreeEntry pointer instead of an index in VectorizableTree.
Committed on behalf of @vporpo (Vasileios Porpodas)
Don Hinton [Sat, 18 May 2019 20:46:35 +0000 (20:46 +0000)]
[CommandLine] Reduce size of Option class
Summary:
Reduce size of Option class from 184 bytes to 136 bytes by
placing more member variables in Bit Field (16 bytes), and
reducing the initial sizes of Categories and Subs to 1 (32 bytes).
Roman Lebedev [Sat, 18 May 2019 13:00:03 +0000 (13:00 +0000)]
UpdateTestChecks: fix AMDGPU handling
Summary:
Was looking into supporting `(srl (shl x, c1), c2)` with c1 != c2 in dagcombiner,
this test changes, but makes `update_llc_test_checks.py` unhappy.
**Many** AMDGPU tests specify `-march`, not `-mtriple`, which results in `update_llc_test_checks.py`
defaulting to x86 asm function detection heuristics, which don't work here.
I propose to fix this by adding an infrastructure to map from `-march` to `-mtriple`,
in the UpdateTestChecks tooling.
Roman Lebedev [Sat, 18 May 2019 12:59:56 +0000 (12:59 +0000)]
UpdateTestChecks: arm64-eabi handlind
Summary:
Was looking into supporting `(srl (shl x, c1), c2)` with c1 != c2 in dagcombiner,
this test changes, but makes `update_llc_test_checks.py` unhappy
Michael Trent [Sat, 18 May 2019 03:17:27 +0000 (03:17 +0000)]
Update llvm-nm -s to use a multi-var option
Summary:
Previously llvm-nm relied on a positional parameter to read two values
into the SegSect list. This worked, but required the "-s" paramater and
its arguments to be the last elements on the command-line.
The CommandLine library now supports mutli-var parameters, so it can
naturally deal with "-s" expecting two arguments, and now the input file
can appear anywhere (within reason) in the command line invocation. E.g.
Matt Arsenault [Fri, 17 May 2019 23:05:13 +0000 (23:05 +0000)]
GlobalISel: Implement lower for S64->S32 [SU]ITOFP
This is ported from the custom AMDGPU DAG implementation. I think this
is a better default expansion than what the DAG currently uses, at
least if the target has CTLZ.
This implements the signed version in terms of the unsigned
conversion, which is implemented with bit operations. SelectionDAG has
several other implementations that should eventually be ported
depending on what instructions are legal.
build: use clang-cl for runtimes when targeting Windows
When targeting Windows and building a runtime (subproject) prefer to use
`clang-cl` rather than the `clang` driver. This allows us to cross-compile
runtimes for the Windows environment from Linux.
Simon Pilgrim [Fri, 17 May 2019 17:25:55 +0000 (17:25 +0000)]
[X86][SSE] Match all-of bool scalar reductions into a bitcast/movmsk + cmp.
Same as what we do for vector reductions in combineHorizontalPredicateResult, use movmsk+cmp for scalar (and(extract(x,0),extract(x,1)) reduction patterns.
Roman Lebedev [Fri, 17 May 2019 15:52:58 +0000 (15:52 +0000)]
[DAGCombiner] visitShiftByConstant(): drop bogus signbit check
Summary:
That check claims that the transform is illegal otherwise.
That isn't true:
1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter
https://rise4fun.com/Alive/K4A
2. For `ISD::AND`, there is no restriction on constants:
https://rise4fun.com/Alive/Wy3
3. For `ISD::OR`, there is no restriction on constants:
https://rise4fun.com/Alive/GOH
3. For `ISD::XOR`, there is no restriction on constants:
https://rise4fun.com/Alive/ml6
So, why is it there then?
This changes the testcase that was touched by @spatel in rL347478,
but i'm not sure that test tests anything particular?
Roman Lebedev [Fri, 17 May 2019 15:52:49 +0000 (15:52 +0000)]
[InstCombine] canShiftBinOpWithConstantRHS(): drop bogus signbit check
Summary:
In D61918 i was looking at dropping it in DAGCombiner `visitShiftByConstant()`,
but as @craig.topper pointed out, it was copied from here.
That check claims that the transform is illegal otherwise.
That isn't true:
1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter
https://rise4fun.com/Alive/K4A
2. For `ISD::AND`, there is no restriction on constants:
https://rise4fun.com/Alive/Wy3
3. For `ISD::OR`, there is no restriction on constants:
https://rise4fun.com/Alive/GOH
3. For `ISD::XOR`, there is no restriction on constants:
https://rise4fun.com/Alive/ml6
So, why is it there then?
As far as i can tell, it dates all the way back to original check-in rL7793.
I think we should just drop it.