Amara Emerson [Tue, 2 Jan 2018 18:56:39 +0000 (18:56 +0000)]
[AArch64][GlobalISel] Fix assert fail with unknown intrinsic.
A call may have an intrinsic name but not have a valid intrinsic ID,
for example with llvm.invariant.group.barrier. If so, treat it as a
normal call like FastISel does.
Sanjay Patel [Tue, 2 Jan 2018 16:38:29 +0000 (16:38 +0000)]
[x86] allow pairs of PCMPEQ for vector-sized integer equality comparisons (PR33325)
This is an extension of D31156 with the goal that we'll allow memcmp() == 0 expansion
for x86 to use 2 pairs of loads per block.
The memcmp expansion pass (formerly part of CGP) will generate this kind of pattern
with oversized integer compares, so we want to transform these into x86-specific vector
nodes before legalization splits things into scalar chunks.
See PR33325 for more details:
https://bugs.llvm.org/show_bug.cgi?id=33325
Anna Thomas [Tue, 2 Jan 2018 16:25:50 +0000 (16:25 +0000)]
[BasicBlockUtils] Check for unreachable preds before updating LI in UpdateAnalysisInformation
Summary:
We are incorrectly updating the LI when loop-simplify generates
dedicated exit blocks for a loop. The issue is that there's an implicit
assumption that the Preds passed into UpdateAnalysisInformation are
reachable. However, this is not true and breaks LI by incorrectly
updating the header of a loop.
One such case is when we generate dedicated exits when the exit block is
a landing pad (through SplitLandingPadPredecessors). There maybe other
cases as well, since we do not guarantee that Preds passed in are
reachable basic blocks.
The added test case shows how loop-simplify breaks LI for the outer loop (and DT in turn)
after we try to generate the LoopSimplifyForm.
Craig Topper [Mon, 1 Jan 2018 19:21:35 +0000 (19:21 +0000)]
[SelectionDAG][X86][AArch64] Require targets to specify the promotion type when using setOperationAction Promote for INT_TO_FP and FP_TO_INT
Currently the promotion for these ignores the normal getTypeToPromoteTo and instead just tries to double the element width. This is because the default behavior of getTypeToPromote to just adds 1 to the SimpleVT, which has the affect of increasing the element count while keeping the scalar size the same.
If multiple steps are required to get to a legal operation type, int_to_fp will be promoted multiple times. And fp_to_int will keep trying wider types in a loop until it finds one that works.
getTypeToPromoteTo does have the ability to query a promotion map to get the type and not do the increasing behavior. It seems better to just let the target specify the promotion type in the map explicitly instead of letting the legalizer iterate via widening.
FWIW, it's worth I think for any other vector operations that need to be promoted, we have to specify the type explicitly because the default behavior of getTypeToPromote isn't useful for vectors. The other types of promotion already require either the element count is constant or the total vector width is constant, but neither happens by incrementing the SimpleVT enum.
Craig Topper [Mon, 1 Jan 2018 01:11:29 +0000 (01:11 +0000)]
[X86] Add patterns for using zmm registers for v8i32/v8f32 vselect with the false input being zero.
We can use zmm move with zero masking for this. We already had patterns for using a masked move, but we didn't check for the zero masking case separately.
Craig Topper [Sun, 31 Dec 2017 19:17:52 +0000 (19:17 +0000)]
[X86] Use CONCAT_VECTORS instead of INSERT_SUBVECTOR for padding v4i1/v2i1 vector to v8i1 pre-legalize.
The CONCAT_VECTORS will be lowered to INSERT_SUBVECTOR later. In the modified cases this seems to be enough to trick a later DAG combine into running in a different order than allows the ANDs to be removed.
I'll admit this is a bit of a hack that happens to work, but using CONCAT_VECTORS is more consistent with other legalization code anyway.
Simon Pilgrim [Sun, 31 Dec 2017 17:07:47 +0000 (17:07 +0000)]
[X86][SSE] Don't vectorize splat buildvector of binops (PR30780)
Don't combine buildvector(binop(),binop(),binop(),binop()) -> binop(buildvector(), buildvector()) if its a splat - keep the binop scalar and just splat the result to avoid large vector constants.
Craig Topper [Sun, 31 Dec 2017 07:38:41 +0000 (07:38 +0000)]
[X86] Prevent combining (v8i1 (bitconvert (i8 load)))->(v8i1 load) if we don't have DQI.
We end up using an i8 load via an isel pattern from v8i1 anyway. This just makes it more explicit. This seems to improve codgen in some cases and I'd like to kill off some of the load patterns.
Philip Reames [Sun, 31 Dec 2017 03:34:36 +0000 (03:34 +0000)]
2nd attempt at "fixing" amdgpu tests after r321575​
The test needs to be changed; it was exercising UB and that likely wasn't the intent of the test author. I simply removed the checks because I have absolutely no idea what this test was trying to accomplish. With multiple check patterns, no explanation, and no familiarity on my part with the ISA a true fix is going to have to come from someone familiar with the target.
Serge Pavlov [Sat, 30 Dec 2017 15:37:46 +0000 (15:37 +0000)]
Added support for reading configuration files
Configuration file is read as a response file in which file names in
the nested constructs `@file` are resolved relative to the directory
where the including file resides. Lines in which the first non-whitespace
character is '#' are considered as comments and are skipped. Trailing
backslashes are used to concatenate lines in the same way as they
are used in shell scripts.
Serge Pavlov [Sat, 30 Dec 2017 08:15:15 +0000 (08:15 +0000)]
Added support for reading configuration files
Configuration file is read as a response file in which file names in
the nested constructs `@file` are resolved relative to the directory
where the including file resides. Lines in which the first non-whitespace
character is '#' are considered as comments and are skipped. Trailing
backslashes are used to concatenate lines in the same way as they
are used in shell scripts.
Hiroshi Inoue [Sat, 30 Dec 2017 08:09:04 +0000 (08:09 +0000)]
[PowerPC] fix a bug in TCO eligibility check
If the callee and caller use different calling convensions, we cannot apply TCO if the callee requires arguments on stack; e.g. C calling convention and Fast CC use the same registers for parameter passing, but the stack offset is not necessarily same.
This patch also recommit r319218 "[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions." by @sfertile since the problem reported in r320106 should be fixed.
Philip Reames [Sat, 30 Dec 2017 05:54:22 +0000 (05:54 +0000)]
[instsimplify] consistently handle undef and out of bound indices for insertelement and extractelement
In one case, we were handling out of bounds, but not undef indices. In the other, we were handling undef (with the comment making the analogy to out of bounds), but not out of bounds. Be consistent and treat both undef and constant out of bounds indices as producing undefined results.
As a side effect, this also protects instcombine from having to handle large constant indices as we always simplify first.
Geoff Berry [Fri, 29 Dec 2017 21:01:09 +0000 (21:01 +0000)]
[MachineOperand] Fix LiveDebugVariables code after isRenamable change.
Fix code in LiveDebugVariables that was changing def MachineOperands to
uses, which will hit an assert for dead operands after the change to add
the renamable bit to MachineOperands. Avoid the assert by clearing the
dead bit before changing the operand to a use.
Fixes issue reported in out of tree target by Jesper Antonsson at Ericsson.
Matt Arsenault [Fri, 29 Dec 2017 17:18:14 +0000 (17:18 +0000)]
AMDGPU: Implement getTgtMemIntrinsic for images
Currently all images are lowered to have a single
image PseudoSourceValue. Image stores happen to have
overly strict mayLoad/mayStore/hasSideEffects flags
set on them, so this happens to work. When these
are fixed to be correct, the scheduler breaks
this because the identical PSVs are assumed to
be the same address. These need to be unique
to the image resource value.
Nemanja Ivanovic [Fri, 29 Dec 2017 12:22:27 +0000 (12:22 +0000)]
[PowerPC] Fix for PR35688 - handle out-of-range values for r+r to r+i conversion
Revision 320791 introduced a pass that transforms reg+reg instructions to
reg+imm if they're fed by "load immediate". However, it didn't
handle out-of-range shifts correctly as reported in PR35688.
This patch fixes that and therefore the PR.
Furthermore, there was undefined behaviour in the patch where the RHS of an
initialization expression was 32 bits and constant `1` was shifted left 32
bits. This was fixed by ensuring the RHS is 64 bits just like the LHS.
Fedor Sergeev [Fri, 29 Dec 2017 08:16:06 +0000 (08:16 +0000)]
[PM] pass -debug-pass-manager flag into FunctionToLoopPassAdaptor's canonicalization PM
Summary:
New pass manager driver passes DebugPM (-debug-pass-manager) flag into
individual PassManager constructors in order to enable debug logging.
FunctionToLoopPassAdaptor has its own internal LoopCanonicalizationPM
which never gets its debug logging enabled and that means canonicalization
passes like LoopSimplify are never present in -debug-pass-manager output.
Extending FunctionToLoopPassAdaptor's constructor and
createFunctionToLoopPassAdaptor wrapper with an optional
boolean DebugLogging argument.
Dimitry Andric [Thu, 28 Dec 2017 23:42:44 +0000 (23:42 +0000)]
Avoid modifying DbgInfo while looping in salvageDebuginfo
Summary:
I have been getting rather difficult to reproduce SIGBUS crashes when
compiling certain FreeBSD sources, and their stack traces pointed
squarely at `SelectionDAG::salvageDebugInfo()`:
```
Core was generated by `/usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/usr/bin/cc -cc1 -'.
Program terminated with signal SIGBUS, Bus error.
#0 isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
115 bool isInvalidated() const { return Invalid; }
(gdb) bt
#0 isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115
#1 salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
#2 0x00000000033b2516 in operator() () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3595
#3 __invoke<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/type_traits:4323
#4 __call<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/__functional_base:349
#5 operator() () at /usr/include/c++/v1/functional:1562
#6 0x00000000033b0817 in operator() () at /usr/include/c++/v1/functional:1916
#7 NodeDeleted () at /share/dim/src/freebsd/clang600-import/contrib/llvm/include/llvm/CodeGen/SelectionDAG.h:293
#8 0x0000000003529dde in RemoveDeadNodes () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:610
#9 0x00000000035556df in MorphNodeTo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6794
#10 0x00000000033a9acc in MorphNode () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:2594
#11 0x00000000033ac80b in SelectCodeCommon () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3601
#12 0x00000000023d464b in SelectCode () at /usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/X86GenDAGISel.inc:282902
#13 Select () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:3072
#14 0x00000000033a5afa in DoInstructionSelection () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:988
#15 0x00000000033a4e1a in CodeGenAndEmitDAG () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:868
#16 0x00000000033a2643 in SelectAllBasicBlocks () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624
#17 0x000000000339f158 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#18 0x00000000023d03c4 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:175
#19 0x00000000035cc8c2 in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
#20 0x00000000030dca9a in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1520
#21 0x00000000030dccf3 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1541
#22 0x00000000030dd228 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1597
#23 run () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1700
#24 0x00000000014db578 in EmitAssembly () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:815
#25 EmitBackendOutput () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:1181
#26 0x00000000014d5b26 in HandleTranslationUnit () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:292
#27 0x0000000001c4c332 in ParseAST () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Parse/ParseAST.cpp:159
#28 0x00000000015d546c in Execute () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/FrontendAction.cpp:897
#29 0x0000000001cec311 in ExecuteAction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:991
#30 0x00000000014b4f81 in ExecuteCompilerInvocation () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:252
#31 0x00000000014aa73f in cc1_main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/cc1_main.cpp:221
#32 0x00000000014b2928 in ExecuteCC1Tool () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:309
#33 main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:388
(gdb) frame 1
#1 salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116
7116 if (DV->isInvalidated())
(gdb) disassemble
Dump of assembler code for function salvageDebugInfo():
[...]
0x0000000003557348 <+744>: nopl 0x0(%rax,%rax,1)
0x0000000003557350 <+752>: mov (%r12),%r13
=> 0x0000000003557354 <+756>: cmpb $0x0,0x31(%r13)
0x0000000003557359 <+761>: jne 0x35573b0 <salvageDebugInfo()+848>
(gdb) info registers
[...]
r13 0x5a5a5a5a5a5a5a5a 6510615555426900570
```
The `0x5a5a5a5a5a5a5a5a` value in `r13` indicates the memory was either
uninitialized, or already freed.
Unfortunately I do not have a simple self-contained test case for this.
However, it seems pretty clear that the call to `AddDbgValue()` in
`salvageDebugInfo()` causes the problems, since it modifies
`SelectionDag::DbgInfo` while looping through one of its DenseMaps:
```
void SelectionDAG::salvageDebugInfo(SDNode &N) {
[...]
for (auto DV : GetDbgValues(&N)) {
if (DV->isInvalidated())
continue;
[...]
AddDbgValue(Clone, N0.getNode(), false);
[...]
}
}
```
At least, if I comment out the `AddDbgValue()` call, the crashes go
away. I propose to change this function slightly, similar to the
`SelectionDAG::transferDbgValues()` function just above it, to save the
cloned SDDbgValues in a separate SmallVector, and only call
AddDbgValue() on them after the for loop is done.
Craig Topper [Thu, 28 Dec 2017 19:46:14 +0000 (19:46 +0000)]
[KnownBits] Remove asserts from KnownBits::makeNegative/makeNonNegative
Many of the callers don't guarantee there is no conflict before calling these and instead check for conflicts later.
The makeNegative/makeNonNegative methods replaced Known.One.setSignBit() and Known.Zero.setSignBit() calls that didn't have asserts originally. So removing the asserts is no worse than the original code.
Craig Topper [Thu, 28 Dec 2017 19:46:11 +0000 (19:46 +0000)]
[X86] When lowering extending loads from v2i1/v4i1, if we have VLX, use a narrower extend.
Previously we used an extend from v8i1 to v8i32/v8i64. Then extracted to the final width. But if we have VLX we should extract first. This way we don't end up with an overly large extend.
This allows us to use vcmpeq to make all ones for the sign extend when DQI isn't available. Otherwise we get a VPTERNLOG.
If we make v2i1/v4i1 legal like proposed in D41560, we could always do this and rely on the lowering of the extend to widen when necessary.
Gadi Haber [Thu, 28 Dec 2017 15:00:41 +0000 (15:00 +0000)]
[X86][PREFETCH]: Adding full coverage of MC encoding for the PREFETCH isa sets.<NFC>
NFC.
Adding MC regressions tests to cover the PREFETCH isa sets for both 32 and 64 bit.
This patch is part of a larger task to cover MC encoding of all X86 ISA Sets started in revision: https://reviews.llvm.org/D39952
[dsymutil][NFC] Replace calls to CoreFoundation with LLVM equivalent.
This patch replaces a block of logic that was implemented using
CoreFoundations calls with functionally equivalent logic that makes use
of LLVM libraries.
Max Kazantsev [Thu, 28 Dec 2017 12:03:12 +0000 (12:03 +0000)]
[RewriteStatepoints] Fix incorrect assertion
`RewriteStatepointsForGC` iterates over function blocks and their predecessors
in order of declaration. One of outcomes of this is that callsites are placed in
arbitrary order which has nothing to do with travelsar order.
On the other hand, function `recomputeLiveInValues` asserts that bases are
added to `Info.PointerToBase` before their deried pointers are updated. But
if call sites are processed in order different from RPOT, this is not necessarily
true. We cannot guarantee that the base was placed there before every
pointer derived from it. All we can guarantee is that this base was marked as
known base by this point.
This patch replaces the fact that we assert from checking that the base was
added to the map with assert that the base was marked as known base.
Simon Pilgrim [Thu, 28 Dec 2017 10:05:49 +0000 (10:05 +0000)]
[X86][SSE] Use PMADDWD for v4i32 multiplies with 17 or more leading zeros
If there are 17 or more leading zeros to the v4i32 elements, then we can use PMADD for the integer multiply when PMULLD is unavailable or slow.
The 17 bits need to be zero as the PMADDWD performs a v8i16 signed-mul-extend + pairwise-add - the upper 16 so we're adding a zero pair and the 17th bit so we don't incorrectly sign extend.
Reid Kleckner [Thu, 28 Dec 2017 05:10:33 +0000 (05:10 +0000)]
Revert "[memcpyopt] Teach memcpyopt to optimize across basic blocks"
This reverts r321138. It seems there are still underlying issues with
memdep. PR35519 seems to still be present if debug info is enabled. We
end up losing a memcpy. Somehow during store to memset merging, we
insert the memset after the memcpy or fail to update the memdep analysis
to account for the newly inserted memset of a pair.
int __attribute__((optnone)) main() {
// Put some data in the vector and then remove it so we take the push_back
// fast path.
std::vector<std::pair<std::string, std::vector<std::string>>> crl_set;
crl_set.push_back({"asdf", {}});
crl_set.pop_back();
printf("first word in vector storage: %p\n", *(void**)crl_set.data());
// Do the push_back which may fail to initialize the data.
do_push_back(&crl_set);
auto* first = &crl_set.back().first;
printf("first word in vector storage (should be zero): %p\n",
*(void**)crl_set.data());
assert(first->empty());
puts("ok");
}
Compile with libc++, enable optimizations, and enable debug info:
$ clang++ -stdlib=libc++ -g -O2 t.cpp -o t.exe -Wl,-rpath=llvm/build/lib