Tom Stellard [Mon, 17 Jun 2019 16:27:43 +0000 (16:27 +0000)]
AMDGPU/GlobalISel: Implement select for G_ICMP and G_SELECT
Reviewers: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60640
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363576
91177308-0d34-0410-b5e6-
96231b3b80d8
Francis Visoiu Mistrih [Mon, 17 Jun 2019 16:06:00 +0000 (16:06 +0000)]
[Remarks] Extend -fsave-optimization-record to specify the format
Use -fsave-optimization-record=<format> to specify a different format
than the default, which is YAML.
For now, only YAML is supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363573
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 17 Jun 2019 15:54:36 +0000 (15:54 +0000)]
[X86] combineLoad - begun making the load split code more generic. NFCI.
This is currently only used for ymm->xmm splitting but we shouldn't hardcode the offsets/alignment.
This is necessary for an upcoming patch to split under-aligned non-temporal vector loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363570
91177308-0d34-0410-b5e6-
96231b3b80d8
Whitney Tsang [Mon, 17 Jun 2019 14:38:56 +0000 (14:38 +0000)]
PHINode: introduce setIncomingValueForBlock() function, and use it.
Summary:
There is PHINode::getBasicBlockIndex() and PHINode::setIncomingValue()
but no function to replace incoming value for a specified BasicBlock*
predecessor.
Clearly, there are a lot of places that could use that functionality.
Reviewer: craig.topper, lebedev.ri, Meinersbur, kbarton, fhahn
Reviewed By: Meinersbur, fhahn
Subscribers: fhahn, hiraditya, zzheng, jsji, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D63338
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363566
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 17 Jun 2019 14:38:17 +0000 (14:38 +0000)]
[X86][SSE] Add tests for underaligned nt loads
Test both 'unaligned' (which we should just use regular unaligned loads) and 'subvector aligned' (which we should split)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363565
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 17 Jun 2019 14:26:10 +0000 (14:26 +0000)]
[X86][SSE] Prevent misaligned non-temporal vector load/store combines
For loads, pre-SSE41 we can't perform NT loads at all, and after that we can only perform vector aligned loads, so if the alignment is less than for a xmm we'll just end up using the regular unaligned vector loads anyway.
First step towards fixing PR42026 - the next step for stores will be to use SSE4A movntsd where possible and to avoid the stack spill on SSE2 targets.
Differential Revision: https://reviews.llvm.org/D63246
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363564
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 17 Jun 2019 14:13:29 +0000 (14:13 +0000)]
InferAddressSpaces: Fix cloning original addrspacecast
If an addrspacecast needed to be inserted again, this was creating a
clone of the original cast for each user. Just use the original, which
also saves losing the value name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363562
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 17 Jun 2019 14:13:24 +0000 (14:13 +0000)]
AMDGPU: Ignore subtarget for InferAddressSpaces
Even if the target doesn't have flat instructions, addrspace(0) is
still flat. It just happens to not work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363561
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 17 Jun 2019 13:52:24 +0000 (13:52 +0000)]
AMDGPU: Mark exp/exp.compr as inaccessiblememonly
Should also be marked writeonly, but I think that would require
splitting the version with done set to a separate intrinsic
Test change is only from renumbering the attribute group numbers,
which for some reason the generated check lines consider.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363560
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 17 Jun 2019 13:52:19 +0000 (13:52 +0000)]
AMDGPU/GlobalISel: Fix default mapping for non-register operands
Tests will be in future commits when new intrinsics are handled here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363559
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 17 Jun 2019 13:52:15 +0000 (13:52 +0000)]
AMDGPU: Cleanup custom PseudoSourceValue definitions
Use separate enums for each kind, avoid repeating overloads, and add
missing classof implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363558
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Mon, 17 Jun 2019 13:39:28 +0000 (13:39 +0000)]
[CodeGen] Check for HardwareLoop Latch ExitBlock
The HardwareLoops pass finds exit blocks with a scevable exit count.
If the target specifies to update the loop counter in a register,
through a phi, we need to ensure that the exit block is a latch so
that we can insert the phi with the correct value for the incoming
edge.
Differential Revision: https://reviews.llvm.org/D63336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363556
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 17 Jun 2019 12:35:26 +0000 (12:35 +0000)]
[X86][SSE] Avoid unnecessary stack codegen in NT store codegen tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363552
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Mon, 17 Jun 2019 12:24:04 +0000 (12:24 +0000)]
AsmPrinter: add doc-string for EmitLinkage
Change-Id: I376fcbd58f84a2aac6aaf744bc1665c92d312b25
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363550
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Mon, 17 Jun 2019 12:18:27 +0000 (12:18 +0000)]
gn build: Merge r363530
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363549
91177308-0d34-0410-b5e6-
96231b3b80d8
Bjorn Pettersson [Mon, 17 Jun 2019 12:02:24 +0000 (12:02 +0000)]
[LV] Deny irregular types in interleavedAccessCanBeWidened
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.
Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.
Reviewers: fhahn, Ayal, craig.topper
Reviewed By: fhahn
Subscribers: hiraditya, rkruppe, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363547
91177308-0d34-0410-b5e6-
96231b3b80d8
Sander de Smalen [Mon, 17 Jun 2019 12:01:53 +0000 (12:01 +0000)]
Test forward references in IntrinsicEmitter on Neon LD(2|3|4)
This patch tests the forward-referencing added in D62995 by changing
some existing intrinsics to use forward referencing of overloadable
parameters, rather than backward referencing.
This patch changes the TableGen definition/implementation of
llvm.aarch64.neon.ld2lane and llvm.aarch64.neon.ld2lane intrinsics
(and similar for ld3 and ld4). This change is intended to be
non-functional, since the behaviour of the intrinsics is
expected to be the same.
Reviewers: arsenm, dmgreen, RKSimon, greened, rnk
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D63189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363546
91177308-0d34-0410-b5e6-
96231b3b80d8
Luis Marques [Mon, 17 Jun 2019 10:54:12 +0000 (10:54 +0000)]
[DAGCombiner] [CodeGenPrepare] More comprehensive GEP splitting
Some GEPs were not being split, presumably because that split would just be
undone by the DAGCombiner. Not performing those splits can prevent important
optimizations, such as preventing the element indices / member offsets from
being (partially) folded into load/store instruction immediates. This patch:
- Makes the splits also occur in the cases where the base address and the GEP
are in the same BB.
- Ensures that the DAGCombiner doesn't reassociate them back again.
Differential Revision: https://reviews.llvm.org/D60294
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363544
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Mon, 17 Jun 2019 10:20:20 +0000 (10:20 +0000)]
Fix clang -Wcovered-switch-default after stack-id change by D60137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363543
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 17 Jun 2019 10:14:52 +0000 (10:14 +0000)]
[SelectionDAG] Fold insert_subvector(undef, extract_subvector(v, c), c) -> v in getNode
This is already done in DAGCombiner::visitINSERT_SUBVECTOR, but this helps a number of shuffles across different vector widths recognise when they come from the same source.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363542
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Mon, 17 Jun 2019 10:05:18 +0000 (10:05 +0000)]
[SCEV] Use NoWrapFlags when expanding a simple mul
Second functional change following on from rL362687. Pass the
NoWrapFlags from the MulExpr to InsertBinop when we're generating a
shl or mul.
Differential Revision: https://reviews.llvm.org/D61934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363540
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Mon, 17 Jun 2019 09:59:55 +0000 (09:59 +0000)]
[llvm-objdump] Use %08 instead of %016 to print leading addresses for 32-bit binaries
Reviewed By: grimar
Differential Revision: https://reviews.llvm.org/D63398
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363539
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Mon, 17 Jun 2019 09:51:07 +0000 (09:51 +0000)]
[lit] Delete empty lines at the end of lit.local.cfg NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363538
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Mon, 17 Jun 2019 09:50:50 +0000 (09:50 +0000)]
[NFC][Codegen] Standalone tests for icmp eq/ne (urem %x, C), 0 -> icmp eq/ne %x, 0 fold (D63390)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363537
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Mon, 17 Jun 2019 09:29:50 +0000 (09:29 +0000)]
[ARM] Fix another -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363535
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Mon, 17 Jun 2019 09:26:50 +0000 (09:26 +0000)]
[ARM] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D63265
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363534
91177308-0d34-0410-b5e6-
96231b3b80d8
Sander de Smalen [Mon, 17 Jun 2019 09:13:29 +0000 (09:13 +0000)]
Describe stack-id as an enum
This patch changes MIR stack-id from an integer to an enum,
and adds printing/parsing support for this in MIR files. The default
stack-id '0' is now renamed to 'default'.
This should make MIR tests that have stack objects with different stack-ids
more descriptive. It also clarifies code operating on StackID.
Reviewers: arsenm, thegameg, qcolombet
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D60137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363533
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Mon, 17 Jun 2019 09:13:10 +0000 (09:13 +0000)]
[ARM] Remove ARMComputeBlockSize
Forgot to remove file!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363532
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Mon, 17 Jun 2019 09:05:43 +0000 (09:05 +0000)]
[ARM] Add ARMBasicBlockInfo.cpp
Forgot to add file!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363531
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Mon, 17 Jun 2019 08:49:09 +0000 (08:49 +0000)]
[ARM] Extract some code from ARMConstantIslandPass
Create the ARMBasicBlockUtils class for tracking and querying basic
blocks sizes so we can use them when generating low-overhead loops.
Differential Revision: https://reviews.llvm.org/D63265
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363530
91177308-0d34-0410-b5e6-
96231b3b80d8
Hans Wennborg [Mon, 17 Jun 2019 07:47:28 +0000 (07:47 +0000)]
Re-commit r357452 (take 3): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"
Third time's the charm.
This was reverted in r363220 due to being suspected of an internal benchmark
regression and a test failure, none of which turned out to be caused by this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363529
91177308-0d34-0410-b5e6-
96231b3b80d8
Yevgeny Rouban [Mon, 17 Jun 2019 05:55:12 +0000 (05:55 +0000)]
[SimplifyCFG] Fix prof branch_weights MD while removing unreachable switch cases
SimplifyCFG has a bug that results in inconsistent prof branch_weights metadata
if unreachable switch cases are removed. This patch fixes this bug by making use
of the newly introduced SwitchInstProfUpdateWrapper class (see patch D62122).
A new test is created.
Differential Revision: https://reviews.llvm.org/D62186
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363527
91177308-0d34-0410-b5e6-
96231b3b80d8
Justin Hibbits [Mon, 17 Jun 2019 03:15:23 +0000 (03:15 +0000)]
PowerPC: Optimize SPE double parameter calling setup
Summary:
SPE passes doubles the same as soft-float, in register pairs as i32
types. This is all handled by the target-independent layer. However,
this is not optimal when splitting or reforming the doubles, as it
pushes to the stack and loads from, on either side.
For instance, to pass a double argument to a function, assuming the
double value is in r5, the sequence currently looks like this:
evstdd 5, X(1)
lwz 3, X(1)
lwz 4, X+4(1)
Likewise, to form a double into r5 from args in r3 and r4:
stw 3, X(1)
stw 4, X+4(1)
evldd 5, X(1)
This optimizes the fence to use SPE instructions. Now, to pass a double
to a function:
mr 4, 5
evmergehi 3, 5, 5
And to form a double into r5 from args in r3 and r4:
evmergelo 5, 3, 4
This is comparable to the way that gcc generates the double splits.
This also fixes a bug with expanding builtins to libcalls, where the
LowerCallTo() code path was generating intermediate illegal type nodes.
Reviewers: nemanjai, hfinkel, joerg
Subscribers: kbarton, jfb, jsji, llvm-commits
Differential Revision: https://reviews.llvm.org/D54583
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363526
91177308-0d34-0410-b5e6-
96231b3b80d8
Seiya Nuta [Mon, 17 Jun 2019 02:07:20 +0000 (02:07 +0000)]
[yaml2obj][MachO] Don't fill dummy data for virtual sections
Summary:
Currently, MachOWriter::writeSectionData writes dummy data (0xdeadbeef) to fill section data areas in the file even if the section is a virtual one. Since virtual sections don't occupy any space in the file, writing dummy data could results the "OS.tell() - fileStart <= Sec.offset" assertion failure.
This patch fixes the bug by simply not writing any dummy data for virtual sections.
Reviewers: beanz, jhenderson, rupprecht, alexshap
Reviewed By: alexshap
Subscribers: compnerd, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62991
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363525
91177308-0d34-0410-b5e6-
96231b3b80d8
Seiya Nuta [Mon, 17 Jun 2019 02:03:45 +0000 (02:03 +0000)]
[llvm-objcopy] Add elf32-sparc and elf32-sparcel target
Summary:
The "sparc"/"sparcel" architectures appears in ArchMap (used by -B option) but not in OutputFormatMap (used by -I/-O option). Add their targets into OutputFormatMap for consistency.
Note that AFAIK there're no targets for 32-bit little-endian SPARC ("elf32-sparcel") in GNU binutils.
Reviewers: espindola, alexshap, rupprecht, jhenderson, compnerd, jakehehrlich
Reviewed By: jhenderson, compnerd, jakehehrlich
Subscribers: jyknight, emaste, arichardson, fedor.sergeev, jakehehrlich, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63238
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363524
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 16 Jun 2019 22:33:09 +0000 (22:33 +0000)]
[X86] Add TB_NO_REVERSE to some folding table entries where the register from uses the REX prefix, but the memory form does not.
It would not be safe to unfold the memory form the register form
without checking that we are compiling for 64-bit mode.
This probaby isn't a real functional issue since we are unlikely
to unfold any of these instructions since they don't have any
tied registers, aren't commutable, and don't have any inputs
other than the address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363523
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sun, 16 Jun 2019 20:39:45 +0000 (20:39 +0000)]
[InstSimplify] Fix addo/subo undef folds (PR42209)
Fix folds of addo and subo with an undef operand to be:
`@llvm.{u,s}{add,sub}.with.overflow` all fold to `{ undef, false }`,
as per LLVM undef rules.
Same for commuted variants.
Based on the original version of the patch by @nikic.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=42209 | PR42209 ]]
Differential Revision: https://reviews.llvm.org/D63065
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363522
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Sun, 16 Jun 2019 18:30:42 +0000 (18:30 +0000)]
[AsmPrinter] Make EmitLinkage and EmitVisibility public
Summary:
This allows target to implement custom emit of global variables if
required. See subsequent patch for a use case.
Change-Id: I9654197e3df24503104a54c41fff06845aed37fe
Reviewers: arsenm, kzhuravl
Subscribers: wdng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61650
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363519
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Sun, 16 Jun 2019 17:43:37 +0000 (17:43 +0000)]
AMDGPU: Prepare for explicit absolute relocations in code generation
Summary:
We will use absolute relocations for LDS symbols.
Change-Id: I9a32795ed0ea835e433a787129cfe3c57ee9a325
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61492
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363517
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Sun, 16 Jun 2019 17:32:01 +0000 (17:32 +0000)]
AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0
Summary:
Instead of encoding a high-word of 0 using a fake TargetGlobalAddress,
just use a literal target constant. This simplifies some subsequent changes.
The generated assembly is now more explicit about the kind of relocation
that is to be used.
Change-Id: I066835202d23b5941fa7a358eb4b89e9b71ab6f8
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61491
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363516
91177308-0d34-0410-b5e6-
96231b3b80d8
Nicolai Haehnle [Sun, 16 Jun 2019 17:14:12 +0000 (17:14 +0000)]
AMDGPU/GFX10: Support DLC bit in llvm.amdgcn.s.buffer.load intrinsic
Summary: Change-Id: Ie4c971462a7749740938c687144e77441dac2539
Reviewers: rampitec, arsenm
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62486
Change-Id: Iae59523edd75c74918d2118df6571a7b671717a0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363514
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Sun, 16 Jun 2019 17:13:09 +0000 (17:13 +0000)]
[AMDGPU] gfx10 conditional registers handling
This is cpp source part of wave32 support, excluding overriden
getRegClass().
Differential Revision: https://reviews.llvm.org/D63351
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363513
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sun, 16 Jun 2019 15:29:03 +0000 (15:29 +0000)]
[CodeGenPrepare][x86] shift both sides of a vector select when profitable
This is based on the example/discussion in PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428
Proper vector shift instructions don't appear until AVX2, so we may generate several
extra instructions within a loop trying to compensate for that. It's difficult to
recover from that shift expansion later than this, so use the existing TLI hook and
splat analysis to enable better codegen.
This extends CGP functionality introduced with:
rL201655
Differential Revision: https://reviews.llvm.org/D63233
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363511
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Sun, 16 Jun 2019 14:04:49 +0000 (14:04 +0000)]
[x86] split 256-bit vector selects if operands are vector concats
This is similar logic/motivation to the select splitting in D62969.
In D63233, the pattern changes so that we no longer have an extract_subvector of vselect,
but the operands of the select are still being concatenated.
The closest case is represented in either the first or last test diffs here - we have an
extra instruction, but we converted 3-4 ymm instructions into 4-5 xmm instructions.
I think that's the right trade-off for most AVX1 targets.
In the example based on PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428
...this makes the loop about 30% faster (tested on Haswell by compiling with -mavx).
Differential Revision: https://reviews.llvm.org/D63364
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363508
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sun, 16 Jun 2019 08:00:41 +0000 (08:00 +0000)]
[X86] CombineShuffleWithExtract - handle cases with different vector extract sources
Insert the shorter vector source into an undef vector of the longer vector source's type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363507
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Sun, 16 Jun 2019 02:24:01 +0000 (02:24 +0000)]
gn build: Merge r363444
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363505
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 15 Jun 2019 19:12:44 +0000 (19:12 +0000)]
[X86] CombineShuffleWithExtract - assert all src ops types are multiples of rootsize. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363501
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 15 Jun 2019 18:30:43 +0000 (18:30 +0000)]
[X86][AVX] Handle lane-crossing shuffle(extract_subvector(x,c1),extract_subvector(y,c2),m1) shuffles
Pull out the existing (non)lane-crossing fold into a helper lambda and use for lane-crossing unary shuffles as well.
Fixes PR34380
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363500
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 15 Jun 2019 17:05:24 +0000 (17:05 +0000)]
[X86][AVX] Decode constant bits from insert_subvector(c1, c2, c3)
This mostly happens due to SimplifyDemandedVectorElts reducing a vector to insert_subvector(undef, c1, 0)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363499
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sat, 15 Jun 2019 16:12:13 +0000 (16:12 +0000)]
[NFC][MCA][X86] Add one more 'clear super register' pattern - movss/movsd load clears high XMM bits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363498
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sat, 15 Jun 2019 16:12:05 +0000 (16:12 +0000)]
[NFC][MCA][X86] Add baseline test coverage for AMD Barcelona (aka K10, fam10h)
Looking into sched model for that CPU ...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363497
91177308-0d34-0410-b5e6-
96231b3b80d8
Aaron Puchert [Sat, 15 Jun 2019 15:38:51 +0000 (15:38 +0000)]
[Clang] Harmonize Split DWARF options with llc
Summary:
With Split DWARF the resulting object file (then called skeleton CU)
contains the file name of another ("DWO") file with the debug info.
This can be a problem for remote compilation, as it will contain the
name of the file on the compilation server, not on the client.
To use Split DWARF with remote compilation, one needs to either
* make sure only relative paths are used, and mirror the build directory
structure of the client on the server,
* inject the desired file name on the client directly.
Since llc already supports the latter solution, we're just copying that
over. We allow setting the actual output filename separately from the
value of the DW_AT_[GNU_]dwo_name attribute in the skeleton CU.
Fixes PR40276.
Reviewers: dblaikie, echristo, tejohnson
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D59673
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363496
91177308-0d34-0410-b5e6-
96231b3b80d8
Kang Zhang [Sat, 15 Jun 2019 15:10:24 +0000 (15:10 +0000)]
[PowerPC] Set the innermost hot loop to align 32 bytes
Summary:
If the nested loop is an innermost loop, prefer to a 32-byte alignment, so that
we can decrease cache misses and branch-prediction misses. Actual alignment of
the loop will depend on the hotness check and other logic in alignBlocks.
The old code will only align hot loop to 32 bytes when the LoopSize larger than
16 bytes and smaller than 32 bytes, this patch will align the innermost hot loop
to 32 bytes not only for the hot loop whose size is 16~32 bytes.
Reviewed By: steven.zhang, jsji
Differential Revision: https://reviews.llvm.org/D61228
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363495
91177308-0d34-0410-b5e6-
96231b3b80d8
Gauthier Harnisch [Sat, 15 Jun 2019 10:24:47 +0000 (10:24 +0000)]
[clang] Add storage for APValue in ConstantExpr
Summary:
When using ConstantExpr we often need the result of the expression to be kept in the AST. Currently this is done on a by the node that needs the result and has been done multiple times for enumerator, for constexpr variables... . This patch adds to ConstantExpr the ability to store the result of evaluating the expression. no functional changes expected.
Changes:
- Add trailling object to ConstantExpr that can hold an APValue or an uint64_t. the uint64_t is here because most ConstantExpr yield integral values so there is an optimized layout for integral values.
- Add basic* serialization support for the trailing result.
- Move conversion functions from an enum to a fltSemantics from clang::FloatingLiteral to llvm::APFloatBase. this change is to make it usable for serializing APValues.
- Add basic* Import support for the trailing result.
- ConstantExpr created in CheckConvertedConstantExpression now stores the result in the ConstantExpr Node.
- Adapt AST dump to print the result when present.
basic* : None, Indeterminate, Int, Float, FixedPoint, ComplexInt, ComplexFloat,
the result is not yet used anywhere but for -ast-dump.
Reviewers: rsmith, martong, shafik
Reviewed By: rsmith
Subscribers: rnkovacs, hiraditya, dexonsmith, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D62399
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363493
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Sat, 15 Jun 2019 10:09:59 +0000 (10:09 +0000)]
[BranchProbability] Delete a redundant overflow check
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363492
91177308-0d34-0410-b5e6-
96231b3b80d8
Nikita Popov [Sat, 15 Jun 2019 09:15:52 +0000 (09:15 +0000)]
[SCEV] Use unsigned/signed intersection type in SCEV
Based on D59959, this switches SCEV to use unsigned/signed range
intersection based on the sign hint. This will prefer non-wrapping
ranges in the relevant domain. I've left the one intersection in
getRangeForAffineAR() to use the smallest intersection heuristic,
as there doesn't seem to be any obvious preference there.
Differential Revision: https://reviews.llvm.org/D60035
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363490
91177308-0d34-0410-b5e6-
96231b3b80d8
Nikita Popov [Sat, 15 Jun 2019 08:48:52 +0000 (08:48 +0000)]
[SimplifyIndVar] Simplify non-overflowing saturating add/sub
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.
Differential Revision: https://reviews.llvm.org/D62792
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363489
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Sat, 15 Jun 2019 07:49:14 +0000 (07:49 +0000)]
[RISCV] Regenerate remat.ll and atomic-rmw.ll after D43256
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363487
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Sat, 15 Jun 2019 06:14:15 +0000 (06:14 +0000)]
[RISCV] Simplify RISCVAsmBackend::writeNopData(). NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363486
91177308-0d34-0410-b5e6-
96231b3b80d8
Alex Brachet [Sat, 15 Jun 2019 05:32:23 +0000 (05:32 +0000)]
[objcopy] Error when --preserve-dates is specified with standard streams
Summary: llvm-objcopy/strip now error when -p is specified when reading from stdin or writing to stdout
Reviewers: jhenderson, rupprecht, espindola, alexshap
Reviewed By: jhenderson, rupprecht
Subscribers: emaste, arichardson, jakehehrlich, MaskRay, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63090
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363485
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Berg [Sat, 15 Jun 2019 04:53:51 +0000 (04:53 +0000)]
adding more fmf propagation for selects plus updated tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363484
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Sat, 15 Jun 2019 03:51:08 +0000 (03:51 +0000)]
Revert "adding more fmf propagation for selects plus tests"
This reverts rL363474. -debug-only=isel was added to some tests that
don't specify `REQUIRES: asserts`. This causes failures on
-DLLVM_ENABLE_ASSERTIONS=off builds.
I chose to revert instead of fixing the tests because I'm not sure
whether we should add `REQUIRES: asserts` to more tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363482
91177308-0d34-0410-b5e6-
96231b3b80d8
Huihui Zhang [Sat, 15 Jun 2019 00:33:41 +0000 (00:33 +0000)]
[InstCombine] Add tests to show missing fold opportunity for "icmp and shift" (nfc).
Summary:
For icmp pred (and (sh X, Y), C), 0
When C is signbit, expect to fold (X << Y) & signbit ==/!= 0 into (X << Y) >=/< 0,
rather than (X & (signbit >> Y)) != 0.
When C+1 is power of 2, expect to fold (X << Y) & ~C ==/!= 0 into (X << Y) </>= C+1,
rather than (X & (~C >> Y)) == 0.
For icmp pred (and X, (sh signbit, Y)), 0
Expect to fold (X & (signbit l>> Y)) ==/!= 0 into (X << Y) >=/< 0
Expect to fold (X & (signbit << Y)) ==/!= 0 into (X l>> Y) >=/< 0
Reviewers: lebedev.ri, efriedma, spatel, craig.topper
Reviewed By: lebedev.ri
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63025
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363479
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Sat, 15 Jun 2019 00:33:26 +0000 (00:33 +0000)]
Reapply "GlobalISel: Avoid producing Illegal copies in RegBankSelect"
This reapplies r363410, avoiding null dereference if there is no
AltRegBank.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363478
91177308-0d34-0410-b5e6-
96231b3b80d8
Richard Smith [Fri, 14 Jun 2019 23:56:40 +0000 (23:56 +0000)]
Add a map_range function for applying map_iterator to a range.
In preparation for use in Clang.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363477
91177308-0d34-0410-b5e6-
96231b3b80d8
Mitch Phillips [Fri, 14 Jun 2019 23:45:34 +0000 (23:45 +0000)]
Revert "GlobalISel: Avoid producing Illegal copies in RegBankSelect"
This patch breaks UBSan build bots. See
https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild for
a guide as to how to reproduce the error.
This reverts commit
c2864c0de07efb5451d32d27a7d4ff2984830929.
This reverts rL363410.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363476
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Berg [Fri, 14 Jun 2019 23:30:52 +0000 (23:30 +0000)]
adding more fmf propagation for selects plus tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363474
91177308-0d34-0410-b5e6-
96231b3b80d8
Guozhi Wei [Fri, 14 Jun 2019 23:08:59 +0000 (23:08 +0000)]
[MBP] Move a latch block with conditional exit and multi predecessors to top of loop
Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is:
* a latch block
* it has two successors, one is loop header, another is exit
* it has more than one predecessors
If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch.
Differential Revision: https://reviews.llvm.org/D43256
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363471
91177308-0d34-0410-b5e6-
96231b3b80d8
Akira Hatanaka [Fri, 14 Jun 2019 22:06:32 +0000 (22:06 +0000)]
[ObjC][ARC] Delete ObjC runtime calls on global variables annotated
with 'objc_arc_inert'
Those calls are no-ops, so they can be safely deleted.
rdar://problem/
49839633
Differential Revision: https://reviews.llvm.org/D62433
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363468
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:52:26 +0000 (21:52 +0000)]
AMDGPU: Avoid most waitcnts before calls
Currently you get extra waits, because waits are inserted for the
register dependencies of the call, and the function prolog waits on
everything.
Currently waits are still inserted on returns. It may make sense to
not do this, and wait in the caller instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363465
91177308-0d34-0410-b5e6-
96231b3b80d8
Ziang Wan [Fri, 14 Jun 2019 21:42:21 +0000 (21:42 +0000)]
Add --print-supported-cpus flag for clang.
This patch allows clang users to print out a list of supported CPU models using
clang [--target=<target triple>] --print-supported-cpus
Then, users can select the CPU model to compile to using
clang --target=<triple> -mcpu=<model> a.c
It is a handy feature to help cross compilation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363464
91177308-0d34-0410-b5e6-
96231b3b80d8
Francis Visoiu Mistrih [Fri, 14 Jun 2019 21:38:57 +0000 (21:38 +0000)]
[Remarks][NFC] Improve testing and documentation of -foptimization-record-passes
This adds:
* documentation to the user manual
* nicer error message
* test for the error case
* test for the gold plugin
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363463
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:38:31 +0000 (21:38 +0000)]
SROA: Allow eliminating addrspacecasted allocas
There is a circular dependency between SROA and InferAddressSpaces
today that requires running both multiple times in order to be able to
eliminate all simple allocas and addrspacecasts. InferAddressSpaces
can't remove addrspacecasts when written to memory, and SROA helps
move pointers out of memory.
This should avoid inserting new commuting addrspacecasts with GEPs,
since there are unresolved questions about pointer wrapping between
different address spaces.
For now, don't replace volatile operations that don't match the alloca
addrspace, as it would change the address space of the access. It may
be still OK to insert an addrspacecast from the new alloca, but be
more conservative for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363462
91177308-0d34-0410-b5e6-
96231b3b80d8
Jinsong Ji [Fri, 14 Jun 2019 21:33:51 +0000 (21:33 +0000)]
[PowerPC][NFC] Comments update and remove some unused def
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363461
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:22:26 +0000 (21:22 +0000)]
SROA: Add baseline test for addrspacecast changes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363460
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:16:06 +0000 (21:16 +0000)]
AMDGPU: Fix capitalized register names in asm constraints
This was a workaround a long time ago, but the canonical lower case
names work now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363459
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:01:24 +0000 (21:01 +0000)]
AMDGPU: Fix dropping memref for ds append/consume
The way SelectionDAG treats memory operands is very frustrating, and
by default drops them unless a property is set on the pattern. There
is no pattern for manually selected instructions, so this requires
manually setting them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363455
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:01:24 +0000 (21:01 +0000)]
AMDGPU: Set isTrap on S_TRAP
This seems to only be used for generating some kind
of documentation, but might as well set it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363454
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 21:01:23 +0000 (21:01 +0000)]
AMDGPU: Add baseline test for call waitcnt insertion
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363453
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 20:40:15 +0000 (20:40 +0000)]
UpdateTestChecks: Consider .section as end of function for AMDGPU
Kernels seem to go directly to a section switch instead of emitting
.Lfunc_end. This fixes including all of the kernel metadata in the
check lines, which is undesirable most of the time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363452
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 14 Jun 2019 20:03:42 +0000 (20:03 +0000)]
[x86] add test for 256-bit blendv with AVX targets; NFC
This is a reduction of the pattern seen in D63233.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363448
91177308-0d34-0410-b5e6-
96231b3b80d8
Lang Hames [Fri, 14 Jun 2019 19:41:21 +0000 (19:41 +0000)]
[JITLink] Move JITLinkMemoryManager into its own header.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363444
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Fri, 14 Jun 2019 18:28:57 +0000 (18:28 +0000)]
build: extract LLVM distribution target handling
This extracts the LLVM distribution target handling into a support module.
Extraction will enable us to restructure the builds to support multiple
distribution configurations (e.g. developer and user) to permit us to build the
development package and the user package at once.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363440
91177308-0d34-0410-b5e6-
96231b3b80d8
Francis Visoiu Mistrih [Fri, 14 Jun 2019 18:18:26 +0000 (18:18 +0000)]
[Remarks] Use the RemarkSetup error in setupOptimizationRemarks
Added the errors in r363415 but they were not used in the
RemarkStreamer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363439
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 14 Jun 2019 18:07:00 +0000 (18:07 +0000)]
gn build: Add NVPTX target
The NVPTX target is a bit unusual in that it's the only target without a
disassembler, and one of three targets without an asm parser (and the
first one of those three in the gn build). NVPTX doesn't have those
because it's not a binary format.
The CMake build checks for the existence of
{AsmParser,Disassembler}/CMakeLists.txt when setting
LLVM_ENUM_ASM_PARSERS / LLVM_ENUM_DISASSEBLERS
(http://llvm-cs.pcc.me.uk/CMakeLists.txt#744). The GN build doesn't want
to hit the disk for things like this, so instead I'm adding explicit
`targets_with_asm_parsers` and `targets_with_disassemblers` lists. Since
both are needed rarely, they are defined in their own gni files.
Differential Revision: https://reviews.llvm.org/D63210
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363437
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 14 Jun 2019 17:58:34 +0000 (17:58 +0000)]
gn build: Simplify Target build files
Now that the cycle between MCTargetDesc and TargetInfo is gone
(see revisions 360709 360718 360722 360724 360726 360731 360733 360735 360736),
remove the dependency from TargetInfo on MCTargetDesc:tablegen. In most
targets, this makes MCTargetDesc:tablegen have just a single use, so
inline it there.
For AArch64, ARM, and RISCV there's still a similar cycle between
MCTargetDesc and Utils, so the MCTargetDesc:tablegen indirection is
still needed there.
Differential Revision: https://reviews.llvm.org/D63200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363436
91177308-0d34-0410-b5e6-
96231b3b80d8
Amara Emerson [Fri, 14 Jun 2019 17:55:48 +0000 (17:55 +0000)]
[GlobalISel] Add a G_BRJT opcode.
This is a branch opcode that takes a jump table pointer, jump table index and an
index into the table to do an indirect branch.
We pass both the table pointer and JTI to allow targets like ARM64 to more
easily use the existing jump table compression optimization without having to
walk up the block to find a paired G_JUMP_TABLE.
Differential Revision: https://reviews.llvm.org/D63159
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363434
91177308-0d34-0410-b5e6-
96231b3b80d8
Florian Hahn [Fri, 14 Jun 2019 17:23:09 +0000 (17:23 +0000)]
Revert Fix a bug w/inbounds invalidation in LFTR
Reverting because it breaks a green dragon build:
http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208
This reverts r363289 (git commit
eb88badff96dacef8fce3f003dec34c2ef6900bf)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363427
91177308-0d34-0410-b5e6-
96231b3b80d8
Florian Hahn [Fri, 14 Jun 2019 17:22:56 +0000 (17:22 +0000)]
Revert [LFTR] Stylistic cleanup as suggested in last review comment of D62939 [NFC]
Reverting because it depends on r363289, which breaks a green dragon build:
http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208
This reverts r363292 (git commit
42a3fc133d3544b5c0c032fe99c6e8a469a836c2)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363426
91177308-0d34-0410-b5e6-
96231b3b80d8
Florian Hahn [Fri, 14 Jun 2019 17:22:49 +0000 (17:22 +0000)]
Revert [LFTR] Rename variable to minimize confusion [NFC]
Reverting because it depends on r363289, which breaks a green dragon
build:
http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208
This reverts r363293 (git commit
c37be29634214fb1cb4c823840bffc31e5ebfe40)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363425
91177308-0d34-0410-b5e6-
96231b3b80d8
Aditya Nandakumar [Fri, 14 Jun 2019 17:19:37 +0000 (17:19 +0000)]
[GISel]: Fix pattern matcher for m_OneUse
https://reviews.llvm.org/D63302
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363424
91177308-0d34-0410-b5e6-
96231b3b80d8
Jinsong Ji [Fri, 14 Jun 2019 17:04:24 +0000 (17:04 +0000)]
[PowerPC][NFC] Format comments in P9InstrResrouce.td
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363423
91177308-0d34-0410-b5e6-
96231b3b80d8
Shawn Landden [Fri, 14 Jun 2019 16:56:49 +0000 (16:56 +0000)]
[SimpligyCFG] NFC intended, remove GCD that was only used for powers of two
and replace with an equilivent countTrailingZeros.
GCD is much more expensive than this, with repeated division.
This depends on D60823
Differential Revision: https://reviews.llvm.org/D61151
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363422
91177308-0d34-0410-b5e6-
96231b3b80d8
Saleem Abdulrasool [Fri, 14 Jun 2019 16:47:04 +0000 (16:47 +0000)]
build: don't attempt to run config.guess on Windows
When cross-compiling LLVM to android from Windows (for LLVMSupport), we would
attempt to execute `config.guess` to determine the host triple since
`CMAKE_SYSTEM_NAME` is not Windows and `CMAKE_C_COMPILER` will be set to GNU or
Clang. This will fail as `config.guess` is a shell script which cannot be
executed on Windows. Simply log a warning instead. The user can specify the
value for this instead in those cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363420
91177308-0d34-0410-b5e6-
96231b3b80d8
Valery Pykhtin [Fri, 14 Jun 2019 16:37:33 +0000 (16:37 +0000)]
[AMDGPU] Don't constrain callees with inlinehint from inlining on MaxBB check
Summary: Function bodies marked inline in an opencl source are eliminated but MaxBB check may prevent inlining them leaving undefined references.
Reviewers: rampitec, arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, Anastasia, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63337
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363418
91177308-0d34-0410-b5e6-
96231b3b80d8
Kevin P. Neal [Fri, 14 Jun 2019 16:28:55 +0000 (16:28 +0000)]
[FPEnv] Lower STRICT_FP_EXTEND and STRICT_FP_ROUND nodes in preprocess phase of ISelLowering to mirror non-strict nodes on x86.
I recently discovered a bug on the x86 platform: The fp80 type was not handled well by x86 for constrained floating point nodes, as their regular counterparts are replaced by extending loads and truncating stores during the preprocess phase. Normally, platforms don't have this issue, as they don't typically attempt to perform such legalizations during instruction selection preprocessing. Before this change, strict_fp nodes survived until they were mutated to normal nodes, which happened shortly after preprocessing on other platforms. This modification lowers these nodes at the same phase while properly utilizing the chain.5
Submitted by: Drew Wock <drew.wock@sas.com>
Reviewed by: Craig Topper, Kevin P. Neal
Approved by: Craig Topper
Differential Revision: https://reviews.llvm.org/D63271
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363417
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Fri, 14 Jun 2019 16:25:46 +0000 (16:25 +0000)]
[AMDGPU] gfx1010 BoolReg definition. NFC.
Earlier commit has added AMDGPUOperand::isBoolReg(). Turns out
gcc issues warning about unused function since D63204 is not
yet submitted.
Added NFC part of D63204 to have a use of that function and
mute the warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363416
91177308-0d34-0410-b5e6-
96231b3b80d8
Francis Visoiu Mistrih [Fri, 14 Jun 2019 16:20:51 +0000 (16:20 +0000)]
Reland: [Remarks] Refactor optimization remarks setup
* Add a common function to setup opt-remarks
* Rename common options to the same names
* Add error types to distinguish between file errors and regex errors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363415
91177308-0d34-0410-b5e6-
96231b3b80d8
Sanjay Patel [Fri, 14 Jun 2019 15:23:09 +0000 (15:23 +0000)]
[x86] move vector shift tests for PR37428; NFC
As suggested in the post-commit thread for rL363392 - it's
wasteful to have so many runs for larger tests. AVX1/AVX2
is what shows the diff and probably what matters most going
forward.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363411
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 14 Jun 2019 15:22:25 +0000 (15:22 +0000)]
GlobalISel: Avoid producing Illegal copies in RegBankSelect
Avoid producing illegal register bank copies for reg_sequence and
phi. The default implementation assumes it is possible to pick any
operand's bank and use that for the result, introducing a copy for
operands with a different bank. This does not check for illegal
copies. It is not legal to introduce a VGPR->SGPR copy, so any VGPR
operand requires the result to be a VGPR.
The changes in getInstrMappingImpl aren't strictly necessary, since
AMDGPU now just bypasses this for reg_sequence/phi. This could be
replaced with an assert in case other targets run into this. It is
currently responsible for producing the error for unsatisfiable
copies, but this will be better served with a verifier check.
For phis, for now assume any undetermined operands must be
VGPRs. Eventually, this needs to be able to defer mapping these
operations. This also does not yet have a way to check for whether the
block is in a divergent region.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363410
91177308-0d34-0410-b5e6-
96231b3b80d8