granicus.if.org Git

[ARM] Add v8m.base pattern for add negative imm

The v8m.base ISA contains movw, which can operate on an unsigned
16-bit value. Add the pattern that converts an add with a negative
value, that could fit into 16-bits when negated, into a sub with that
positive value.

Differential Revision: https://reviews.llvm.org/D57942

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353692 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Enable DPP combiner pass by default.

Related revisions: https://reviews.llvm.org/D55444, https://reviews.llvm.org/D55314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353691 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][ARM] Simplify loop-indexing codegen test

Remove unnecessary offset checks, CHECK-BASE checks and add some
extra -NOT checks and TODO comments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353689 91177308-0d34-0410-b5e6-96231b3b80d8

[TEST] Add failing test from PR40454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353688 91177308-0d34-0410-b5e6-96231b3b80d8

test-release.sh: Add option to use ninja

Allow the use of ninja instead of make. This is useful on some
platforms where we'd like to be able to limit the number of link jobs
without slowing down the other steps of the release.

This patch adds a -use-ninja command line option, which sets the
generator to Ninja both for LLVM and the test-suite. It also deals with
some differences between make and ninja:
* DESTDIR handling - ninja doesn't like this to be listed after the
  target, but both make and ninja can handle it before the command
* Verbose mode - ninja uses -v, make uses VERBOSE=1
* Keep going mode - make has a -k mode, which builds as much as possible
  even when failures are encountered; for ninja we need to set a hard
  limit (we use 100 since most people won't look at 100 failures anyway)

I haven't tested with gmake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353685 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix buildbot after r353679 #2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353683 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] LLVM ERROR: Broken function found, while removing Debug Intrinsics.

Check that when SimplifyCFG is flattening a 'br', all their debug intrinsic instructions are removed, including any dbg.label referencing a label associated with the basic blocks being removed.

As the test case involves a CFG transformation, move it to the correct location.

Differential Revision: https://reviews.llvm.org/D57444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353682 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix buildbot after r353679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353681 91177308-0d34-0410-b5e6-96231b3b80d8

Small refactoring of FileError. NFC.

Differential revision: https://reviews.llvm.org/D57945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353679 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] LoadStoreOptimizer: reoder limit

The whole design of generating LDMs/STMs is fragile and unreliable: it depends on
rescheduling here in the LoadStoreOptimizer that isn't register pressure aware
and regalloc that isn't aware of generating LDMs/STMs.
This patch adds a (hidden) option to control the total number of instructions that
can be re-ordered. I appreciate this looks only a tiny bit better than a hard-coded
constant, but at least it allows more easy experimentation with different values
for now. Ideally we calculate this reorder limit based on some heuristics, and take
register pressure into account. I might be looking into that next.

Differential Revision: https://reviews.llvm.org/D57954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353678 91177308-0d34-0410-b5e6-96231b3b80d8

Move CFLGraph and the AA summary code over to the new `CallBase`
instruction base class rather than the `CallSite` wrapper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353676 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm] [cmake] Use current directory in GenerateVersionFromVCS

Find dependent scripts of GenerateVersionFromVCS in current directory
rather than ../../cmake/modules. I do not see any reason why the former
would not work and The latter is incorrect when GenerateVersionFromVCS
is used from install directory (i.e. in stand-alone builds).

Differential Revision: https://reviews.llvm.org/D57996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353674 91177308-0d34-0410-b5e6-96231b3b80d8

Remove `CallSite` from the CodeMetrics analysis, moving it to the new
`CallBase` and simpler APIs therein.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353673 91177308-0d34-0410-b5e6-96231b3b80d8

Remove a declaration that is dead, and not even implemented any longer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353672 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] LoadStoreOptimizer: just a clean-up. NFC.

Differential Revision: https://reviews.llvm.org/D57955

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353670 91177308-0d34-0410-b5e6-96231b3b80d8

Update more files added with the old header to the new one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353667 91177308-0d34-0410-b5e6-96231b3b80d8

Update new files added to llvm-objcopy to use the new file header.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353666 91177308-0d34-0410-b5e6-96231b3b80d8

Update files that were mistakenly added with the old file header to the
new one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353665 91177308-0d34-0410-b5e6-96231b3b80d8

Update files that were mistakenly added with the old file header.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353664 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSite removal] Port InstSimplify over to use `CallBase` both in its
interface and implementation.

Port code with: `cast<CallBase>(CS.getInstruction())`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353662 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSite removal] Migrate ConstantFolding APIs and implementation to
`CallBase`.

Users have been updated. You can see how to update any out-of-tree
usages: pass `cast<CallBase>(CS.getInstruction())`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353661 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSite removal] Migrate the statepoint GC infrastructure to use the
`CallBase` class rather than `CallSite` wrappers.

I pushed this change down through most of the statepoint infrastructure,
completely removing the use of CallSite where I could reasonably do so.
I ended up making a couple of cut-points: generic call handling
(instcombine, TLI, SDAG). As soon as it hit truly generic handling with
users outside the immediate code, I simply transitioned into or out of
a `CallSite` to make this a reasonable sized chunk.

Differential Revision: https://reviews.llvm.org/D56122

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353660 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Removed unused SDTypeProfile. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353659 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Fix clang-tidy dep on ClangSACheckers.

Patch by Mirko Bonadei <mbonadei@webrtc.org>!

Differential Revision: https://reviews.llvm.org/D57998

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353657 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] EltsFromConsecutiveLoads - replace SmallBitVector with APInt (NFC).

Minor refactor to simplify some incoming patches to improve broadcast loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353655 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Regex the opcodes in unit test to fix non-deterministic ordering

Differential Revision: https://reviews.llvm.org/D57988

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353652 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen][X86] Don't scalarize vector saturating add/sub

Now that we have vector support for [US](ADD|SUB)O we no longer
need to scalarize when expanding [US](ADD|SUB)SAT.

This matches what the cost model already does.

Differential Revision: https://reviews.llvm.org/D57348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353651 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Regenerate bswap tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353648 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add basic bitreverse/bswap combine tests

Shows missing SimplifyDemandedBits support

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353647 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Add optional AllowUndefs to isNullOrNullSplat

No change in default behaviour (AllowUndefs = false)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353646 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Simplify funnel shifts with undef/zero args to bitshifts

Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero.

Differential Revision: https://reviews.llvm.org/D58009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353645 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add masked variable tests for funnel undef/zero argument combines

I've avoided 'modulo' masks as we'll SimplifyDemandedBits those in the future, and we just need to check that the shift variable is 'in range'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353644 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] narrow 256-bit horizontal ops via demanded elements

256-bit horizontal math ops are an x86 monstrosity (and thankfully have
not been extended to 512-bit AFAIK).

The two 128-bit halves operate on separate halves of the inputs. So if we
don't demand anything in the upper half of the result, we can extract the
low halves of the inputs, do the math, and then insert that result into a
256-bit output.

All of the extract/insert is free (ymm<-->xmm), so we're left with a
narrower (cheaper) version of the original op.

In the affected tests based on:
https://bugs.llvm.org/show_bug.cgi?id=33758
https://bugs.llvm.org/show_bug.cgi?id=38971
...we see that the h-op narrowing can result in further narrowing of other
math via existing generic transforms.

I originally drafted this patch as an exact pattern match starting from
extract_vector_elt, but I thought we might see diffs starting from
extract_subvector too, so I changed it to a more general demanded elements
solution. There are no extra existing regression test improvements from
that switch though, so we could go back.

Differential Revision: https://reviews.llvm.org/D57841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353641 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add additional tests for funnel undef/zero argument combines

As suggested on D58009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353640 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] refactor setcc folds to fix another miscompile (PR40657)

SimplifySetCC still has much room for improvement, but this should
fix the remaining problem examples from:
https://bugs.llvm.org/show_bug.cgi?id=40657

The initial fix for this problem was rL353615.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353639 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add SimplifyDemandedBits test for BLENDVPD

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353638 91177308-0d34-0410-b5e6-96231b3b80d8

[Local] Delete a redundant check. NFC

isInstructionTriviallyDead also performs the use_empty() check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353637 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] - Fix .dynamic section entries writing for 32bit targets.

This was introduced by me in r353613.

I tried to fix Big-endian bot and replaced
uintX_t -> ELFT::Xword. But ELFT::Xword is a packed<uint64_t>,
so it is always 8 bytes and that was obviously incorrect.

My intention was to use something like packed<uint> actually, which
size is target dependent.

Patch fixes this bug and adds a test case, since no bots seems reported this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353636 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Move some vector InstAliases out from under unnecessary 'let Predicates'. NFCI

We don't have any assembler predicates for vector ISAs so this isn't necessary. It just adds extra lines and identation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353631 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix an unused variable warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353630 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add tests for funnel undef argument combines

If one of the shifted arguments is undef we should be folding to a regular shift.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353628 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] CombineOr - fold to generic funnel shifts

As discussed on D57389, this is a first step towards moving the SHLD/SHRD matching code to DAGCombiner using FSHL/FSHR instead.

There's a bit of work to do before I can do that, so this just folds to FSHL/FSHR in the existing code (handling the different SHRD/FSHR argument ordering), which fixes the issue we had with i16 shift amounts not being correctly masked.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353626 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add another test for setcc miscompile (PR40657); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353625 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r353590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353621 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-lib: Implement /list flag

Differential Revision: https://reviews.llvm.org/D57952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353620 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] add tests to show effect of setcc sub->shift; NFC

There's effectively no difference for the cases with variables.
We just trade a sub for an add on those. But the case with a
subtract from constant would require an extra move instruction
on x86, so this looks like a reasonable generic combine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353619 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test for setcc sub->shift transform; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353618 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353616 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] avoid miscompile in setcc transform (PR40657)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353615 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2elf.cpp] - Fix compilation under linux.

Fixes errors like:
/home/ssglocal/clang-cmake-x86_64-sde-avx512-linux/clang-cmake-x86_64-sde-avx512-linux/llvm/tools/yaml2obj/yaml2elf.cpp:597:5: error: need ‘typename’ before ‘ELFT:: Xword’ because ‘ELFT’ is a dependent scope
ELFT::Xword Tag = (ELFT::Xword)DE.Tag;

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353614 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2elf] - An attemp to fix s390x BB after r353607.

s390x is big-endian and seems r353607 had an issue with endianess,
Bot was unhappy:
http://lab.llvm.org:8011/builders/clang-s390x-linux-lnt/builds/11168/steps/ninja%20check%201/logs/stdio

This should fix it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353613 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[SelectionDAG] Extract [US]MULO expansion into TL method; NFC"

This reverts commit r353611.

Triggers an assertion during the libcall expansion on ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353612 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Extract [US]MULO expansion into TL method; NFC

In preparation for supporting vector expansion.

Also drop a variant of ExpandLibCall, of which the MULO expansions
were the only user.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353611 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Generalize X86ISD::BLENDI support to more value types

D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required.

With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors.

This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts.

I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case).

The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV.

Differential Revision: https://reviews.llvm.org/D57888

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353610 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/ObjectYAML] - Fix BB after r353607 [2]. NFC.

The second and the last place it seems.

Error was:
[ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Error.cpp.o
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:993:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353609 91177308-0d34-0410-b5e6-96231b3b80d8

[lib/ObjectYAML] - Fix BB after r353607. NFC.

Error was:
[ 4%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/DAGDeltaAlgorithm.cpp.o
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:666:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
(http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/29920)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353608 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj][obj2yaml] - Add support for dumping/parsing .dynamic sections.

This teaches the tools to parse and dump
the .dynamic section and its dynamic tags.

Differential revision: https://reviews.llvm.org/D57691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353606 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalOpt] Simplify __cxa_atexit elimination

cxxDtorIsEmpty checks callers recursively to determine if the
__cxa_atexit-registered function is empty, and eliminates the
__cxa_atexit call accordingly.

This recursive check is unnecessary as redundant instructions and
function calls can be removed by early-cse and inliner. In addition,
cxxDtorIsEmpty does not mark visited function and it may visit a
function exponential times (multiplication principle).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353603 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Don't set <PROJECT>_STANDALONE_BUILD

We shouldn't be treating runtimes builds as standalone builds since
we have enough of the context loaded into the runtimes environment.

Differential Revision: https://reviews.llvm.org/D57992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353601 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Clean up unused inline function and non-anchor defaulted destructors; NFCI

Summary:
Take care of some missing clean-ups that belong with r249548 and some
other copy/paste that had happened. In particular, the destructors are
no longer vtable anchors after r249548; and `setSectionName` in
`MCSectionWasm` is private and unused since r313058 culled its only
caller. The destructors are now implicitly defined, and the unused
function is removed.

Reviewers: nemanjai, jasonliu, grosbach

Reviewed By: nemanjai

Subscribers: sbc100, aheejin, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57182

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353597 91177308-0d34-0410-b5e6-96231b3b80d8

Extra processing for BitCast + PHI in InstCombine

For some specific cases with bitcast A->B->A with intervening PHI nodes InstCombiner::optimizeBitCastFromPhi transformation creates extra PHI nodes, which are actually a copy of already created PHI or in another words, they are redundant. These extra PHI nodes could lead to extra move instructions generated after DeSSA transformation. This happens when several conditions are met

- SROA kicks in and creates new alloca;
- there is a simple assignment L = R, which falls under 'canonicalize loads' done by combineLoadToOperationType (this transformation is by default). Exactly this transformation is the reason of bitcasts generated;
- the alloca is then used in A->B->A + PHI chain;
- there is a loop unrolling.

As a result optimizeBitCastFromPhi creates as many of PHI nodes for each new SROA alloca as loop unrolling factor is. These new extra PHI nodes are redundant actually except of one and should not be created. Moreover the idea of optimizeBitCastFromPhi is to get rid of the cast (when possible) but that doesn't happen in these conditions.

The proposed fix is to do the cast replacement for the whole calculated/accumulated PHI closure not for one cast only, which is an argument to the optimizeBitCastFromPhi. These will help to accomplish several things: 1) avoid extra PHI nodes generated as all casts which may trigger optimizeBitCastFromPhi transformation will be replaced, 3) bitcasts will be replaced, and 3) create more opportunities to remove dead code, which appears after the replacement.

A new test case shows that it's possible to get rid of all bitcasts completely and get quite good code reduction.

Author: Igor Tsimbalist <igor.v.tsimbalist@intel.com>

Reviewed By: Carrot

Differential Revision: https://reviews.llvm.org/D57053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353595 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Split idot4/8 signed and unsigned tests. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353593 91177308-0d34-0410-b5e6-96231b3b80d8

This reverts commit 1440a848a635849b97f7a5cfa0ecc40d37451f5b.
and commit a1853e834c65751f92521f7481b15cf0365e796b.

They broke arm and aarch64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353590 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit "[GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR""

After r353586, we won't fail on the AMDGPU floor pattern that was killing the
importer before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353589 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Split dot-insts feature

Differential Revision: https://reviews.llvm.org/D57971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353587 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Skip patterns that define complex suboperands twice instead of dying

If we run into a pattern that looks like this:

add
(complex $x, $y)
(complex $x, $z)

We should skip the pattern instead of asserting/doing something unpredictable.

This makes us return an Error in that case, and adds a testcase for skipped
patterns.

Differential Revision: https://reviews.llvm.org/D57980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353586 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r353566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353585 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Avoid passing blocks vector to the OutlineRegionInfo constructor by value.

Reviewers: vsk, fhahn, davidxl

Reviewed By: vsk

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57957

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353582 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test for miscompiling setcc transform (PR40657); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353580 91177308-0d34-0410-b5e6-96231b3b80d8

Re-apply r353553 "[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder"

With a fix after r353563 that adds some more opcodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353579 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r353553 "[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder"

This reverts commit r353553.

This breaks CodeGen/AArch64/GlobalISel/legalize-ext-csedebug-output.mir:

http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/57963/console

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353575 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Use code-block:: text for part of the callbr documentation to attempt to make the bot happy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353567 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add FPCW as an implicit use on floating point load instructions.

These instructions can generate a stack overflow exception so technically they read the stack overflow exception mask bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353564 91177308-0d34-0410-b5e6-96231b3b80d8

Implementation of asm-goto support in LLVM

This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353563 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeExtractor] Restore outputs after creating exit stubs

When CodeExtractor saves the result of InvokeInst at the first insertion
point of the 'normal destination' basic block, this block can be omitted
in the outlined region, so store is placed outside of the function. The
suggested solution is to process saving outputs after creating exit
stubs for new function, and stores will be placed in that blocks before
return in this case.

Patch by Sergei Kachkov!

Fixes llvm.org/PR40455.

Differential Revision: https://reviews.llvm.org/D57919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353562 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix broken tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353559 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Eliminate GPU specific SubtargetFeatures

Inline compatability is determined from the individual feature
bits. These are just sets of the separate features, but will always be
treated as incompatible unless they are specifically ignored.

Defining the ISA version number here in tablegen would be nice, but it
turns out this wasn't actually used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353558 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))

The sqrt case is faster and we already do this for the case where
the exponent is 0.25. This adds the 0.75 case which is also not
sensitive to signed zeros.

Patch by Whitney Tsang (Whitney)

Differential revision: https://reviews.llvm.org/D57434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353557 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilder

https://reviews.llvm.org/D57932

Add some logging + tests to make sure CSEInfo prints debug output.

reviewed by: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353553 91177308-0d34-0410-b5e6-96231b3b80d8

Document libFuzzer on Windows.

Summary:
Document that libFuzzer supports Windows, how to get it,
and its limitations.

Reviewers: kcc, morehouse, rnk, metzman

Reviewed By: kcc, rnk, metzman

Subscribers: hans, rnk

Differential Revision: https://reviews.llvm.org/D57597

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353551 91177308-0d34-0410-b5e6-96231b3b80d8

[Cmake] Add an option to build LLVM using the experimental new pass manager

Add LLVM_USE_NEWPM to build LLVM using the experimental new pass manager.

Differential Revision: http://reviews.llvm.org/D57068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353550 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove GCN features and predicates

These are no longer necessary since the R600 tablegen files are split
out now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353548 91177308-0d34-0410-b5e6-96231b3b80d8

[InstrProf] Implement static profdata registration

Summary:
The motivating use case is eliminating duplicate profile data registered
for the same inline function in two object files. Before this change,
users would observe multiple symbol definition errors with VC link, but
links with LLD would succeed.

Users (Mozilla) have reported that PGO works well with clang-cl and LLD,
but when using LLD without this static registration, we would get into a
"relocation against a discarded section" situation. I'm not sure what
happens in that situation, but I suspect that duplicate, unused profile
information was retained. If so, this change will reduce the size of
such binaries with LLD.

Now, Windows uses static registration and is in line with all the other
platforms.

Reviewers: davidxl, wmi, inglorion, void, calixte

Subscribers: mgorny, krytarowski, eraman, fedor.sergeev, hiraditya, #sanitizers, dmajor, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D57929

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353547 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Use ISD::FSHR in expandFixedPointMul

Replace OR(SHL,SRL) pattern with ISD::FSHR (legalization expands this later if necessary) - this helps with the scale == 0 'undefined' drop-through case that was discussed on D55720.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353546 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Run the verifier for dsymutil module tests

Dsymutil has an option "verify" that runs the dwarf verifier on the
generated dSYM. This patch enables this for the module tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353544 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Add SimplifyDemandedBits funnel shift support

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353539 91177308-0d34-0410-b5e6-96231b3b80d8

ArgumentPromotion should copy all metadata to new Function

Summary:
ArgumentPromotion had code to specifically move the dbg metadata over to
the new function, but other metadata such as the function_entry_count
!prof metadata was not. Replace code that moved dbg metadata with a call
to copyMetadata. The old metadata is automatically removed when the old
Function is removed.

Reviewers: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353537 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove isReMaterializable from X87 floating point constant loads and constant pool loads.

Summary: These instructions update FPSW so they aren't generically safe to rematerialize into any location if FPSW is live for a comparison result. They also use FPCW for exception masking control. Though the only exception they can generate is stack overflow and we manage the stack ourselves so that's not really going to occur.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57934

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353536 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add basic funnel shift demanded bits tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353534 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] fix formatting; NFC

(test commit #2 migrating to git)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353533 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix CS scratch setup on pre-GCN3 ASICs

Summary:
Prior to GCN3 s_load_dword offsets are in dwords rather than bytes.
Thus the scratch buffer descriptor offset must be adjusted for pre-GCN3 ASICs.

Reviewers: nhaehnle, tpr

Reviewed By: nhaehnle

Subscribers: sheredom, arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56496

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353530 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r353416 "[DAG] Cleanup unused nodes on failed store-to-load forward combine."

This cleanup causes out-of-tree crashes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353527 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix shift legalization for non-power-of-2

clampScalar doesn't do anything for non-power-of-2 in range.
There should probably be a combination rule to reduce the number
of matching rules.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353526 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Added support of lds_direct operand

See bug 39293: https://bugs.llvm.org/show_bug.cgi?id=39293

Reviewers: artem.tamazov, rampitec

Differential Revision: https://reviews.llvm.org/D57889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353524 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix non-power-of-2 implicit_def

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353522 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add few file processing directives

Differential revision: https://reviews.llvm.org/D57877

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353521 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Select any extending load and truncating store

Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and
G_SEXTLOAD. That is perform widenScalarDst to size given by the target
and avoid additional checks in common code. Targets can reorder or add
additional rules in LegalizeRuleSet for the opcode to achieve desired
behavior.

Select extending load that does not have specified type of extension
into zero extending load.

Select truncating store that stores number of bytes indicated by size
in MachineMemoperand.

Differential Revision: https://reviews.llvm.org/D57454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353520 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r353471, r353373.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353518 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Don't use a copy in addrspacecast lowering

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353516 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC][CODEOBJECT] Added predefined symbols to access GPU minor and stepping numbers

Added the following Code Object v3 symbols:
.amdgcn.gfx_generation_minor
.amdgcn.gfx_generation_stepping

Reviewers: artem.tamazov, kzhuravl

Differential Revision: https://reviews.llvm.org/D57826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353515 91177308-0d34-0410-b5e6-96231b3b80d8