granicus.if.org Git

[AArch64][SVE2] Asm: support SVE2 Histogram Computation Groups

Summary:
Patch adds support for the following instructions:

SVE2 histogram generation (segment):
* HISTSEG

SVE2 histogram generation (vector):
* HISTCNT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361796 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Misc Group

Summary:
Patch adds support for the following instructions:

SVE2 bitwise exclusive-or interleaved:
    * EORBT, EORTB

SVE2 bitwise permute:
    * BEXT, BDEP, BGRP

SVE2 bitwise shift left long:
    * SSHLLB, SSHLLT, USHLLB, USHLLT

SVE2 integer add/subtract interleaved long:
    * SADDLBT, SSUBLBT, SSUBLTB

BDEP, BEXT and BGRP are enabled with SVE2 feature +bitperm, all other
instructions in this group are enabled with +sve2.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361795 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineCost] Fix a couple comments. NFC

Replace "unary operator" with "unary instruction" in visitUnaryInstruction since
we now have a UnaryOperator class which might needs its own visit function.

Fix a copy/paste in visitCastInst that appears to have been copied from
visitPtrToInt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361794 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [test] Fix plugin tests

This reverts r361790 (git commit fe5eaab2b5b4523886bd63aebcfea8cfce586fa1)

It's causing buildbot breakage, so reverting while I investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361793 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Fix plugin tests

Summary:
The following changes were required to fix these tests:

1) Change LLVM_ENABLE_PLUGINS to an option and move it to
   llvm/CMakeLists.txt with an appropriate default -- which matches
   the original default behavior.

2) Move the plugins directory from clang/test/Analysis
   clang/lib/Analysis.  It's not enough to add an exclude to the
   lit.local.cfg file because add_lit_testsuites recurses the tree and
   automatically adds the appropriate `check-` targets, which don't
   make sense for the plugins because they aren't tests and don't
   have `RUN` statements.

   Here's a list of the `clang-check-anlysis*` targets with this
   change:

```
  $ ninja -t targets all| sed -n "s/.*\/$check[^:]*$:.*/\1/p" | sort -u | grep clang-analysis
  check-clang-analysis
  check-clang-analysis-checkers
  check-clang-analysis-copypaste
  check-clang-analysis-diagnostics
  check-clang-analysis-engine
  check-clang-analysis-exploration_order
  check-clang-analysis-html_diagnostics
  check-clang-analysis-html_diagnostics-relevant_lines
  check-clang-analysis-inlining
  check-clang-analysis-objc
  check-clang-analysis-unified-sources
  check-clang-analysis-z3
```

3) Simplify the logic and only include the subdirectories under
   clang/lib/Analysis/plugins if LLVM_ENABLE_PLUGINS is set.

Reviewed By: NoQ

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62445

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361790 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel] Add really basic support for being able to query the cost of the FNeg instruction.

Summary:
This reuses the getArithmeticInstrCost, but passes dummy values of the second
operand flags.

The X86 costs are wrong and can be improved in a follow up. I just wanted to
stop it from reporting an unknown cost first.

Reviewers: RKSimon, spatel, andrew.w.kaylor, cameron.mcinally

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361788 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Remove unreachable statement

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361786 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test to show volatile store splitting; NFC

From the LangRef:
"the backend should never split or merge target-legal
volatile load/store instructions."

See also:
D62498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361785 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Extract demangleMD5Name() method; no behavior change

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361783 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld][ARM] Fix an incorrect assertion condition.

Fixes https://llvm.org/PR42036

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361782 91177308-0d34-0410-b5e6-96231b3b80d8

RegAllocFast: Set MayLiveAcrossBlocks when allocating uses

Setting mayLiveOut based only on use instructions after allocating the
def block did not work if the use block was allocated before the def
block, since the virtual register uses were already removed.

Fixes bug 41973.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361781 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] fold concat of extract subvectors

This is derived from the related fold for build vectors.
We also have a version of this in DAGCombiner. The benefit of
having this fold at node creation time is (1) efficiency and
(2) preventing infinite looping from creating patterns that
should not exist in the first place.

Currently, the inf-loop could happen with MergeConsecutiveStores()
because it naively creates concat of extracts when forming a wider
vector store. That could fight with target-specific store narrowing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361780 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] fix formatting and redundant comments; NFC

There's a possible missing fold here for extracting from the
same source vector. It's similar to a check that we use to
squash a build vector with all extracted elements from the
same source vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361778 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Enhance the simplification of `copyto` from `implicit-def`.

Summary:
- The current implementation simplifies the case where the source of
  `copyto` is `implicit-def`ed. However, it only works when that
  `implicit-def` is single-used since it detects that from
  `implicit-def` and cannot determine which destination vreg should be
  used if there are multiple uses.
- This patch changes that detection when `copyto` is being emitted. If
  that `copyto`'s source is defined from `implicit-def`, it simplifies
  it. Hence, it works even that `implicit-def` is multi-used.
- Except it simplifies the internal IR, it won't improve the quality of
  code generation. However, it helps to detect 'implicit-def` in a
  straight-forward manner in some passes, such as `si-i1-copies`. A test
  case is added.

Reviewers: sunfish, nhaehnle

Subscribers: jvesely, hiraditya, asbirlea, llvm-commits, yaxunl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361777 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix for the address sanitizer failure. Fixing typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361776 91177308-0d34-0410-b5e6-96231b3b80d8

NFC: Change usage of 'DenseSet' to 'DenseSetImpl' in DenseSetImpl::ConstIterator.

Summary:
Change usage of 'DenseSet' to 'DenseSetImpl' in a friend declaration within DenseSetImpl::ConstIterator. 'ConstIterator' was never updated when DenseSet was split into an impl when adding support for DenseSetImpl.

This fixes build errors on MSVC when forward declaring DenseSet as this friend decl does not declare the template arguments as well.

Reviewers: jpienaar

Reviewed By: jpienaar

Subscribers: jpienaar, lebedev.ri, dexonsmith, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62467

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361775 91177308-0d34-0410-b5e6-96231b3b80d8

Include what you use in AArch64AsmBackend.cpp

AArch64AsmBackend.cpp was not using any APIs from AArch64.h, and was
only including it for transitive dependencies. Doing so is problematic
from include-what-you-use perspective, but it is also a layering issue
(it creates a dependency cycle between the primary AArch64 target
library and the MCTargetDesc library).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361774 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] GetDemandedBits - add demanded elements wrapper implementation

The DemandedElts variable is pretty much inert at the moment - the original GetDemandedBits implementation calls it with an 'all ones' DemandedElts value so the function is active and behaves exactly as it used to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361773 91177308-0d34-0410-b5e6-96231b3b80d8

[LLParser] Fix uninitialized flag variable warnings. NFCI.

Fixes a large number of warnings in the scan-build report on llvm builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361772 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix for the address sanitizer failure caused by the ifollowing commit:

1a8b2ea611cf4ca7cb09562e0238cfefa27c05b5 Divergence driven ISel. Assign register class for cross block values according to the divergence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361770 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Enabled constant expressions as operands of s_waitcnt

See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D61017

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361763 91177308-0d34-0410-b5e6-96231b3b80d8

[MustExecute] Improve MustExecute to correctly handle loop nest

Summary:
for.outer:
  br for.inner
for.inner:
  LI <loop invariant load instruction>
for.inner.latch:
  br for.inner, for.outer.latch
for.outer.latch:
  br for.outer, for.outer.exit

LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock().

Current algorithm of allLoopPathsLeadToBlock()

  1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch
  2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false
  3. return true

Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this.

Author: Whitney (committed by xingxue)

Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn

Reviewed By: jdoerfert

Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour

Tags: #LLVM

Differential Revision: https://reviews.llvm.org/D62418

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361762 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit (NFC)

Add blank line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361761 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Un-XFAIL some tests. NFC

It turns out we support big endian now (probably since r332449, but I
haven't bisected to confirm).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361756 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Cleanup CallLowering a bit

We never actually use the Offsets produced by ComputeValueVTs, so remove
them until we need them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361755 91177308-0d34-0410-b5e6-96231b3b80d8

Cmake: allow using LLVM_EXTERNAL_PROJECTS with LLVM_ENABLE_PROJECTS

The current code iterates over the combination of LLVM_EXTERNAL_PROJECTS
and LLVM_ENABLE_PROJECTS, but then disables projects that are only in
the former. If a project is in LLVM_EXTERNAL_PROJECTS, it should be
enabled.

See also llvm-commits thread on r354060.

Differential revision: https://reviews.llvm.org/D62289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361751 91177308-0d34-0410-b5e6-96231b3b80d8

Make llvm-as --help great again

This is a follow-up to https://reviews.llvm.org/D60411, but for llvm-as.

New output:

    OVERVIEW: llvm .ll -> .bc assembler

    USAGE: llvm-as [options] <input .llvm file>

    OPTIONS:

    Generic Options:

      -help                        - Display available options (-help-hidden for more)
      -help-list                   - Display list of available options (-help-list-hidden for more)
      -version                     - Display the version of this program

    llvm-as Options:

      -data-layout=<layout-string> - data layout string to use
      -disable-output              - Disable output
      -f                           - Enable binary output on terminals
      -module-hash                 - Emit module hash
      -o=<filename>                - Override output filename

Differential Revision: https://reviews.llvm.org/D60603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361750 91177308-0d34-0410-b5e6-96231b3b80d8

[test commit] Add my name to the CREDITS.TXT

This is my test commit. (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361748 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361356: "[MIR] Add simple PRE pass to MachineCSE"

This is problematic on buildbots, as discussed here: https://reviews.llvm.org/rL361356

It seems like the plan already was to revert, but that hasn't happened yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361746 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for D62444. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361745 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Make demangling of MD5 names more robust

Demangler::parse() for MD5 names would:

1. Put all remaining text into the MD5 name sight unseen
2. Not modify MangledName

This meant that if the demangler recursively called parse() (e.g. in
demangleLocallyScopedNamePiece()), every recursive call that started on
an MD5 name would add all remaining bytes to the output buffer but
only advance the input by a byte. For valid inputs, MD5 types are
never (well, see comments for 2 exceptions) nested, but for invalid
input this could cause memory use quadratic in the input size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361744 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.

The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.

This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.

Fixes PR41725.

Reviewers: efriedma, mcrosier, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361743 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] generate R_BPF_NONE relocation for BTF DataSec variables

The variables in BTF DataSec type encode in-section offset.
R_BPF_NONE should be generated instead of R_BPF_64_32.

Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D62460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361742 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

    Details: To make instruction selection really divergence driven it is necessary to assign
             the correct register classes to the cross block values beforehand. For the divergent targets
             same value type requires different register classes dependent on the value divergence.

    Reviewers: rampitec, nhaehnle

    Differential Revision: https://reviews.llvm.org/D59990

    This commit was reverted because of the build failure.
    The reason was mlformed patch.
    Build failure fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361741 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA][Scheduler] Improved critical memory dependency computation.

This fixes a problem where back-pressure increases caused by register
dependencies were not correctly notified if execution was also delayed by memory
dependencies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361740 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] GetDemandedBits - cleanup to more closely match SimplifyDemandedBits. NFCI.

Prep work before adding demanded elts support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361739 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] MaskedValueIsZero - add demanded elements implementation

Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361738 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Refactor the logic that computes the critical memory dependency info. NFCI

CriticalRegDep has been renamed CriticalDependency, and it is now used by class
Instruction to store information about the critical register dependency and the
critical memory dependency. No functional change intendend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361737 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] back out all SwitchInst commits

They caused the sanitizer builds to fail.

My suspicion is the change the countLeadingZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361736 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add shuffle combining support for ISD::ANY_EXTEND_VECTOR_INREG

Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361734 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, one more fixed test from previous push.

The old test was checking for a stupid subtract one that is a transform that
makes the code woorse.

The constant-islands-jump-table.ll test wants the code a specific way,
that makes sense, so I will submit code to fix that one.

Sorry that I really didn't know how to run the test suite before this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361733 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL361731 : [LLParser] Fix uninitialized variable warnings. NFCI.

These 3 variables cause quite a few warnings in the scan-build report on llvm.
........
Revert accidental commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361732 91177308-0d34-0410-b5e6-96231b3b80d8

[LLParser] Fix uninitialized variable warnings. NFCI.

These 3 variables cause quite a few warnings in the scan-build report on llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361731 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, fix failing tests from last patches.

No problems with the transforms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361730 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] prevent crashing with invalid extractelement index

This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361729 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361728 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize

Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361727 91177308-0d34-0410-b5e6-96231b3b80d8

[SimpligyCFG] NFC, remove GCD that was only used for powers of two

and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361726 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything

Also add baseline tests to show effect of later patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361725 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] make countLeadingZeros() and countTrailingZeros() return unsigned

This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().

(as well as __builtin_clzll())

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361724 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI

The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361723 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r361664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361722 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor OptimizeOverflowCheck; NFCI

Extract method to compute overflow based on binop and signedness,
and then make the result handling code generic. This extends the
always-overflow handling to signed muls, but has currently no effect,
as we don't compute always overflow for them (thus NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361721 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Remove OverflowCheckFlavor; NFC

Instead pass binary op and signedness. The extra enum only makes
things more complicated in this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361720 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fma

This adds a pattern for fma, similar to the float and double patterns.

Differential Revision: https://reviews.llvm.org/D62330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361719 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select a number of fp16 rounding functions

This add patterns for fp16 round and ceil etc. Same as the float and double
patterns.

Differential Revision: https://reviews.llvm.org/D62326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361718 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Promote various fp16 math intrinsics

Promote a number of fp16 math intrinsics to float, so that the relevant float
math routines can be used. Copysign is expanded so as to be handled in-place.

Differential Revision: https://reviews.llvm.org/D62325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361717 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] combineBitcastvxi1 - peek through bitops to determine size of original vector

We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well.

There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361716 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fabs

This adds a pattern for the fabs intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361715 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fsqrt

This adds a pattern for the sqrt intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361714 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Promote fp16 frem

Promote fp16 frem operations on ARM to floats so they call fmodf.

Differential Revision: https://reviews.llvm.org/D62321

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361713 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add some base fullfp16 tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361712 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add missing R_PPC_* relocation types

While people mostly care about 64-bit, some systems need basic lib32
support. The plan is to make lld (see PR40888) capable of linking some
applications (PR40888).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361711 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Added condition assumption for unreachable blocks

Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361707 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] lowerBuildVectorToBitOp - support build_vector(shift()) -> shift(build_vector(),C)

Commonly occurs in sign-extension cases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361706 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add Accessor for Mach-O Universal Binary Slices

Summary: Allow for retrieving an object file corresponding to an architecture-specific slice in a Mach-O universal binary file.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60378

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361705 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Combine fminnum/fmaxnum with non-nan operand to fmin/fmax

If we have a known non-nan operand, place it in the second operand
of fmin/fmax that is returned if either operand is nan.

Differential Revision: https://reviews.llvm.org/D62448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361704 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Add support for saturating add/sub

Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.

Differential Revision: https://reviews.llvm.org/D62447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361703 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] vector-sext - cleanup prefix lists

Add X32-SSE common prefix to merge some checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361702 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] define binops as a superset of commutative binops

The test diffs show improved vector narrowing for integer min/max opcodes because
those were all absent from the list. I'm not sure if we can expose functional diffs
for all of the moved/added opcodes though.

It seems like we are missing an AVX512 opportunity to use 256-bit ops in place of
512-bit ops on some tests/targets, but I think that can be a follow-up.

Preliminary steps to make sure the callers are not misusing these queries:
rL361268
rL361547

Differential Revision: https://reviews.llvm.org/D62191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361701 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add tests for min/maxnum with const operand; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361700 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Fix test by regenerating checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361699 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Remove unnecessary checks for empty GNWR; NFC

The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.

To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361698 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Make tests more robust for new optimizations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361697 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] soften assertion when legalizing narrow vector FP ops

The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not
be so aggressive with an assert here. There's no telling what out-of-tree
targets look like.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361696 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Update test checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361695 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Add tests for saturating add/sub ranges; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361694 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Calculate with.overflow result range

In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.

Differential Revision: https://reviews.llvm.org/D60656

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361693 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI] Extract helper for binary range calculations; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361692 91177308-0d34-0410-b5e6-96231b3b80d8

[X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly.

INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags.

This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg.

One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input.

Differential Revision: https://reviews.llvm.org/D61472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361691 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add zero idioms to the haswell, broadwell, and skylake schedule models. Add 256-bit fp xor to sandybridge zero idioms

This copies the Sandy Bridge zero idiom support to later CPUs. Adding the AVX2 and AVX512F/VL instructions as appropriate.

Differential Revision: https://reviews.llvm.org/D62360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361690 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][llvm-mca] Add zero idiom tests for Intel CPUs. NFC

This pre-commits tests for D62360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361689 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."

Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361688 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[Analysis] Link library dependencies to Analysis plugins"

This reverts commit r361340. The following builder has been broken for
the past few days because of this commit:

http://green.lab.llvm.org/green/job/clang-stage2-cmake-RgSan/

Also revert r361399, which was committed to fix r361340.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361685 91177308-0d34-0410-b5e6-96231b3b80d8

Rename clangToolingRefactor to clangToolingRefactoring for consistency with its directory

See "[cfe-dev] The name of clang/lib/Tooling/Refactoring".

Differential Revision: https://reviews.llvm.org/D62420

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361684 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-dwarfdump: Don't error on mixed units using/not using str_offsets

This lead to errors when dumping binaries with v4 and v5 units linked
together (but could've also errored on v5 units that did/didn't use
str_offsets).

Also improves error handling and messages around invalid str_offsets
contributions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361683 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Make FP constraint checks consider possible use/def banks

In a few places in getInstrMapping, we check if use/def instructions for the
instruction we're mapping have floating point constraints.

We can improve this check and reduce the number of copies in GISel-compiled code
if we make a couple observations:

- For a def instruction, it only matters if the def instruction must always
output a value stored on a FPR

- For a use instruction, it only matters if the use instruction must always
only take in values stored in FPRs

This adds two new functions:

- onlyUsesFP
- onlyDefinesFP

Then we can use those when we're checking the uses/defs instead.

Without this patch, the load, unmerge, store, and select in the added test
would have unnecessary copies.

Differential Revision: https://reviews.llvm.org/D62426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361679 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] NFC: Factor out HasFPConstraints into a proper function

Factor it out into a function, and replace places where we had the same check
with the new function.

Differential Revision: https://reviews.llvm.org/D62421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361677 91177308-0d34-0410-b5e6-96231b3b80d8

[dwarfdump] Add flag to limit the number of parents DIEs

This adds `-parent-recurse-depth` which limits the number of parent DIEs
being dumped.

Differential revision: https://reviews.llvm.org/D62359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361671 91177308-0d34-0410-b5e6-96231b3b80d8

Implement call lowering without parameters on AIX

Summary:dd
This patch implements call lowering for calls without parameters
on AIX as initial support.

Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma

Differential Revision: https://reviews.llvm.org/D61948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361669 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Improve register bank mappings for G_SELECT

The fcsel and csel instructions differ in only the register banks they work on.

So, they're entirely interchangeable otherwise.

With this in mind, this does two things:

- Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as
the outputs.
- Teach it to choose the best register bank mapping based off the constraints
of the inputs and outputs.

The "best" in this case means the one that requires the smallest number of
copies to properly emit a fcsel/csel.

For example, if the inputs are all already going to be on FPRs, we should
emit a fcsel, even if the output is a GPR. This costs one copy to produce the
result, but saves us from copying the inputs into GPRs.

Also update the regbank-select.mir to check that we end up with the right
select instruction.

Differential Revision: https://reviews.llvm.org/D62267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361665 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] check for INLINEASM_BR along w/ INLINEASM

Summary:
It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Reviewers: t.p.northover, peter.smith

Reviewed By: peter.smith

Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62402

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361661 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM

Summary:
We were observing failures for arm32 allyesconfigs of the Linux kernel
with the asm goto Clang patch, where ldr's were being generated to
offsets too far away to encode in imm12.

It looks like since INLINEASM_BR was created off of INLINEASM, a few
checks for INLINEASM needed to be updated to check for either case.

pr/41999

Link: https://github.com/ClangBuiltLinux/linux/issues/490
Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover

Reviewed By: peter.smith

Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62400

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361659 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spills

If some lanes weren't active on entry to the function, this could
clobber their VGPR values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361655 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Boost inline threshold with addrspacecasted alloca arguments

This was skipping GetUnderlyingObject for nonprivate addresses, but an
alloca could also be found through an addrspacecast if it's flat.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361649 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] update test to be independent of instcombine; NFC

This is a regression test for vectorization, so remove instcombine
from the RUN line and adjust the comparison predicates to show what
the vectorizer is creating rather than how instcombine cleans it up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361648 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Fix issues building runtimes

This resolves two issues:
(1) LIBCXX_HEADER_DIR is a very misleadingly named variable because it shouldn't be set to the header directory, instead it needs to be the root binary dir.
(2) If you build runtimes without libcxx, we can't depend on the libcxx header target, so we should instaed refer to it by the variable name which will be unset if libcxx isn't present.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361646 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

Details: To make instruction selection really divergence driven it is necessary to assign
the correct register classes to the cross block values beforehand. For the divergent targets
same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361644 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] - Strip undefined symbols if they are no longer referenced following --only-section

This is https://bugs.llvm.org/show_bug.cgi?id=40004.

In this patch I teach llvm-objcopy to remove undefined symbols if
them are not used anymore after applying -j/--only-section option.

Differential revision: https://reviews.llvm.org/D62317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361642 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r361607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361640 91177308-0d34-0410-b5e6-96231b3b80d8