granicus.if.org Git

AMDGPU: Don't enable all lanes with non-CSR VGPR spills

If the only VGPRs used for SGPR spilling were not CSRs, this was
enabling all laness and immediately restoring exec. This is the usual
situation in leaf functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361848 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix the mis-handling of `vreg_1` copied from scalar register.

Summary:
- Don't treat the use of a scalar register as `vreg_1` an VGPR usage.
  Otherwise, that promotes that scalar register into vector one, which
  breaks the assumption that scalar register holds the lane mask.
- The issue is triggered in a complicated case, where if the uses of
  that (lane mask) scalar register is legalized firstly before its
  definition, e.g., due to the mismatch block placement and its
  topological order or loop. In that cases, the legalization of PHI
  introduces the use of that scalar register as `vreg_1`.

Reviewers: rampitec, nhaehnle, arsenm, alex-t

Subscribers: kzhuravl, jvesely, wdng, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62492

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361847 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Replace fp-only-sp and d16 with fp64 and d32.

Those two subtarget features were awkward because their semantics are
reversed: each one indicates the _lack_ of support for something in
the architecture, rather than the presence. As a consequence, you
don't get the behavior you want if you combine two sets of feature
bits.

Each SubtargetFeature for an FP architecture version now comes in four
versions, one for each combination of those options. So you can still
say (for example) '+vfp2' in a feature string and it will mean what
it's always meant, but there's a new string '+vfp2d16sp' meaning the
version without those extra options.

A lot of this change is just mechanically replacing positive checks
for the old features with negative checks for the new ones. But one
more interesting change is that I've rearranged getFPUFeatures() so
that the main FPU feature is appended to the output list *before*
rather than after the features derived from the Restriction field, so
that -fp64 and -d32 can override defaults added by the main feature.

Reviewers: dmgreen, samparker, SjoerdMeijer

Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361845 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Delete unused VariantKind in AArch64MCExpr

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361844 91177308-0d34-0410-b5e6-96231b3b80d8

[X86-64] Fix 256-bit SET0 lowering for non-VLX targets

If we don't have VLX then 256-bit SET0 should be lowered
to VPXOR with ZMM registers. This restores functionality
accidentally removed by r309926.

Differential Revision: https://reviews.llvm.org/D62415

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361843 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Support demangling char8_t

Ports clang's mangling support added in r354633 to llvm-undname.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361839 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361826, as it still breaks LLDB.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361837 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Add support for local static thread guards

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361835 91177308-0d34-0410-b5e6-96231b3b80d8

[XCOFF] Implement parsing symbol table for xcoffobjfile and output as yaml format

Summary:
This patch implement parsing symbol table for xcoffobjfile and
output as yaml format. Parsing auxiliary entries of a symbol
will be in a separate patch.

The XCOFF object file (aix_xcoff.o) used in the test comes from
-bash-4.2$ cat test.c
extern int i;
extern int TestforXcoff;
int main()
{
i++;
TestforXcoff--;
}

Patch by DiggerLin

Reviewers: sfertile, hubert.reinterpretcast, MaskRay, daltenty

Differential Revision: https://reviews.llvm.org/D61532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361832 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 361827. It broke the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361831 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: make clangd depend on clang resource headers

Summary:
clangd needs them to function properly, even though they are not
strictly required for the build.

Reviewers: thakis

Reviewed By: thakis

Subscribers: MaskRay, jkorous, arphaman, llvm-commits, kadircet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361828 91177308-0d34-0410-b5e6-96231b3b80d8

Add constrained intrinsic tests for powerpc64 and powerpc64le.

Submitted by: Drew Wock
Reviewed by: Hal Finkel
Approved by: Hal Finkel
Differential Revision: https://reviews.llvm.org/D62388

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361827 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Default options for faster executables on MSVC

Differential Revision: https://reviews.llvm.org/D55056

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361826 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] split 256-bit store of concatenated vectors

This shows up as a side issue to the main problem for the AVX target example from PR37428:
https://bugs.llvm.org/show_bug.cgi?id=37428 - https://godbolt.org/z/7tpRa3

But as we can see in the pile of existing test diffs, it's actually a widespread problem
that affects any AVX or later target. Apart from a couple of oddballs, I think these are
all improvements for the reasons stated in the code comment: we do not want to enable YMM
unnecessarily (avoid vzeroupper and frequency throttling) and some cores split 256-bit
stores anyway.

We could say that MergeConsecutiveStores() is going overboard on some of these examples,
but that won't solve the problem completely. But that is the reason I'm proposing this as
a lowering rather than a combine: we will infinite loop fighting the merge code if we try
this earlier.

Differential Revision: https://reviews.llvm.org/D62498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361822 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] LegalizeVectorTypes - reduce scope of local variables. NFCI.

Move the element index/count variables into the block where they are actually used - appeases cppcheck and helps avoid shadow variable warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361821 91177308-0d34-0410-b5e6-96231b3b80d8

Stop undef fragments from closing non-overlapping fragments

Summary:
When DwarfDebug::buildLocationList() encountered an undef debug value,
it would truncate all open values, regardless if they were overlapping or
not. This patch fixes so that it only does that for overlapping fragments.

This change unearthed a bug that I had introduced in D57511,
which I have fixed in this patch. The code in DebugHandlerBase that
changes labels for parameter debug values could break DwarfDebug's
assumption that the labels for the entries in the debug value history
are monotonically increasing. Before this patch, that bug could result
in location list entries whose ending address was lower than the
beginning address, and with the changes for undef debug values that this
patch introduces it could trigger an assertion, due to attempting to
emit location list entries with empty ranges. A reproducer for the bug
is added in param-reg-const-mix.mir.

Reviewers: aprantl, jmorse, probinson

Reviewed By: aprantl

Subscribers: javed.absar, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D62379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361820 91177308-0d34-0410-b5e6-96231b3b80d8

MIR: Fix printer crashing on dead CSR frame indexes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361819 91177308-0d34-0410-b5e6-96231b3b80d8

Follow up of r361810: test case fix attempt for Windows builder

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361817 91177308-0d34-0410-b5e6-96231b3b80d8

[IRBuilder] Add CreateUnOp(...) to the IRBuilder to support unary FNeg

Also update UnaryOperator to support isa, cast, and dyn_cast.

Differential Revision: https://reviews.llvm.org/D62417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361816 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] fix 256-bit vector store splitting to honor 'volatile'

Forking this out of the discussion in D62498
(and assuming that will be committed later, so adding the helper function here).
The LangRef says:
"the backend should never split or merge target-legal volatile load/store instructions."

Differential Revision: https://reviews.llvm.org/D62506

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361815 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Custom lower CONCAT_VECTORS of v2i1

The generic legalizer cannot handle this. Add an assert instead of
silently miscompiling vectors with elements smaller than 8 bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361814 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Test commit, delete trailing whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361813 91177308-0d34-0410-b5e6-96231b3b80d8

Cleanups for r361807 that I somehow failed to commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361812 91177308-0d34-0410-b5e6-96231b3b80d8

Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"

This was reverted in r360086 as it was supected of causing mysterious test
failures internally. However, it was never concluded that this patch was the
root cause.

> The code was previously checking that candidates for sinking had exactly
> one use or were a store instruction (which can't have uses). This meant
> we could sink call instructions only if they had a use.
>
> That limitation seemed a bit arbitrary, so this patch changes it to
> "instruction has zero or one use" which seems more natural and removes
> the need to special-case stores.
>
> Differential revision: https://reviews.llvm.org/D59936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361811 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Use CHECK-NEXT in CodeGen/ARM/O3-pipeline.ll. NFC.

Use CHECK-NEXT, like in other pipeline tests, so that we actually
notice when the pipeline is changed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361810 91177308-0d34-0410-b5e6-96231b3b80d8

[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst

This patch fixes the CorrelatedValuePropagation pass to keep
prof branch_weights metadata of SwitchInst consistent.
It makes use of SwitchInstProfUpdateWrapper.
New tests are added.

Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D62126

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361808 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some llvm-readelf tests after r361633

They were failing on 32-bit Windows. In the cases where I've changed
test expectations, I've checked that they match the output of GNU
readelf.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361807 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer][X86] Add broadcast test case from D62427

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361805 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] X86CmovConverterPass::collectCmovCandidates - fix uninitialized variable warnings. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361804 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Floating Point Convert Group

Summary:
Patch adds support for the following intructions:

SVE2 floating-point convert precision:
* FCVTXNT, FCVTNT, FCVTLT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62382

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361801 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Crypto Extensions Group

Summary:
Patch adds support for the following instructions:

SVE2 crypto constructive binary operations:
    * SM4EKEY, RAX1

SVE2 crypto destructive binary operations:
    * AESE, AESD, SM4E

SVE2 crypto unary operations:
    * AESMC, AESIMC

AESE, AESD, AESMC and AESIMC are enabled with +sve2-aes.  SM4E and
SM4EKEY are enabled with +sve2-sm4. RAX1 is enabled with +sve2-sha3.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62307

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361797 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Histogram Computation Groups

Summary:
Patch adds support for the following instructions:

SVE2 histogram generation (segment):
* HISTSEG

SVE2 histogram generation (vector):
* HISTCNT

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361796 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: support SVE2 Misc Group

Summary:
Patch adds support for the following instructions:

SVE2 bitwise exclusive-or interleaved:
    * EORBT, EORTB

SVE2 bitwise permute:
    * BEXT, BDEP, BGRP

SVE2 bitwise shift left long:
    * SSHLLB, SSHLLT, USHLLB, USHLLT

SVE2 integer add/subtract interleaved long:
    * SADDLBT, SSUBLBT, SSUBLTB

BDEP, BEXT and BGRP are enabled with SVE2 feature +bitperm, all other
instructions in this group are enabled with +sve2.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D62304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361795 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineCost] Fix a couple comments. NFC

Replace "unary operator" with "unary instruction" in visitUnaryInstruction since
we now have a UnaryOperator class which might needs its own visit function.

Fix a copy/paste in visitCastInst that appears to have been copied from
visitPtrToInt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361794 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [test] Fix plugin tests

This reverts r361790 (git commit fe5eaab2b5b4523886bd63aebcfea8cfce586fa1)

It's causing buildbot breakage, so reverting while I investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361793 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Fix plugin tests

Summary:
The following changes were required to fix these tests:

1) Change LLVM_ENABLE_PLUGINS to an option and move it to
   llvm/CMakeLists.txt with an appropriate default -- which matches
   the original default behavior.

2) Move the plugins directory from clang/test/Analysis
   clang/lib/Analysis.  It's not enough to add an exclude to the
   lit.local.cfg file because add_lit_testsuites recurses the tree and
   automatically adds the appropriate `check-` targets, which don't
   make sense for the plugins because they aren't tests and don't
   have `RUN` statements.

   Here's a list of the `clang-check-anlysis*` targets with this
   change:

```
  $ ninja -t targets all| sed -n "s/.*\/$check[^:]*$:.*/\1/p" | sort -u | grep clang-analysis
  check-clang-analysis
  check-clang-analysis-checkers
  check-clang-analysis-copypaste
  check-clang-analysis-diagnostics
  check-clang-analysis-engine
  check-clang-analysis-exploration_order
  check-clang-analysis-html_diagnostics
  check-clang-analysis-html_diagnostics-relevant_lines
  check-clang-analysis-inlining
  check-clang-analysis-objc
  check-clang-analysis-unified-sources
  check-clang-analysis-z3
```

3) Simplify the logic and only include the subdirectories under
   clang/lib/Analysis/plugins if LLVM_ENABLE_PLUGINS is set.

Reviewed By: NoQ

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62445

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361790 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel] Add really basic support for being able to query the cost of the FNeg instruction.

Summary:
This reuses the getArithmeticInstrCost, but passes dummy values of the second
operand flags.

The X86 costs are wrong and can be improved in a follow up. I just wanted to
stop it from reporting an unknown cost first.

Reviewers: RKSimon, spatel, andrew.w.kaylor, cameron.mcinally

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361788 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Remove unreachable statement

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361786 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test to show volatile store splitting; NFC

From the LangRef:
"the backend should never split or merge target-legal
volatile load/store instructions."

See also:
D62498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361785 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Extract demangleMD5Name() method; no behavior change

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361783 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld][ARM] Fix an incorrect assertion condition.

Fixes https://llvm.org/PR42036

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361782 91177308-0d34-0410-b5e6-96231b3b80d8

RegAllocFast: Set MayLiveAcrossBlocks when allocating uses

Setting mayLiveOut based only on use instructions after allocating the
def block did not work if the use block was allocated before the def
block, since the virtual register uses were already removed.

Fixes bug 41973.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361781 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] fold concat of extract subvectors

This is derived from the related fold for build vectors.
We also have a version of this in DAGCombiner. The benefit of
having this fold at node creation time is (1) efficiency and
(2) preventing infinite looping from creating patterns that
should not exist in the first place.

Currently, the inf-loop could happen with MergeConsecutiveStores()
because it naively creates concat of extracts when forming a wider
vector store. That could fight with target-specific store narrowing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361780 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] fix formatting and redundant comments; NFC

There's a possible missing fold here for extracting from the
same source vector. It's similar to a check that we use to
squash a build vector with all extracted elements from the
same source vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361778 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Enhance the simplification of `copyto` from `implicit-def`.

Summary:
- The current implementation simplifies the case where the source of
  `copyto` is `implicit-def`ed. However, it only works when that
  `implicit-def` is single-used since it detects that from
  `implicit-def` and cannot determine which destination vreg should be
  used if there are multiple uses.
- This patch changes that detection when `copyto` is being emitted. If
  that `copyto`'s source is defined from `implicit-def`, it simplifies
  it. Hence, it works even that `implicit-def` is multi-used.
- Except it simplifies the internal IR, it won't improve the quality of
  code generation. However, it helps to detect 'implicit-def` in a
  straight-forward manner in some passes, such as `si-i1-copies`. A test
  case is added.

Reviewers: sunfish, nhaehnle

Subscribers: jvesely, hiraditya, asbirlea, llvm-commits, yaxunl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361777 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix for the address sanitizer failure. Fixing typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361776 91177308-0d34-0410-b5e6-96231b3b80d8

NFC: Change usage of 'DenseSet' to 'DenseSetImpl' in DenseSetImpl::ConstIterator.

Summary:
Change usage of 'DenseSet' to 'DenseSetImpl' in a friend declaration within DenseSetImpl::ConstIterator. 'ConstIterator' was never updated when DenseSet was split into an impl when adding support for DenseSetImpl.

This fixes build errors on MSVC when forward declaring DenseSet as this friend decl does not declare the template arguments as well.

Reviewers: jpienaar

Reviewed By: jpienaar

Subscribers: jpienaar, lebedev.ri, dexonsmith, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62467

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361775 91177308-0d34-0410-b5e6-96231b3b80d8

Include what you use in AArch64AsmBackend.cpp

AArch64AsmBackend.cpp was not using any APIs from AArch64.h, and was
only including it for transitive dependencies. Doing so is problematic
from include-what-you-use perspective, but it is also a layering issue
(it creates a dependency cycle between the primary AArch64 target
library and the MCTargetDesc library).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361774 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] GetDemandedBits - add demanded elements wrapper implementation

The DemandedElts variable is pretty much inert at the moment - the original GetDemandedBits implementation calls it with an 'all ones' DemandedElts value so the function is active and behaves exactly as it used to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361773 91177308-0d34-0410-b5e6-96231b3b80d8

[LLParser] Fix uninitialized flag variable warnings. NFCI.

Fixes a large number of warnings in the scan-build report on llvm builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361772 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix for the address sanitizer failure caused by the ifollowing commit:

1a8b2ea611cf4ca7cb09562e0238cfefa27c05b5 Divergence driven ISel. Assign register class for cross block values according to the divergence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361770 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Enabled constant expressions as operands of s_waitcnt

See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D61017

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361763 91177308-0d34-0410-b5e6-96231b3b80d8

[MustExecute] Improve MustExecute to correctly handle loop nest

Summary:
for.outer:
  br for.inner
for.inner:
  LI <loop invariant load instruction>
for.inner.latch:
  br for.inner, for.outer.latch
for.outer.latch:
  br for.outer, for.outer.exit

LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock().

Current algorithm of allLoopPathsLeadToBlock()

  1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch
  2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false
  3. return true

Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this.

Author: Whitney (committed by xingxue)

Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn

Reviewed By: jdoerfert

Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour

Tags: #LLVM

Differential Revision: https://reviews.llvm.org/D62418

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361762 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit (NFC)

Add blank line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361761 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Un-XFAIL some tests. NFC

It turns out we support big endian now (probably since r332449, but I
haven't bisected to confirm).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361756 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Cleanup CallLowering a bit

We never actually use the Offsets produced by ComputeValueVTs, so remove
them until we need them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361755 91177308-0d34-0410-b5e6-96231b3b80d8

Cmake: allow using LLVM_EXTERNAL_PROJECTS with LLVM_ENABLE_PROJECTS

The current code iterates over the combination of LLVM_EXTERNAL_PROJECTS
and LLVM_ENABLE_PROJECTS, but then disables projects that are only in
the former. If a project is in LLVM_EXTERNAL_PROJECTS, it should be
enabled.

See also llvm-commits thread on r354060.

Differential revision: https://reviews.llvm.org/D62289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361751 91177308-0d34-0410-b5e6-96231b3b80d8

Make llvm-as --help great again

This is a follow-up to https://reviews.llvm.org/D60411, but for llvm-as.

New output:

    OVERVIEW: llvm .ll -> .bc assembler

    USAGE: llvm-as [options] <input .llvm file>

    OPTIONS:

    Generic Options:

      -help                        - Display available options (-help-hidden for more)
      -help-list                   - Display list of available options (-help-list-hidden for more)
      -version                     - Display the version of this program

    llvm-as Options:

      -data-layout=<layout-string> - data layout string to use
      -disable-output              - Disable output
      -f                           - Enable binary output on terminals
      -module-hash                 - Emit module hash
      -o=<filename>                - Override output filename

Differential Revision: https://reviews.llvm.org/D60603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361750 91177308-0d34-0410-b5e6-96231b3b80d8

[test commit] Add my name to the CREDITS.TXT

This is my test commit. (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361748 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361356: "[MIR] Add simple PRE pass to MachineCSE"

This is problematic on buildbots, as discussed here: https://reviews.llvm.org/rL361356

It seems like the plan already was to revert, but that hasn't happened yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361746 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for D62444. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361745 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Make demangling of MD5 names more robust

Demangler::parse() for MD5 names would:

1. Put all remaining text into the MD5 name sight unseen
2. Not modify MangledName

This meant that if the demangler recursively called parse() (e.g. in
demangleLocallyScopedNamePiece()), every recursive call that started on
an MD5 name would add all remaining bytes to the output buffer but
only advance the input by a byte. For valid inputs, MD5 types are
never (well, see comments for 2 exceptions) nested, but for invalid
input this could cause memory use quadratic in the input size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361744 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches.

The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.

This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.

Fixes PR41725.

Reviewers: efriedma, mcrosier, davide, jdoerfert

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D61576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361743 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] generate R_BPF_NONE relocation for BTF DataSec variables

The variables in BTF DataSec type encode in-section offset.
R_BPF_NONE should be generated instead of R_BPF_64_32.

Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D62460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361742 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

    Details: To make instruction selection really divergence driven it is necessary to assign
             the correct register classes to the cross block values beforehand. For the divergent targets
             same value type requires different register classes dependent on the value divergence.

    Reviewers: rampitec, nhaehnle

    Differential Revision: https://reviews.llvm.org/D59990

    This commit was reverted because of the build failure.
    The reason was mlformed patch.
    Build failure fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361741 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA][Scheduler] Improved critical memory dependency computation.

This fixes a problem where back-pressure increases caused by register
dependencies were not correctly notified if execution was also delayed by memory
dependencies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361740 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] GetDemandedBits - cleanup to more closely match SimplifyDemandedBits. NFCI.

Prep work before adding demanded elts support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361739 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] MaskedValueIsZero - add demanded elements implementation

Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361738 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Refactor the logic that computes the critical memory dependency info. NFCI

CriticalRegDep has been renamed CriticalDependency, and it is now used by class
Instruction to store information about the critical register dependency and the
critical memory dependency. No functional change intendend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361737 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] back out all SwitchInst commits

They caused the sanitizer builds to fail.

My suspicion is the change the countLeadingZeros().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361736 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add shuffle combining support for ISD::ANY_EXTEND_VECTOR_INREG

Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361734 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, one more fixed test from previous push.

The old test was checking for a stupid subtract one that is a transform that
makes the code woorse.

The constant-islands-jump-table.ll test wants the code a specific way,
that makes sense, so I will submit code to fix that one.

Sorry that I really didn't know how to run the test suite before this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361733 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL361731 : [LLParser] Fix uninitialized variable warnings. NFCI.

These 3 variables cause quite a few warnings in the scan-build report on llvm.
........
Revert accidental commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361732 91177308-0d34-0410-b5e6-96231b3b80d8

[LLParser] Fix uninitialized variable warnings. NFCI.

These 3 variables cause quite a few warnings in the scan-build report on llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361731 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, fix failing tests from last patches.

No problems with the transforms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361730 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] prevent crashing with invalid extractelement index

This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361729 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361728 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize

Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.

This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.

Be careful to not generate worse code, by introducing a
SubThreshold heuristic.

Instead of just sorting by signed, generalize the finding of the
best base.

And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.

(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)

And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".

Also, fix generation of invalid LLVM-IR: shl by bit-width.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361727 91177308-0d34-0410-b5e6-96231b3b80d8

[SimpligyCFG] NFC, remove GCD that was only used for powers of two

and replace with an equilivent countTrailingZeros.

GCD is much more expensive than this, with repeated division.

This depends on D60823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361726 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] NFC, update Switch tests to HEAD so I can see if my changes change anything

Also add baseline tests to show effect of later patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361725 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] make countLeadingZeros() and countTrailingZeros() return unsigned

This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().

(as well as __builtin_clzll())

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361724 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI

The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361723 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r361664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361722 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor OptimizeOverflowCheck; NFCI

Extract method to compute overflow based on binop and signedness,
and then make the result handling code generic. This extends the
always-overflow handling to signed muls, but has currently no effect,
as we don't compute always overflow for them (thus NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361721 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Remove OverflowCheckFlavor; NFC

Instead pass binary op and signedness. The extra enum only makes
things more complicated in this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361720 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fma

This adds a pattern for fma, similar to the float and double patterns.

Differential Revision: https://reviews.llvm.org/D62330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361719 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select a number of fp16 rounding functions

This add patterns for fp16 round and ceil etc. Same as the float and double
patterns.

Differential Revision: https://reviews.llvm.org/D62326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361718 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Promote various fp16 math intrinsics

Promote a number of fp16 math intrinsics to float, so that the relevant float
math routines can be used. Copysign is expanded so as to be handled in-place.

Differential Revision: https://reviews.llvm.org/D62325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361717 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] combineBitcastvxi1 - peek through bitops to determine size of original vector

We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well.

There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361716 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fabs

This adds a pattern for the fabs intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361715 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Select fp16 fsqrt

This adds a pattern for the sqrt intrinsic, the same as float and double.

Differential Revision: https://reviews.llvm.org/D62322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361714 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Promote fp16 frem

Promote fp16 frem operations on ARM to floats so they call fmodf.

Differential Revision: https://reviews.llvm.org/D62321

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361713 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add some base fullfp16 tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361712 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add missing R_PPC_* relocation types

While people mostly care about 64-bit, some systems need basic lib32
support. The plan is to make lld (see PR40888) capable of linking some
applications (PR40888).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361711 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Added condition assumption for unreachable blocks

Summary: PR41688

Reviewers: spatel, efriedma, craig.topper, hfinkel, reames

Reviewed By: hfinkel

Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361707 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] lowerBuildVectorToBitOp - support build_vector(shift()) -> shift(build_vector(),C)

Commonly occurs in sign-extension cases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361706 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add Accessor for Mach-O Universal Binary Slices

Summary: Allow for retrieving an object file corresponding to an architecture-specific slice in a Mach-O universal binary file.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60378

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361705 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Combine fminnum/fmaxnum with non-nan operand to fmin/fmax

If we have a known non-nan operand, place it in the second operand
of fmin/fmax that is returned if either operand is nan.

Differential Revision: https://reviews.llvm.org/D62448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361704 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Add support for saturating add/sub

Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.

Differential Revision: https://reviews.llvm.org/D62447

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361703 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] vector-sext - cleanup prefix lists

Add X32-SSE common prefix to merge some checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361702 91177308-0d34-0410-b5e6-96231b3b80d8