granicus.if.org Git

[AArch64][SVE2] Asm: add SQRDMLAH/SQRDMLSH instructions

Summary:
This patch adds support for the indexed and unpredicated vectors forms of the
SQRDMLAH and SQRDMLSH instructions.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D61515

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360683 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Asm: add integer multiply-add/subtract (indexed) instructions

Summary:
This patch adds support for the following instructions:

MLA mul-add, writing addend (Zda = Zda + Zn * Zm[idx])
MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm[idx])

Predicated forms of these instructions were added in SVE.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewed By: rovka

Differential Revision: https://reviews.llvm.org/D61514

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360682 91177308-0d34-0410-b5e6-96231b3b80d8

Replace lit feature keyword 'not_COFF' with 'uses_COFF'.

Differential Revision: https://reviews.llvm.org/D61791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360680 91177308-0d34-0410-b5e6-96231b3b80d8

DWARF v5: emit DW_AT_addr_base if DW_AT_low_pc references .debug_addr

The condition !AddrPool.empty() is tested before attachRangesOrLowHighPC(), which may add an entry to AddrPool. We emit DW_AT_low_pc (DW_FORM_addrx) but may incorrectly omit DW_AT_addr_base for LineTablesOnly. This can be easily reproduced:

clang -gdwarf-5 -gmlt -c a.cc

Fix this by moving !AddrPool.empty() below.

This was discovered while investigating an lld crash (fixed by D61889) on such object files: ld.lld --gdb-index a.o

Reviewed By: probinson

Differential Revision: https://reviews.llvm.org/D61891

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360678 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Custom lower known CR bit spills

For known CRBit spills, CRSET/CRUNSET, it is more efficient to load and spill
the known value instead of extracting the bit.

eg. This sequence is currently used to spill a CRUNSET:
    crclr   4*cr5+lt
    mfocrf  r3,4
    rlwinm  r3,r3,20,0,0
    stw     r3,132(r1)

This patch custom lower it to:
    li  r3,0
    stw r3,132(r1)

Differential Revision: https://reviews.llvm.org/D61754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360677 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] - Apply clang format. NFC.

I am a bit tired of the formatting issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360676 91177308-0d34-0410-b5e6-96231b3b80d8

[APFloat] APFloat::Storage::Storage - fix use after move

This was mentioned both in https://www.viva64.com/en/b/0629/ and by scan-build checks

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360675 91177308-0d34-0410-b5e6-96231b3b80d8

[lit][tests]Add feature libcxx-used and use it in llvm-*-fuzzer tests

When a LLVM binary such as llvm-*-fuzzer is built with libc++, it has dependency on libc++. The path to find shared libraries specified in llvm-*-fuzzer is relative. As a result, these binaries cannot be copied to an arbitrary directory and launched from there. Changes in this patch add a LIT feature to indicate that libc++ is used to build and, based on the feature exclude test cases that test by copying llvm-*-fuzzer binaries to a directory.

Reviewers: hubert.reinterpretcast, dberris, amyk, jasonliu, EricWF

Reviewed By: hubert.reinterpretcast, amyk

Subscribers: javed.absar, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360672 91177308-0d34-0410-b5e6-96231b3b80d8

Save the induction binary operator in IVDescriptors for non FP induction variables.

Summary:
Currently InductionBinOps are only saved for FP induction variables, the PR extends it with non FP induction variable, so user of IVDescriptors can query the InductionBinOps for integer induction variables.

The changes in hasUnsafeAlgebra() and getUnsafeAlgebraInst() are required for the existing LIT test cases to pass. As described in the comment of the two functions, one of the requirement to return true is it is a FP induction variable. The checks was not needed because InductionBinOp was not set on non FP cases before.

https://reviews.llvm.org/D60565 depends on the patch.

Committed on behalf of @Whitney (Whitney Tsang).

Reviewers: jdoerfert, kbarton, fhahn, hfinkel, dmgreen, Meinersbur

Reviewed By: jdoerfert

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61329

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360671 91177308-0d34-0410-b5e6-96231b3b80d8

TableGen: support #ifndef in addition to #ifdef.

TableGen has a limited preprocessor, which only really supports
easier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360670 91177308-0d34-0410-b5e6-96231b3b80d8

Reinstate "FileCheck [5/12]: Introduce regular numeric variables"

This reinstates r360578 (git e47362c1ec1ea31b626336cc05822035601c3e57),
reverted in r360653 (git 004393681c25e34e921adccc69ae6378090dee54),
with a fix for the list added in FileCheck.rst to build without error.

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar,
arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar,
arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60385

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360665 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] X86TargetLowering::LowerINTRINSIC_WO_CHAIN - ensure rounding control is initialized. NFCI.

Fixes scan-build warnings

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360664 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64: support binutils-like things on arm64_32.

This adds support for the arm64_32 watchOS ABI to LLVM's low level tools,
teaching them about the specific MachO choices and constants needed to
disassemble things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360663 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalOpt: do not promote globals used atomically to constants.

Some atomic loads are implemented as cmpxchg (particularly if large or
floating), and that usually requires write access to the memory involved
or it will segfault.

We can still propagate the constant value to users we understand though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360662 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Cache gnu_debuglink's target CRC

.gnu_debuglink section contains information regarding file with
debugging symbols, identified by its CRC32. This target file is not
intended to ever change or it would invalidate the stored checksum, yet
the checksum is calculated over and over again for each of the objects
inside the archive, usually hundreds of times.

This patch precomputes the CRC32 of the target once and then reuses the
value where required, saving lots of redundant I/O.

The error message reported should stay the same, although now it might
be reported earlier.

Reviewed by: jhenderson, jakehehrlich, MaskRay

Differential Revision: https://reviews.llvm.org/D61343

Patch by Michal Janiszewski

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360661 91177308-0d34-0410-b5e6-96231b3b80d8

[test]Make test work on Windows

Previously, the test didn't work because '\' characters appeared in the
sed string, causing bogus escape characters to form in the substituted
string literal. Switching to using '%/p' causes the path to be emitted
with '/' characters instead, so that there are are no escaping issues.

Reviewed by: kzhuravl, grimar

Differential Revision: https://reviews.llvm.org/D61856

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360660 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySanitizer] getMMXVectorTy - assert valid element size. NFCI.

Fixes scan-build warnings

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360658 91177308-0d34-0410-b5e6-96231b3b80d8

[IRTranslator] Don't hardcode GEP index type

When breaking up loads and stores of aggregates, the IRTranslator uses
LLT::scalar(64) for the index type of the G_GEP instructions that
compute the addresses. This is unnecessarily large for 32-bit targets.
Use the int ptr type provided by the DataLayout instead.

Note that we're already doing the right thing when translating
getelementptr instructions from the IR. This is just an oversight when
generating new ones while translating loads/stores.

Both x86 and AArch64 already have tests confirming that the old
behaviour is preserved for 64-bit targets.

Differential Revision: https://reviews.llvm.org/D61852

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360656 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "FileCheck [5/12]: Introduce regular numeric variables"

This reverts r360578 (git e47362c1ec1ea31b626336cc05822035601c3e57) to
solve the sphinx build failure on
http://lab.llvm.org:8011/builders/llvm-sphinx-docs buildbot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360653 91177308-0d34-0410-b5e6-96231b3b80d8

Add guidelines/recommendations for organizers of LLVM Socials

Differential Revision: https://reviews.llvm.org/D61550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360651 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Prefer locked stack op over mfence for seq_cst 64-bit stores on 32-bit targets

This is a follow on to D58632, with the same logic. Given a memory operation which needs ordering, but doesn't need to modify any particular address, prefer to use a locked stack op over an mfence.

Differential Revision: https://reviews.llvm.org/D61863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360649 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Change ObjectFile::getSectionContents to return Expected<ArrayRef<uint8_t>>

Change
std::error_code getSectionContents(DataRefImpl, StringRef &) const;
to
Expected<ArrayRef<uint8_t>> getSectionContents(DataRefImpl) const;

Many object formats use ArrayRef<uint8_t> as the underlying type, which
is generally better than StringRef to represent binary data, so change
the type to decrease the number of type conversions.

Reviewed By: ruiu, sbc100

Differential Revision: https://reviews.llvm.org/D61781

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360648 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: add Hexagon target

Differential Revision: https://reviews.llvm.org/D61819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360647 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: add Sparc target

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360645 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: add Lanai target

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360644 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] Fix typos in triples

Found by bzEq (Kai Luo).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360643 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use X86 instead of X32 as a check prefix in atomic-idempotent.ll. NFC

X32 can refer to a 64-bit ABI that uses 32-bit ints, longs, and pointers.

I plan to add gnux32 command lines to this test so this prepares for that.

Also remove some check lines that have a prefix that is not in any run lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360642 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] fix unused variable warning and unneeded indirection; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360640 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG, x86] allow targets to override test for binop opcodes

This follows the pattern of the existing isCommutativeBinOp().

x86 shows improvements from vector narrowing for the min/max opcodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360639 91177308-0d34-0410-b5e6-96231b3b80d8

[coroutines] Fix spills of static array allocas

Summary:
CoroFrame was not considering static array allocas, and was only ever reserving a single element in the coroutine frame.
This meant that stores to the non-zero'th element would corrupt later frame data.

Store static array allocas as field arrays in the coroutine frame.

Added test.

Committed by Gor Nishanov on behalf of ben-clayton
Reviewers: GorNishanov, modocache

Reviewed By: GorNishanov

Subscribers: Orlando, capn, EricWF, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61372

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360636 91177308-0d34-0410-b5e6-96231b3b80d8

[gn] Fix build

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360629 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use ISD::MERGE_VALUES to return from lowerAtomicArith instead of calling ReplaceAllUsesOfValueWith and returning SDValue().

Returning SDValue() makes the caller think that nothing happened and it will
end up executing the Expand path. This generates extra nodes that will need to
be pruned as dead code.

Returning an ISD::MERGE_VALUES will tell the caller that we'd like to make a
change and it will take care of replacing uses. This will prevent falling into
the Expand path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360627 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx1010 SearchableTableEmitter patch for NSA

This part was accidentally missing from NSA image support commit.

Differential Revision: https://reviews.llvm.org/D61868

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360623 91177308-0d34-0410-b5e6-96231b3b80d8

[Pass Pipeline][NFC] Add a test prior to committing D61726

This patch just adds a test case to show the differences in code emitted
by opt before and after https://reviews.llvm.org/D61726.

Previous attempt to commit this did not include the registered target
requirement so it caused buildbot breaks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360620 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Various type corrections to the code that creates LOCK_OR32mi8/OR32mi8Locked to the stack for idempotent atomic rmw and atomic fence.

These are updates to match how isel table would emit a LOCK_OR32mi8 node.

-Use i32 for the immediate zero even though only 8 bits are encoded.
-Use i16 for segment register.
-Use LOCK_OR32mi8 for idempotent atomic operations in 32-bit mode to match
64-bit mode. I'm not sure why OR32mi8Locked and LOCK_OR32mi8 both exist. The
only difference seems to be that OR32mi8Locked is marked as UnmodeledSideEffects=1.
-Emit an extra i32 result for the flags output.

I don't know if the types here really matter just noticed it was inconsistent
with normal behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360619 91177308-0d34-0410-b5e6-96231b3b80d8

[JITLink][MachO] Honor the no-dead-strip flag on nlist entries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360618 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Ensure redirected outputs don't contain output from previous tests.

stdout may be buffered, and may not flush on every write. Explicitly flushing
before redirecting the output ensures that the captured output does not contain
output from other tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360617 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Don't assume that zext/sext result is i32/i64 in fast isel (PR41841)

Usually this will abort fast-isel at the instruction using the
non-legal result, but if the only use is in a different basic block,
we'll incorrectly assume that the zext/sext is to i32 (rather than
i128 in this case).

Differential Revision: https://reviews.llvm.org/D61823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360616 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx1010 tests. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360615 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Reorder includes per coding standard. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360609 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Remove now unused V2FP16_ONE constant def. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360608 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [X86] Avoid SFB - Fix inconsistent codegen with/without debug info

Revert r360436 as it is causing clang-x64-windows-msvc buildbot to fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360606 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] try harder to form rotate (funnel shift) (PR20750)

We have a similar match for patterns ending in a truncate. This
should be ok for all targets because the default expansion would
still likely be better from replacing 2 'and' ops with 1.

Attempt to show the logic equivalence in Alive (which doesn't
currently have funnel-shift in its vocabulary AFAICT):

  %shamt = zext i8 %i to i32
  %m = and i32 %shamt, 31
  %neg = sub i32 0, %shamt
  %and4 = and i32 %neg, 31
  %shl = shl i32 %v, %m
  %shr = lshr i32 %v, %and4
  %or = or i32 %shr, %shl
  =>
  %a = and i8 %i, 31
  %shamt2 = zext i8 %a to i32
  %neg2 = sub i32 0, %shamt2
  %and4 = and i32 %neg2, 31
  %shl = shl i32 %v, %shamt2
  %shr = lshr i32 %v, %and4
  %or = or i32 %shr, %shl

https://rise4fun.com/Alive/V9r

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360605 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Handle multi depth GEPs w/ inline asm constraints

Summary:
X86TargetLowering::LowerAsmOperandForConstraint had better support than
TargetLowering::LowerAsmOperandForConstraint for arbitrary depth
getelementpointers for "i", "n", and "s" extended inline assembly
constraints. Hoist its support from the derived class into the base
class.

Link: https://github.com/ClangBuiltLinux/linux/issues/469
Reviewers: echristo, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, E5ten, kees, jyknight, nemanjai, javed.absar, eraman, hiraditya, jsji, llvm-commits, void, craig.topper, nathanchance, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360604 91177308-0d34-0410-b5e6-96231b3b80d8

Stop defining negative versions of some lit feature keywords:
zlib/nozlib, asan/not_asan, msan/not_msan, ubsan/not_ubsan.

We still have two other ways to express the absence of a feature.
First, we have the '!' operator to invert the sense of a keyword.  For
example, given a feature that depends on zlib being unavailable, its
test can say:
    REQUIRES: !zlib

Second, if a test doesn't play well with some features, such as
sanitizers, that test can say:
    UNSUPPORTED: asan, msan

The different ways of writing these exclusions both have the same
technical effect, but have different implications to the reader.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360603 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for rotates with narrow shift amount (PR20750); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360601 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Fewer dependencies in llvm/lib/Target

The tablegen groups only need public_deps for inc files included
(possibly transitively) in other targets. Move inc files that are
internan to the MCTargetDesc libraries into regular deps.

Related to the changes that merged InstPrinter into MCTargetDesc
(360484, 360486 etc).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360600 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r360572

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360597 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] LowerBuildVectorv4x32 - don't insert MOVQ for undef elts

Fixes the regression noted in D61782 where a VZEXT_MOVL was being inserted because we weren't discriminating between 'zeroable' and 'all undef' for the upper elts.

Differential Revision: https://reviews.llvm.org/D61782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360596 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Relax use limits for lowerAddSubToHorizontalOp (PR32433)

Now that we can use HADD/SUB for scalar additions from any pair of extracted elements (D61263), we can relax the one use limit as we will be able to merge multiple uses into using the same HADD/SUB op.

This exposes a couple of missed opportunities in LowerBuildVectorv4x32 which will be committed separately.

Differential Revision: https://reviews.llvm.org/D61782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360594 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Add SimplifyDemandedBits support for ZERO_EXTEND_VECTOR_INREG

More work for PR39709.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360592 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for mask register variant of PR41619 which should be fixed after r360552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360591 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[LSR] Tweak setup cost depth threshold to 10."

Changing the threshold might not be the best long term approach. Revert for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360589 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add SimplifyDemandedBits support for PEXTRB/PEXTRW (PR39709)

Test case will be included in a followup - its being used but its tricky to show a case that isn't caught at a later stage anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360588 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] narrow vector binop with inserts/extract

We catch most of these patterns (on x86 at least) by matching
a concat vectors opcode early in combining, but the pattern may
emerge later using insert subvector instead.

The AVX1 diffs for add/sub overflow show another missed narrowing
pattern. That one may be falling though the cracks because of
combine ordering and multiple uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360585 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test for insert/extract binop; NFC

This pattern is visible in the c-ray benchmark with an AVX target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360582 91177308-0d34-0410-b5e6-96231b3b80d8

Add constrained fptrunc and fpext intrinsics.

The new fptrunc and fpext intrinsics are constrained versions of the
regular fptrunc and fpext instructions.

Reviewed by: Andrew Kaylor, Craig Topper, Cameron McInally, Conner Abbot
Approved by: Craig Topper
Differential Revision: https://reviews.llvm.org/D55897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360581 91177308-0d34-0410-b5e6-96231b3b80d8

TargetLowering::SimplifyDemandedBits - early-out for UNDEF ops. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360579 91177308-0d34-0410-b5e6-96231b3b80d8

FileCheck [5/12]: Introduce regular numeric variables

Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch introduces regular numeric
variables which can be set on the command-line.

This commit introduces regular numeric variable that can be set on the
command-line with the -D option to a numeric value. They can then be
used in CHECK patterns in numeric expression with the same shape as
@LINE numeric expression, ie. VAR, VAR+offset or VAR-offset where offset
is an integer literal.

The commit also enable strict whitespace in the verbose.txt testcase to
check that the position or the location diagnostics. It fixes one of the
existing CHECK in the process which was not accurately testing a
location diagnostic (ie. the diagnostic was correct, not the CHECK).

Copyright:
    - Linaro (changes up to diff 183612 of revision D55940)
    - GraphCore (changes in later versions of revision D55940 and
                 in new revision created off D55940)

Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk

Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60385

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360578 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Don't internalize weak writeable variables

Variables with linkonce_odr and weak_odr linkage shouldn't be internalized
if they're not readonly. Otherwise we may end up with multiple copies of
such variable, so reads and writes will become inconsistent

Differential revision: https://reviews.llvm.org/D61255

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360577 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify llvm-cat help

Only output options that are directly relevant.

Differential Revision: https://reviews.llvm.org/D61740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360575 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE2] Add SVE2 target features to backend and TargetParser

Summary:
This patch adds the following features defined by Arm SVE2 architecture
extension:

sve2, sve2-aes, sve2-sm4, sve2-sha3, bitperm

For existing CPUs these features are declared as unsupported to prevent
scheduler errors.

The specification can be found here:
https://developer.arm.com/docs/ddi0602/latest

Reviewers: SjoerdMeijer, sdesmalen, ostannard, rovka

Reviewed By: SjoerdMeijer, rovka

Subscribers: rovka, javed.absar, tschuett, kristof.beyls, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61513

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360573 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Model floating-point control register

This adds the FPC (floating-point control register) as a reserved
physical register and models its use by SystemZ instructions.

Note that only the current rounding modes and the IEEE exception
masks are modeled. *Changes* of the FPC due to exceptions (in
particular the IEEE exception flags and the DXC) are not modeled.

At this point, this patch is mostly NFC, but it will prevent
scheduling of floating-point instructions across SPFC/LFPC etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360570 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][ParallelDSP] Relax alias checks

When deciding the safety of generating smlad, we checked for any
writes within the block that may alias with any of the loads that
need to be widened. This is overly conservative because it only
matters when there's a potential aliasing write to a location
accessed by a pair of loads.

Now we check for aliasing writes only once, during setup. If two
loads are found to have an aliasing write between them, we don't add
these loads to LoadPairs. This means that later during the transform,
we can safely widened a pair without worrying about aliasing.

However, to maintain correctness, we also need to change the way that
wide loads are inserted because the order is now important.

The MatchSMLAD method has also been changed, absorbing
MatchReductions and AddMACCandidate to hopefully improve readability.

Differential Revision: https://reviews.llvm.org/D6102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360567 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Fix invalid alias analysis.

Summary:
When we know for sure whether two addresses do or do not alias, we
should immediately return from DAGCombiner::isAlias().

I think this comes from a bad copy/paste, Sorry for not catching that during the
code review.

Fixes PR41855.

Reviewers: niravd, gchatelet, EricWF

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360566 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner][NFC] Commit test to show fix in D61846.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360561 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add dependency on WebAssemblyDesc to fix BUILD_SHARED_LIBS=on builds after rL360550

This fixes the link error

ld.lld: error: undefined symbol: llvm::WebAssembly::anyTypeToString(unsigned int)
>>> referenced by WebAssemblyDisassembler.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360558 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] emit BTF sections only if debuginfo available

Currently, without -g, BTF sections may still be emitted with
data sections, e.g., for linux kernel bpf selftest
test_tcp_check_syncookie_kern.c issue discovered by Martin
as shown below.

-bash-4.4$ bpftool btf dump file test_tcp_check_syncookie_kern.o
[1] VAR 'results' type_id=0, linkage=global-alloc
[2] VAR '_license' type_id=0, linkage=global-alloc
[3] DATASEC 'license' size=0 vlen=1
type_id=2 offset=0 size=4
[4] DATASEC 'maps' size=0 vlen=1
type_id=1 offset=0 size=28

Let disable BTF generation if no debuginfo, which is
the original design.

Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D61826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360556 91177308-0d34-0410-b5e6-96231b3b80d8

[JITLink] Track section alignment and make sure it is respected during layout.

Previously we had only honored alignments on individual atoms, but
tools/runtimes may assume that the section alignment is respected too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360555 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: support host build on ppc64 (a.k.a. powerpc64le)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360553 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r358887 "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling"

I've included a new fix in X86RegisterInfo to prevent PR41619 without
reintroducing r359392. We might be able to improve that in the base class
implementation of shouldRewriteCopySrc somehow. But this hopefully enables
forward progress on SimplifyDemandedBits improvements for now.

Original commit message:

This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly.

The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGComb
but it caused a lot of noise on other targets - some improvements, some regressions.

The X86 changes are all definite wins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360552 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: merge r360550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360551 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Move InstPrinter files to MCTargetDesc. NFC

For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360550 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r360540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360549 91177308-0d34-0410-b5e6-96231b3b80d8

[JITLink] Add a test for zero-filled content.

Also updates RuntimeDyldChecker and llvm-rtdyld to support zero-fill tests by
returning a content address of zero (but no error) for zero-fill atoms, and
treating loads from zero as returning zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360547 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Fix some typos.

Patch by Praveen Velliengiri. Thanks Praveen!

https://reviews.llvm.org/D61793

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360546 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Split VZEXT_MOVL ymm/zmm if the upper elements are not demanded.

Removes unnecessary vzeroupper noted in D61806

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360543 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorizer] add tests for FP minmax; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360542 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] try to move bitcast after extract_subvector

I noticed that we were failing to narrow an x86 ymm math op in a case similar
to the 'madd' test diff. That is because a bitcast is sitting between the math
and the extract subvector and thwarting our pattern matching for narrowing:

       t56: v8i32 = add t59, t58
      t68: v4i64 = bitcast t56
    t73: v2i64 = extract_subvector t68, Constant:i64<2>
  t96: v4i32 = bitcast t73

There are a few wins and neutral diffs in the other tests.

Differential Revision: https://reviews.llvm.org/D61806

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360541 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] update_test_checks.py: allow opt-8, opt-9

Allow using Debian's opt-8, opt-9 with update_test_checks.py

Patch by Shawn Landden!

Differential Revision: https://reviews.llvm.org/D61148

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360536 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] SimplifyDemandedBits - call PEXTRB/PEXTRW SimplifyDemandedVectorElts as well.

See if we can simplify the demanded vector elts from the extraction before trying to simplify the demanded bits.

This helps us with target shuffles and hops in particular.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360535 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Add SimplifyDemandedBits support for BITREVERSE

Pulled out of D58017 while I continue to investigate the BSWAP regression on PPC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360534 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Updated shift-mask test targets for D61830

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360533 91177308-0d34-0410-b5e6-96231b3b80d8

[CommandLine] Add long option flag for cl::ParseCommandLineOptions . Part 5 of 5

Summary:
If passed, the long option flag makes the CommandLine parser
mimic the behavior or GNU getopt_long. Short options are a single
character prefixed by a single dash, and long options are multiple
characters prefixed by a double dash.

This patch was motivated by the discussion in the following thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html

Reviewed By: MaskRay

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360532 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add scalar shl+lshr -> shift+mask tests (PR40758)

As discussed on D61068, many x86 targets can perform 2 immediate shifts quicker than a shift + mask

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360530 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add avx512f tests for boolean reduction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360529 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add min/max reduction costs for all SSE targets

The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference).

I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360528 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] yaml2obj/yam2elf.cpp whitespace changes: dos2unix removed CRs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360527 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add SimplifyDemandedVectorElts HADD/HSUB handling.

Still missing PHADDW/PHSUBW tests because PEXTRW doesn't call SimplifyDemandedVectorElts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360526 91177308-0d34-0410-b5e6-96231b3b80d8

FixupLEAPass::fixupIncDec - non-LEA opcodes should not happen here. NFCI.

Matches what we do in other functions and fixes scan-build warning about uninitialized NewOpcode variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360525 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add CMOV_FR32X/CMOV_FR64X pseudo instructions. Use them in fast isel to fix a machine verifier error after adding test cases.

Fast isel picks the FR32X/FR64X register classes when lowering pseudo select, but it didn't have the right opcode to go with it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360524 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Sink some fast isel code into the only if that uses it. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360523 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use TLI.getRegClassFor to simplify some more fast isel code. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360522 91177308-0d34-0410-b5e6-96231b3b80d8

[MC][X86] Add test cases from PR14056

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360521 91177308-0d34-0410-b5e6-96231b3b80d8

HexagonConstEvaluator::evaluateHexExt - check incoming opcodes. NFCI.

Only certain extension opcodes are supported - fixes scan build warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360520 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Tweaked HADD/HSUB SimplifyDemandedVectorElts

Try to ensure we LHS and RHS test coverage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360519 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add integer HADD/HSUB SimplifyDemandedVectorElts tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360518 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add HADD/HSUB SimplifyDemandedVectorElts tests

Shows missed opportunities to simplify args.

Will add integer HADD/HSUB tests in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360517 91177308-0d34-0410-b5e6-96231b3b80d8

Fix uninitialized variable analyzer warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360516 91177308-0d34-0410-b5e6-96231b3b80d8

SelectionDAGISel::CodeGenAndEmitDAG - remove unused variable. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360514 91177308-0d34-0410-b5e6-96231b3b80d8