granicus.if.org Git

[llvm-objdump] Keep warning for --disassemble-functions in correct order.

relative to normal output when dumping archive files.

prepare for PR35351.

Reviewers: jhenderson, grimar, MaskRay, rupprecht

Reviewed by: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D64165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365564 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx908 mAI instructions, MC part

Differential Revision: https://reviews.llvm.org/D64446

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365563 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Optimize getSpillCost(); NFCI

For a given set of live values, the spill cost will always be the
same for each call. Compute the cost once and multiply it by the
number of calls.

(I'm not sure this spill cost modeling makes sense if there are
multiple calls, as the spill cost will likely be shared across
calls in that case. But that's how it currently works.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365552 91177308-0d34-0410-b5e6-96231b3b80d8

hwasan: Improve precision of checks using short granule tags.

A short granule is a granule of size between 1 and `TG-1` bytes. The size
of a short granule is stored at the location in shadow memory where the
granule's tag is normally stored, while the granule's actual tag is stored
in the last byte of the granule. This means that in order to verify that a
pointer tag matches a memory tag, HWASAN must check for two possibilities:

* the pointer tag is equal to the memory tag in shadow memory, or
* the shadow memory tag is actually a short granule size, the value being loaded
is in bounds of the granule and the pointer tag is equal to the last byte of
the granule.

Pointer tags between 1 to `TG-1` are possible and are as likely as any other
tag. This means that these tags in memory have two interpretations: the full
tag interpretation (where the pointer tag is between 1 and `TG-1` and the
last byte of the granule is ordinary data) and the short tag interpretation
(where the pointer tag is stored in the granule).

When HWASAN detects an error near a memory tag between 1 and `TG-1`, it
will show both the memory tag and the last byte of the granule. Currently,
it is up to the user to disambiguate the two possibilities.

Because this functionality obsoletes the right aligned heap feature of
the HWASAN memory allocator (and because we can no longer easily test
it), the feature is removed.

Also update the documentation to cover both short granule tags and
outlined checks.

Differential Revision: https://reviews.llvm.org/D63908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365551 91177308-0d34-0410-b5e6-96231b3b80d8

[PoisonChecking] Flesh out complete todo list for full coverage

Note: I don't actually plan to implement all of the cases at the moment, I'm just documenting them for completeness. There's a couple of cases left which are practically useful for me in debugging loop transforms, and I'll probably stop there for the moment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365550 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AMDGPU][DAGCombiner] Move call to allowsMemoryAccess into isLoadBitCastBeneficial/isStoreBitCastBeneficial to allow X86 to bypass it

Basically the problem is that X86 doesn't set the Fast flag from
allowsMemoryAccess on certain CPUs due to slow unaligned memory
subtarget features. This prevents bitcasts from being folded into
loads and stores. But all vector loads and stores of the same width
are the same cost on X86.

This patch merges the allowsMemoryAccess call into isLoadBitCastBeneficial to allow X86 to skip it.

Differential Revision: https://reviews.llvm.org/D64295

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365549 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build error for VC STL, use llvm::make_unique

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365548 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx908 register file changes

Differential Revision: https://reviews.llvm.org/D64438

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365546 91177308-0d34-0410-b5e6-96231b3b80d8

[PoisonCheker] Support for out of bounds operands on shifts + insert/extractelement

These are sources of poison which don't come from flags, but are clearly documented in the LangRef. Left off support for scalable vectors for the moment, but should be easy to add if anyone is interested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365543 91177308-0d34-0410-b5e6-96231b3b80d8

Boilerplate for producing XCOFF object files from the PowerPC backend.

Stubs out a number of the classes needed to produce a new object file format
(XCOFF) for the powerpc-aix target. For testing input is an empty module which
produces an object file with just a file header.

Differential Revision: https://reviews.llvm.org/D61694

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365541 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] LowerToHorizontalOp - use count_if to count non-UNDEF ops. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365540 91177308-0d34-0410-b5e6-96231b3b80d8

[PoisonChecking] Add validation rules for "exact" on sdiv/udiv

As directly stated in the LangRef, no ambiguity here...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365538 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] only emit used or referenced CFI records to index

Summary: We emit CFI_FUNCTION_DEFS and CFI_FUNCTION_DECLS to
distributed ThinLTO indices to implement indirect function call
checking. This change causes us to only emit entries for functions
that are either defined or used by the module we're writing the index
for (instead of all functions in the combined index), which can make
the indices substantially smaller.

Fixes PR42378.

Reviewers: pcc, vitalybuka, eugenis

Subscribers: mehdi_amini, hiraditya, dexonsmith, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365537 91177308-0d34-0410-b5e6-96231b3b80d8

Add a transform pass to make the executable semantics of poison explicit in the IR

Implements a transform pass which instruments IR such that poison semantics are made explicit. That is, it provides a (possibly partial) executable semantics for every instruction w.r.t. poison as specified in the LLVM LangRef. There are obvious parallels to the sanitizer tools, but this pass is focused purely on the semantics of LLVM IR, not any particular source language.

The target audience for this tool is developers working on or targetting LLVM from a frontend. The idea is to be able to take arbitrary IR (with the assumption of known inputs), and evaluate it concretely after having made poison semantics explicit to detect cases where either a) the original code executes UB, or b) a transform pass introduces UB which didn't exist in the original program.

At the moment, this is mostly the framework and still needs to be fleshed out. By reusing existing code we have decent coverage, but there's a lot of cases not yet handled. What's here is good enough to handle interesting cases though; for instance, one of the recent LFTR bugs involved UB being triggered by integer induction variables with nsw/nuw flags would be reported by the current code.

(See comment in PoisonChecking.cpp for full explanation and context)

Differential Revision: https://reviews.llvm.org/D64215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365536 91177308-0d34-0410-b5e6-96231b3b80d8

Try to appease the Windows build bots.

Several of the conditonal operators commited in llvm-svn: 365524 fail to compile
on the windows buildbots. Converting to an if and early return to try to fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365535 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] Fix a typo in the file name

Fixed the file name from BPFAbstrctMemberAccess.cpp to
BPFAbstractMemberAccess.cpp.

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365532 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r365503.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365530 91177308-0d34-0410-b5e6-96231b3b80d8

[unittest] Add the missing bogus machine register info initialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365529 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx908 target

Differential Revision: https://reviews.llvm.org/D64429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365525 91177308-0d34-0410-b5e6-96231b3b80d8

[Object][XCOFF] Add support for 64-bit file header and section header dumping.

Adds a readobj dumper for 32-bit and 64-bit section header tables, and extend
support for the file-header dumping to include 64-bit object files. Also
refactors the binary file parsing to be done in a helper function in an attempt
to cleanup error handeling.

Differential Revision: https://reviews.llvm.org/D63843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365524 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for trunc(load); NFC

I'm not sure if transforming any of these is valid as
a target-independent fold, but we might as well have
a few tests here to confirm or deny our position.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365523 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix test failing since r365512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365521 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()"

This reverts commit d95557306585404893d610784edb3e32f1bfce18.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365520 91177308-0d34-0410-b5e6-96231b3b80d8

Add lit.local.cfg to llvm-objdump tests

Add configuration file to llvm-objdump tests to treat files with .yaml
extension as tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365519 91177308-0d34-0410-b5e6-96231b3b80d8

Remove a comment that has been obsolete since r327679

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365517 91177308-0d34-0410-b5e6-96231b3b80d8

[unittest] Add bogus register info.

Reviewers: dstenb

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365516 91177308-0d34-0410-b5e6-96231b3b80d8

Rename llvm/test/tools/llvm-pdbdump to llvm/test/tools/llvm-pdbutil

llvm-pdbdump was renamed to llvm-pdbutil long ago. This updates the test
to be where you'd expect them to be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365515 91177308-0d34-0410-b5e6-96231b3b80d8

Make pdbdump-objfilename test work again

- The test had extension .yaml, which lit doesn't execute in this
directory. Rename to .test to make it run, and move the yaml bits
into a dedicated file, like with all other tests in this dir.

- llvm-pdbdump got renamed to llvm-pdbutil long ago, update test.

- -dbi-module-info got renamed in r305032, update test for this too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365514 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Created a sub-register class for the return address operand in the return instruction.

Function return instruction lowering, currently uses the fixed register pair s[30:31] for holding
the return address. It can be any SGPR pair other than the CSRs. Created an SGPR pair sub-register class
exclusive of the CSRs, and used this regclass while lowering the return instruction.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D63924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365512 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Fix ICE in isDesirableToCommuteWithShift

Summary:
There was an error being thrown from isDesirableToCommuteWithShift in
some tests. This was tracked down to the method being called before
legalisation, with an extended value type, not a machine value type.

In the case I diagnosed, the error was only hit with an instruction sequence
involving `i24`s in the add and shift. `i24` is not a Machine ValueType, it is
instead an Extended ValueType which was causing the issue.

I have added a test to cover this case, and fixed the error in the callback.

Reviewers: asb, luismarques

Reviewed By: asb

Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365511 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Optimize conditional branches followed by unconditional branches

If we have an icmp->brcond->br sequence where the brcond just branches to the
next block jumping over the br, while the br takes the false edge, then we can
modify the conditional branch to jump to the br's target while inverting the
condition of the incoming icmp. This means we can eliminate the br as an
unconditional branch to the fallthrough block.

Differential Revision: https://reviews.llvm.org/D64354

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365510 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Show error in case of using FP64 mode on pre MIPS32R2 CPU

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365508 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Explicitly select `mips32r2` CPU for test cases require 64-bit FPU. NFC

Support for 64-bit coprocessors on a 32-bit architecture
was added in `MIPS32 R2`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365507 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fixed tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365506 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] LoadedSlice - keep getOffsetFromBase() uint64_t offset. NFCI.

Keep the uint64_t type from getOffsetFromBase() to stop truncation/extension overflow warnings in MSVC in alignment math.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365504 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] Support for compile once and run everywhere

Introduction
============

This patch added intial support for bpf program compile once
and run everywhere (CO-RE).

The main motivation is for bpf program which depends on
kernel headers which may vary between different kernel versions.
The initial discussion can be found at https://lwn.net/Articles/773198/.

Currently, bpf program accesses kernel internal data structure
through bpf_probe_read() helper. The idea is to capture the
kernel data structure to be accessed through bpf_probe_read()
and relocate them on different kernel versions.

On each host, right before bpf program load, the bpfloader
will look at the types of the native linux through vmlinux BTF,
calculates proper access offset and patch the instruction.

To accommodate this, three intrinsic functions
   preserve_{array,union,struct}_access_index
are introduced which in clang will preserve the base pointer,
struct/union/array access_index and struct/union debuginfo type
information. Later, bpf IR pass can reconstruct the whole gep
access chains without looking at gep itself.

This patch did the following:
  . An IR pass is added to convert preserve_*_access_index to
    global variable who name encodes the getelementptr
    access pattern. The global variable has metadata
    attached to describe the corresponding struct/union
    debuginfo type.
  . An SimplifyPatchable MachineInstruction pass is added
    to remove unnecessary loads.
  . The BTF output pass is enhanced to generate relocation
    records located in .BTF.ext section.

Typical CO-RE also needs support of global variables which can
be assigned to different values to different hosts. For example,
kernel version can be used to guard different versions of codes.
This patch added the support for patchable externals as well.

Example
=======

The following is an example.

  struct pt_regs {
    long arg1;
    long arg2;
  };
  struct sk_buff {
    int i;
    struct net_device *dev;
  };

  #define _(x) (__builtin_preserve_access_index(x))
  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) =
          (void *) 4;
  extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version;
  int bpf_prog(struct pt_regs *ctx) {
    struct net_device *dev = 0;

    // ctx->arg* does not need bpf_probe_read
    if (__kernel_version >= 41608)
      bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg1)->dev));
    else
      bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg2)->dev));
    return dev != 0;
  }

In the above, we want to translate the third argument of
bpf_probe_read() as relocations.

  -bash-4.4$ clang -target bpf -O2 -g -S trace.c

The compiler will generate two new subsections in .BTF.ext,
OffsetReloc and ExternReloc.
OffsetReloc is to record the structure member offset operations,
and ExternalReloc is to record the external globals where
only u8, u16, u32 and u64 are supported.

   BPFOffsetReloc Size
   struct SecLOffsetReloc for ELF section #1
   A number of struct BPFOffsetReloc for ELF section #1
   struct SecOffsetReloc for ELF section #2
   A number of struct BPFOffsetReloc for ELF section #2
   ...
   BPFExternReloc Size
   struct SecExternReloc for ELF section #1
   A number of struct BPFExternReloc for ELF section #1
   struct SecExternReloc for ELF section #2
   A number of struct BPFExternReloc for ELF section #2

  struct BPFOffsetReloc {
    uint32_t InsnOffset;    ///< Byte offset in this section
    uint32_t TypeID;        ///< TypeID for the relocation
    uint32_t OffsetNameOff; ///< The string to traverse types
  };

  struct BPFExternReloc {
    uint32_t InsnOffset;    ///< Byte offset in this section
    uint32_t ExternNameOff; ///< The string for external variable
  };

Note that only externs with attribute section ".BPF.patchable_externs"
are considered for Extern Reloc which will be patched by bpf loader
right before the load.

For the above test case, two offset records and one extern record
will be generated:
  OffsetReloc records:
        .long   .Ltmp12                 # Insn Offset
        .long   7                       # TypeId
        .long   242                     # Type Decode String
        .long   .Ltmp18                 # Insn Offset
        .long   7                       # TypeId
        .long   242                     # Type Decode String

  ExternReloc record:
        .long   .Ltmp5                  # Insn Offset
        .long   165                     # External Variable

  In string table:
        .ascii  "0:1"                   # string offset=242
        .ascii  "__kernel_version"      # string offset=165

The default member offset can be calculated as
    the 2nd member offset (0 representing the 1st member) of struct "sk_buff".

The asm code:
    .Ltmp5:
    .Ltmp6:
            r2 = 0
            r3 = 41608
    .Ltmp7:
    .Ltmp8:
            .loc    1 18 9 is_stmt 0        # t.c:18:9
    .Ltmp9:
            if r3 > r2 goto LBB0_2
    .Ltmp10:
    .Ltmp11:
            .loc    1 0 9                   # t.c:0:9
    .Ltmp12:
            r2 = 8
    .Ltmp13:
            .loc    1 19 66 is_stmt 1       # t.c:19:66
    .Ltmp14:
    .Ltmp15:
            r3 = *(u64 *)(r1 + 0)
            goto LBB0_3
    .Ltmp16:
    .Ltmp17:
    LBB0_2:
            .loc    1 0 66 is_stmt 0        # t.c:0:66
    .Ltmp18:
            r2 = 8
            .loc    1 21 66 is_stmt 1       # t.c:21:66
    .Ltmp19:
            r3 = *(u64 *)(r1 + 8)
    .Ltmp20:
    .Ltmp21:
    LBB0_3:
            .loc    1 0 66 is_stmt 0        # t.c:0:66
            r3 += r2
            r1 = r10
    .Ltmp22:
    .Ltmp23:
    .Ltmp24:
            r1 += -8
            r2 = 8
            call 4

For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number
8 is the structure offset based on the current BTF.
Loader needs to adjust it if it changes on the host.

For instruction .Ltmp5, "r2 = 0", the external variable
got a default value 0, loader needs to supply an appropriate
value for the particular host.

Compiling to generate object code and disassemble:
   0000000000000000 bpf_prog:
           0:       b7 02 00 00 00 00 00 00         r2 = 0
           1:       7b 2a f8 ff 00 00 00 00         *(u64 *)(r10 - 8) = r2
           2:       b7 02 00 00 00 00 00 00         r2 = 0
           3:       b7 03 00 00 88 a2 00 00         r3 = 41608
           4:       2d 23 03 00 00 00 00 00         if r3 > r2 goto +3 <LBB0_2>
           5:       b7 02 00 00 08 00 00 00         r2 = 8
           6:       79 13 00 00 00 00 00 00         r3 = *(u64 *)(r1 + 0)
           7:       05 00 02 00 00 00 00 00         goto +2 <LBB0_3>

    0000000000000040 LBB0_2:
           8:       b7 02 00 00 08 00 00 00         r2 = 8
           9:       79 13 08 00 00 00 00 00         r3 = *(u64 *)(r1 + 8)

    0000000000000050 LBB0_3:
          10:       0f 23 00 00 00 00 00 00         r3 += r2
          11:       bf a1 00 00 00 00 00 00         r1 = r10
          12:       07 01 00 00 f8 ff ff ff         r1 += -8
          13:       b7 02 00 00 08 00 00 00         r2 = 8
          14:       85 00 00 00 04 00 00 00         call 4

Instructions #2, #5 and #8 need relocation resoutions from the loader.

Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D61524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365503 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Remove MSVC-only "no two-phase name lookup" typename path.

Now that we've dropped VS2015 support (D64326) we can use the regular codepath as VS2017+ correctly handles it

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365502 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added tests for D64285

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365501 91177308-0d34-0410-b5e6-96231b3b80d8

[HardwareLoops] NFC - move hardware loop checking code to isHardwareLoopProfitable()

Differential Revision: https://reviews.llvm.org/D64197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365497 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add test for MVE and no floats. NFC

Adds a simple test that MVE with no floating point will be promoted correctly
to software float calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365496 91177308-0d34-0410-b5e6-96231b3b80d8

[InferFunctionAttrs] add more tests for derefenceable; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365495 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Register bank select for G_PHI. Select i64 phi

Select gprb or fprb when def/use register operand of G_PHI is
used/defined by either:
copy to/from physical register or
instruction with only one mapping available for that use/def operand.

Integer s64 phi is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.

Differential Revision: https://reviews.llvm.org/D64351

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365494 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Prepare some tests for store selection

Mostsly these would fail due to trying to use SI with a flat
operation. Implementing global loads with MUBUF is more work than
flat, so these won't be handled in the initial load selection.

Others fail because store of s64 won't initially work, as the current
set of patterns expect everything to be turned into v2i32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365493 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS GlobalISel] Regbanks for G_SELECT. Select i64, f32 and f64 select

Select gprb or fprb when def/use register operand of G_SELECT is
used/defined by either:
copy to/from physical register or
instruction with only one mapping available for that use/def operand.

Integer s64 select is handled with narrowScalar when mapping is applied,
produced artifacts are combined away. Manually set gprb to all register
operands of instructions created during narrowScalar.

For selection of floating point s32 or s64 select it is enough to set
fprb of appropriate size and selectImpl will do the rest.

Differential Revision: https://reviews.llvm.org/D64350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365492 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365491 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-dwarfdump] Fix wording

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365489 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize more concat_vectors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365488 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Improve regbankselect for icmp s16

Account for 64-bit scalar eq/ne when available.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365487 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Make s16 G_ICMP legal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365486 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_SUB

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365484 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_UNMERGE_VALUES

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365483 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select G_MERGE_VALUES

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365482 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r365453

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365481 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] AccelTable - remove non-constexpr (MSVC) Atom defs

Now that we've dropped VS2015 support (D64326) we can enable the constexpr variables on MSVC builds as VS2017+ correctly handles them

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365477 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Implement sge/sgeu pseudo instructions

The `sge/sgeu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than or equal `Src2/Imm` and
to 0 otherwise.

Differential Revision: https://reviews.llvm.org/D64314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365476 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Implement sgt/sgtu pseudo instructions with immediate operand

The `sgt/sgtu Dst, Src1, Src2/Imm` pseudo instructions set register
`Dst` to 1 if register `Src1` is greater than `Src2/Imm` and to 0 otherwise.

Differential Revision: https://reviews.llvm.org/D64313

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365475 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-objdump] Make some wording improvements/simplifications.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365474 91177308-0d34-0410-b5e6-96231b3b80d8

OpaquePtr: pass type to CreateLoad. NFC.

This is the one place in LLVM itself that used the deprecated API for
CreateLoad, so I just added the type in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365472 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Enable ArrayRef/StringRef is_assignable tests on MSVC

Now that we've dropped VS2015 support (D64326) we can enable these static_asserts on MSVC builds as VS2017+ correctly handles them

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365471 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][AsmPrinter] Fix the formatting for the rL365467

In addition, fix the build failure for the 'unused'
variable. The variable was used inside the 'LLVM_DEBUG()'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365469 91177308-0d34-0410-b5e6-96231b3b80d8

OpaquePtr: add Type parameter to Loads analysis API.

This makes the functions in Loads.h require a type to be specified
independently of the pointer Value so that when pointers have no structure
other than address-space, it can still do its job.

Most callers had an obvious memory operation handy to provide this type, but a
SROA and ArgumentPromotion were doing more complicated analysis. They get
updated to merge the properties of the various instructions they were
considering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365468 91177308-0d34-0410-b5e6-96231b3b80d8

[DwarfDebug] Dump call site debug info

Dump the DWARF information about call sites and call site parameters into
debug info sections.

The patch also provides an interface for the interpretation of instructions
that could load values of a call site parameters in order to generate DWARF
about the call site parameters.

([13/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D60716

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365467 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Fix RISCVTTIImpl::getIntImmCost for immediates where getMinSignedBits() > 64

APInt::getSExtValue will assert if getMinSignedBits() > 64. This can happen,
for instance, if examining an i128. Avoid this assertion by checking
Imm.getMinSignedBits() <= 64 before doing
getTLI()->isLegalAddImmediate(Imm.getSExtValue()). We could directly check
getMinSignedBits() <= 12 but it seems better to reuse the isLegalAddImmediate
helper for this.

Differential Revision: https://reviews.llvm.org/D64390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365462 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-nm] Improve some wording

In particular, the --debug-syms switch really doesn't have anything to
do with debuggers, so I've updated the document accordingly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365461 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Simplify some calls to getSetCCResultType. NFC

DAGTypeLegalizer and SelectionDAGLegalize has helper
functions wrapping the call to TLI.getSetCCResultType(...).
Use those helpers in more places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365456 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeTypes] Fix saturation bug for smul.fix.sat

Summary:
Make sure we use SETGE instead of SETGT when checking
if the sign bit is zero at SMULFIXSAT expansion.

The faulty expansion occured when doing "expand" of
SMULFIXSAT and the scale was exactly matching the
size of the smaller type. For example doing
  i64 Z = SMULFIXSAT X, Y, 32
and expanding X/Y/Z into using two i32 values.

The problem was that we sometimes did not saturate
to min when overflowing.

Here is an example using Q3.4 numbers:

Consider that we are multiplying X and Y.
  X = 0x80 (-8.0 as Q3.4)
  Y = 0x20 (2.0 as Q3.4)
To avoid loss of precision we do a widening
multiplication, getting a 16 bit result
  Z = 0xF000 (-16.0 as Q7.8)

To detect negative overflow we should check if
the five most significant bits in Z are less than -1.
Assume that we name the 4 most significant bits
as HH and the next 4 bits as HL. Then we can do the
check by examining if
(HH < -1) or (HH == -1 && "sign bit in HL is zero").

The fault was that we have been doing the check as
(HH < -1) or (HH == -1 && HL > 0)
instead of
(HH < -1) or (HH == -1 && HL >= 0).

In our example HH is -1 and HL is 0, so the old
code did not trigger saturation and simply truncated
the result to 0x00 (0.0). With the bugfix we instead
detect that we should saturate to min, and the result
will be set to 0x80 (-8.0).

Reviewers: leonardchan, bevinh

Reviewed By: leonardchan

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64331

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365455 91177308-0d34-0410-b5e6-96231b3b80d8

Retire VS2015 Support

As proposed here: https://lists.llvm.org/pipermail/llvm-dev/2019-June/133147.html

This patch raises the minimum supported version to build LLVM/Clang to Visual Studio 2017.

Differential Revision: https://reviews.llvm.org/D64326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365452 91177308-0d34-0410-b5e6-96231b3b80d8

[docs][llvm-dwarfdump] Make some option descriptions clearer and more precise

Some of the wording in the doc (taken largely from the help text), was a
little imprecise in some cases, so this patch makes it a little more
precise.

Reviewed by: JDevlieghere, probinson

Differential Revision: https://reviews.llvm.org/D64332

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365451 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-profdata] Don't make the output overwrite the input file.

Some file systems may not allow this behavior, the test fails on our internal
system ("Permission denied").

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365450 91177308-0d34-0410-b5e6-96231b3b80d8

Fixing @llvm.memcpy not honoring volatile.
This is explicitly not addressing target-specific code, or calls to memcpy.

Summary: https://bugs.llvm.org/show_bug.cgi?id=42254

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365449 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r364515 and r364524

Jordan reports on llvm-commits a performance regression with r364515,
backing the patch out while it's investigated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365448 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][PowerPC] Added a test to show current codegen of MachinePRE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365447 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[LiveDebugValues] Emit the debug entry values"

Emit replacements for clobbered parameters location if the parameter
has unmodified value throughout the funciton. This is basic scenario
where we can use the debug entry values.

([12/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>
Differential Revision: https://reviews.llvm.org/D58042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365444 91177308-0d34-0410-b5e6-96231b3b80d8

[Loop Peeling] Add support for peeling of loops with multiple exits

This patch modifies the loop peeling transformation so that
it does not expect that there is only one loop exit from latch.

It modifies only transformation. Update of branch weights remains
only for exit from latch.

The motivation is that in follow-up patch I plan to enable loop peeling for
loops with multiple exits but only if other exits then from latch one goes to
block with call to deopt.

For now this patch is NFC.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames, fhahn
Subscribers: zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D63921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365441 91177308-0d34-0410-b5e6-96231b3b80d8

Prepare for making SwitchInstProfUpdateWrapper strict

This patch removes the test part that relates to the non-strict
behavior of SwitchInstProfUpdateWrapper and changes
the assertion to llvm_unreachable() to allow the check in
release builds.
This patch prepares SwitchInstProfUpdateWrapper to become
strict with one line change. That is need to revert it easily
if any failure will arise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365439 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopInfo] Update getExitEdges to accept vector of pairs for non const BasicBlock

D63921 requires getExitEdges fills a vector of Edge pairs where
BasicBlocks are not constant.

The rest Loop API mostly returns non-const BasicBlocks, so to be more consistent with
other Loop API getExitEdges is modified to return non-const BasicBlocks as well.

This is an alternative solution to D64060.

Reviewers: reames, fhahn
Reviewed By: reames, fhahn
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64309

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365437 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][PowerPC] Fixed unused variable 'NewInstr'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365433 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Added td definitions for HW regs

Infrastructure work for future commit. NFC.

Differential Revision: https://reviews.llvm.org/D64370

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365432 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Always use s_memtime for readcyclecounter

Differential Revision: https://reviews.llvm.org/D64369

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365431 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][Peephole] Combine extsw and sldi after instruction selection

Summary:
`extsw` and `sldi` are supposed to be combined if they are in the same
BB in instruction selection phase. This patch handles the case where
extsw and sldi are not in the same BB.

Differential Revision: https://reviews.llvm.org/D63806

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365430 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] remove redundant function isVFReg().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365429 91177308-0d34-0410-b5e6-96231b3b80d8

[MachinePipeliner] Fix Phi refers to Phi in same stage in 1st epilogue

Summary:
This is exposed by functional testing on PowerPC.
In some pipelined loops, Phi refer to phi did not get value defined by
the Phi, hence getting wrong value later.

As the comment mentioned, we should "use the value defined by the Phi,
unless we're generating the firstepilog and the Phi refers to a Phi
in a different stage.", so Phi refering to same stage Phi should use
the value defined by the Phi here.

Reviewers: bcahoon, hfinkel

Reviewed By: hfinkel

Subscribers: MaskRay, wuzish, nemanjai, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365428 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][MachinePipeliner][NFC] Add a testcase for Phi bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365427 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Make sret parameter work with AddMissingPrototypes

Summary:
Even with functions with `no-prototype` attribute, there can be an
argument `sret` (structure return) attribute, which is an optimization
when a function return type is a struct. Fixes PR42420.

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365426 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPred] Stylistic improvement to recently added NE/EQ normalization [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365425 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] add new intrinsics preserve_{array,union,struct}_access_index

For background of BPF CO-RE project, please refer to
  http://vger.kernel.org/bpfconf2019.html
In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.
  addr = preserve_array_access_index(base, index, dimension)
  addr = preserve_union_access_index(base, di_index)
  addr = preserve_struct_access_index(base, gep_index, di_index)
here,
  base: the base pointer for the array/union/struct access.
  index: the last access index for array, the same for IR/DebugInfo layout.
  dimension: the array dimension.
  gep_index: the access index based on IR layout.
  di_index: the access index based on user/debuginfo types.

For example, for the following example,
  $ cat test.c
  struct sk_buff {
     int i;
     int b1:1;
     int b2:2;
     union {
       struct {
         int o1;
         int o2;
       } o;
       struct {
         char flags;
         char dev_id;
       } dev;
       int netid;
     } u[10];
  };

  static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
      = (void *) 4;

  #define _(x) (__builtin_preserve_access_index(x))

  int bpf_prog(struct sk_buff *ctx) {
    char dev_id;
    bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
    return dev_id;
  }
  $ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
    test.c >& log

The generated IR looks like below:

  ...
  define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
    %2 = alloca %struct.sk_buff*, align 8
    %3 = alloca i8, align 1
    store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
    call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
    call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
    call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
    %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
    %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
    %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
         %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
    %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
         [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
    %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
         %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
    %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
    %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
         %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
    %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
    %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
    %13 = sext i8 %12 to i32, !dbg !54
    call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
    ret i32 %13, !dbg !57
  }

  !19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
  !26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
  !34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,
  . The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
  . The "%7 = ..." represents array subscript "5".
  . The "%8 = ..." represents union member "dev" with index 1 for DI layout.
  . The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

The test case ThinLTO/X86/lazyload_metadata.ll is adjusted to reflect the
new addition of the metadata.

Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D61810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365423 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPred] Extend LFTR normalization to the inverse EQ case

A while back, I added support for NE latches formed by LFTR. I didn't think that quite through, as LFTR will also produce the inverse EQ form for some loops and I hadn't handled that. This change just adds handling for that case as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365419 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix a typo in a test file name

Reviewers: sbc100

Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365418 91177308-0d34-0410-b5e6-96231b3b80d8

Changing CodeView debug info type record representation in assembly files to make it more human-readable & editable & fixing bug introduced in r364987

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365417 91177308-0d34-0410-b5e6-96231b3b80d8

Let unaliased Args track which Alias they were created from, and use that in Arg::getAsString() for diagnostics

With this, `clang-cl /source-charset:utf-16 test.cc` now prints `invalid
value 'utf-16' in '/source-charset:utf-16'` instead of `invalid value
'utf-16' in '-finput-charset=utf-16'` before, and several other clang-cl
flags produce much less confusing output as well.

Fixes PR29106.

Since an arg and its alias can have different arg types (joined vs not)
and different values (because of AliasArgs<>), I chose to give the Alias
its own Arg object. For convenience, I just store the alias directly in
the unaliased arg – there aren't many arg objects at runtime, so that
seems ok.

Finally, I changed Arg::getAsString() to use the alias's representation
if it's present – that function was already documented as being the
suitable function for diagnostics, and most callers already used it for
diagnostics.

Implementation-wise, Arg::accept() previously used to parse things as
the unaliased option. The core of that switch is now extracted into a
new function acceptInternal() which parses as the _aliased_ option, and
the previously-intermingled unaliasing is now done as an explicit step
afterwards.

(This also changes one place in lld that didn't use getAsString() for
diagnostics, so that that one place now also prints the flag as the user
wrote it, not as it looks after it went through unaliasing.)

Differential Revision: https://reviews.llvm.org/D64253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365413 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Deduce the "returned" argument attribute

Deduce the "returned" argument attribute by collecting all potentially
returned values.

Not only the unique return value, if any, can be used by subsequent
attributes but also the set of all potentially returned values as well
as the mapping from returned values to return instructions that they
originate from (see AAReturnedValues::checkForallReturnedValues).

Change in statistics (-stats) for LLVM-TS + Spec2006, totaling ~19% more "returned" arguments.

  ADDED: attributor                   NumAttributesManifested                  n/a ->        637
  ADDED: attributor                   NumAttributesValidFixpoint               n/a ->      25545
  ADDED: attributor                   NumFnArgumentReturned                    n/a ->        637
  ADDED: attributor                   NumFnKnownReturns                        n/a ->      25545
  ADDED: attributor                   NumFnUniqueReturned                      n/a ->      14118
CHANGED: deadargelim                  NumRetValsEliminated                     470 ->        449 (    -4.468%)
REMOVED: functionattrs                NumReturned                              535 ->        n/a
CHANGED: indvars                      NumElimIdentity                          138 ->        164 (   +18.841%)

Reviewers: homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames, efriedma, chandlerc

Subscribers: hiraditya, bollu, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D59919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365407 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Use TST for comparisons when possible

Porting over the part of `emitComparison` in AArch64ISelLowering where we use
TST to represent a compare.

- Rename `tryOptCMN` to `tryFoldIntegerCompare`, since it now also emits TSTs
when possible.

- Add a utility function for emitting a TST with register operands.

- Rename opt-fold-cmn.mir to opt-fold-compare.mir, since it now also tests the
TST fold as well.

Differential Revision: https://reviews.llvm.org/D64371

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365404 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-profdata] Fix buildbot failure on llvm-clang-x86_64-expensive-checks-win

This fixes buildbot failure in LLVM on llvm-clang-x86_64-expensive-checks-win
from r365386.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365401 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Split extload/zextload local load patterns

This will help removing the custom load predicates, allowing the
global isel emitter to handle them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365398 91177308-0d34-0410-b5e6-96231b3b80d8

Add parentheses to silence warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365394 91177308-0d34-0410-b5e6-96231b3b80d8

Standardize on MSVC behavior for triples with no environment

Summary:
This makes it so that IR files using triples without an environment work
out of the box, without normalizing them.

Typically, the MSVC behavior is more desirable. For example, it tends to
enable things like constant merging, use of associative comdats, etc.

Addresses PR42491

Reviewers: compnerd

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64109

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365387 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-profdata] Handle the cases of overlapping input file and output file

Currently llvm-profdata does not expect the same file name for the input profile
and the output profile.
>llvm-profdata merge A.profraw B.profraw -o B.profraw
The above command runs successfully but the resulted B.profraw is not correct.
This patch fixes the issue by moving the initialization of writer after loading
the profile.

For the show command, the following will report a confusing error of
"Empty raw profile file":
>llvm-profdata show B.profraw -o B.profraw
It's harder to fix as we need to output something before loading the input profile.
I don't think that a fix for this is worth the effort. I just make the error explicit for
the show command.

Differential Revision: https://reviews.llvm.org/D64360

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365386 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Reapply [llvm-ar][test] Increase llvm-ar test coverage"

llvm-ar.extract.test has been failing on greendragon and gone unfixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365383 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fold insertelement into splat of same scalar

Forming the canonical splat shuffle improves analysis and
may allow follow-on transforms (although some possibilities
are missing as shown in the test diffs).

The backend generically turns these patterns into build_vector,
so there should be no codegen regressions. All targets are
expected to be able to lower splats efficiently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365379 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix unused variable in release build

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365378 91177308-0d34-0410-b5e6-96231b3b80d8