granicus.if.org Git

Rename option to -lto-pass-remarks-output

The new option -pass-remarks-output broke LLVM_LINK_LLVM_DYLIB because
of the duplicate option name with opt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287627 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen][ISel] When factoring ScopeMatcher, if the child of the ScopeMatcher we're working on is also a ScopeMatcher, merge all its children into the one we're working on.

There were several cases in X86 where we were unable to fully factor a ScopeMatcher but created nested ScopeMatchers for some portions of it. Then we created a SwitchType that split it up and further factored it so that we ended up with something like this:

SwitchType
  Scope
    Scope
      Sequence of matchers
      Some other sequence of matchers
    EndScope
    Another sequence of matchers
  EndScope
...Next type

This change turns it into this:

SwitchType
  Scope
    Sequence of matchers
    Some other sequence of matchers
    Another sequence of matchers
  EndScope
...Next type

Several other in-tree targets had similar nested scopes like this. Overall this doesn't save many bytes, but makes the isel output a little more regular.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287624 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove alternate CodeGenOnly version of (v)movq that declared the load size as i128mem. Change all uses to the use the i64mem version.

I'm sure this caused the load size to misprint in Intel syntax output. We were also inconsistent about which patterns used which instruction between VEX and EVEX.

There are two different reg/reg versions of movq, one from a GPR and one from the lower 64-bits of an XMM register. This changes the loading folding table to use the single i64mem memory form for folding both cases. But we need to use TB_NO_REVERSE to prevent a duplicate entry in the unfolding table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287622 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add support for commuting VPERMT2(B/W/D/Q/PS/PD) to/from VPERMI2(B/W/D/Q/PS/PD).

Summary:
The index and one of the table operands can be swapped by changing the opcode to the other version. Neither of these operands are the one that can load from memory so this can't be used to increase memory folding opportunities.

We need to handle the unmasked forms and the kz forms. Since the load operand isn't being commuted we can commute the load and broadcast instructions too.

Reviewers: igorb, delena, Ayal, Farhana, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287621 91177308-0d34-0410-b5e6-96231b3b80d8

MC: ensure that we have a section before accessing it

We would attempt to access the symbol section without ensuring that the symbol
was not absolute. When the assembler referenced relocation is not evaluated to
the absolute, but when we record the relocation, we would query the section.
Because the symbol is absolute, it does not have a section associated with it,
triggering an assertion. Just be more careful about the access of the section.

Addresses PR31064!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287619 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add support for changing the element size of PALIGNR/VALIGND/VALIGNQ shuffles if they feed a vselect with a different type

Summary:
Shuffle lowering widens the element size of a shuffle if elements are contiguous. This is sometimes help because wider element types have more shuffle options. If the shuffle is one of the arguments to a vselect this shuffle widening can introduce a bitcast between the vselect and the shuffle. This will prevent isel from selecting a masked operation. If the shuffle can be written equally efficiently with a different element size to match the vselect type we should change the shuffle type to allow masking.

This patch does this conversion for all VALIGND/VALIGNQ sizes. It also supports turning 128-bit PALIGNR into VALIGND/VALIGNQ. This fixes the case shown in PR31018.

I plan to add support for more operations in future patches.

Reviewers: RKSimon, zvi, delena

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26902

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287612 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Make SymbolicFile::symbol_{begin,end}() virtual and remove unnecessary wrappers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287611 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add initializer list support to SmallPtrSet so that sets can be
easily initialized with some initial values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287610 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix multiple vreg definitions in si-lower-control-flow

Differential Revision: https://reviews.llvm.org/D26939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287608 91177308-0d34-0410-b5e6-96231b3b80d8

Analysis: gep inbounds (gep inbounds (...)) is inbounds.

Differential Revision: https://reviews.llvm.org/D26441

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287604 91177308-0d34-0410-b5e6-96231b3b80d8

Remove LLVM_NODISCARD in one more place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287596 91177308-0d34-0410-b5e6-96231b3b80d8

Remove LLVM_NODISCARD from two more StringRef members.

This should be everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287594 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: Ignore call site attributes when emitting target intrinsic

A target intrinsic may be defined as possibly reading memory,
but the call site may have additional knowledge that it doesn't read
memory. The intrinsic lowering will expect the pessimistic
assumption of the intrinsic definition, so the chain should
still be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287593 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64LoadStoreOptimizer] Don't treat write to XZR/WZR as a clobber.

Summary:
When searching for load/store instructions to pair/merge don't treat
writes to WZR/XZR as clobbers since they don't change the value read
from WZR/XZR (which is always 0).

Reviewers: mcrosier, junbuml, jmolloy, t.p.northover

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D26921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287592 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGenPrepare] Don't sink non-cheap addrspacecasts.

Summary:
Previously, CGP would unconditionally sink addrspacecast instructions,
even going so far as to sink them into a loop.

Now we check that the cast is "cheap", as defined by TLI.

We introduce a new "is-cheap" function to TLI rather than using
isNopAddrSpaceCast because some GPU platforms want the ability to ask
for non-nop casts to be sunk.

Reviewers: arsenm, tra

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287591 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGenPrepare] Rewrite a loop in terms of llvm::none_of. NFC.

Reviewers: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D26924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287590 91177308-0d34-0410-b5e6-96231b3b80d8

Remove LLVM_NODISCARD from getAsInteger().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287589 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopReroll] Make root-finding more aggressive.

Allow using an instruction other than a mul or phi as the base for
root-finding. For example, the included testcase includes a loop
which requires using a getelementptr as the base for root-finding.

Differential Revision: https://reviews.llvm.org/D26529

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287588 91177308-0d34-0410-b5e6-96231b3b80d8

Fix attribute list syntax.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287587 91177308-0d34-0410-b5e6-96231b3b80d8

Remove LLVM_NODISCARD from StringRef.

This is a bit too aggressive of a warning, as it is forces
ANY function which returns a StringRef to have its return
value checked. While useful on classes like llvm::Error which
are designed to require checking, this is not the case for
StringRef, and it is perfectly reasonable to have a function
return a StringRef for which the return value is not checked.

Move LLVM_NODISCARD to each of the individual member functions
where it makes sense instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287586 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] canonicalize min/max constant to select's false value

This is a first step towards canonicalization and improved folding/codegen
for integer min/max as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/106868.html

Here, we're just matching the simplest min/max patterns and adjusting the
icmp predicate while swapping the select operands.

I've included FIXME tests in test/Transforms/InstCombine/select_meta.ll
so it's easier to see how this might be extended (corresponds to the TODO
comment in the code). That's also why I'm using matchSelectPattern()
rather than a simpler check; once the backend is patched, we can just
remove some of the restrictions to allow the obfuscated min/max patterns
in the FIXME tests to be matched.

Differential Revision: https://reviews.llvm.org/D26525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287585 91177308-0d34-0410-b5e6-96231b3b80d8

LSR debug fix.

Summary:
Dump instruction instead of address.
Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D26877

From: Evgeny Stupachenko <evstupac@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287584 91177308-0d34-0410-b5e6-96231b3b80d8

reassociate-deadinst.ll: avoid accidental match on path

Pipe from stdin to avoid accidentally matching on the path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287583 91177308-0d34-0410-b5e6-96231b3b80d8

fix formatting; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287582 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Make ASan compatible with linker dead stripping on Windows

Summary:
This is similar to what was done for Darwin in rL264645 /
http://reviews.llvm.org/D16737, but it uses COFF COMDATs to achive the
same result instead of relying on new custom linker features.

As on MachO, this creates one metadata global per instrumented global.
The metadata global is placed in the custom .ASAN$GL section, which the
ASan runtime will iterate over during initialization. There are no other
references to the metadata, so normal linker dead stripping would
discard it. However, the metadata is put in a COMDAT group with the
instrumented global, so that it will be discarded if and only if the
instrumented global is discarded.

I didn't update the ASan ABI version check since this doesn't affect
non-Windows platforms, and the WinASan ABI isn't really stable yet.

Implementing this for ELF will require extending LLVM IR and MC a bit so
that we can use non-COMDAT section groups.

Reviewers: pcc, kcc, mehdi_amini, kubabrecka

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26770

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287576 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Fix unit tests broken by D26704

Summary:
D26704 fixed the non-determinism in codegen by sorting basic blocks before
iteration so as to have a defined iteration order. As a result we need to fix
the names (numbers) of the temporaries in the following unit tests:
test/Transforms/Util/MemorySSA/multi-edges.ll
test/Transforms/Util/MemorySSA/multiple-backedges-hal.ll

Reviewers: dberlin, david2050, mgrang

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287575 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add tests for half precision floating point support.

These should have been part of r287349.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287574 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] seq macro support

This patch adds the seq macro.

This partially resolves PR/30381.

Thanks to Sean Bruno for reporting the issue!

Reviewers: zoran.jovanovic, vkalintiris, seanbruno

Differential Revision: https://reviews.llvm.org/D24607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287573 91177308-0d34-0410-b5e6-96231b3b80d8

Check proper live range in extendPHIRanges

The function extendPHIRanges checks the main range of the original live
interval, even when dealing with a subrange. This could also lead to an
assert when the subrange is not live at the extension point, but the
main range is. To avoid this, check the corresponding subrange of the
original live range, instead of always checking the main range.

Review (as a part of a bigger set of changes):
https://reviews.llvm.org/D26359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287571 91177308-0d34-0410-b5e6-96231b3b80d8

[TLI] Fix breakage introduced by D21739.

The initialize function has an early return for AMDGPU targets. If taken,
the ShouldExtI32* initialization code will not be executed, resulting in
invalid values in the corresponding fields. Fix this by moving the code
to the top of the function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287570 91177308-0d34-0410-b5e6-96231b3b80d8

[AsmPrinter] Enable codeview for windows-itanium

Enable codeview emission for windows-itanium targets. Co-opt an existing
test (which is derived from a C source file and should therefore be
identical across the Itanium and MS ABIs).

Differential Revision: https://reviews.llvm.org/D26693

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287567 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Fix for non-determinism in codegen

This patch fixes the non-determinism caused due to iterating SmallPtrSet's
which was uncovered due to the experimental "reverse iteration order " patch:
https://reviews.llvm.org/D26718

The following unit tests failed because of the undefined order of iteration.
LLVM :: Transforms/Util/MemorySSA/cyclicphi.ll
LLVM :: Transforms/Util/MemorySSA/many-dom-backedge.ll
LLVM :: Transforms/Util/MemorySSA/many-doms.ll
LLVM :: Transforms/Util/MemorySSA/phi-translation.ll

Reviewers: dberlin, mgrang

Subscribers: dberlin, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D26704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287563 91177308-0d34-0410-b5e6-96231b3b80d8

[VectorLegalizer] Remove EVT::getSizeInBits code duplications. NFCI.

We were calling SVT.getSizeInBits() several times in a row - just call it once and reuse the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287556 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGenPrep] Skip merging empty case blocks

Summary: Merging an empty case block into the header block of switch could cause
ISel to add COPY instructions in the header of switch, instead of the case
block, if the case block is used as an incoming block of a PHI. This could
potentially increase dynamic instructions, especially when the switch is in a
loop. I added a test case which was reduced from the benchmark I was targetting.

Reviewers: t.p.northover, mcrosier, manmanren, wmi, davidxl

Subscribers: qcolombet, danielcdh, hfinkel, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D22696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287553 91177308-0d34-0410-b5e6-96231b3b80d8

small fixup which enables the issuing of the aforementioned instruction (w/o operands), on MS/Intel syntax.

Differential Revision: https://reviews.llvm.org/D26913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287548 91177308-0d34-0410-b5e6-96231b3b80d8

Fix known zero bits for addrspacecast.

Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1.

This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand.

Differential Revision: https://reviews.llvm.org/D26803

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287545 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add SSE reciprocal estimate tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287543 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add ComputeNumSignBits support for CONCAT_VECTORS opcode

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287541 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-cov] Avoid 0% when reporting something that's 0/0

This commit makes llvm-cov avoid showing 0% (0/0) coverage for things
like file function coverage, etc. in reports and HTML output. This can happen
for files like headers that have macros but no functions. This commit makes
llvm-cov report - (0/0) instead.

rdar://29246480

Differential Revision: https://reviews.llvm.org/D26615

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287539 91177308-0d34-0410-b5e6-96231b3b80d8

Adjust arm64-irtranslator.ll test to changes from r287368

The test is currently broken, and this CL should fix it.

Patch by Adrian Kuegel!

Differential Revision: https://reviews.llvm.org/D26910

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287536 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Allow PACKSS to be used to truncate any type of all/none sign bits input

At the moment we only use truncateVectorCompareWithPACKSS with direct vector comparison results (just one example of a known all/none signbits input).

This change relaxes the direct matching of a SETCC opcode by moving the logic up into SelectionDAG::ComputeNumSignBits and accepting any input with a known splatted signbit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287535 91177308-0d34-0410-b5e6-96231b3b80d8

[InstrProfiling] Mark __llvm_profile_instrument_target last parameter as i32 zeroext if appropriate.

On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed
as i32 signext instead of plain i32. Likewise, unsigned int may be passed
as i32, i32 signext, or i32 zeroext depending on the platform. Mark
__llvm_profile_instrument_target properly (its last parameter is unsigned
int).

This (together with the clang change) makes compiler-rt profile testsuite pass
on s390x.

Differential Revision: http://reviews.llvm.org/D21736

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287534 91177308-0d34-0410-b5e6-96231b3b80d8

[TLI] Add functions determining if int parameters/returns should be zeroext/signext.

On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed
as i32 signext instead of plain i32. Likewise, unsigned int may be passed
as i32, i32 signext, or i32 zeroext depending on the platform. Add this
information to TargetLibraryInfo, to be used whenever some LLVM pass
inserts a compiler-rt call to a function involving int parameters
or returns.

Differential Revision: http://reviews.llvm.org/D21739

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287533 91177308-0d34-0410-b5e6-96231b3b80d8

Fixing a small typo (A->U).
This seem to fixes PR30992.

- HasAVX512 ? X86::VMOVAPSZ128rm_NOVLX
+ HasAVX512 ? X86::VMOVUPSZ128rm_NOVLX

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287532 91177308-0d34-0410-b5e6-96231b3b80d8

[Sparc] Use target name instead of namespace as prefix for MCRegisterClasses array

Summary:
For Sparc the namespace (SP) is different from the target name (Sparc),
which causes the name of the array in this declaration to differ from
the name used in the definition.

Patch by Daniel Cederman.

Reviewers: jyknight

Subscribers: llvm-commits, jyknight

Differential Revision: https://reviews.llvm.org/D23650

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287528 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add EVEX form of VMOVZPQILo2PQIZrm to load folding tables to match SSE and AVX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287523 91177308-0d34-0410-b5e6-96231b3b80d8

[bpf] attempt to fix big-endian bots

attempt to fix big-endian bots failing on new dwarfdump test

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287522 91177308-0d34-0410-b5e6-96231b3b80d8

[bpf] fix dwarf elf relocs and line numbers

- teach RelocVisitor to recognize bpf relocations
- fix AsmInfo->PointerSize to make sure dwarf is emitted correctly
- add a test for the above

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287521 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen][ISel] Do a better job of factoring ScopeMatchers created during creation of SwitchTypeMatcher.

Previously we were factoring when the ScopeMatcher was initially created, but it might get more Matchers added to it later. Delay factoring until we have fully created/populated the ScopeMatchers.

This reduces X86 isel tables by 154 bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287520 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove duplicate instructions for (v)movq and replace with patterns on other instructions. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287519 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay][AArch64] Implemented a test for the compile-time sleds emitted, and fixed a bug in the jump instruction

This patch adds a test for the assembly code emitted with XRay
instrumentation. It also fixes a bug where the operand of a jump
instruction must be not the number of bytes to jump over, but rather the
number of 4-byte instructions.

Author: rSerge

Reviewers: dberris, rengolin

Differential Revision: https://reviews.llvm.org/D26805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287516 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalSplit] Port to the new pass manager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287511 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Restrict tail call optimization

The tail call optimization was being used without proper consideration of
ABI requirements for saving and restoring the GP. This patch restricts tail
call optimization to functions within the same translation unit.

Reviewers: vkalintiris

Differential Revision: https://reviews.llvm.org/D24763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287505 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add some initial combine tests that could (should?) use PACKSS

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287504 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add tests for masked palignr/valignd/valignq shuffles, many of which show failures to fold the masking into the operation.

Many of these problems are because shuffle lowering widens element size and reduces element count when possible. This causes the shuffle to become separated from the select by a bitcast. Future patches will work to improve these cases by rewriting the shuffle back to a narrow element type if we think it can result in folding the mask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287503 91177308-0d34-0410-b5e6-96231b3b80d8

The 'vpmultishiftqb' instruction was implemented falsely, this patch amend it.
More specifically - (MS dialect) broadcasting variants were implemented falsely.

Differential Revision: https://reviews.llvm.org/D26257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287501 91177308-0d34-0410-b5e6-96231b3b80d8

Some instructions were missing, other implemented falsely. this patch aims at amending those issues. full list:

vcvtps2pd
vcvtudq2pd
vcvtps2qq
vcvttps2qq
vcvtps2uqq
vcvttps2uqq

variants are:

[Dst]XMM(zero-masked/merge-masked/unmasked)
[Src]Mem64

Differential Revision: https://reviews.llvm.org/D26799

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287500 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Combine unary + zero target shuffles to VPERMV3 with a zero vector where possible

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287497 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add support for VBMI VPERMV3 target shuffle combines

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287496 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add support for VBMI VPERMV target shuffle combines

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287495 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add some initial VBMI target shuffle combine tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287494 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512VL] Removed duplicate operation action

Basic AVX512F already declared uint_to_fp v4i32 as legal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287493 91177308-0d34-0410-b5e6-96231b3b80d8

Strip trailing whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287492 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512F] Add support for uint_to_fp v2i32 to v2f64 on AVX512F-only targets

Use 512-bit instructions (we already do something similar for uint_to_fp v4i32 to v4f64)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287491 91177308-0d34-0410-b5e6-96231b3b80d8

Fix comment typos. NFC.

Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287490 91177308-0d34-0410-b5e6-96231b3b80d8

Fix spelling mistakes in Tools/Tests comments. NFC.

Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287489 91177308-0d34-0410-b5e6-96231b3b80d8

Fix spelling mistakes in Transforms comments. NFC.

Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287488 91177308-0d34-0410-b5e6-96231b3b80d8

Fix spelling mistakes in SelectionDAG comments. NFC.

Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287487 91177308-0d34-0410-b5e6-96231b3b80d8

Fix comment typos. NFC.

Identified by Pedro Giffuni in PR27636.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287486 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] RegCall - Handling long double arguments

The change is part of RegCall calling convention support for LLVM.
Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack).
This review present the change and the corresponding tests.

Differential Revision: https://reviews.llvm.org/D26151

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287485 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][InlineAsm]Test commit.
Fixing a wrong comment on X86AsmParser.cpp::ParseZ: "true" --> "false"

Differential Revision: https://reviews.llvm.org/D26797

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287484 91177308-0d34-0410-b5e6-96231b3b80d8

Fix file name resolution in nested response files

If a response file in construct `@file` was specified by relative name,
constructs `@file` nested within it were resolved incorrectly if the
flag RelativeNames in call to ExpandResponseFile was set to true.
This feature is used in configuration files, tests for it are in
respective change (see D24933).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287482 91177308-0d34-0410-b5e6-96231b3b80d8

ExceptionDemo: remove some undefined behaviour

The casting based reading of the LSDA could attempt to read unsuitably aligned
data. Avoid that case by explicitly using a memcpy. A similar approach is used
in libc++abi to address the same UB.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287479 91177308-0d34-0410-b5e6-96231b3b80d8

ExceptionDemo: prefer headers over redeclarations

Rather than redeclaring the interfaces for exceptions, prefer using the
`unwind.h` header.  This is vended by at least gcc and clang, and can also be
found by an external unwinding library (e.g. libunwind).  Doing this simplifies
the example to the exception handling itself.  Minor tweaks are the result of
_Unwind_Context_t not being defined, which is just a typedef for struct
_Unwind_Context *.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287478 91177308-0d34-0410-b5e6-96231b3b80d8

[bpf] add BPF disassembler

add BPF disassembler, so tools like llvm-objdump can be used:
$ llvm-objdump -d -no-show-raw-insn ./sockex1_kern.o

./sockex1_kern.o: file format ELF64-BPF

Disassembly of section socket1:
bpf_prog1:
       0: r6 = r1
       8: r0 = *(u8 *)skb[23]
      10: *(u32 *)(r10 - 4) = r0
      18: r1 = *(u32 *)(r6 + 4)
      20: if r1 != 4 goto 8
      28: r2 = r10
      30: r2 += -4

ld_imm64 (the only 16-byte insn) and special ld_abs/ld_ind instructions
had to be treated in a special way. The decoders for the rest of the insns
are automatically generated.

Add tests to cover new functionality.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287477 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix big-endian buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287476 91177308-0d34-0410-b5e6-96231b3b80d8

Style fix. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287475 91177308-0d34-0410-b5e6-96231b3b80d8

Fix buildbot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287474 91177308-0d34-0410-b5e6-96231b3b80d8

SHA1: unroll loop in hashBlock.

This code is taken from public domain.
https://github.com/jsonn/src/blob/trunk/common/lib/libc/hash/sha1/sha1.c

I wrote a sha1 command and ran it on my Xeon E5-2680 v2 2.80GHz machine.
Here is a result. The new hash function is 37% faster than before.

Performance counter stats for './llvm-sha1-old /ssd/build/bin/lld' (10 runs):

       6640.503687 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.03% )
                54 context-switches          #    0.008 K/sec                    ( +-  5.03% )
                 5 cpu-migrations            #    0.001 K/sec                    ( +- 31.73% )
           183,803 page-faults               #    0.028 M/sec                    ( +-  0.00% )
    18,527,954,113 cycles                    #    2.790 GHz                      ( +-  0.03% )
     4,993,237,485 stalled-cycles-frontend   #   26.95% frontend cycles idle     ( +-  0.11% )
   <not supported> stalled-cycles-backend
    50,217,149,423 instructions              #    2.71  insns per cycle
                                             #    0.10  stalled cycles per insn  ( +-  0.00% )
     6,094,322,337 branches                  #  917.750 M/sec                    ( +-  0.00% )
        11,778,239 branch-misses             #    0.19% of all branches          ( +-  0.01% )

       6.634017401 seconds time elapsed                                          ( +-  0.03% )

Performance counter stats for './llvm-sha1-new /ssd/build/bin/lld' (10 runs):

       4167.062720 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.02% )
                52 context-switches          #    0.012 K/sec                    ( +- 16.45% )
                 7 cpu-migrations            #    0.002 K/sec                    ( +- 32.20% )
           183,804 page-faults               #    0.044 M/sec                    ( +-  0.00% )
    11,626,611,958 cycles                    #    2.790 GHz                      ( +-  0.02% )
     4,491,897,976 stalled-cycles-frontend   #   38.63% frontend cycles idle     ( +-  0.05% )
   <not supported> stalled-cycles-backend
    24,320,180,617 instructions              #    2.09  insns per cycle
                                             #    0.18  stalled cycles per insn  ( +-  0.00% )
     1,574,674,576 branches                  #  377.886 M/sec                    ( +-  0.00% )
        11,769,693 branch-misses             #    0.75% of all branches          ( +-  0.00% )

       4.163251552 seconds time elapsed                                          ( +-  0.02% )

Differential Revision: https://reviews.llvm.org/D26890

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287473 91177308-0d34-0410-b5e6-96231b3b80d8

Demangle: remove references to allocator for default allocator

The demangler had stopped using a custom allocator but had not been updated to
remove the use of the explicit allocator passing.  This removes that as we do
not need to do anything special here anymore.  This just makes the code more
compact.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287472 91177308-0d34-0410-b5e6-96231b3b80d8

Demangle: remove unnecessary typedef for std::vector

We could create a local typedef for std::vector called Vector. Inline the use
of std::vector rather than use the typedef. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287471 91177308-0d34-0410-b5e6-96231b3b80d8

Demangle: replace custom typedef for std::string with std::string

We created a local typedef for `std::basic_string<char, std::char_traits<char>>`
which is just `std::string`. Remove the local typedef and propagate the type
information through the rest of the demangler. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287470 91177308-0d34-0410-b5e6-96231b3b80d8

Demangle: use direct member initialization (NFC)

Prefer direct member initialization over the explicit out-of-line initialization
for the construction of the local type. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287469 91177308-0d34-0410-b5e6-96231b3b80d8

Give some helper classes/functions internal linkage. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287462 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Improve PSHUFB lowering from either input

Canonicalization may leave the zeroable vector in the first input.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287461 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add VPERMV/VPERMV3 v64i8 byte shuffles on avx512vbmi targets

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287459 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Fix crash when importing an opaque type

It seems that because ThinLTO does not import the full module,
some invariant of the type mapper are broken.

In Monolithic LTO, we import every globals: when calling
IRLinker::copyFunctionProto() on @foo(), we end-up calling
TypeMapTy::get(FTy) on the type of @foo(), which will map
%0 and record the destination as opaque.

ThinLTO skips this because @foo is not imported and goes directly
to the next stage.

Next we call computeTypeMapping() that map the types for each
globals, and ends up checking for type isomorphism, and may add
type mapping. However it doesn't record if there was an opaque
destination type that was resolved.

Instead of lazily "discovering" opaque type in the destination
module on the go, we change the TypeFinder to eagerly record all
types and not only the named ones.

Differential Revision: https://reviews.llvm.org/D26840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287453 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Implement -pass-remarks-output in ThinLTOCodeGenerator

Summary:
This will also be added to the LTO API, right now this will
bring ThinLTO on par with Monolithic LTO on Darwin.

Reviewers: anemet

Subscribers: tejohnson, llvm-commits

Differential Revision: https://reviews.llvm.org/D26886

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287450 91177308-0d34-0410-b5e6-96231b3b80d8

Change setDiagnosticsOutputFile to take a unique_ptr from a raw pointer (NFC)

Summary:
This makes it explicit that ownership is taken. Also replace all `new`
with make_unique<> at call sites.

Reviewers: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26884

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287449 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add avx512vbmi tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287447 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added some more complex v64i8 shuffles

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287444 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Simplify some code a little by removing a dulicate variable and combinining two if statements. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287443 91177308-0d34-0410-b5e6-96231b3b80d8

Try again to fix unused variable warning on lld-x86_64-darwin13 after r287439.

The previous attempt didn't work. I assume LLVM_ATTRIBUTE_UNUSED isn't
available on that machine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287442 91177308-0d34-0410-b5e6-96231b3b80d8

Try to fix unused variable warning on lld-x86_64-darwin13 after r287439.

Whether the variable is used or not depends on NDEBUG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287440 91177308-0d34-0410-b5e6-96231b3b80d8

Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86.

Summary:
* ARM is omitted from this patch because this check appears to expose bugs in this target.
* Mips is omitted from this patch because this check either detects bugs or deliberate
  emission of instructions that don't satisfy their predicates. One deliberate
  use is the SYNC instruction where the version with an operand is correctly
  defined as requiring MIPS32 while the version without an operand is defined
  as an alias of 'SYNC 0' and requires MIPS2.
* X86 is omitted from this patch because it doesn't use the tablegen-erated
  MCCodeEmitter infrastructure.

Patches for ARM and Mips will follow.

Depends on D25617

Reviewers: tstellarAMD, jmolloy

Subscribers: wdng, jmolloy, aemerson, rengolin, arsenm, jyknight, nemanjai, nhaehnle, tstellarAMD, llvm-commits

Differential Revision: https://reviews.llvm.org/D25618

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287439 91177308-0d34-0410-b5e6-96231b3b80d8

[tablegen] Merge duplicate definitions of getMinimalTypeForRange. NFC.

Summary: Depends on D25614

Reviewers: qcolombet

Subscribers: qcolombet, beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D25617

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287438 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] llvm-lto2 depends on intrinsics_gen

llvm-lto2.cpp has the following include chain:

llvm/LTO/Caching.h
llvm/LTO/LTO.h
llvm/CodeGen/Analysis.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen

This means llvm-lto2 needs to depend on intrinsics_gen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287434 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] opt depends on intrinsics_gen

AnalysisWrappers.cpp has the following include chain:

llvm/Analysis/CallGraph.h
llvm/IR/CallSite.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen

This means opt needs to depend on intrinsics_gen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287433 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] llvm-nm depends on intrinsics_gen

llvm-nm.cpp has the following include chain:

llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen

This means llvm-nm needs to depend on intrinsics_gen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287432 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] llvm-link depends on intrinsics_gen

llvm-link.cpp has the following include chain:

llvm/Bitcode/BitcodeWriter.h
llvm/IR/ModuleSummaryIndex.h
llvm/IR/Module.h
llvm/IR/Function.h
llvm/IR/Argument.h
llvm/IR/Attributes.h
llvm/IR/Attributes.gen

This means llvm-link needs to depend on intrinsics_gen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287431 91177308-0d34-0410-b5e6-96231b3b80d8