granicus.if.org Git

[DAGCombine] Match a pattern where a wide type scalar value is stored by several narrow stores
This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c

static void store64(u64 x, unsigned char* y)
{
    for(int i = 0; i != 8; ++i)
        y[i] = (x >> ((7-i) * 8)) & 255;
}

static u64 load64(const unsigned char* y)
{
    u64 res = 0;
    for(int i = 0; i != 8; ++i)
        res |= (u64)(y[i]) << ((7-i) * 8);
    return res;
}
The load64 has been implemented by https://reviews.llvm.org/D26149
This patch is trying to implement the store pattern.

Match a pattern where a wide type scalar value is stored by several narrow
stores. Fold it into a single store or a BSWAP and a store if the targets
supports it.

Assuming little endian target:
i8 *p = ...
i32 val = ...
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
p[3] = (val >> 24) & 0xFF;

>
*((i32)p) = val;

i8 *p = ...
i32 val = ...
p[0] = (val >> 24) & 0xFF;
p[1] = (val >> 16) & 0xFF;
p[2] = (val >> 8) & 0xFF;
p[3] = (val >> 0) & 0xFF;

>
*((i32)p) = BSWAP(val);

Differential Revision: https://reviews.llvm.org/D62897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362921 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When promoting i16 compare with immediate to i32, try to use sign_extend for eq/ne if the input is truncated from a type with enough sign its.

Summary:
Our default behavior is to use sign_extend for signed comparisons and zero_extend for everything else. But for equality we have the freedom to use either extension. If we can prove the input has been truncated from something with enough sign bits, we can use sign_extend instead and let DAG combine optimize it out. A similar rule is used by type legalization in LegalizeIntegerTypes.

This gets rid of the movzx in PR42189. The immediate will still take 4 bytes instead of the 2 bytes plus 0x66 prefix a cmp di, 32767 would get, but it avoids a length changing prefix.

Reviewers: RKSimon, spatel, xbolva00

Reviewed By: xbolva00

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63032

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362920 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Disable f32->f64 extload when sse2 is enabled

Summary:
We can only use the memory form of cvtss2sd under optsize due to a partial register update. So previously we were emitting 2 instructions for extload when optimizing for speed. Also due to a late optimization in preprocessiseldag we had to handle (fpextend (loadf32)) under optsize.

This patch forces extload to expand so that it will always be in the (fpextend (loadf32)) form during isel. And when optimizing for speed we can just let each of those pieces select an instruction independently.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362919 91177308-0d34-0410-b5e6-96231b3b80d8

Do not derive no-recurse attribute if function does not have exact definition.
This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336

Reviewers: jdoerfert
Reviewed by: jdoerfert

Differential Revision: https://reviews.llvm.org/D63045

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362918 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Test if commit access granted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362917 91177308-0d34-0410-b5e6-96231b3b80d8

Make test not write to source directory

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362916 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use EVEX instructions for f128 FAND/FOR/FXOR when avx512vl is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362915 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Convert f32/f64 FANDN/FAND/FOR/FXOR to vector logic ops and scalar_to_vector/extract_vector_elts to reduce isel patterns.

Previously we did the equivalent operation in isel patterns with
COPY_TO_REGCLASS operations to transition. By inserting
scalar_to_vetors and extract_vector_elts before isel we can
allow each piece to be selected individually and accomplish the
same final result.

I ideally we'd use vector operations earlier in lowering/combine,
but that looks to be more difficult.

The scalar-fp-to-i64.ll changes are because we have a pattern for
using movlpd for store+extract_vector_elt. While an f64 store
uses movsd. The encoding sizes are the same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362914 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r361953 "[SVE][IR] Scalable Vector IR Type"

This reverts commit f4fc01f8dd3a5dfd2060d1ad0df6b90e8351ddf7.
It caused a 3-4x slowdown when doing thinlto links, PR42210.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362913 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Simplify (ctpop x) == 1

Reviewers: craig.topper, spatel, RKSimon, bkramer

Reviewed By: spatel

Subscribers: javed.absar, lebedev.ri, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63004

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362912 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompiles

A precondition 'x != 0' was forgotten by me:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL

These 4 folds with non-constants could be re-enabled,
but for now let's go for the simplest solution.

https://bugs.llvm.org/show_bug.cgi?id=42198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362911 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Revisit canonicalize-constant-low-bit-mask-and-icmp-s* tests in preparatio for PR42198.

The `icmp sgt`/`icmp sle` variants are, too, miscompiles:
https://rise4fun.com/Alive/JFNP
https://rise4fun.com/Alive/jHvL
A precondition 'x != 0' was forgotten by me.

While ensuring test coverage for `-1`, also add test coverage
for `0` mask. Mask `0` is allowed for all the folds,
mask `-1` is allowed for all the folds with unsigned `icmp` pred.
Constant mask `0` is missed though.

https://bugs.llvm.org/show_bug.cgi?id=42198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362910 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] change canonicalization to fabs() to use FMF on fneg

This isn't the ideal fix (use FMF on the select), but it's still an
improvement until we have better FMF propagation to selects and other
FP math operators.

I don't think there's much risk of regression from this change by
not including the FMF on the fcmp any more. The nsz/nnan FMF
should be the same on the fcmp and the fneg (fsub) because they
have the same operand.

This works around the most glaring FMF logical inconsistency cited
in PR38086:
https://bugs.llvm.org/show_bug.cgi?id=38086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362909 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Adjust test for D63004

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362908 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added test from PR19758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362907 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added test from PR42084 for D63058

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362906 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add tests for usub.sat(x,y)+y etc; NFC

For PR42178.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362905 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] reduce code duplication for fcmp folds; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362904 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] enhance fcmp fold with never-nan operand

This is another step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

This is a continuation of D62979 / rL362879.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362903 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for fcmp with known-never-nan operands; NFC

Opposite predicate for rL362742 / rL362879 / D62979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362902 91177308-0d34-0410-b5e6-96231b3b80d8

[MIR] Add simple PRE pass to MachineCSE

This is the second part of the commit fixing PR38917 (hoisting
partitially redundant machine instruction). Most of PRE (partitial
redundancy elimination) and CSE work is done on LLVM IR, but some of
redundancy arises during DAG legalization. Machine CSE is not enough
to deal with it. This simple PRE implementation works a little bit
intricately: it passes before CSE, looking for partitial redundancy
and transforming it to fully redundancy, anticipating that the next
CSE step will eliminate this created redundancy. If CSE doesn't
eliminate this, than created instruction will remain dead and eliminated
later by Remove Dead Machine Instructions pass.

The third part of the commit is supposed to refactor MachineCSE,
to make it more clear and to merge MachinePRE with MachineCSE,
so one need no rely on further Remove Dead pass to clear instrs
not eliminated by CSE.

First step: https://reviews.llvm.org/D54839

Fixes llvm.org/PR38917

This is fixed recommit of r361356 after PowerPC64 multistage build failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362901 91177308-0d34-0410-b5e6-96231b3b80d8

[CaptureTracking] Don't let comparisons against null escape inbounds pointers

Pointers that are in-bounds (either through dereferenceable_or_null or
thorough a getelementptr inbounds) cannot be captured with a comparison
against null. There is no way to construct a pointer that is still in
bounds but also NULL.

This helps safe languages that insert null checks before load/store
instructions. Without this patch, almost all pointers would be
considered captured even for simple loads. With this patch, an icmp with
null will not be seen as escaping as long as certain conditions are met.

There was a lot of discussion about this patch. See the Phabricator
thread for detals.

Differential Revision: https://reviews.llvm.org/D60047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362900 91177308-0d34-0410-b5e6-96231b3b80d8

[bindings/go] Add wrappers for atomic operations.

This patch adds Go bindings for atomic operations in LLVM.

Differential Revision: https://reviews.llvm.org/D61034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362899 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] NFCI : Comment updation for EVEX to VEX translation.

Reviewers: llvm-commits, jbhateja

Reviewed By: jbhateja

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63055

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362898 91177308-0d34-0410-b5e6-96231b3b80d8

Use for-range loop. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362897 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Select immediate forms of cmp instructions.

A simple re-use of the immediate operand matcher and renderer functions.

rdar://43795178

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362896 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove (store (f32 (extractelt (v4f32))) isel patterns which is redundant.

We emit a MOVSSmr and a COPY_TO_REGCLASS, but that's what we would get from
selecting the store and extractelt independently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362895 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Mutate scalar fceil/ffloor/ftrunc/fnearbyint/frint into X86ISD::RNDSCALE during PreProcessIselDAG to cut down on number of isel patterns.

Similar was done for vectors in r362535. Removes about 1200 bytes from the isel table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362894 91177308-0d34-0410-b5e6-96231b3b80d8

[bindings/go] Add bindings to LLVMGet?CmpPredicate

Add bindings so that predicates on comparisons (icmp/fcmp) can be
inspected from IR.

Note: I considered adding Value.ICmpPredicate() etc. instead but
Value.IntPredicate() seemed easier to read and matches the name of the
returned type.

(This change was also pushed two commits ago but accidentally had the
wrong title and description.)

Revision: https://reviews.llvm.org/D53884

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362893 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[bindings/go] Add Go bindings for CalledValue"

This reverts commit f675a60ca7a93f22e22dd4209504a9846dd04630.
The commit had the wrong title/description. Sorry about the mess!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362892 91177308-0d34-0410-b5e6-96231b3b80d8

[bindings/go] Add Go bindings for CalledValue

This is very useful for inspecting generated IR, there appears to be no
other way to get the called function from a CallInst.

Differential Revision: https://reviews.llvm.org/D52972

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362891 91177308-0d34-0410-b5e6-96231b3b80d8

[bindings/go] Add Go bindings for CalledValue

This is very useful for inspecting generated IR, there appears to be no
other way to get the called function from a CallInst.

Revision: https://reviews.llvm.org/D52972

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362890 91177308-0d34-0410-b5e6-96231b3b80d8

[bindings/go] Add EraseFromParent

After using ReplaceAllUsesWith on an instruction, it may be necessary to
erase it even though it is dead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362889 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Test commit

Add a newline, which is missing according to go fmt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362888 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][Codegen] Add missed pattern that may be a lea+neg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362886 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] visitAND - merge (zext_inreg ((s)extload x)) -> (zextload x) combines. NFCI.

Same codegen, only differ by the oneuse limit for the sextload case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362880 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] enhance fcmp fold with never-nan operand

This is 1 step towards correcting our usage of fast-math-flags when applied on an fcmp.
In this case, we are checking for 'nnan' on the fcmp itself rather than the operand of
the fcmp. But I'm leaving that clause in until we're more confident that we can stop
relying on fcmp's FMF.

By using the more general "isKnownNeverNaN()", we gain a simplification shown on the
tests with 'uitofp' regardless of the FMF on the fcmp (uitofp never produces a NaN).
On the tests with 'fabs', we are now relying on the FMF for the call fabs instruction
in addition to the FMF on the fcmp.

I'll update the 'ult' case below here as a follow-up assuming no problems here.

Differential Revision: https://reviews.llvm.org/D62979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362879 91177308-0d34-0410-b5e6-96231b3b80d8

fix a typo unavaliable=>unavailable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362878 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added tests for D63038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362875 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Adjust isLegalT1AddressImmediate for non-legal types

Types such as float and i64's do not have legal loads in Thumb1, but will still
be loaded with a LDR (or potentially multiple LDR's). As such we can treat the
cost of addressing mode calculations the same as an i32 and get some optimisation
benefits.

Differential Revision: https://reviews.llvm.org/D62968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362874 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add MVE addressing to isLegalT2AddressImmediate

Now with MVE being added, we can add the vector addressing mode costs for it.
These are generally imm7 multiplied by the size of the type being loaded /
stored.

Differential Revision: https://reviews.llvm.org/D62967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362873 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add fp16 addressing to isLegalT2AddressImmediate

The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs
to account for those, and adds extra testing. It is dependant upon hasFPRegs16
as this is what the load/store instructions require.

Differential Revision: https://reviews.llvm.org/D62966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362872 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add extra gep costmodel tests for MVE and half float. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362871 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add HasNEON for all Neon patterns in ARMInstrNEON.td. NFCI

We are starting to add an entirely separate vector architecture to the ARM
backend. To do that we need at least some separation between the existing NEON
and the new MVE code. This patch just goes through the Neon patterns and
ensures that they are predicated on HasNEON, giving MVE a stable place to start
from.

No tests yet as this is largely an NFC, and we don't have the other target that
will treat any of these intructions as legal.

Differential Revision: https://reviews.llvm.org/D62945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362870 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Fix CMakeLists.txt for alphabetical order (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362869 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ, RegAlloc] Favor 3-address instructions during instruction selection.

This patch aims to reduce spilling and register moves by using the 3-address
versions of instructions per default instead of the 2-address equivalent
ones. It seems that both spilling and register moves are improved noticeably
generally.

Regalloc hints are passed to increase conversions to 2-address instructions
which are done in SystemZShortenInst.cpp (after regalloc).

Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are
the same), foldMemoryOperandImpl() can no longer trivially fold a spilled
source register since the reg/reg instruction is now 3-address. In order to
remedy this, new 3-address pseudo memory instructions are used to perform the
folding only when the dst and lhs virtual registers are known to be allocated
to the same physreg. In order to not let MachineCopyPropagation run and
change registers on these transformed instructions (making it 3-address), a
new target pass called SystemZPostRewrite.cpp is run just after
VirtRegRewriter, that immediately lowers the pseudo to a target instruction.

If it would have been possibe to insert a COPY instruction and change a
register operand (convert to 2-address) in foldMemoryOperandImpl() while
trusting that the caller (e.g. InlineSpiller) would update/repair the
involved LiveIntervals, the solution involving pseudo instructions would not
have been needed. This is perhaps a potential improvement (see Phabricator
post).

Common code changes:

* A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a
target pass immediately before MachineCopyPropagation.

* VirtRegMap is passed as an argument to foldMemoryOperand().

Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D60888

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362868 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r362857

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362864 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][MachO] Recompute and update offset/size fields in the writer

Summary:
Recompute and update offset/size fields so that we can implement llvm-objcopy options like --only-section.

This patch is the first step and focuses on supporting load commands that covered by existing tests: executable files and
dynamic libraries are not supported.

Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: alexshap, rupprecht

Subscribers: compnerd, jakehehrlich, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362863 91177308-0d34-0410-b5e6-96231b3b80d8

Visualizer for APInt and remove obsolete visualizer

Visualizer for the simple case of APInt (uints < 2^64)
as will be required for Clang ConstantArrayType visualizer.
Also, removed obsolete VS2013 SmallVectorVisualizer as VS2013
is no longer supported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362860 91177308-0d34-0410-b5e6-96231b3b80d8

Factor out SelectionDAG's switch analysis and lowering into a separate component.

In order for GlobalISel to re-use the significant amount of analysis and
optimization code in SDAG's switch lowering, we first have to extract it and
create an interface to be used by both frameworks.

No test changes as it's NFC.

Differential Revision: https://reviews.llvm.org/D62745

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362857 91177308-0d34-0410-b5e6-96231b3b80d8

LoopDistribute: Add testcase where SCEV wants to insert a runtime
check.

Only the memory based checks were being tested. Prepare for fix in
convergent handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362854 91177308-0d34-0410-b5e6-96231b3b80d8

[GVN] non-functional code movement

Summary: Move some code around, in preparation for later fixes
to the non-integral addrspace handling (D59661)

Patch By Jameson Nash <jameson@juliacomputing.com>

Reviewed By: reames, loladiro
Differential Revision: https://reviews.llvm.org/D59729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362853 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Force skips around traps

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362852 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Fix /export:foo=bar when bar is a weak alias

Summary:
When handling exports from the command line or from .def files, the
linker does a "fuzzy" string lookup to allow finding mangled symbols.
However, when the symbol is re-exported under a new name, the linker has
to transfer the decorations from the exported symbol over to the new
name. This is implemented by taking the mangled symbol that was found in
the object and replacing the original symbol name with the export name.

Before this patch, LLD implemented the fuzzy search by adding an
undefined symbol with the unmangled name, and then during symbol
resolution, checking if similar mangled symbols had been added after the
last round of symbol resolution. If so, LLD makes the original symbol a
weak alias of the mangled symbol. Later, to get the original symbol
name, LLD would look through the weak alias and forward it on to the
import library writer, which copies the symbol decorations. This
approach doesn't work when bar is itself a weak alias, as is the case in
asan. It's especially bad when the aliasee of bar contains the string
"bar", consider "bar_default". In this case, we would end up exporting
the symbol "foo_default" when we should've exported just "foo".

To fix this, don't look through weak aliases to find the mangled name.
Save the mangled name earlier during fuzzy symbol lookup.

Fixes PR42074

Reviewers: mstorsjo, ruiu

Subscribers: thakis, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362849 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lipo] Add docs for llvm-lipo

Add docs (llvm-lipo.rst) for llvm-lipo.

Test plan:
make -j8 sphinx
check that ./docs/html/CommandGuide/llvm-lipo.html is built correctly and looks okay.

Differential revision: https://reviews.llvm.org/D62706

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362848 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Fix Bugzilla ID 41862 to support checking addresses of disassembled object

Summary:
This fixes the bugzilla id,41862 to support dealing with checking
stop address against start address to support this not being a
proper object to check the disasembly against like gnu objdump
currently does.

Reviewers: jakehehrlich, rupprecht, echristo, jhenderson, grimar

Reviewed By: jhenderson

Subscribers: MaskRay, smeenai, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61969

Patch by Nicholas Krause!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362847 91177308-0d34-0410-b5e6-96231b3b80d8

Fix string literals to avoid deprecation warnings in regexp patterns

In LLDB, where tests run with the debug version of Python, we get a
series of deprecation warnings because escape sequences like `\(` are
being treated as part of the string literal rather than an escape for
the regexp pattern.

NFC intended.

Differential Revision: https://reviews.llvm.org/D62882

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362846 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lipo] Drop unneeded braces. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362841 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lipo] Implement -archs

Displays the architecture names of an input file.
Unknown architectures are represented by unknown(cputype,cpusubtype).

Patch by Anusha Basana <anusha.basana@gmail.com>

Differential Revision: https://reviews.llvm.org/D62753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362840 91177308-0d34-0410-b5e6-96231b3b80d8

[DomTreeUpdater] Add all insert before all delete updates to reduce compile time.

Summary:
The cleanup in D62751 introduced a compile-time regression due to the way DT updates are performed.
Add all insert edges then all delete edges in DTU to match the previous compile time.
Compile time on the test provided by @mstorsjo before and after this patch on my machine:
113.046s vs 35.649s
Repro: clang -target x86_64-w64-mingw32 -c -O3 glew-preproc.c; on https://martin.st/temp/glew-preproc.c.

Reviewers: kuhar, NutshellySima, mstorsjo

Subscribers: jlebar, mstorsjo, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62981

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362839 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add warning if --disassemble-functions specifies an unknown symbol

Summary:
Fixes Bug 41904 https://bugs.llvm.org/show_bug.cgi?id=41904

Re-land r362768 after it was reverted in r362826.

Reviewers: jhenderson, rupprecht, grimar, MaskRay

Reviewed By: jhenderson, rupprecht, MaskRay

Subscribers: dexonsmith, rupprecht, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62275

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362838 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unnecessary new line escape from the end of a macro. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362837 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Enable set_difference() to be used on StringSet

Summary: Re-land r362766 after it was reverted in r362823.

Reviewers: jhenderson, dsanders, aaron.ballman, MatzeB, lhames, dblaikie

Reviewed By: dblaikie

Subscribers: smeenai, mgrang, mgorny, dexonsmith, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62369

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362835 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] IRTranslator: Translate the intrinsics ignored by CodeGen

Summary:
Translate `llvm.assume`, `llvm.var.annotation` and `llvm.sideeffect` to nothing
as they have no effect on CodeGen.

Reviewers: qcolombet, aditya_nandakumar, dsanders, paquette, aemerson, arsenm

Reviewed By: arsenm

Subscribers: hiraditya, wdng, rovka, kristof.beyls, javed.absar, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63022

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362834 91177308-0d34-0410-b5e6-96231b3b80d8

[APFloat] APFloat::Storage::Storage - refix use after move

Summary:
Re-land r360675 after it was reverted in r360770.

This was reported in:
https://llvm.org/reports/scan-build/

Based on feedback in:
https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190513/652286.html

Reviewers: RKSimon, efriedma

Reviewed By: RKSimon, efriedma

Subscribers: eli.friedman, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362833 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Update symbol lookup to use a single callback with a required symbol state
rather than two callbacks.

The asynchronous lookup API (which the synchronous lookup API wraps for
convenience) used to take two callbacks: OnResolved (called once all requested
symbols had an address assigned) and OnReady to be called once all requested
symbols were safe to access). This patch updates the asynchronous lookup API to
take a single 'OnComplete' callback and a required state (SymbolState) to
determine when the callback should be made. This simplifies the common use case
(where the client is interested in a specific state) and will generalize neatly
as new states are introduced to track runtime initialization of symbols.

Clients who were making use of both callbacks in a single query will now need to
issue two queries (one for SymbolState::Resolved and another for
SymbolState::Ready). Synchronous lookup API clients who were explicitly passing
the WaitOnReady argument will now need neeed to pass a SymbolState instead (for
'WaitOnReady == true' use SymbolState::Ready, for 'WaitOnReady == false' use
SymbolState::Resolved). Synchronous lookup API clients who were using default
arugment values should see no change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362832 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add UnaryOperator::CreateFNegFMF(...)

Differential Revision: https://reviews.llvm.org/D62705

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362828 91177308-0d34-0410-b5e6-96231b3b80d8

Unbreak 32-bit build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362827 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[llvm-objdump] Add warning if --disassemble-functions specifies an unknown symbol"

This reverts commit 50f61af3f304a03f10d9ecb0828829f0a72d0099, it used
the function introduced in the previous revert of
0bddef79019a23ab14fcdb27028e55e484674c88.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362826 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] visitAND - fix local shadow variable warnings. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362825 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[ADT] Enable set_difference() to be used on StringSet"

This reverts commit 0bddef79019a23ab14fcdb27028e55e484674c88, it was
causing ASan failures on the sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32800

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362823 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wunused-lambda-capture warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362822 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Use APInt::extractBits in "sub-splat" constant mask detection. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362820 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-objcopy: Implement --extract-partition and --extract-main-partition.

This implements the functionality described in
https://lld.llvm.org/Partitions.html. It works as follows:

- Reads the section headers using the ELF header at file offset 0;
- If extracting a loadable partition:
  - Finds the section containing the required partition ELF header by looking it up in the section table;
  - Reads the ELF and program headers from the section.
- If extracting the main partition:
  - Reads the ELF and program headers from file offset 0.
- Filters the section table according to which sections are in the program headers that it read:
  - If ParentSegment != nullptr or section is not SHF_ALLOC, then it goes in.
  - Sections containing partition ELF headers or program headers are excluded as there are no headers for these in ordinary ELF files.

Differential Revision: https://reviews.llvm.org/D62364

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362818 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix MIR test verifier error

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362817 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Use the number of threads specified.

Before this patch we used either a single thread, or the number of
hardware threads available, effectively ignoring the number of threads
specified on the command line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362815 91177308-0d34-0410-b5e6-96231b3b80d8

[docs]Move llvm-readobj from "Developer Tools" to "Basic Commands"

On the Command Guide page, there are multiple sections with links to the
different documentation pages available for LLVM tools. The "Basic
Tools" section includes tools like llvm-objdump, llvm-nm and so on. The
"Developer Tools" section contains things like FileCheck and lit. This
change moves llvm-readobj into the former block, from the latter.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D63011

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362813 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] simplify code for getSplatValue(); NFC

AFAIK, this is only currently called by TTI, but it could be
used from instcombine or CGP to help solve problems like:
https://bugs.llvm.org/show_bug.cgi?id=37428
https://bugs.llvm.org/show_bug.cgi?id=42174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362810 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix nm-archive.test after r362798

llvm-lib now needs a `target triple` for bitcode, so add a new file
that's like trivial.ll but has one, and use that in the test.
(trivial.ll had a comment that looked like it wasn't supposed to be used
in tests directly, so I don't want to change that file.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362809 91177308-0d34-0410-b5e6-96231b3b80d8

Build with _XOPEN_SOURCE defined on AIX

Summary:
It is useful to build with _XOPEN_SOURCE defined on AIX, enabling X/Open
and POSIX compatibility mode, to work around stray macros and other
bugs in the headers provided by the system and build compiler.

This patch adds the config to cmake to build with _XOPEN_SOURCE defined
on AIX with a few exceptions. Google Test internals require access to
platform specific thread info constructs on AIX so in that case we build
with _ALL_SOURCE defined instead. Libclang also uses header which needs
_ALL_SOURCE on AIX so we leave that as is as well.

We also add building on AIX with the large file API and doing CMake
header checks with X/OPEN definitions so the results are consistent with
the environment that will be present in the build.

Reviewers: hubert.reinterpretcast, xingxue, andusy

Reviewed By: hubert.reinterpretcast

Subscribers: mgorny, jsji, cfe-commits, llvm-commits

Tags: #llvm, #clang

Differential Revision: https://reviews.llvm.org/D62533

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362808 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineScheduler] checkResourceLimit boundary condition update

When we call checkResourceLimit in bumpCycle or bumpNode, and we
know the resource count has just reached the limit (the equations
are equal). We should return true to mark that we are resource
limited for next schedule, or else we might continue to schedule
in favor of latency for 1 more schedule and create a schedule that
actually overbook the resource.

When we call checkResourceLimit to estimate the resource limite before
scheduling, we don't need to return true even if the equations are
equal, as it shouldn't limit the schedule for it .

Differential Revision: https://reviews.llvm.org/D62345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362805 91177308-0d34-0410-b5e6-96231b3b80d8

test-commit

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362802 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added tests for D63004

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362801 91177308-0d34-0410-b5e6-96231b3b80d8

TailDuplicator: Remove no-op analyzeBranch call

This could fail, which looked concerning. However nothing was actually
using the results of this. I assume this was intended to use the
anti-feature of analyzeBranch of removing instructions, but wasn't
actually calling it with AllowModify = true.

Fixes bug 42162.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362800 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Don't export helpers of ConstantFoldCall

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362799 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-lib: Disallow mixing object files with different machine types

lib.exe doesn't allow creating .lib files with object files that have
differing machine types. Update llvm-lib to match.

The motivation is to make it possible to infer the machine type of a
.lib file in lld, so that it can warn when e.g. a 32-bit .lib file is
passed to a 64-bit link (PR38965).

Fixes PR38782.

Differential Revision: https://reviews.llvm.org/D62913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362798 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] narrow extract subvector of vector select

This is a potentially large perf win for AVX1 targets because of the way we
auto-vectorize to 256-bit but then expect the backend to legalize/optimize
for the half-implemented AVX1 ISA.

On the motivating example from PR37428 (even though this patch doesn't solve
the vector shift issue):
https://bugs.llvm.org/show_bug.cgi?id=37428
...there's a 16% speedup when compiling with "-mavx" (perf tested on Haswell)
because we eliminate the remaining 256-bit vblendv ops.

I added comments on a couple of tests that require further work. If we have
256-bit logic ops separating the vselect and extract, we should probably narrow
everything to 128-bit, but that requires a larger pattern match.

Differential Revision: https://reviews.llvm.org/D62969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362797 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r362766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362796 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r362774

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362795 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Run `git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362794 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix bugs introduced by the fp64/d32 rework.

Change D60691 caused some knock-on failures that weren't caught by the
existing tests. Firstly, selecting a CPU that should have had a
restricted FPU (e.g. `-mcpu=cortex-m4`, which should have 16 d-regs
and no double precision) could give the unrestricted version, because
`ARM::getFPUFeatures` returned a list of features including subtracted
ones (here `-fp64`,`-d32`), but `ARMTargetInfo::initFeatureMap` threw
away all the ones that didn't start with `+`. Secondly, the
preprocessor macros didn't reliably match the actual compilation
settings: for example, `-mfpu=softvfp` could still set `__ARM_FP` as
if hardware FP was available, because the list of features on the cc1
command line would include things like `+vfp4`,`-vfp4d16` and clang
didn't realise that one of those cancelled out the other.

I've fixed both of these issues by rewriting `ARM::getFPUFeatures` so
that it returns a list that enables every FP-related feature
compatible with the selected FPU and disables every feature not
compatible, which is more verbose but means clang doesn't have to
understand the dependency relationships between the backend features.
Meanwhile, `ARMTargetInfo::handleTargetFeatures` is testing for all
the various forms of the FP feature names, so that it won't miss cases
where it should have set `HW_FP` to feed into feature test macros.

That in turn caused an ordering problem when handling `-mcpu=foo+bar`
together with `-mfpu=something_that_turns_off_bar`. To fix that, I've
arranged that the `+bar` suffixes on the end of `-mcpu` and `-march`
cause feature names to be put into a separate vector which is
concatenated after the output of `getFPUFeatures`.

Another side effect of all this is to fix a bug where `clang -target
armv8-eabi` by itself would fail to set `__ARM_FEATURE_FMA`, even
though `armv8` (aka Arm v8-A) implies FP-Armv8 which has FMA. That was
because `HW_FP` was being set to a value including only the `FPARMV8`
bit, but that feature test macro was testing only the `VFP4FPU` bit.
Now `HW_FP` ends up with all the bits set, so it gives the right
answer.

Changes to tests included in this patch:

* `arm-target-features.c`: I had to change basically all the expected
  results. (The Cortex-M4 test in there should function as a
  regression test for the accidental double-precision bug.)
* `arm-mfpu.c`, `armv8.1m.main.c`: switched to using `CHECK-DAG`
  everywhere so that those tests are no longer sensitive to the order
  of cc1 feature options on the command line.
* `arm-acle-6.5.c`: been updated to expect the right answer to that
  FMA test.
* `Preprocessor/arm-target-features.c`: added a regression test for
  the `mfpu=softvfp` issue.

Reviewers: SjoerdMeijer, dmgreen, ostannard, samparker, JamesNagurne

Reviewed By: ostannard

Subscribers: srhines, javed.absar, kristof.beyls, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D62998

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362791 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Support Bit-Preserving FP in F/D Extensions

Summary:
This allows some integer bitwise operations to instead be performed by
hardware fp instructions. This is correct because the RISC-V spec
requires the F and D extensions to use the IEEE-754 standard
representation, and fp register loads and stores to be bit-preserving.

This is tested against the soft-float ABI, but with hardware float
extensions enabled, so that the tests also ensure the optimisation also
fires in this case.

Reviewers: asb, luismarques

Reviewed By: asb

Subscribers: hiraditya, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362790 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Constrain the AMDGPU inliner on maximum number of basic blocks in a caller function (compile time performance)

Differential revision: https://reviews.llvm.org/D62917

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362789 91177308-0d34-0410-b5e6-96231b3b80d8

Work around a circular dependency between IR and MC introduced in r362735

I replaced the circular library dependency with a forward declaration,
but it is only a workaround, not a real fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362782 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][AsmParser] error on unexpected SVE predicate type suffix

Summary:
This patch fixes a bug in the assembler that permitted a type suffix on
predicate registers when not expected. For instance, the following was
previously valid:

    faddv h0, p0.q, z1.h

This bug was present in all SVE instructions containing predicates with
no type suffix and no predication form qualifier, i.e. /z or /m. The
latter instructions are already caught with an appropiate error message
by the assembler, e.g.:

            .text
    <stdin>:1:13: error: not expecting size suffix
    cmpne p1.s, p0.b/z, z2.s, 0
                ^

A similar issue for SVE vector registers was fixed in:

  https://reviews.llvm.org/D59636

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62942

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362780 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][AsmParser] Provide better diagnostics for SVE predicates

Patch by Sander de Smalen (sdesmalen)

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D62941

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362779 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] - Emit error and don't crash if program header reaches past end of file.

This is https://bugs.llvm.org/show_bug.cgi?id=42122.

If an object file has a size less than program header's file [offset + size]
(i.e. if we have overflow), llvm-objcopy crashes instead of reporting a
error.

The patch fixes this issue.

Differential revision: https://reviews.llvm.org/D62898

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362778 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2elf] - Refactoring followup for D62809

This is a refactoring follow-up for D62809
"Change how we handle implicit sections.".
It allows to simplify the code.

Differential revision: https://reviews.llvm.org/D62912

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362777 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] -march=cooperlake (llvm)

Support intel -march=cooperlake in llvm

Patch by Shengchen Kan (skan)

Differential Revision: https://reviews.llvm.org/D62836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362776 91177308-0d34-0410-b5e6-96231b3b80d8

Fix for lld buildbot

Removed unused (in non-debug builds) variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362775 91177308-0d34-0410-b5e6-96231b3b80d8