granicus.if.org Git

[WebAssembly] Split CFG-sorting into its own pass. NFC.

CFG sorting was already an independent algorithm from block/loop insertion;
this change makes it more convenient to debug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296399 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r296366 "[InlineFunction] add nonnull assumptions based on argument attributes"

It causes miscompiles e.g. during self-host of Clang (PR32082).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296398 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing namespace qualifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296397 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Support v2i16/v2f16 packed operations

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8

ISel: We need to notify FastIS of the IMPLICIT_DEF we created in createSwiftErrorEntriesInEntryBlock

Otherwise, it will insert instructions before it.

rdar://30536186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296395 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Partial resubmit of r296215, which improved PDB Stream Library.

This was reverted because it was breaking some builds, and
because of incorrect error code usage.  Since the CL was
large and contained many different things, I'm resubmitting
it in pieces.

This portion is NFC, and consists of:

1) Renaming classes to follow a consistent naming convention.
2) Fixing the const-ness of the interface methods.
3) Adding detailed doxygen comments.
4) Fixing a few instances of passing `const BinaryStream& X`.  These
   are now passed as `BinaryStreamRef X`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296394 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "DAG: Check if extract_vector_elt is legal or custom"

This reverts r295782. This could potentially result in some
legalization loops and I avoided the need for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296393 91177308-0d34-0410-b5e6-96231b3b80d8

Empty line. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296392 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO] Fix a bug in reading text format value profile.

Summary: Should use the Valuekind read from the profile.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits, xur

Differential Revision: https://reviews.llvm.org/D30420

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296391 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] don't transform an add(ext Cond), C to select unless there's a setcc of the condition

The transform in question claims to be doing:

// fold (add (select cc, 0, c), x) -> (select cc, x, (add, x, c))

...starting in PerformADDCombineWithOperands(), but it wasn't actually checking for a setcc node
for the sext/zext patterns.

This is exactly the opposite of a transform I'd like to add to DAGCombiner's foldSelectOfConstants(),
so I was seeing infinite loops with my draft of a patch applied.

The changes in select_const.ll look positive (less instructions). The change in arm-and-tst-peephole.ll
is unrelated. We're changing the input IR in that test to preserve the intent of the test, but that's
not affected by this code change.

Differential Revision:
https://reviews.llvm.org/D30355

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296389 91177308-0d34-0410-b5e6-96231b3b80d8

[Support][Error] Add a 'cantFail' utility function for known-safe calls to
fallible functions.

Some fallible functions (those returning Error or Expected<T>) may only fail
for a subset of their inputs. For example, a "safe" square root function will
succeed for all finite positive inputs:

  Expected<double> safeSqrt(double d) {
    if (d < 0 && !isnan(d) && !isinf(d))
      return make_error<...>("Cannot sqrt -ve values, nans or infs");
    return sqrt(d);
  }

At a safe callsite for such a function, checking the error return value is
redundant:

  if (auto ValOrErr = safeSqrt(42.0)) {
    // use *ValOrErr.
  } else
    llvm_unreachable("safeSqrt should always succeed for +ve values");

The cantFail function wraps this check and extracts the contained value,
simplifying control flow:

  double Result = cantFail(safeSqrt(42.0));

This function should be used with care: it is a programmatic error to wrap a
call with cantFail if it can in fact fail. For debug builds this will
result in llvm_unreachable being called. For release builds the behavior is
undefined.

Use of this function is likely to be rare in library code, but more common
for tool and unit-test code where inputs and mock functions may be known to be
safe.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296384 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add some of the new gfx9 VOP3 instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296382 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Attempt to extract vector elements through target shuffles

DAGCombiner already supports peeking thorough shuffles to improve vector element extraction, but legalization often leaves us in situations where we need to extract vector elements after shuffles have already been lowered.

This patch adds support for VECTOR_EXTRACT_ELEMENT/PEXTRW/PEXTRB instructions to attempt to handle target shuffles as well. I've covered some basic scenarios including handling shuffle mask scaling and the implicit zero-extension of PEXTRW/PEXTRB, there is more that could be done here (that I've mentioned in TODOs) but I haven't found many cases where its worth it.

Differential Revision: https://reviews.llvm.org/D30176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296381 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Support inlineasm for packed instructions

Add packed types as legal so they may be used with inlineasm.
Keep all operations expanded for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296379 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Use different flags in tests for reduction ops and extra args.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296376 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't fold immediate if clamp/omod are set

Doesn't fix any practical problems because clamp/omod
are currently folded after peephole optimizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296375 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fold omod into instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296372 91177308-0d34-0410-b5e6-96231b3b80d8

[TailDuplicator] Maintain DebugLoc for branch instructions

Summary: Existing implementation of duplicateSimpleBB function drops DebugLoc metadata of branch instructions during the transformation. This patch addresses this issue by making newly created branch instructions to keep the metadata of replaced branch instructions.

Reviewers: qcolombet, craig.topper, aprantl, MatzeB, sanjoy, dblaikie

Reviewed By: dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: https://reviews.llvm.org/D30026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296371 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add f16 to shader calling conventions

Mostly useful for writing tests for f16 features.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296370 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Modify test to check IR flags propagation for extra args.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296369 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add VOP3P instruction format

Add a few non-VOP3P but instructions related to packed.

Includes hack with dummy operands for the benefit of the assembler

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296368 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor xaluo.ll and xmulo.ll tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296367 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineFunction] add nonnull assumptions based on argument attributes

This was suggested in D27855: have the inliner add assumptions, so we don't
lose nonnull info provided by argument attributes.

This still doesn't solve PR28430 (dyn_cast), but this gets us closer.

https://reviews.llvm.org/D29999

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296366 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Defs and clobbers can overlap

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296365 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a bug when unswitching on partial LIV for SwitchInst

Summary: Fix a bug when unswitching on partial LIV for SwitchInst.

Reviewers: hfinkel, efriedma, sanjoy

Reviewed By: sanjoy

Subscribers: david2050, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D29107

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296363 91177308-0d34-0410-b5e6-96231b3b80d8

Fix comments. NFC.

Change "Thin-LTO" to "ThinLTO" in the comments for consistency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296362 91177308-0d34-0410-b5e6-96231b3b80d8

Fix LLVM module build

Add WasmRelocs/WebAssembly.def to textual include header.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296356 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use APInt instead of SmallBitVector tracking undef elements from getTargetConstantBitsFromNode and getConstVector.

Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296355 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use APInt instead of SmallBitVector for tracking Zeroable elements in shuffle lowering

Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296354 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix SmallVector sizes in constant pool shuffle decoding to avoid heap allocation

Some of the vectors are under sized to avoid heap allocation. In one case the vector was oversized.

Differential Revision: https://reviews.llvm.org/D30387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296353 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use APInt instead of SmallBitVector for tracking undef elements in constant pool shuffle decoding

Summary:
SmallBitVector uses a malloc for more than 58 bits on a 64-bit target and more than 27 bits on a 32-bit target. Some of the vector types we deal with here use more than those number of elements and therefore cause a malloc.

APInt on the other hand supports up to 64 bits without a malloc. That's the maximum number of bits we need here so we can avoid a malloc for all cases by using APInt. This will incur a minor increase in stack usage due to APInt storing the bit count separately from the data bits unlike SmallBitVector, but that should be ok.

Reviewers: RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30386

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296352 91177308-0d34-0410-b5e6-96231b3b80d8

Remove an empty line in icmp-illegal.ll . NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296350 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] A test for a fix of PR32038.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296349 91177308-0d34-0410-b5e6-96231b3b80d8

Loop predication expand both sides of the widened condition

This is a fix for a loop predication bug which resulted in malformed IR generation.

Loop invariant side of the widened condition is not guaranteed to be available in the preheader as is, so we need to expand it as well. See added unsigned_loop_0_to_n_hoist_length test for example.

Reviewed By: sanjoy, mkazantsev

Differential Revision: https://reviews.llvm.org/D30099

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296345 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64InstPrinter: rewrite of printSysAlias

This is a cleanup/rewrite of the printSysAlias function. This was not using the
tablegen instruction descriptions, but was "manually" decoding the
instructions. This has been replaced with calls to lookup_XYZ_ByEncoding
tablegen calls.

This revealed several problems. First, instruction IVAU had the wrong encoding.
This was cancelled out by the parser that incorrectly matched the wrong
encoding. Second, instruction CVAP was missing from the SystemOperands tablegen
descriptions, so this has been added. And third, the required target features
were not captured in the tablegen descriptions, so support for this has also
been added.

Differential Revision: https://reviews.llvm.org/D30329

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296343 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] LSL #0 is an alias of MOV

Currently we handle this correctly in arm, but in thumb we don't which leads to
an unpredictable instruction being emitted for LSL #0 in an IT block and SP not
being permitted in some cases when it should be.

For the thumb2 LSL we can handle this by making LSL #0 an alias of MOV in the
.td file, but for thumb1 we need to handle it in checkTargetMatchPredicate to
get the IT handling right. We also need to adjust the handling of
MOV rd, rn, LSL #0 to avoid generating the 16-bit encoding in an IT block. We
should also adjust it to allow SP in the same way that it is allowed in
MOV rd, rn, but I haven't done that here because it looks like it would take
quite a lot of work to get right.

Additionally correct the selection of the 16-bit shift instructions in
processInstruction, where it was checking if the two registers were equal when
it should have been checking if they were low. It appears that previously this
code was never executed and the 16-bit encoding was selected by default, but
the other changes I've done here have somehow made it start being used.

Differential Revision: https://reviews.llvm.org/D30294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296342 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Fix for a load combine bug with non-zero offset patterns on BE targets

This pattern is essentially a i16 load from p+1 address:

  %p1.i16 = bitcast i8* %p to i16*
  %p2.i8 = getelementptr i8, i8* %p, i64 2
  %v1 = load i16, i16* %p1.i16
  %v2.i8 = load i8, i8* %p2.i8
  %v2 = zext i8 %v2.i8 to i16
  %v1.shl = shl i16 %v1, 8
  %res = or i16 %v1.shl, %v2

Current implementation would identify %v1 load as the first byte load and would mistakenly emit a i16 load from %p1.i16 address. This patch adds a check that the first byte is loaded from a non-zero offset of the first load address. This way this address can be used as the base address for the combined value. Otherwise just give up combining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296336 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] NFC. MatchLoadCombine extract MemoryByteOffset lambda helper

This refactoring will simplify the upcoming change to fix the bug in folding patterns with non-zero offsets on BE targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296332 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] NFC. MatchLoadCombine remember the first byte provider, not the load node

This refactoring will simplify the upcoming change to fix a bug in folding patterns with non-zero offsets on BE targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296331 91177308-0d34-0410-b5e6-96231b3b80d8

AArch64AsmParser: don't try to parse “[1]” for non-vector register operands

There are no instructions that have "[1]" as part of the assembly string;
FMOVXDhighr is out of date. This removes dead code.

Differential Revision: https://reviews.llvm.org/D30165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296327 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Runtime metadata fixes:
  - Verify that runtime metadata is actually valid runtime metadata when assembling, otherwise we could accept the following when assembling, but ocl runtime will reject it:
    .amdgpu_runtime_metadata
    { amd.MDVersion: [ 2, 1 ], amd.RandomUnknownKey, amd.IsaInfo: ...
  - Make IsaInfo optional, and always emit it.

Differential Revision: https://reviews.llvm.org/D30349

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296324 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-mc-fuzzer: add support for assembly

This creates an llvm-mc-disassemble-fuzzer from the existing llvm-mc-fuzzer
and finishing the assemble support in llvm-mc-assemble-fuzzer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296323 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Use UINT64_MAX instead of ~integerPart(0). NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296322 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Check for less than 0 rather than explicit compare with -1. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296321 91177308-0d34-0410-b5e6-96231b3b80d8

Do full codegen for various tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296305 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Use UINT64_MAX instead of ~uint64_t(0ULL). NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296301 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Use UINT64_MAX instead of ~0ULL. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296300 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Remove unnecessary early out from getLowBitsSet. The same case is handled equally well by the next check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296299 91177308-0d34-0410-b5e6-96231b3b80d8

Update comments. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296298 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[CGP] Split some critical edges coming out of indirect branches"

This reverts commit r296149 as it leads to crashes when compiling for
PPC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296295 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopDeletion] Modernize and simplify a bit. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296294 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix execution domain for cmpss/sd instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296293 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix execution domain for scalar commutable min/max instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296292 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix execution domain for vmovhpd/lpd/hps/lps.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296291 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix the execution domain for AVX-512 integer broadcasts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296290 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Disable the redundant patterns in the VPBROADCASTBr_Alt and VPBROADCASTWr_Alt instructions. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296289 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix execution domain for VPMADD52 instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296288 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Use update_llc_test_checks.py to regenerate a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296287 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix the execution domain for VSCALEF instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296286 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix execution domain of scalar VRANGE/REDUCE/GETMANT with sae.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296285 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix the execution domain for scalar SQRT intrinsic instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296284 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add an additional CHECK prefix to a test. Some of the cases used it, but it wasn't on the FileCheck command lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296283 91177308-0d34-0410-b5e6-96231b3b80d8

[SCCP] Remove manual folding of terminator instructions.

Summary:
BranchInst, SwitchInst (with non-default case) with Undef as input is not
possible at this point. As we always default-fold terminator to one target in
ResolvedUndefsIn and set the input accordingly.

So we should only have constantint/blockaddress here.

If ConstantFoldTerminator fails, that could mean 2 things.

1. ConstantFoldTerminator is doing something unexpected, i.e. not folding on constantint
or blockaddress and not making blocks that should be dead dead.
2. This is not a terminator on constantint or blockaddress. Its on a constant or
overdefined, then this block should not be dead.

In both cases, we should assert.

Reviewers: davide, efriedma, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30381

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296281 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Clean up test/CodeGen/X86/2006-03-02-InstrSchedBug.ll

Summary:
Migrated from grep to FileCheck.
Re-indented code, removed boilerplate comments.
Added 'entry' label at beginning of basic block.

Patch by Jorge Gorbe!

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30320

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296280 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."

This reverts commit r296252 until 256-bit operations are more efficiently generated in X86.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296279 91177308-0d34-0410-b5e6-96231b3b80d8

vec perm can go down either pipeline on P8.
No observable changes, spotted while looking at the scheduling description.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296277 91177308-0d34-0410-b5e6-96231b3b80d8

Fix signed-unsigned comparison warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296274 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Don't do an unchecked shift in ComputeNumSignBits

Summary:
Previously we used to return a bogus result, 0, for IR like `ashr %val,
-1`.

I've also added an assert checking that `ComputeNumSignBits` at least
returns 1. That assert found an already checked in test case where we
were returning a bad result for `ashr %val, -1`.

Fixes PR32045.

Reviewers: spatel, majnemer

Reviewed By: spatel, majnemer

Subscribers: efriedma, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30311

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296273 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Add APInt::extractBits() method to extract APInt subrange (reapplied)

The current pattern for extract bits in range is typically:

Mask.lshr(BitOffset).trunc(SubSizeInBits);

Which can be particularly slow for large APInts (MaskSizeInBits > 64) as they require the allocation of memory for the temporary variable.

This is another of the compile time issues identified in PR32037 (see also D30265).

This patch adds the APInt::extractBits() helper method which avoids the temporary memory allocation.

Differential Revision: https://reviews.llvm.org/D30336

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296272 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix the execution domain for scalar FMA instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296271 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix the execution domain on some instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296270 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add an additional test case to show the execution domain for vrqsrtsd is wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296269 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Use update_llc_test_checks.py to regenerate the avx512er intrinsic test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296268 91177308-0d34-0410-b5e6-96231b3b80d8

reenable accidentally disabled test NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296266 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Remove unnecessary masked versions of VCVTSS2SD and VCVTSD2SS using the scalar register class. We only have patterns for the masked intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296264 91177308-0d34-0410-b5e6-96231b3b80d8

[ExecutionDepsFix] Don't make copies of LiveReg objects when collecting operands for soft instructions

Summary:
While collecting operands we make copies of the LiveReg objects which are stored in the LiveRegs array. If the instruction uses the same register multiple times we end up with multiple copies. Later we iterate through the collected list of LiveReg objects and merge DomainValues. In the process of doing this the merge function can change the contents of the original LiveReg object in the LiveRegs array, but not the copies that have been made. So when we get to the second usage of the register we end up seeing a stale copy of the LiveReg object.

To fix this I've stopped copying and now just store a pointer to the original LiveReg object. Another option might be to avoid adding the same register to the Regs array twice, but this approach seemed simpler.

The included test case exposes this bug due to an AVX-512 masked OR instruction using the same register for the passthru operand and one of the inputs to the OR operation.

Fixes PR30284.

Reviewers: RKSimon, stoklund, MatzeB, spatel, myatsina

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30242

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296260 91177308-0d34-0410-b5e6-96231b3b80d8

No need to copy the variable [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296259 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r296215, "[PDB] General improvements to Stream library." and followings.

r296215, "[PDB] General improvements to Stream library."
r296217, "Disable BinaryStreamTest.StreamReaderObject temporarily."
r296220, "Re-enable BinaryStreamTest.StreamReaderObject."
r296244, "[PDB] Disable some tests that are breaking bots."
r296249, "Add static_cast to silence -Wc++11-narrowing."

std::errc::no_buffer_space should be used for OS-oriented errors for socket transmission.
(Seek discussions around llvm/xray.)

I could substitute s/no_buffer_space/others/g, but I revert whole them ATM.

Could we define and use LLVM errors there?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296258 91177308-0d34-0410-b5e6-96231b3b80d8

Update various test's codegen. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296257 91177308-0d34-0410-b5e6-96231b3b80d8

Add test for known bits in uaddo and saddo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296255 91177308-0d34-0410-b5e6-96231b3b80d8

The automatic CHECK: to CHECK-LABEL: conversion, back in 2013,
had missed most labels in this test because they didn't end
with a colon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296254 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Skip implicit_const attributes when dumping .debug_info. NFC.

When dumping .debug_info section we loop through all attributes mentioned in
.debug_abbrev section and dump values using DWARFFormValue::extractValue().
We need to skip implicit_const attributes here as their values are not
really located in .debug_info but directly in .debug_abbrev. This patch fixes
triggered assert() in DWARFFormValue::extractValue() caused by trying to
access implicit_const values from .debug_info.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296253 91177308-0d34-0410-b5e6-96231b3b80d8

In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.

    Recommiting after fixup of 32-bit aliasing sign offset bug in DAGCombiner.

    * Simplify Consecutive Merge Store Candidate Search

    Now that address aliasing is much less conservative, push through
    simplified store merging search and chain alias analysis which only
    checks for parallel stores through the chain subgraph. This is cleaner
    as the separation of non-interfering loads/stores from the
    store-merging logic.

    When merging stores search up the chain through a single load, and
    finds all possible stores by looking down from through a load and a
    TokenFactor to all stores visited.

    This improves the quality of the output SelectionDAG and the output
    Codegen (save perhaps for some ARM cases where we correctly constructs
    wider loads, but then promotes them to float operations which appear
    but requires more expensive constant generation).

    Some minor peephole optimizations to deal with improved SubDAG shapes (listed below)

    Additional Minor Changes:

      1. Finishes removing unused AliasLoad code

      2. Unifies the chain aggregation in the merged stores across code
         paths

      3. Re-add the Store node to the worklist after calling
         SimplifyDemandedBits.

      4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
         arbitrary, but seems sufficient to not cause regressions in
         tests.

      5. Remove Chain dependencies of Memory operations on CopyfromReg
         nodes as these are captured by data dependence

      6. Forward loads-store values through tokenfactors containing
          {CopyToReg,CopyFromReg} Values.

      7. Peephole to convert buildvector of extract_vector_elt to
         extract_subvector if possible (see
         CodeGen/AArch64/store-merge.ll)

      8. Store merging for the ARM target is restricted to 32-bit as
         some in some contexts invalid 64-bit operations are being
         generated. This can be removed once appropriate checks are
         added.

    This finishes the change Matt Arsenault started in r246307 and
    jyknight's original patch.

    Many tests required some changes as memory operations are now
    reorderable, improving load-store forwarding. One test in
    particular is worth noting:

      CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store
      forwarding converts a load-store pair into a parallel store and
      a memory-realized bitcast of the same value. However, because we
      lose the sharing of the explicit and implicit store values we
      must create another local store. A similar transformation
      happens before SelectionDAG as well.

    Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296252 91177308-0d34-0410-b5e6-96231b3b80d8

[Doc] Modernize programmers manual

Summary:
Fixed bunch of for loops to range based for loop
and bunch of rendundat types with auto.

Reviewers: echristo, silvas, chandlerc

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D30338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296251 91177308-0d34-0410-b5e6-96231b3b80d8

Empty line. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296250 91177308-0d34-0410-b5e6-96231b3b80d8

Add static_cast to silence -Wc++11-narrowing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296249 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Disable some tests that are breaking bots.

This has to do with big endian, but I can't fix it until
Monday. The code itself is fine, just the tests are wrong.
Disabling 3 tests for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296244 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SI: export s_waitcnt builtin

Differential Revision: https://reviews.llvm.org/D30358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296228 91177308-0d34-0410-b5e6-96231b3b80d8

Minor code cleanup. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296222 91177308-0d34-0410-b5e6-96231b3b80d8

Re-enable BinaryStreamTest.StreamReaderObject.

I had an invalid pointer / size calculation that was causing
a stack smash. Should be fixed now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296220 91177308-0d34-0410-b5e6-96231b3b80d8

Remove redundant code. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296219 91177308-0d34-0410-b5e6-96231b3b80d8

Clean up ObjCARCOpts.cpp. NFC.

I removed unused functions and variables and moved variables closer to
their uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296218 91177308-0d34-0410-b5e6-96231b3b80d8

Disable BinaryStreamTest.StreamReaderObject temporarily.

This is crashing on some bots, so I need some time to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296217 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] General improvements to Stream library.

This adds various new functionality and cleanup surrounding the
use of the Stream library. Major changes include:

* Renaming of all classes for more consistency / meaningfulness
* Addition of some new methods for reading multiple values at once.
* Full suite of unit tests for reader / writer functionality.
* Full set of doxygen comments for all classes.
* Streams now store their own endianness.
* Fixed some bugs in a few of the classes that were discovered
by the unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296215 91177308-0d34-0410-b5e6-96231b3b80d8

Remove svn:eol-style=native from Properties.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296212 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Rename Stream related source files.

This is part of a larger effort to get the Stream code moved
up to Support. I don't want to do it in one large patch, in
part because the changes are so big that it will treat everything
as file deletions and add, losing history in the process.
Aside from that though, it's just a good idea in general to
make small changes.

So this change only changes the names of the Stream related
source files, and applies necessary source fix ups.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296211 91177308-0d34-0410-b5e6-96231b3b80d8

[XRAY] A Color Choosing helper for XRay Graph

Summary:
In Preparation for graph comparison, this patch breaks out the color
choice code from xray-graph into a library and adds polynomials for
the Sequential and Difference sets from ColorBrewer.

Depends on D29005

Reviewers: dblaikie, chandlerc, dberris

Reviewed By: dberris

Subscribers: chandlerc, llvm-commits, mgorny

Differential Revision: https://reviews.llvm.org/D29363

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296210 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineCost] Move the code in isGEPOffsetConstant to a lambda.

Differential revision: https://reviews.llvm.org/D30112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296208 91177308-0d34-0410-b5e6-96231b3b80d8

Minor code cleanup. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296207 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO] Directory name stripping in global identifier for static functions

Current internal option -static-func-full-module-prefix keeps all the
directory path the profile counter names for static functions. The default
of this option is false. This strips the directory names from the source
filename which is problematic:

(1) it creates linker errors for profile-generation compilation, exposed in
our internal benchmarks. We are seeing messages like
"warning: relocation refers to discarded section".
This is due to the name conflicts after the stripping.

(2) the stripping only applies to getPGOFuncName.
Current Thin-LTO module importing for the indirect-calls assumes
the source directory name not being stripped. Current default value
for this option can potentially prevent some inter-module
indirect-call-promotions.

This patch turns the default value for -static-func-full-module-prefix to true.

The second part of the patch is to have an alternative implementation under
the internal option -static-func-strip-dirname-prefix=<value>

This options specifies level of directories to be stripped from the source
filename. Using a large value as the parameter has the same effect as
-static-func-full-module-prefix.

Differential Revision: http://reviews.llvm.org/D29512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296206 91177308-0d34-0410-b5e6-96231b3b80d8