granicus.if.org Git

llvm-reduce: Follow-up to 372280, now with more-better msan fixing

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372349 91177308-0d34-0410-b5e6-96231b3b80d8

Don't use invalidated iterators in FlattenCFGPass

Summary:
FlattenCFG may erase unnecessary blocks, which also invalidates iterators to those erased blocks.
Before this patch, `iterativelyFlattenCFG` could try to increment a BB iterator after that BB has been removed and crash.

This patch makes FlattenCFGPass use `WeakVH` to skip over erased blocks.

Reviewers: dblaikie, tstellar, davide, sanjoy, asbirlea, grosser

Reviewed By: asbirlea

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67672

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372347 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Allow -scalar-evolution-max-iterations more than once

At present, `-scalar-evolution-max-iterations` is a `cl::Optional`
option, which means it demands to be passed exactly zero or one times.
Our build system makes it pretty tricky to guarantee this. We often
accidentally pass the flag more than once (but always with the same
value) which results in an error, after which compilation fails:

```
clang (LLVM option parsing): for the -scalar-evolution-max-iterations option: may only occur zero or one times!
```

It seems reasonable to allow -scalar-evolution-max-iterations to be
passed more than once. Quoting the [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed | documentation ]]:

> The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times.
> ...
> If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained.

Original patch by: Enrico Bern Hardy Tanuwidjaja <etanuwid@fb.com>

Differential Revision: https://reviews.llvm.org/D67512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372346 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][PowerPC] Fast-isel VSX support test

We have fixed most of the VSX limitation in Fast-isel,
so we can remove the -mattr=-vsx for most testcases now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372345 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r372343

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372344 91177308-0d34-0410-b5e6-96231b3b80d8

[SVFS] Vector Function ABI demangling.

This patch implements the demangling functionality as described in the
Vector Function ABI. This patch will be used to implement the
SearchVectorFunctionSystem (SVFS) as described in the RFC:

http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html

A fuzzer is added to test the demangling utility.

Patch by Sumedh Arani <sumedh.arani@arm.com>

Differential revision: https://reviews.llvm.org/D66024

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372343 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify @llvm.usub.with.overflow+non-zero check (PR43251)

Summary:
This is again motivated by D67122 sanitizer check enhancement.
That patch seemingly worsens `-fsanitize=pointer-overflow`
overhead from 25% to 50%, which strongly implies missing folds.

In this particular case, given
```
char* test(char& base, unsigned long offset) {
  return &base - offset;
}
```
it will end up producing something like
https://godbolt.org/z/luGEju
which after optimizations reduces down to roughly
```
declare void @use64(i64)
define i1 @test(i8* dereferenceable(1) %base, i64 %offset) {
  %base_int = ptrtoint i8* %base to i64
  %adjusted = sub i64 %base_int, %offset
  call void @use64(i64 %adjusted)
  %not_null = icmp ne i64 %adjusted, 0
  %no_underflow = icmp ule i64 %adjusted, %base_int
  %no_underflow_and_not_null = and i1 %not_null, %no_underflow
  ret i1 %no_underflow_and_not_null
}
```
Without D67122 there was no `%not_null`,
and in this particular case we can "get rid of it", by merging two checks:
Here we are checking: `Base u>= Offset && (Base u- Offset) != 0`, but that is simply `Base u> Offset`

Alive proofs:
https://rise4fun.com/Alive/QOs

The `@llvm.usub.with.overflow` pattern itself is not handled here
because this is the main pattern, that we currently consider canonical.

https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00, majnemer

Reviewed By: xbolva00, majnemer

Subscribers: vsk, majnemer, xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372341 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Unnecessary -amdgpu-scalarize-global-loads=false flag removed from min/max lit tests.

Reviewers: arsenm

Differential Revision: https://reviews.llvm.org/D67712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372340 91177308-0d34-0410-b5e6-96231b3b80d8

[Float2Int] avoid crashing on unreachable code (PR38502)

In the example from:
https://bugs.llvm.org/show_bug.cgi?id=38502
...we hit infinite looping/crashing because we have non-standard IR -
an instruction operand is used before defined.
This and other unusual constructs are allowed in unreachable blocks,
so avoid the problem by using DominatorTree to step around landmines.

Differential Revision: https://reviews.llvm.org/D67766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372339 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"

This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)

This was missing one switch to getTargetConstant in an untested case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372338 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Improved cost computation for loop carried dependencies in the bottleneck analysis.

This patch introduces a cut-off threshold for dependency edge frequences with
the goal of simplifying the critical sequence computation. This patch also
removes the cost normalization for loop carried dependencies. We didn't really
need to artificially amplify the cost of loop-carried dependencies since it is
already computed as the integral over time of the delay (in cycle).

In the absence of backend stalls there is no need for computing a critical
sequence. With this patch we early exit from the critical sequence computation
if no bottleneck was reported during the simulation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372337 91177308-0d34-0410-b5e6-96231b3b80d8

Make appendCallNB lambda mutable

Lambdas are by deafult const so that they produce the same output every time they are run. This lambda needs to set the value on a captured promise which is a mutating operation, so it must be mutable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372336 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Add missing test for vshli SimplifyDemandedBitsForTargetNode

This would have caught this regression which triggered the revert of
r372285: https://bugs.chromium.org/p/chromium/issues/detail?id=1005750

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372335 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863)

This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation.

The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants).

Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.).

I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches.

Differential Revision: https://reviews.llvm.org/D67557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372333 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add node to the worklist in topological order in scalarizeExtractedVectorLoad

Summary: As per title.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66661

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372327 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Break long (>80) line. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372326 91177308-0d34-0410-b5e6-96231b3b80d8

[Float2Int] auto-generate complete test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372324 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Support encoding per-HwMode

Much like ValueTypeByHwMode/RegInfoByHwMode, this patch allows targets
to modify an instruction's encoding based on HwMode. When the
EncodingInfos field is non-empty the Inst and Size fields of the Instruction
are ignored and taken from EncodingInfos instead.

As part of this promote getHwMode() from TargetSubtargetInfo to MCSubtargetInfo.

This is NFC for all existing targets - new code is generated only if targets
use EncodingByHwMode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372320 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Add SelectionDAG::MaxRecursionDepth constant

As commented on D67557 we have a lot of uses of depth checks all using magic numbers.

This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly.

Differential Revision: https://reviews.llvm.org/D67711

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372315 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"

This broke the Chromium build, causing it to fail with e.g.

fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>

See llvm-commits thread of r372285 for details.

This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.

> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372314 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] MVE i1 splat

We needn't BFI each lane individually into a predicate register when each lane
in the same. A simple sign extend and a vmsr will do.

Differential Revision: https://reviews.llvm.org/D67653

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372313 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [llvm-ar] Include a line number when failing to parse an MRI script

Revert r372309 due to buildbot failures

Differential Revision: https://reviews.llvm.org/D67449

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372311 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wdocumentation "@returns in a void function" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372310 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-ar] Include a line number when failing to parse an MRI script

Errors that occur when reading an MRI script now include a corresponding
line number.

Differential Revision: https://reviews.llvm.org/D67449

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372309 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372308 91177308-0d34-0410-b5e6-96231b3b80d8

[Unroll] Add an option to control complete unrolling

Add an ability to specify the max full unroll count for LoopUnrollPass pass
in pass options.

Reviewers: fhahn, fedor.sergeev
Reviewed By: fedor.sergeev
Subscribers: hiraditya, zzheng, dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D67701

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372305 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Prevent crash in LowerBUILD_VECTORvXi1 for v64i1 vectors on 32-bit targets when the vector is a mix of constants and non-constant.

We need to materialize the constants as two 32-bit values that
are casted to v32i1 and then concatenated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372304 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Fix for buildbots

I had missed that massive.mir also needed updating.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372303 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Change a SmallVector& argument to SmallVectorImpl&. NFC

Avoids repeating the size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372302 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unused argument from a helper function. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372301 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/SILoadStoreOptimizer: Add const to more functions

Reviewers: arsenm, pendingchaos, rampitec, nhaehnle, vpykhtin

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372298 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.ds.swizzle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372297 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect tbuffer load/store

These have the same operand structure as the non-t buffer operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372296 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store.format

This needs special handling due to some subtargets that have a
nonstandard register layout for f16 vectors

Also reject some illegal types on other targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372293 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select llvm.amdgcn.raw.buffer.store

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372292 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect struct buffer load/store

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372291 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect llvm.amdgcn.raw.buffer.{load|store}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372290 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Attempt to RegBankSelect image intrinsics

Images should always have 2 consecutive, mandatory SGPR arguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372289 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372288 91177308-0d34-0410-b5e6-96231b3b80d8

MachineScheduler: Fix assert from not checking subregs

The assert would fail if there was a dead def of a subregister if
there was a previous use of a different subregister.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372287 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix RegBankSelect G_SMULH/G_UMULH pre-gfx9

The scalar versions were only introduced in gfx9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372286 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Don't materialize immarg arguments to intrinsics

Encode them directly as an imm argument to G_INTRINSIC*.

Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.

This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.

SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.

Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.

The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.

This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372285 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r372282

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372283 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-reduce: Add pass to reduce instructions

Patch by Diego Treviño!

Differential Revision: https://reviews.llvm.org/D66263

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372282 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-reduce: Avoid use-after-free when removing a branch instruction

Found my msan buildbot & pointed out by Nico Weber - thanks Nico!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372280 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Extend MachOUniversalBinary::getObjectForArch

Make the method MachOUniversalBinary::getObjectForArch return MachOUniversalBinary::ObjectForArch
and add helper methods MachOUniversalBinary::getMachOObjectForArch, MachOUniversalBinary::getArchiveForArch
for those who explicitly expect to get a MachOObjectFile or an Archive.

Differential revision: https://reviews.llvm.org/D67700

Test plan: make check-all

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372278 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Add minimal support for MIR inputs to update_llc_test_checks.py

update_{llc,mir}_test_checks.py applicability is determined by the
output (assembly or MIR), not the input, which makes
update_llc_test_checks.py the right tool to generate tests that start at
MIR and stop at the final assembly.

This commit adds the minimal support for this path. Main limitation that
remains:

- MIR has to have LLVM IR section, and the CHECK lines will be inserted
  into the LLVM IR functions that correspond to the MIR functions.

Running
  ../utils/update_llc_test_checks.py --llc-binary ./bin/llc
on a slightly modified  ../test/CodeGen/X86/bad-tls-fold.mir

produces the following diff:

+# NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+# RUN: llc %s -o - | FileCheck %s
--- |
   target triple = "x86_64-unknown-linux-gnu"

@@ -6,17 +7,31 @@
   @i = external thread_local global i32

   define i32 @or() {
+  ; CHECK-LABEL: or:
+  ; CHECK:       # %bb.0: # %entry
+  ; CHECK-NEXT:    movq {{.*}}(%rip), %rax
+  ; CHECK-NEXT:    orq $7, %rax
+  ; CHECK-NEXT:    movq i@{{.*}}(%rip), %rcx
+  ; CHECK-NEXT:    orq %rax, %rcx
+  ; CHECK-NEXT:    movl %fs:(%rcx), %eax
+  ; CHECK-NEXT:    retq
   entry:
     ret i32 undef
   }
-
   define i32 @and() {
+  ; CHECK-LABEL: and:
+  ; CHECK:       # %bb.0: # %entry
+  ; CHECK-NEXT:    movq {{.*}}(%rip), %rax
+  ; CHECK-NEXT:    orq $7, %rax
+  ; CHECK-NEXT:    movq i@{{.*}}(%rip), %rcx
+  ; CHECK-NEXT:    andq %rax, %rcx
+  ; CHECK-NEXT:    movl %fs:(%rcx), %eax
+  ; CHECK-NEXT:    retq
   entry:
     ret i32 undef
   }
...

(not applied)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372277 91177308-0d34-0410-b5e6-96231b3b80d8

[utils] Amend update_llc_test_checks.py to non-llc tooling, NFC

Very minor change aiming to make it easier to extend the script
downstream to support non-llc, but llc-like tools. The main objective is
to decrease the probability of merge conflicts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372276 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Restore defaults for stores per memop

Summary:
Large slowdowns were observed in Rust due to many small, constant
sized copies in conjunction with poorly-optimized memory.copy
implementations. Since memory.copy cannot be expected to be inlined
efficiently by engines at this time, stop using it for the smallest
copies. We continue to lower all memcpy intrinsics to memory.copy,
though.

Reviewers: aheejin, alexcrichton

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, JDevlieghere, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67639

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372275 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Moves topics to new categories

This commit moves several topics to new categories. It also removes a few duplicate links in Subsystem Documentation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372274 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Support lowering musttail calls

Since we now lower most tail calls, it makes sense to support musttail.

Instead of always falling back to SelectionDAG, only fall back when a musttail
call was not able to be emitted as a tail call. Once we can handle most
incoming and outgoing arguments, we can change this to a `report_fatal_error`
like in ISelLowering.

Remove the assert that we don't have varargs and a musttail, and replace it
with a return false. Implementing this requires that we implement
`saveVarArgRegisters` from AArch64ISelLowering, which is an entirely different
patch.

Add GlobalISel lines to vararg-tallcall.ll to make sure that we produce correct
code. Right now we only fall back, but eventually this will be relevant.

Differential Revision: https://reviews.llvm.org/D67681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372273 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the obsolete BlockByRefStruct flag from LLVM IR

DIFlagBlockByRefStruct is an unused DIFlag that originally was used by
clang to express (Objective-)C block captures in debug info. For the
last year Clang has been emitting complex DIExpressions to describe
block captures instead, which makes all the code supporting this flag
redundant.

This patch removes the flag and all supporting "dead" code, so we can
reuse the bit for something else in the future.

Since this only affects debug info generated by Clang with the block
extension this mostly affects Apple platforms and I don't have any
bitcode compatibility concerns for removing this. The Verifier will
reject debug info that uses the bit and thus degrade gracefully when
LTO'ing older bitcode with a newer compiler.

rdar://problem/44304813

Differential Revision: https://reviews.llvm.org/D67453

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372272 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-reduce: Remove inaccurate doxy comment about a return that isn't returned

Addressing post-commit code review feedback from Dávid Bolvanský -
thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372271 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-reduce: Fix inconsistencies between int/unsigned usage (standardize on int)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372270 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r372267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372268 91177308-0d34-0410-b5e6-96231b3b80d8

Add AutoUpgrade function to add new address space datalayout string to existing datalayouts.

Summary:
Add function to AutoUpgrade to change the datalayout of old X86 datalayout strings.
This adds "-p270:32:32-p271:32:32-p272:64:64" to X86 datalayouts that are otherwise valid
and don't already contain it.

This also removes the compatibility changes in https://reviews.llvm.org/D66843.
Datalayout change in https://reviews.llvm.org/D64931.

Reviewers: rnk, echristo

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67631

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372267 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r372264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372265 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-reduce: Add pass to reduce basic blocks

Patch by Diego Treviño!

Differential Revision: https://reviews.llvm.org/D66320

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372264 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] mergeConditionalStoreToAddress(): try to pacify MSAN

MSAN bot complains that there is use-of-uninitialized-value
of this FreeStores later in IsWorthwhile().
Perhaps FreeStores needs to be stored in a vector?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372262 91177308-0d34-0410-b5e6-96231b3b80d8

On PowerPC, Secure-PLT by default for FreeBSD 13 and higher

Summary:
In https://svnweb.freebsd.org/changeset/base/349351, FreeBSD 13 and
higher transitioned to Secure-PLT for PowerPC. This part contains the
changes in llvm's PPC subtarget.

Reviewers: emaste, jhibbits, hfinkel

Reviewed By: jhibbits

Subscribers: wuzish, nemanjai, krytarowski, kbarton, MaskRay, jsji, shchenz, steven.zhang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67118

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372260 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine][ARM][X86] (sub Carry, X) -> (addcarry (sub 0, X), 0, Carry) fold

Summary:
`DAGCombiner::visitADDLikeCommutative()` already has a sibling fold:
`(add X, Carry) -> (addcarry X, 0, Carry)`

This fold, as suggested by @efriedma, helps recover from //some//
of the regressions of D62266

Reviewers: efriedma, deadalnix

Subscribers: javed.absar, kristof.beyls, llvm-commits, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372259 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen][X86][NFC] Tests for (sub Carry, X) -> (addcarry (sub 0, X), 0, Carry) fold (D62392)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372258 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] foldUnsignedUnderflowCheck(): handle last few cases (PR43251)

Summary:
I don't have a direct motivational case for this,
but it would be good to have this for completeness/symmetry.

This pattern is basically the motivational pattern from
https://bugs.llvm.org/show_bug.cgi?id=43251
but with different predicate that requires that the offset is non-zero.

The completeness bit comes from the fact that a similar pattern (offset != zero)
will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259,
so it'd seem to be good to not overlook very similar patterns..

Proofs: https://rise4fun.com/Alive/21b

Also, there is something odd with `isKnownNonZero()`, if the non-zero
knowledge was specified as an assumption, it didn't pick it up (PR43267)

With this, i see no other missing folds for
https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67412

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372257 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Don't implicitly enable global isel on Darwin if code-model==large.

Summary:
AArch64 GlobalISel doesn't support MachO's large code model, so this patch
adds a check for that combination before implicitly enabling it.

Reviewers: paquette

Subscribers: kristof.beyls, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372256 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] mergeConditionalStoreToAddress(): consider cost, not instruction count

Summary:
As it can be see in the changed test, while `div` is really costly,
we were speculating it. This does not seem correct.

Also, the old code would run for every single insturuction in BB,
instead of eagerly bailing out as soon as there are too many instructions.

This function still has a problem that `PHINodeFoldingThreshold` is
per-basic-block, while it should be for all the basic blocks.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372255 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS] For vectors, select `add %x, C` as `sub %x, -C` if it results in inline immediate

Summary:
As discussed in https://reviews.llvm.org/D62341#1515637,
for MIPS `add %x, -1` isn't optimal. Unlike X86 there
are no fastpaths to matearialize such `-1`/`1` vector constants,
and `sub %x, 1` results in better codegen,
so undo canonicalization

Reviewers: atanasyan, Petar.Avramovic, RKSimon

Reviewed By: atanasyan

Subscribers: sdardis, arichardson, hiraditya, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372254 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen][MIPS][NFC] Some standalone tests for D66805 "or vectors, select `add %x, C` as `sub %x, -C` if it results in inline immediate"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372253 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Expand 'lw/sw' instructions for 32-bit GOT

In case of using 32-bit GOT access to the table requires two instructions
with attached %got_hi and %got_lo relocations. This patch implements
correct expansion of 'lw/sw' instructions in that case.

Differential Revision: https://reviews.llvm.org/D67705

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372251 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] dropRedundantMaskingOfLeftShiftInput(): some cleanup before upcoming patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372245 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] More tests for PR42563 "Dropping pointless masking before left shift"

For patterns c/d/e we too can deal with the pattern even if we can't
just drop the mask, we can just apply it afterwars:
https://rise4fun.com/Alive/gslRa

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372244 91177308-0d34-0410-b5e6-96231b3b80d8

Fix compile-time regression caused by rL371928

Summary:
Also fixup rL371928 for cases that occur on our out-of-tree backend

There were still quite a few intermediate APInts and this caused the
compile time of MCCodeEmitter for our target to jump from 16s up to
~5m40s. This patch, brings it back down to ~17s by eliminating pretty
much all of them using two new APInt functions (extractBitsAsZExtValue(),
insertBits() but with a uint64_t). The exact conditions for eliminating
them is that the field extracted/inserted must be <=64-bit which is
almost always true.

Note: The two new APInt API's assume that APInt::WordSize is at least
64-bit because that means they touch at most 2 APInt words. They
statically assert that's true. It seems very unlikely that someone
is patching it to be smaller so this should be fine.

Reviewers: jmolloy

Reviewed By: jmolloy

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372243 91177308-0d34-0410-b5e6-96231b3b80d8

[DDG] Break a cyclic dependency from Analysis to ScalarOpts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372240 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r372238

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372239 91177308-0d34-0410-b5e6-96231b3b80d8

Data Dependence Graph Basics

Summary:
This is the first patch in a series of patches that will implement data dependence graph in LLVM. Many of the ideas used in this implementation are based on the following paper:
D. J. Kuck, R. H. Kuhn, D. A. Padua, B. Leasure, and M. Wolfe (1981). DEPENDENCE GRAPHS AND COMPILER OPTIMIZATIONS.
This patch contains support for a basic DDGs containing only atomic nodes (one node for each instruction). The edges are two fold: def-use edges and memory-dependence edges.
The implementation takes a list of basic-blocks and only considers dependencies among instructions in those basic blocks. Any dependencies coming into or going out of instructions that do not belong to those basic blocks are ignored.

The algorithm for building the graph involves the following steps in order:

  1. For each instruction in the range of basic blocks to consider, create an atomic node in the resulting graph.
  2. For each node in the graph establish def-use edges to/from other nodes in the graph.
  3. For each pair of nodes containing memory instruction(s) create memory edges between them. This part of the algorithm goes through the instructions in lexicographical order and creates edges in reverse order if the sink of the dependence occurs before the source of it.

Authored By: bmahjour

Reviewer: Meinersbur, fhahn, myhsu, xtian, dmgreen, kbarton, jdoerfert

Reviewed By: Meinersbur, fhahn, myhsu

Subscribers: ychen, arphaman, simoll, a.elovikov, mgorny, hiraditya, jfb, wuzish, llvm-commits, jsji, Whitney, etiotto

Tag: #llvm

Differential Revision: https://reviews.llvm.org/D65350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372238 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] add tests for fma/fmuladd; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372236 91177308-0d34-0410-b5e6-96231b3b80d8

[Alignment][NFC] Align(1) to Align::None() conversions

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67715

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372234 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleFDO] Minimize performance impact when profile-sample-accurate
is enabled.

We can save memory and reduce binary size significantly by enabling
ProfileSampleAccurate. However when ProfileSampleAccurate is true,
function without sample will be regarded as cold and this could
potentially cause performance regression.

To minimize the potential negative performance impact, we want to be
a little conservative here saying if a function shows up in the profile,
no matter as outline instance, inline instance or call targets, treat
the function as not being cold. This will handle the cases such as most
callsites of a function are inlined in sampled binary (thus outline copy
don't get any sample) but not inlined in current build (because of source
code drift, imprecise debug information, or the callsites are all cold
individually but not cold accumulatively...), so that the outline function
showing up as cold in sampled binary will actually not be cold after current
build. After the change, such function will be treated as not cold even
profile-sample-accurate is enabled.

At the same time we lower the hot criteria of callsiteIsHot check when
profile-sample-accurate is enabled. callsiteIsHot is used to determined
whether a callsite is hot and qualified for early inlining. When
profile-sample-accurate is enabled, functions without profile will be
regarded as cold and much less inlining will happen in CGSCC inlining pass,
so we can worry less about size increase and be aggressive to allow more
early inlining to happen for warm callsites and it is helpful for performance
overall.

Differential Revision: https://reviews.llvm.org/D67561

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372232 91177308-0d34-0410-b5e6-96231b3b80d8

[Alignment][NFC] Remove LogAlignment functions

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372231 91177308-0d34-0410-b5e6-96231b3b80d8

[Alignment][NFC] Use Align::None instead of 1

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67704

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372230 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"

Summary:
This reverts commit r372204.

This change causes build bot failures under msan:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio:

```
FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579)
******************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir
--
Exit Code: 2

Command Output (stderr):
--
==62894==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3
    #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10
    #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536
    #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21
    #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11
    #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079
    #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361
    #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18
    #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13
    #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27
    #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16
    #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27
    #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863
    #13 0x905f21 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8
    #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22
    #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369)

MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const
Exiting
error: -: The file was not recognized as a valid object file
FileCheck error: '-' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir
```

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372228 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] fix crash with empty function name (PR43347)

...and improve some variable names while here.

https://bugs.llvm.org/show_bug.cgi?id=43347

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372227 91177308-0d34-0410-b5e6-96231b3b80d8

Follow-up to r372209: Use single quotes for host_ldflags in the lit config

HOST_LDFLAGS is now using double quotes, and that would break the lit
config file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372226 91177308-0d34-0410-b5e6-96231b3b80d8

[SDA] Don't stop divergence propagation at the IPD.

Summary:
This fixes B42473 and B42706.

This patch makes the SDA propagate branch divergence until the end of the RPO traversal. Before, the SyncDependenceAnalysis propagated divergence only until the IPD in rpo order. RPO is incompatible with post dominance in the presence of loops. This made the SDA crash because blocks were missed in the propagation.

Reviewers: foad, nhaehnle

Reviewed By: foad

Subscribers: jvesely, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65274

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372223 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Pass "xgot" flag as a subtarget feature

We need "xgot" flag in the MipsAsmParser to implement correct expansion
of some pseudo instructions in case of using 32-bit GOT (XGOT).
MipsAsmParser does not have reference to MipsSubtarget but has a
reference to "feature bit set".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372220 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Mark tests for lw/sw expansion in PIC by a separate "check prefix". NFC

That simplify adding XGOT tests later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372219 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Reduce code duplication in the `loadAndAddSymbolAddress`. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372218 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wdocumentation warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372215 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wdocumentation "empty paragraph passed to '\brief'" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372214 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wdocumentation "@returns in a void function" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372212 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wdocumentation "Unknown param" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372211 91177308-0d34-0410-b5e6-96231b3b80d8

[cmake] Changes to get Windows self-host working with PGO

Fixes quoting of profile arguments to work on Windows
Suppresses adding profile arguments to linker flags when using lld-link
Avoids -fprofile-instr-use being added to rc.exe flags
Removes duplicated adding of -fprofile-instr-use to linker flags (since
r355541)
Move handling LLVM_PROFDATA_FILE to HandleLLVMOptions.cmake

Differential Revision: https://reviews.llvm.org/D62063

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372209 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Allow FP inline constant in v_madak_f16 and v_fmaak_f16

Differential Revision: https://reviews.llvm.org/D67680

Change-Id: Ic38f47cb2079c2c1070a441b5943854844d80a7c

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372208 91177308-0d34-0410-b5e6-96231b3b80d8

[Alignment] Add a None() member function

Summary:
This will allow writing `if(A != llvm::Align::None())` which is clearer than `if(A > llvm::Align(1))`

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67697

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372207 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize

This patch fixes a bug exposed by D65653 where a subsequent invocation
of `determineCalleeSaves` ends up with a different size for the callee
save area, leading to different frame-offsets in debug information.

In the invocation by PEI, `determineCalleeSaves` tries to determine
whether it needs to spill an extra callee-saved register to get an
emergency spill slot. To do this, it calls 'estimateStackSize' and
manually adds the size of the callee-saves to this. PEI then allocates
the spill objects for the callee saves and the remaining frame layout
is calculated accordingly.

A second invocation in LiveDebugValues causes estimateStackSize to return
the size of the stack frame including the callee-saves. Given that the
size of the callee-saves is added to this, these callee-saves are counted
twice, which leads `determineCalleeSaves` to believe the stack has
become big enough to require spilling an extra callee-save as emergency
spillslot. It then updates CalleeSavedStackSize with a larger value.

Since CalleeSavedStackSize is used in the calculation of the frame
offset in getFrameIndexReference, this leads to incorrect offsets for
variables/locals when this information is recalculated after PEI.

Reviewers: omjavaid, eli.friedman, thegameg, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D66935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372204 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "r372201: [Support] Replace function with function_ref in writeFileAtomically. NFC"

function_ref causes calls to the function to be ambiguous, breaking
compilation.

Reverting for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372202 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Replace function with function_ref in writeFileAtomically. NFC

Summary:
The latter is slightly more efficient and communicates the intent of the
API: writeFileAtomically does not own or copy the callback, it merely
calls it at some point.

Reviewers: jkorous

Reviewed By: jkorous

Subscribers: hiraditya, dexonsmith, jfb, llvm-commits, cfe-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67584

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372201 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Break non-power of 2 vXi1 vectors into scalars for argument passing with avx512.

This generates worse code, but matches what is done for avx2 and
prevents crashes when more arguments are passed than we have
registers for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372200 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for passing a v17i1 vector with avx512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372199 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] Permit all user instructed offset relocatiions

Currently, not all user specified relocations
(with clang intrinsic __builtin_preserve_access_index())
will turn into relocations.

In the current implementation, a __builtin_preserve_access_index()
chain is turned into relocation only if the result of the clang
intrinsic is used in a function call or a nonzero offset computation
of getelementptr. For all other cases, the relocatiion request
is ignored and the __builtin_preserve_access_index() is turned
into regular getelementptr instructions.
The main reason is to mimic bpf_probe_read() requirement.

But there are other use cases where relocatable offset is
generated but not used for bpf_probe_read(). This patch
relaxed previous constraints when to generate relocations.
Now, all user __builtin_preserve_access_index() will have
relocations generated.

Differential Revision: https://reviews.llvm.org/D67688

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372198 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Prevent assertion when calling a function that returns double with -mno-sse2 on x86-64.

As seen in the most recent updates to PR10498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372197 91177308-0d34-0410-b5e6-96231b3b80d8