Stefan Stipanovic [Mon, 22 Jul 2019 19:36:27 +0000 (19:36 +0000)]
[Attributor] NoAlias on return values.
Porting function return value attribute noalias to attributor.
This will be followed with a patch for callsite and function argumets.
Reviewers: jdoerfert
Subscribers: lebedev.ri, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D63067
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366728
91177308-0d34-0410-b5e6-
96231b3b80d8
Sean Fertile [Mon, 22 Jul 2019 19:15:29 +0000 (19:15 +0000)]
Stubs out TLOF for AIX and add support for common vars in assembly output.
Stubs out a TargetLoweringObjectFileXCOFF class, implementing only
SelectSectionForGlobal for common symbols. Also adds an override of
EmitGlobalVariable in PPCAIXAsmPrinter which adds a number of defensive errors
and adds support for emitting common globals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366727
91177308-0d34-0410-b5e6-
96231b3b80d8
Petr Hosek [Mon, 22 Jul 2019 18:52:42 +0000 (18:52 +0000)]
[SafeStack] Insert the deref after the offset
While debugging code that uses SafeStack, we've noticed that LLVM
produces an invalid DWARF. Concretely, in the following example:
int main(int argc, char* argv[]) {
std::string value = "";
printf("%s\n", value.c_str());
return 0;
}
DWARF would describe the value variable as being located at:
DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus
The assembly to get this variable is:
leaq -32(%r14), %rbx
The order of operations in the DWARF symbols is incorrect in this case.
Specifically, the deref is incorrect; this appears to be incorrectly
re-inserted in repalceOneDbgValueForAlloca.
With this change which inserts the deref after the offset instead of
before it, LLVM produces correct DWARF:
DW_OP_breg14 R14-32
Differential Revision: https://reviews.llvm.org/D64971
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366726
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Mon, 22 Jul 2019 18:50:45 +0000 (18:50 +0000)]
WholeProgramDevirt: Teach the pass to respect the global's alignment.
The bytes inserted before an overaligned global need to be padded according
to the alignment set on the original global in order for the initializer
to meet the global's alignment requirements. The previous implementation
that padded to the pointer width happened to be correct for vtables on most
platforms but may do the wrong thing if the vtable has a larger alignment.
This issue is visible with a prototype implementation of HWASAN for globals,
which will overalign all globals including vtables to 16 bytes.
There is also no padding requirement for the bytes inserted after the global
because they are never read from nor are they significant for alignment
purposes, so stop inserting padding there.
Differential Revision: https://reviews.llvm.org/D65031
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366725
91177308-0d34-0410-b5e6-
96231b3b80d8
Sean Fertile [Mon, 22 Jul 2019 18:47:59 +0000 (18:47 +0000)]
[PowerPC] Fix comment on MO_PLT Target Operand Flag. [NFC]
Patch by Xiangling Liao.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366724
91177308-0d34-0410-b5e6-
96231b3b80d8
Sean Fertile [Mon, 22 Jul 2019 18:47:55 +0000 (18:47 +0000)]
[Object][XCOFF] Remove extra includes from XCOFF related files. [NFC]
Differential Revision: https://reviews.llvm.org/D60885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366723
91177308-0d34-0410-b5e6-
96231b3b80d8
Peter Collingbourne [Mon, 22 Jul 2019 18:47:03 +0000 (18:47 +0000)]
LowerTypeTests: Teach the pass to respect global alignments.
We were previously ignoring alignment entirely when combining globals
together in this pass. There are two main things that we need to do here:
add additional padding before each global to meet the alignment requirements,
and set the combined global's alignment to the maximum of all of the original
globals' alignments.
Since we now need to calculate layout as we go anyway, use the calculated
layout to produce GlobalLayout instead of using StructLayout.
Differential Revision: https://reviews.llvm.org/D65033
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366722
91177308-0d34-0410-b5e6-
96231b3b80d8
Nilanjana Basu [Mon, 22 Jul 2019 18:22:55 +0000 (18:22 +0000)]
Changes to emit CodeView debug info nested type records properly using MCStreamer directives
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366720
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Mon, 22 Jul 2019 18:08:53 +0000 (18:08 +0000)]
[AMDGPU] Test update. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366715
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 22 Jul 2019 17:57:36 +0000 (17:57 +0000)]
[SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366712
91177308-0d34-0410-b5e6-
96231b3b80d8
Vlad Tsyrklevich [Mon, 22 Jul 2019 17:48:53 +0000 (17:48 +0000)]
Revert "Reland [ELF] Loose a condition for relocation with a symbol"
This reverts commit r366686 as it appears to be causing buildbot
failures on sanitizer-x86_64-linux-android and sanitizer-x86_64-linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366708
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 22 Jul 2019 15:02:34 +0000 (15:02 +0000)]
TableGen: Support physical register inputs > 255
This was truncating register value that didn't fit in unsigned char.
Switch AMDGPU sendmsg intrinsics to using a tablegen pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366695
91177308-0d34-0410-b5e6-
96231b3b80d8
Sam Parker [Mon, 22 Jul 2019 14:16:40 +0000 (14:16 +0000)]
[ARM][LowOverheadLoops] Revert remaining pseudos
ARMLowOverheadLoops would assert a failure if it did not find all the
pseudo instructions that comprise the hardware loop. Instead of doing
this, iterate through all the instructions of the function and revert
any remaining pseudo instructions that haven't been converted.
Differential Revision: https://reviews.llvm.org/D65080
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366691
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 22 Jul 2019 13:33:11 +0000 (13:33 +0000)]
AMDGPU/GlobalISel: Fix broken tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366688
91177308-0d34-0410-b5e6-
96231b3b80d8
Nikola Prica [Mon, 22 Jul 2019 13:07:01 +0000 (13:07 +0000)]
Reland [ELF] Loose a condition for relocation with a symbol
This patch was not the reason of the buildbot failure.
Deleted code was introduced as a work around for a bug in the gold linker
(http://sourceware.org/PR16794). Test case that was given as a reason for
this part of code, the one on previous link, now works for the gold.
This condition is too strict and when a code is compiled with debug info
it forces generation of numerous relocations with symbol for architectures
that do not have relocation addend.
Reviewers: arsenm, espindola
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D64327
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366686
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 22 Jul 2019 13:05:25 +0000 (13:05 +0000)]
AMDGPU/GlobalISel: Remove unnecessary code
The minnum/maxnum case are dead, and the cvt is handled by the
default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366685
91177308-0d34-0410-b5e6-
96231b3b80d8
David Green [Mon, 22 Jul 2019 12:51:38 +0000 (12:51 +0000)]
[ARM] Fix for MVE VPT block pass
We need to ensure that the number of T's is correct when adding multiple
instructions into the same VPT block.
Differential revision: https://reviews.llvm.org/D65049
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366684
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 22 Jul 2019 12:44:10 +0000 (12:44 +0000)]
[X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED)
This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load.
A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match.
Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle.
Fixed out of bounds load assert identified in rL366501
Differential Revision: https://reviews.llvm.org/D64551
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366681
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Mon, 22 Jul 2019 12:43:41 +0000 (12:43 +0000)]
AMDGPU/GlobalISel: Fix tests without asserts
The legality check is only done under NDEBUG, so the failure cases are
different in a release build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366680
91177308-0d34-0410-b5e6-
96231b3b80d8
Christudasan Devadasan [Mon, 22 Jul 2019 12:42:48 +0000 (12:42 +0000)]
Added address-space mangling for stack related intrinsics
Modified the following 3 intrinsics:
int_addressofreturnaddress,
int_frameaddress & int_sponentry.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D64561
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366679
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Mon, 22 Jul 2019 12:17:56 +0000 (12:17 +0000)]
[X86][SSE] Add EltsFromConsecutiveLoads test case identified in rL366501
Test case that led to rL366441 being reverted at rL366501
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366678
91177308-0d34-0410-b5e6-
96231b3b80d8
George Rimar [Mon, 22 Jul 2019 12:01:52 +0000 (12:01 +0000)]
[yaml2obj] - Change how we handle implicit sections.
Instead of having the special list of implicit sections,
that are mixed with the sections read from YAML on late
stages, I just create the placeholders and add them to
the main sections list early.
That allows to significantly simplify the code.
Differential revision: https://reviews.llvm.org/D64999
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366677
91177308-0d34-0410-b5e6-
96231b3b80d8
Stefan Granitz [Mon, 22 Jul 2019 09:47:40 +0000 (09:47 +0000)]
Add location of SVN staging dir to git-llvm error output
Summary:
In pre-monorepo times the svn staging directory was `.git/svn`. The below error message wasn't mentioning the new name yet.
Example before:
```
Can't push git rev
104cfa289d9 because svn status is not empty:
! llvm/trunk/include/llvm
```
Example after:
```
Can't push git rev
104cfa289d9 because status in svn staging dir (.git/llvm-upstream-svn) is not empty:
! llvm/trunk/include/llvm
```
Reviewers: mehdi_amini, jlebar, teemperor
Reviewed By: mehdi_amini
Subscribers: llvm-commits, #llvm
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65038
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366671
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Mon, 22 Jul 2019 08:44:36 +0000 (08:44 +0000)]
[IPRA][ARM] Make use of the "returned" parameter attribute
ARM has code to recognise uses of the "returned" function parameter
attribute which guarantee that the value passed to the function in r0
will be returned in r0 unmodified. IPRA replaces the regmask on call
instructions, so needs to be told about this to avoid reverting the
optimisation.
Differential revision: https://reviews.llvm.org/D64986
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366669
91177308-0d34-0410-b5e6-
96231b3b80d8
George Rimar [Mon, 22 Jul 2019 08:10:02 +0000 (08:10 +0000)]
[llvm-readobj] - Stop using precompiled objects in file-headers.test
This converts all sub-tests except one to YAML instead of precompiled inputs.
Differential revision: https://reviews.llvm.org/D64800
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366668
91177308-0d34-0410-b5e6-
96231b3b80d8
Jay Foad [Mon, 22 Jul 2019 07:19:44 +0000 (07:19 +0000)]
[AMDGPU] Save some work when an atomic op has no uses
Summary:
In the atomic optimizer, save doing a bunch of work and generating a
bunch of dead IR in the fairly common case where the result of an
atomic op (i.e. the value that was in memory before the atomic op was
performed) is not used. NFC.
Reviewers: arsenm, dstuttard, tpr
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64981
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366667
91177308-0d34-0410-b5e6-
96231b3b80d8
Kai Luo [Mon, 22 Jul 2019 05:32:20 +0000 (05:32 +0000)]
[PowerPC][NFC] Precommit a test case where ppc-mi-peepholes miscompiles extswsli
Added a test case to show codegen differences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366666
91177308-0d34-0410-b5e6-
96231b3b80d8
Serguei Katkov [Mon, 22 Jul 2019 05:15:34 +0000 (05:15 +0000)]
[Loop Peeling] Fix the handling of branch weights of peeled off branches.
Current algorithm to update branch weights of latch block and its copies is
based on the assumption that number of peeling iterations is approximately equal
to trip count.
However it is not correct. According to profitability check in one case we can decide to peel
in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration
can be less then estimated trip count.
This patch introduces another way to set the branch weights to peeled of branches.
Let F is a weight of the edge from latch to header.
Let E is a weight of the edge from latch to exit.
F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit.
Then, Estimated TripCount = F / E.
For I-th (counting from 0) peeled off iteration we set the the weights for
the peeled latch as (TC - I, 1). It gives us reasonable distribution,
The probability to go to exit 1/(TC-I) increases. At the same time
the estimated trip count of remaining loop reduces by I.
As a result after peeling off N iteration the weights will be
(F - N * E, E) and trip count of loop becomes
F / E - N or TC - N.
The idea is taken from the review of the patch D63918 proposed by Philip.
Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64235
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366665
91177308-0d34-0410-b5e6-
96231b3b80d8
Fangrui Song [Mon, 22 Jul 2019 04:59:01 +0000 (04:59 +0000)]
[utils] Clean up UpdateTestChecks/common.py
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366664
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Mon, 22 Jul 2019 02:43:43 +0000 (02:43 +0000)]
[InstCombine] Add foldAndOfICmps test cases inspired by PR42691.
icmp ne %x, INT_MIN can be treated similarly to icmp sgt %x, INT_MIN.
icmp ne %x, INT_MAX can be treated similarly to icmp slt %x, INT_MAX.
icmp ne %x, UINT_MAX can be treated similarly to icmp ult %x, UINT_MAX.
We already treat icmp ne %x, 0 similarly to icmp ugt %x, 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366662
91177308-0d34-0410-b5e6-
96231b3b80d8
Nemanja Ivanovic [Sun, 21 Jul 2019 21:03:45 +0000 (21:03 +0000)]
[PowerPC][NFC] Precomit test case for upcoming patch
Just committing a test case for an upcoming patch so that the review can show
only the codegen differences.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366661
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sun, 21 Jul 2019 19:04:44 +0000 (19:04 +0000)]
[X86] SimplifyDemandedVectorEltsForTargetNode - Move SUBV_BROADCAST narrowing handling. NFCI.
Move the narrowing of SUBV_BROADCAST to where we handle all the other opcodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366660
91177308-0d34-0410-b5e6-
96231b3b80d8
Nemanja Ivanovic [Sun, 21 Jul 2019 18:42:29 +0000 (18:42 +0000)]
[PowerPC][NFC] Regenerate test using script
This test case ended up as a hybrid of generated checks and manually inserted
checks. Regenerate using script to make it consistent.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366659
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 21 Jul 2019 16:15:03 +0000 (16:15 +0000)]
[InstCombine] Update comment I missed in r366649. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366658
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sun, 21 Jul 2019 16:06:26 +0000 (16:06 +0000)]
[SmallBitVector] Fix bug in find_next_unset for small types with indices >=32
We were creating a bitmask from a shift of unsigned instead of uintptr_t, meaning we couldn't create masks for indices above 31.
Noticed due to a MSVC analyzer warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366657
91177308-0d34-0410-b5e6-
96231b3b80d8
Aditya Nandakumar [Sun, 21 Jul 2019 14:07:54 +0000 (14:07 +0000)]
[GISel]: Attach missing range metadata while translating G_LOADs
https://reviews.llvm.org/D65048
Attach range information to G_LOAD when only defining one register.
reviewed by: arsenm
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366656
91177308-0d34-0410-b5e6-
96231b3b80d8
David Green [Sun, 21 Jul 2019 13:09:19 +0000 (13:09 +0000)]
[ARM] Move MVE VPT block tests into the Thumb2 directory. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366655
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sun, 21 Jul 2019 09:05:49 +0000 (09:05 +0000)]
[NFC][InstCombine] Add a few extra srem-by-power-of-two tests - extra uses
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366652
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 21 Jul 2019 06:43:38 +0000 (06:43 +0000)]
[InstCombine] Remove insertRangeTest code that handles the equality case.
For equality, the function called getTrue/getFalse with the VT
of the comparison input. But getTrue/getFalse need the boolean VT.
So if this code ever executed, it would assert.
I believe these cases are removed by InstSimplify so we don't get here.
So this patch just fixes up an assert to exclude the equality
possibility and removes the broken code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366649
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Sun, 21 Jul 2019 05:26:05 +0000 (05:26 +0000)]
[InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI
AddOne/SubOne create new Constant objects. That seems heavy for
comparing ConstantInts which wrap APInts. Just do the math on
on the APInts and compare them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366648
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Sun, 21 Jul 2019 00:03:55 +0000 (00:03 +0000)]
gn build: Merge r366622
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366646
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sat, 20 Jul 2019 21:34:00 +0000 (21:34 +0000)]
[NFC][InstCombine] Autogenerate a few tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366643
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sat, 20 Jul 2019 21:33:50 +0000 (21:33 +0000)]
[NFC][InstCombine] Add srem-by-signbit tests - still can fold to bittest
https://rise4fun.com/Alive/IIeS
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366642
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sat, 20 Jul 2019 19:25:44 +0000 (19:25 +0000)]
[NFC][Codegen][X86][AArch64] Add "(x s% C) == 0" tests
Much like with `urem`, the same optimization (albeit with slightly
different algorithm) applies for the signed case, too.
I'm simply copying the test coverage from `urem` case for now,
i believe it should be (close to?) sufficient.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366640
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Sat, 20 Jul 2019 16:33:15 +0000 (16:33 +0000)]
[Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvements
Summary:
Four things here:
1. Generalize the fold to handle non-splat divisors. Reasonably trivial.
2. Unban power-of-two divisors. I don't see any reason why they should
be illegal.
* There is no ban in Hacker's Delight
* I think the ban came from the same bug that caused the miscompile
in the base patch - in `floor((2^W - 1) / D)` we were dividing by
`D0` instead of `D`, and we **were** ensuring that `D0` is not `1`,
which made sense.
3. Unban `1` divisors. I no longer believe Hacker's Delight actually says
that the fold is invalid for `D = 0`. Further considerations:
* We know that
* `(X u% 1) == 0` can be constant-folded to `1`,
* `(X u% 1) != 0` can be constant-folded to `0`,
* Also, we know that
* `X u<= -1` can be constant-folded to `1`,
* `X u> -1` can be constant-folded to `0`,
* https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p
* We know will end up with the following:
`(setule/setugt (rotr (mul N, P), K), Q)`
* Therefore, for given new DAG nodes and comparison predicates
(`ule`/`ugt`), we will still produce the correct answer if:
`Q` is a all-ones constant; and both `P` and `K` are *anything*
other than `undef`.
* The fold will indeed produce `Q = all-ones`.
4. Try to re-splat the `P` and `K` vectors - we don't care about
their values for the lanes where divisor was `1`.
Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00
Reviewed By: RKSimon
Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D63963
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366637
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Sat, 20 Jul 2019 15:20:11 +0000 (15:20 +0000)]
[X86][SSE] Use PSADBW to improve vXi8 sum reduction (PR42674)
As detailed on PR42674, we can reduce a vXi8 down until we have the final <8 x i8>, and then use PSADBW with zero, to sum those values. We then extract the bottom i8, discarding any overflow from the upper bits of the i16 result.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366636
91177308-0d34-0410-b5e6-
96231b3b80d8
Florian Hahn [Sat, 20 Jul 2019 12:25:47 +0000 (12:25 +0000)]
[Local] Zap blockaddress without users in ConstantFoldTerminator.
If the blockaddress is not destoryed, the destination block will still
be marked as having its address taken, limiting further transformations.
I think there are other places where the dead blockaddress constants are kept
around, I'll look into that as follow up.
Reviewers: craig.topper, brzycki, davide
Reviewed By: brzycki, davide
Differential Revision: https://reviews.llvm.org/D64936
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366633
91177308-0d34-0410-b5e6-
96231b3b80d8
Jessica Paquette [Sat, 20 Jul 2019 01:55:35 +0000 (01:55 +0000)]
[GlobalISel][AArch64] Contract trivial same-size cross-bank copies into G_STOREs
Sometimes, you can end up with cross-bank copies between same-sized GPRs and
FPRs, which feed into G_STOREs. When these copies feed only into stores, they
aren't necessary; we can just store using the original register bank.
This provides some minor code size savings for some floating point SPEC
benchmarks. (Around 0.2% for 453.povray and 450.soplex)
This issue doesn't seem to show up due to regbankselect or anything similar. So,
this patch introduces an early select function, `contractCrossBankCopyIntoStore`
which performs the contraction when possible. The selector then continues
normally and selects the correct store opcode, eliminating needless copies
along the way.
Differential Revision: https://reviews.llvm.org/D65024
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366625
91177308-0d34-0410-b5e6-
96231b3b80d8
Guanzhong Chen [Fri, 19 Jul 2019 23:34:16 +0000 (23:34 +0000)]
[WebAssembly] Compute and export TLS block alignment
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.
Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.
The expected usage has now changed to:
__wasm_init_tls(memalign(__builtin_wasm_tls_align(),
__builtin_wasm_tls_size()));
Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton
Reviewed By: tlively
Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D65028
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366624
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Fri, 19 Jul 2019 22:46:47 +0000 (22:46 +0000)]
Re-commit: r366610 and r366612: Expand pseudo-components before embedding in llvm-config
There were two main problems:
* The 'nativecodegen' pseudo-component was unconditionally adding
${native_tgt}CodeGen even though it conditionally added ${native_tgt}Info and
${native_tgt}Desc. This has been fixed by making ${native_tgt}CodeGen
conditional too
* The 'all' pseudo-component was causing library names like LLVMLLVMDemangle as
the expansion was to a library name and not a component. There doesn't seem to
be a list of available components anywhere so this has been fixed by moving the
expansion of 'all' back where it was before. This manifested in different ways
on different builders but it was the same root cause
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366622
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 22:28:44 +0000 (22:28 +0000)]
AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366621
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Fri, 19 Jul 2019 21:43:42 +0000 (21:43 +0000)]
[AMDGPU] Autogenerate register sequences in tuples
Differential Revision: https://reviews.llvm.org/D65007
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366619
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Fri, 19 Jul 2019 21:29:51 +0000 (21:29 +0000)]
[AMDGPU] Fixed occupancy calculation for gfx10
Differential Revision: https://reviews.llvm.org/D65010
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366616
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Fri, 19 Jul 2019 21:11:05 +0000 (21:11 +0000)]
Revert r366610 and r366612: Expand pseudo-components before embedding in llvm-config
Some targets are missing LLVMDemangle, one is adding the LLVM prefix twice, and two
are hitting the very error this patch fixes for my target. Reverting while I work
through the reports.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366615
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Fri, 19 Jul 2019 21:09:21 +0000 (21:09 +0000)]
[InstCombine] Fix copy/paste mistake in the test cases I added for PR42691. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366614
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 21:01:30 +0000 (21:01 +0000)]
AMDGPU: Avoid custom predicates for stores with glue
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366613
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Fri, 19 Jul 2019 20:58:11 +0000 (20:58 +0000)]
Fix a latent bug discovered by r366610: nativecodegen includes X86CodeGen when X86 is not compiled
I believe this to have been a latent bug as the same expansion checks for the
existence of ${native_tgt}Info and ${native_tgt}Desc and only adds them if
they were compiled but unconditionally adds ${native_tgt}CodeGen.
This should fix llvm-clang-x86_64-win-fast which builds ARM only on an X86 host and similar builders.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366612
91177308-0d34-0410-b5e6-
96231b3b80d8
Craig Topper [Fri, 19 Jul 2019 20:48:52 +0000 (20:48 +0000)]
[InstCombine] Add test cases for PR42691. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366611
91177308-0d34-0410-b5e6-
96231b3b80d8
Daniel Sanders [Fri, 19 Jul 2019 20:38:05 +0000 (20:38 +0000)]
Expand pseudo-components before embedding in llvm-config
Summary:
If you use pseudo-targets like AllTargetsCodeGens in LLVM_DYLIB_COMPONENTS
then a test will fail because `./bin/llvm-config --shared-mode` can't
handle these targets. We can fix this by expanding them before embedding
the string into llvm-config
Reviewers: bogner
Reviewed By: bogner
Subscribers: mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D65011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366610
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 20:24:40 +0000 (20:24 +0000)]
AMDGPU: Redefine setcc condition PatLeafs
Avoid using custom code predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366609
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 20:01:24 +0000 (20:01 +0000)]
AMDGPU: Don't rely on m0 being -1 for GWS offsets
This only works if the high bits of m0 are also 0, so m0 would have to
be set to 0xffff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366608
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 19:47:30 +0000 (19:47 +0000)]
AMDGPU: Force s_waitcnt after GWS instructions
This is apparently required to be the immediately following
instruction, so force it into a bundle with a waitcnt.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366607
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 19:32:00 +0000 (19:32 +0000)]
LiveIntervals: Fix handleMove asserting on BUNDLE
The top-level BUNDLE instruction should behave as an ordinary
instruction. It is supposed to have all relevant registers as implicit
operands. Moving it should work as any other instruction. I believe
the assert intended to avoid moving instructions inside bundles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366605
91177308-0d34-0410-b5e6-
96231b3b80d8
Louis Dionne [Fri, 19 Jul 2019 18:52:46 +0000 (18:52 +0000)]
Revert "[libc++] Integrate the PSTL into libc++"
This reverts r366593, which caused unforeseen breakage on the build bots.
I'm reverting until the problems have been figured out and fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366603
91177308-0d34-0410-b5e6-
96231b3b80d8
Michael Liao [Fri, 19 Jul 2019 18:50:53 +0000 (18:50 +0000)]
[AMDGPU] Add test case on crashing of `si-lower-sgpr-spills` pass
Reviewers: arsenm
Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64273
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366602
91177308-0d34-0410-b5e6-
96231b3b80d8
Nick Desaulniers [Fri, 19 Jul 2019 18:18:02 +0000 (18:18 +0000)]
Revert "Use the MachineBasicBlock symbol for a callbr target"
This reverts commit r366523/
ccbffefccaff42b0d094c9ef0f49fc3e8c8456ea.
Two regressions were immediately reported:
- https://github.com/ClangBuiltLinux/linux/issues/614
- https://github.com/ClangBuiltLinux/linux/issues/615
Reported-by: nathanchance
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366600
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Morehouse [Fri, 19 Jul 2019 18:05:12 +0000 (18:05 +0000)]
[RISCV] Disable tests failing on buildbots.
r366399 enabled a couple tests that are failing on a few buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366599
91177308-0d34-0410-b5e6-
96231b3b80d8
Stanislav Mekhanoshin [Fri, 19 Jul 2019 18:05:01 +0000 (18:05 +0000)]
[AMDGPU] Allow register tuples to set asm names
This change reverts most of the previous register name generation.
The real problem is that RegisterTuple does not generate asm names.
Added optional operand to RegisterTuple. This way we can simplify
register name access and dramatically reduce the size of static
tables for the backend.
Differential Revision: https://reviews.llvm.org/D64967
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366598
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 17:52:56 +0000 (17:52 +0000)]
AMDGPU/GlobalISel: Fix MMO flags for kernel argument loads
The DAG lowering sets dereferencable and invariant, not nontemporal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366597
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 17:32:19 +0000 (17:32 +0000)]
GlobalISel: Add GINodeEquiv for fcopysign
I don't need this at the moment, but it should be here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366596
91177308-0d34-0410-b5e6-
96231b3b80d8
Shoaib Meenai [Fri, 19 Jul 2019 17:19:57 +0000 (17:19 +0000)]
[llvm-lipo] Remove trailing whitespace. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366595
91177308-0d34-0410-b5e6-
96231b3b80d8
Louis Dionne [Fri, 19 Jul 2019 17:02:42 +0000 (17:02 +0000)]
[libc++] Integrate the PSTL into libc++
Summary:
This commit allows specifying LIBCXX_ENABLE_PARALLEL_ALGORITHMS when
configuring libc++ in CMake. When that option is enabled, libc++ will
assume that the PSTL can be found somewhere on the CMake module path,
and it will provide the C++17 parallel algorithms based on the PSTL
(that is assumed to be available).
The commit also adds support for running the PSTL tests as part of
the libc++ test suite.
Reviewers: rodgert, EricWF
Subscribers: mgorny, christof, jkorous, dexonsmith, libcxx-commits, mclow.lists, EricWF
Tags: #libc
Differential Revision: https://reviews.llvm.org/D60480
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366593
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 16:45:48 +0000 (16:45 +0000)]
AMDGPU: Add some function return test cases
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366591
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Fri, 19 Jul 2019 15:43:56 +0000 (15:43 +0000)]
[AMDGPU] Regenerate test file for upcoming patch. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366589
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 14:56:24 +0000 (14:56 +0000)]
AMDGPU: Attempt to fix bot error
Manually remove file name from check line, since it somehow ends
up being different on an msvc bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366586
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 14:42:40 +0000 (14:42 +0000)]
AMDGPU/GlobalISel: Selection for fminnum/fmaxnum
v2f16 case doesn't work yet because the VOP3P complex patterns haven't
been ported yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366585
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 14:29:30 +0000 (14:29 +0000)]
AMDGPU/GlobalISel: Support arguments with multiple registers
Handles structs used directly in argument lists.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366584
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 14:15:18 +0000 (14:15 +0000)]
AMDGPU/GlobalISel: Rewrite lowerFormalArguments
This should now handle everything except structs passed as multiple
registers.
I think most of the packing logic should be handled by
handleAssignments, but I'm unclear on what the contract is for
multiple registers. This is copying how x86 handles this.
This does change the behavior of the test_sgpr_alignment0 amdgpu_vs
test. I don't think shader arguments should try to follow the
alignment, and registers need to be repacked. I also don't think it
matters, since I think the pointers are packed to the beginning of the
argument list anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366582
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 13:57:44 +0000 (13:57 +0000)]
AMDGPU: Decompose all values to 32-bit pieces for calling conventions
This is the more natural lowering, and presents more opportunities to
reduce 64-bit ops to 32-bit.
This should also help avoid issues graphics shaders have had with
64-bit values, and simplify argument lowering in globalisel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366578
91177308-0d34-0410-b5e6-
96231b3b80d8
Nico Weber [Fri, 19 Jul 2019 13:40:54 +0000 (13:40 +0000)]
gn build: Set +x on symlink_or_copy.py
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366576
91177308-0d34-0410-b5e6-
96231b3b80d8
Matt Arsenault [Fri, 19 Jul 2019 13:36:46 +0000 (13:36 +0000)]
DAG: Handle dbg_value for arguments split into multiple subregs
This was handled previously for arguments split due to not fitting in
an MVT. This was dropping the register for argument registers split
due to TLI::getRegisterTypeForCallingConv.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366574
91177308-0d34-0410-b5e6-
96231b3b80d8
Than McIntosh [Fri, 19 Jul 2019 13:13:54 +0000 (13:13 +0000)]
[NFC] include cstdint/string prior to using uint8_t/string
Summary: include proper header prior to use of uint8_t typedef
and std::string.
Subscribers: llvm-commits
Reviewers: cherry
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366572
91177308-0d34-0410-b5e6-
96231b3b80d8
Dmitry Preobrazhensky [Fri, 19 Jul 2019 13:12:47 +0000 (13:12 +0000)]
[AMDGPU][MC] Corrected parsing of branch offsets
See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D64629
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366571
91177308-0d34-0410-b5e6-
96231b3b80d8
Kai Luo [Fri, 19 Jul 2019 12:58:16 +0000 (12:58 +0000)]
[MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs.
Summary:
Current PRE hoists common computations into
CMBB = DT->findNearestCommonDominator(MBB, MBB1).
However, if CMBB is in a hot loop body, we might get performance
degradation.
Differential Revision: https://reviews.llvm.org/D64394
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366570
91177308-0d34-0410-b5e6-
96231b3b80d8
Than McIntosh [Fri, 19 Jul 2019 12:54:44 +0000 (12:54 +0000)]
[X86] for split stack, not save/restore nested arg if unused
Summary:
For split-stack, if the nested argument (i.e. R10) is not used, no need to save/restore it in the prologue.
Reviewers: thanm
Reviewed By: thanm
Subscribers: mstorsjo, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64673
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366569
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 11:29:18 +0000 (11:29 +0000)]
[NFC][InstCombine] Tests for 'rem' formation from sub-of-mul-by-'div' (PR42673)
https://rise4fun.com/Alive/8Rp
https://bugs.llvm.org/show_bug.cgi?id=42673
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366565
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 11:29:04 +0000 (11:29 +0000)]
[NFC][InstCombine] Redundant masking before left-shift: tests with assume
If the legality check is `(shiftNbits-maskNbits) s>= 0`,
then we can simplify it to `shiftNbits u>= maskNbits`,
which is easier to check for.
However, currently switching the `dropRedundantMaskingOfLeftShiftInput()`
to `SimplifyICmpInst()` does not catch these cases and regresses
currently-handled cases, so i'll leave it as is for now.
https://rise4fun.com/Alive/25P
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366564
91177308-0d34-0410-b5e6-
96231b3b80d8
Simon Pilgrim [Fri, 19 Jul 2019 11:18:46 +0000 (11:18 +0000)]
Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366563
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Fri, 19 Jul 2019 10:37:37 +0000 (10:37 +0000)]
Don't update NoTrappingFPMath and FPDenormalMode in resetTargetOptions
We'd like to remove this whole function, because these are properties of
functions, not the target as a whole. These two are easy to remove
because they are only used for emitting ARM build attributes, which
expects them to represent the defaults for the whole module, not just
the last function generated.
This is needed to get correct build attributes when using IPRA on ARM,
because IPRA causes resetTargetOptions to get called before
ARMAsmPrinter::emitAttributes.
Differential revision: https://reviews.llvm.org/D64929
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366562
91177308-0d34-0410-b5e6-
96231b3b80d8
George Rimar [Fri, 19 Jul 2019 10:15:03 +0000 (10:15 +0000)]
[llvm-readelf] - A fix for: "--hash-symbols asserts for 64-bit ELFs"
Fixes https://bugs.llvm.org/show_bug.cgi?id=42622.
(--hash-symbols switch is currently broken for 64-bit ELF files, due to r352630.)
Differential revision: https://reviews.llvm.org/D64788
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366558
91177308-0d34-0410-b5e6-
96231b3b80d8
Oliver Stannard [Fri, 19 Jul 2019 09:59:26 +0000 (09:59 +0000)]
[IPRA] Don't rely on non-exact function definitions
If a function definition is not exact, then the linker could select a
differently-compiled version of it, which could use different registers.
https://reviews.llvm.org/D64909
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366557
91177308-0d34-0410-b5e6-
96231b3b80d8
Mikhail Maltsev [Fri, 19 Jul 2019 09:46:28 +0000 (09:46 +0000)]
[ARM] Add <saturate> operand to SQRSHRL and UQRSHLL
Summary:
According to the new Armv8-M specification
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf the
instructions SQRSHRL and UQRSHLL now have an additional immediate
operand <saturate>. The new assembly syntax is:
SQRSHRL<c> RdaLo, RdaHi, #<saturate>, Rm
UQRSHLL<c> RdaLo, RdaHi, #<saturate>, Rm
where <saturate> can be either 64 (the existing behavior) or 48, in
that case the result is saturated to 48 bits.
The new operand is encoded as follows:
#64 Encoded as sat = 0
#48 Encoded as sat = 1
sat is bit 7 of the instruction bit pattern.
This patch adds a new assembler operand class MveSaturateOperand which
implements parsing and encoding. Decoding is implemented in
DecodeMVEOverlappingLongShift.
Reviewers: ostannard, simon_tatham, t.p.northover, samparker, dmgreen, SjoerdMeijer
Reviewed By: simon_tatham
Subscribers: javed.absar, kristof.beyls, hiraditya, pbarrio, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64810
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366555
91177308-0d34-0410-b5e6-
96231b3b80d8
Hubert Tong [Fri, 19 Jul 2019 08:46:18 +0000 (08:46 +0000)]
[sanitizers] Use covering ObjectFormatType switches
Summary:
This patch removes the `default` case from some switches on
`llvm::Triple::ObjectFormatType`, and cases for the missing enumerators
(`UnknownObjectFormat`, `Wasm`, and `XCOFF`) are then added.
For `UnknownObjectFormat`, the effect of the action for the `default`
case is maintained; otherwise, where `llvm_unreachable` is called,
`report_fatal_error` is used instead.
Where the `default` case returns a default value, `report_fatal_error`
is used for XCOFF as a placeholder. For `Wasm`, the effect of the action
for the `default` case in maintained.
The code is structured to avoid strongly implying that the `Wasm` case
is present for any reason other than to make the switch cover all
`ObjectFormatType` enumerator values.
Reviewers: sfertile, jasonliu, daltenty
Reviewed By: sfertile
Subscribers: hiraditya, aheejin, sunfish, llvm-commits, cfe-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D64222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366544
91177308-0d34-0410-b5e6-
96231b3b80d8
Jay Foad [Fri, 19 Jul 2019 08:40:37 +0000 (08:40 +0000)]
[AMDGPU] Simplify the exclusive scan used for optimized atomics
Summary:
Change the scan algorithm to use only power-of-two shifts (1, 2, 4, 8,
16, 32) instead of starting off shifting by 1, 2 and 3 and then doing
a 3-way ADD, because:
1. It simplifies the compiler a little.
2. It minimizes vgpr pressure because each instruction is now of the
form vn = vn + vn << c.
3. It is more friendly to the DPP combiner, which currently can't
combine into an ADD3 instruction.
Because of #2 and #3 the end result is improved from this:
v_add_u32_dpp v4, v3, v3 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
v_mov_b32_dpp v5, v3 row_shr:2 row_mask:0xf bank_mask:0xf
v_mov_b32_dpp v1, v3 row_shr:3 row_mask:0xf bank_mask:0xf
v_add3_u32 v1, v4, v5, v1
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc
s_nop 1
v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf
s_nop 1
v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf
To this:
v_add_u32_dpp v1, v1, v1 row_shr:1 row_mask:0xf bank_mask:0xf bound_ctrl:0
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:2 row_mask:0xf bank_mask:0xf bound_ctrl:0
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:4 row_mask:0xf bank_mask:0xe
s_nop 1
v_add_u32_dpp v1, v1, v1 row_shr:8 row_mask:0xf bank_mask:0xc
s_nop 1
v_add_u32_dpp v1, v1, v1 row_bcast:15 row_mask:0xa bank_mask:0xf
s_nop 1
v_add_u32_dpp v1, v1, v1 row_bcast:31 row_mask:0xc bank_mask:0xf
I.e. two fewer computational instructions, one extra nop where we could
schedule something else.
Reviewers: arsenm, sheredom, critson, rampitec, vpykhtin
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64411
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366543
91177308-0d34-0410-b5e6-
96231b3b80d8
Serguei Katkov [Fri, 19 Jul 2019 08:35:45 +0000 (08:35 +0000)]
[Loop Peeling] Enable peeling of multiple exits by default.
Enable loop peeling with multiple exits where all non-latch exits
ends up with deopt by default.
Reviewers: reames, fhahn
Reviewed By: reames
Subscribers: xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64619
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366542
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 08:26:58 +0000 (08:26 +0000)]
[InstCombine] Dropping redundant masking before left-shift [5/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.
There are many variants to this pattern:
f. `((x << MaskShAmt) a>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
f. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)
Normally, the inner pattern is sign-extend,
but for our purposes it's no different to other patterns:
alive proofs:
f: https://rise4fun.com/Alive/7U3
For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.
https://bugs.llvm.org/show_bug.cgi?id=42563
Differential Revision: https://reviews.llvm.org/D64524
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366540
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 08:26:47 +0000 (08:26 +0000)]
[InstCombine] Dropping redundant masking before left-shift [4/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.
There are many variants to this pattern:
e. `((x << MaskShAmt) l>> MaskShAmt) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
e. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)
alive proofs:
e: https://rise4fun.com/Alive/0FT
For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.
https://bugs.llvm.org/show_bug.cgi?id=42563
Differential Revision: https://reviews.llvm.org/D64521
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366539
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 08:26:37 +0000 (08:26 +0000)]
[InstCombine] Dropping redundant masking before left-shift [3/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.
There are many variants to this pattern:
d. `(x & ((-1 << MaskShAmt) >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
d. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)
alive proofs:
d: https://rise4fun.com/Alive/I5Y
For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.
https://bugs.llvm.org/show_bug.cgi?id=42563
Differential Revision: https://reviews.llvm.org/D64519
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366538
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 08:26:25 +0000 (08:26 +0000)]
[InstCombine] Dropping redundant masking before left-shift [2/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.
There are many variants to this pattern:
c. `(x & (-1 >> MaskShAmt)) << ShiftShAmt`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
c. `(ShiftShAmt-MaskShAmt) s>= 0` (i.e. `ShiftShAmt u>= MaskShAmt`)
alive proofs:
c: https://rise4fun.com/Alive/RgJh
For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.
https://bugs.llvm.org/show_bug.cgi?id=42563
Differential Revision: https://reviews.llvm.org/D64517
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366537
91177308-0d34-0410-b5e6-
96231b3b80d8
Roman Lebedev [Fri, 19 Jul 2019 08:26:13 +0000 (08:26 +0000)]
[InstCombine] Dropping redundant masking before left-shift [1/5] (PR42563)
Summary:
If we have some pattern that leaves only some low bits set, and then performs
left-shift of those bits, if none of the bits that are left after the final
shift are modified by the mask, we can omit the mask.
There are many variants to this pattern:
b. `(x & (~(-1 << maskNbits))) << shiftNbits`
All these patterns can be simplified to just:
`x << ShiftShAmt`
iff:
b. `(MaskShAmt+ShiftShAmt) u>= bitwidth(x)`
alive proof:
b: https://rise4fun.com/Alive/y8M
For now let's start with patterns where both shift amounts are variable,
with trivial constant "offset" between them, since i believe this is
both simplest to handle and i think this is most common.
But again, there are likely other variants where we could use
ValueTracking/ConstantRange to handle more cases.
https://bugs.llvm.org/show_bug.cgi?id=42563
Differential Revision: https://reviews.llvm.org/D64514
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366536
91177308-0d34-0410-b5e6-
96231b3b80d8