granicus.if.org Git

[Power9] Enable the Out-of-Order scheduling model for P9 hw

When switched to the MI scheduler for P9, the hardware is modeled as out of order.
However, inside the MI Scheduler algorithm, we still use the in-order scheduling model
as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer
the op. So, only when all the available instructions issued, the pending instruction
could be scheduled. That is not true for our P9 hw in fact.

This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is
picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017.

With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows:

x264_r: +6.95%
cactuBSSN_r: +6.94%
lbm_r: +4.11%
xz_r: -3.85%

And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved.

Reviewer: Nemanjai
Differential Revision: https://reviews.llvm.org/D55810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350285 91177308-0d34-0410-b5e6-96231b3b80d8

Teach ObjCARC optimizer about equivalent PHIs when eliminating autoreleaseRV/retainRV pairs

OptimizeAutoreleaseRVCall skips optimizing llvm.objc.autoreleaseReturnValue if it
sees a user which is llvm.objc.retainAutoreleasedReturnValue, and if they have
equivalent arguments (either identical or equivalent PHIs). It then assumes that
ObjCARCOpt::OptimizeRetainRVCall will optimize the pair instead.

Trouble is, ObjCARCOpt::OptimizeRetainRVCall doesn't know about equivalent PHIs
so optimizes in a different way and we are left with an unoptimized llvm.objc.autoreleaseReturnValue.

This teaches ObjCARCOpt::OptimizeRetainRVCall to also understand PHI equivalence.

rdar://problem/47005143

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D56235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350284 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC visualizer for PointerUnion4

Calculate which item is being held and then display it with the appropriate type. We also
optimize the display of PointerUnion3 to take advantage of our knowing that the IntMask is
always 1 in PointerUnion types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350280 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Expand LLVMRelocMode

Summary: Add read[only|write] PIC relocation models to the C API and teach the TargetMachine API about it.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350279 91177308-0d34-0410-b5e6-96231b3b80d8

[tblgen][disasm] Emit record names again when decoder conflicts occur.

And add a test for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350277 91177308-0d34-0410-b5e6-96231b3b80d8

[gold] emit assembly listing from gold plugin on LTO stage

Summary:
Sometimes it's useful to emit assembly after LTO stage to modify it manually. Emitting precodegen bitcode file (via save-temps plugin option) and then feeding it to llc doesn't always give the same binary as original.
This patch is simpler alternative to https://reviews.llvm.org/D24020.

Patch by Denis Bakhvalov.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: MaskRay, inglorion, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D56114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350276 91177308-0d34-0410-b5e6-96231b3b80d8

MSVC Visualizer for PointerUnion3

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350275 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add load folding support to the custom isel we do for X86ISD::UMUL/SMUL.

The peephole pass isn't always able to fold the load because it can't commute the implicit usage of AL/AX/EAX/RAX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350272 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases to show that we fail to fold loads into i8 smulo and i8/i16/i32/i64 umulo lowering without the assistance of the peephole pass. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350271 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] made assembler parse block_type

Summary:
This was previously ignored and an incorrect value generated.

Also fixed Disassembler's handling of block_type.

Reviewers: dschuff, aheejin

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D56092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350270 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Scan all variants of vague symbol for reachability.

Summary:
Alias can make one (but not all) live, we still need to scan all others if this symbol is reachable
from somewhere else.

Reviewers: tejohnson, grimar

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D56117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350269 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] Fix typo in test; NFC

shl by 32 is undefined. This was intended to be a shl by 31 as part
of a rotate sequence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350265 91177308-0d34-0410-b5e6-96231b3b80d8

Fix assert in ObjCARC optimizer when deleting retainBlock of null or undef.

The caller to EraseInstruction had this conditional:

// ARC calls with null are no-ops. Delete them.
if (IsNullOrUndef(Arg))

but the assert inside EraseInstruction only allowed ConstantPointerNull and not
undef or bitcasts.

This adds support for both of these cases.

rdar://problem/47003805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350261 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly][NFC] Elaborate on simd-noopt test comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350260 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] Remove instructions without demanded bits

If an instruction has no demanded bits, remove it directly during BDCE,
instead of leaving it for something else to clean up.

Differential Revision: https://reviews.llvm.org/D56185

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350257 91177308-0d34-0410-b5e6-96231b3b80d8

Git ignore CLion project configuration files. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350256 91177308-0d34-0410-b5e6-96231b3b80d8

Format AggresiveInstCombine.cpp. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350255 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC PointerUnion visualizer

Differential Revision: https://reviews.llvm.org/D56186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350250 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove X86ISD::INC/DEC. Just select them from X86ISD::ADD/SUB at isel time

INC/DEC are pretty much the same as ADD/SUB except that they don't update the C flag.

This patch removes the special nodes and just pattern matches from ADD/SUB during isel if the C flag isn't being used.

I had to avoid selecting DEC is the result isn't used. This will become a SUB immediate which will turned into a CMP later by optimizeCompareInstr. This lead to the one test change where we use a CMP instead of a DEC for an overflow intrinsic since we only checked the flag.

This also exposed a hole in our RMW flag matching use of hasNoCarryFlagUses. Our root node for the match is a store and there's no guarantee that all the flag users have been selected yet. So hasNoCarryFlagUses needs to check copyToReg and machine opcodes, but it also needs to check for the pre-match SETCC, SETCC_CARRY, BRCOND, and CMOV opcodes.

Differential Revision: https://reviews.llvm.org/D55975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350245 91177308-0d34-0410-b5e6-96231b3b80d8

[MS Demangler] Add a flag for dumping types without tag specifier.

Sometimes it's useful to be able to output demangled names without
tag specifiers like "struct", "class", etc. This patch adds a
flag enabling this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350241 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] After performing the division by constant optimization for a DIV or REM node, replace the users of the corresponding REM or DIV node if it exists.

Currently we expand the two nodes separately. This gives DAG combiner an opportunity to optimize the expanded sequence taking into account only one set of users. When we expand the other node we'll create the expansion again, but might not be able to optimize it the same way. So the nodes won't CSE and we'll have two similarish sequences in the same basic block. By expanding both nodes at the same time we'll avoid prematurely optimizing the expansion until both the division and remainder have been replaced.

Improves the test case from PR38217. There may be additional opportunities after this.

Differential Revision: https://reviews.llvm.org/D56145

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350239 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add fuzzers in llvm/tools that are needed for check-llvm

Also add a fuzzer() template for defining fuzzers that's similar to
add_llvm_fuzzer in the CMake build, and a build file for dependency
llvm/lib/FuzzMutate.

Also make `assert(defined(...` error strings a bit more self-consistent.

Differential Revision: https://reviews.llvm.org/D56194

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350238 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Adding full coverage of MC encoding for the XOP and LWP ISAs.

Adding MC regressions tests to cover the XOP isa set.
This patch is part of a larger task to cover MC encoding of all X86 isa sets started in revision: https://reviews.llvm.org/D39952

Differential Revision: https://reviews.llvm.org/D41392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350237 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeIntegerTypes] When promoting the result of an extract_vector_elt also promote the input type if necessary

By also promoting the input type we get a better idea for what scalar type to use. This can provide better results if the result of the extract is sign extended. What was previously happening is that the extract result would be legalized, sometime later the input of the sign extend would be legalized using the result of the extract. Then later the extract input would be legalized forcing a truncate into the input of the sign extend using a replace all uses. This requires DAG combine to combine out the sext/truncate pair. But sometimes we visited the truncate first and messed things up before the sext could be combined.

By creating the extract with the correct scalar type when we create legalize the result type, the truncate will be added right away. Then when the sign_extend input is legalized it will create an any_extend of the truncate which can be optimized by getNode to maybe remove the truncate. And then a sign_extend_inreg. Now DAG combine doesn't have to worry about getting rid of the extend.

This fixes the regression on X86 in D56156.

Differential Revision: https://reviews.llvm.org/D56176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350236 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold (sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them.

If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead.

The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this.

Differential Revision: https://reviews.llvm.org/D56156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350235 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add build files for bugpoint-passes and LLVMHello plugins

These two plugins are loaded into a host process that contains all LLVM
symbols, so they don't link against anything. This required minor readjustments
to the tablegen() setup of IR.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56204

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350234 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: lli, lli-child-target

Also add build files for dependencies llvm/lib/ExecutionEngine/{Interpreter,Orc}

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350226 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Remove SeenUse check when optimizing conditional branch in
PPCPreEmitPeephole pass.

PPCPreEmitPeephole will convert a BC to B when the conditional branch is
based on a constant CR by CRSET or CRUNSET. This is added in
https://reviews.llvm.org/rL343100.

When the conditional branch is known to be always taken, all branches will
be removed and a new unconditional branch will be inserted. However, when
SeenUse is false the original patch will not remove the branches, but still
insert the new unconditional branch, update the successors and create
inconsistent IR. Compiling the synthetic testcase included can show the
problem we run into.

The patch simply removes the SeenUse condition when adding branches into
InstrsToErase set.

Differential Revision: https://reviews.llvm.org/D56041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350223 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Support SHLD/SHRD masked shift-counts (PR34641)

Peek through shift modulo masks while matching double shift patterns.

I was hoping to delay this until I could remove the X86 code with generic funnel shift matching (PR40081) but this will do for now.

Differential Revision: https://reviews.llvm.org/D56199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350222 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add more tests for potential horizontal ops; NFC

As discussed in D56011 - add runs for AVX512 and tests with extra uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350221 91177308-0d34-0410-b5e6-96231b3b80d8

[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)

Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).

Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).

After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).

As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.

Patch by me with contributions from Michael Ferguson (mferguson@cray.com).

Differential Revision: https://reviews.llvm.org/D38662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350220 91177308-0d34-0410-b5e6-96231b3b80d8

Extend Module::getOrInsertGlobal to control the construction of the
GlobalVariable

Summary:
Extend Module::getOrInsertGlobal to accept a callback for creating a new
GlobalVariable if necessary instead of calling the GV constructor
directly using default arguments. Additionally overload
getOrInsertGlobal for the previous default behavior.

Reviewers: chandlerc

Subscribers: hiraditya, llvm-commits, bollu

Differential Revision: https://reviews.llvm.org/D56130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350219 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Minor refactoring of method DefaultResourceStrategy::select. NFCI

Common code used by the default resource strategy to select pipeline resources
has been moved to an helper function.

The new selection logic has been slightly rewritten to get rid of a redundant
zero check on the `ReadyMask` value. Before this patch, method select internally
called function `PowerOf2Floor` to compute the next ready pipeline resource.
However, `PowerOf2Floor` forces an implicit (redundant) zero check on the input
value. By construction, `ReadyMask` can never be zero. This patch replaces the
call to `PowerOf2Floor` with an equivalent block of code which avoids the
redundant zero check. This gives a minor 3-3.5% speedup on a release build.

No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350218 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: bugpoint, dsymutil, llvm-opt-report

Also add build file for dependency llvm/lib/OptRemarks.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350217 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-c-test, llvm-cfi-verify, llvm-cov, llvm-cvtres

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350216 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-cxxdump, llvm-cxxfilt, llvm-cxxmap

Needed for check-llvm.

This is the last target reading llvm_install_binutils_symlinks.

Differential Revision: https://reviews.llvm.org/D56190

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350215 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-diff, llvm-dwp

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56189

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350214 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-mca, llvm-mt

Also add build file for dependency llvm/lib/MCA.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350213 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-size, llvm-split, llvm-strings

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56164

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350212 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-xray, sancov, sanstats, verify-uselistorder, yaml-bench

Also add build file for dependency llvm/lib/XRay.

Needed for check-llvm.

(yaml-bench is an llvm/util, not an llvm/tool.)

Differential Revision: https://reviews.llvm.org/D56163

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350211 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Handle OR as operand of raw load/store

Summary:
Use isBaseWithConstantOffset() which handles OR as an operand
to llvm.amdgcn.raw.buffer.load and llvm.amdgcn.raw.buffer.store.

Change-Id: Ifefb9dc5ded8710d333df07ab1900b230e33539a

Reviewers: nhaehnle, mareko, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D55999

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350208 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL350035 "[llvm-exegesis] Clustering: don't enqueue a point multiple times"

Let's discuss this on the review thread before submitting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350207 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove the separate SMUL8/UMUL8 X86ISD opcodes by merging with SMUL/UMUL. Remove the second result from X86ISD::UMUL.

All of these use custom isel so we can pretty easily detect the differences in the custom code in X86ISelDAGToDAG. The ISD opcodes just need to express the desired semantics not the details of how they would be selected by isel. So unifying them lets us remove the special casing from lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350206 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Allow LowerSELECT and LowerBRCOND to directly lower i8 UMULO/SMULO.

These require a different X86ISD node to be created than i16/i32/i64. I guess no one wanted to add the special code for that except in LowerXALUO. But now LowerXALUO, LowerSELECT, and LowerBRCOND all use a common helper function so they all share the special code.

Unfortunately, there are no test changes because we seem to correct the miss in a DAG combine later. I did verify it manually using test cases from xmulo.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350205 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add i8/i16 smulo/umulo test cases where the overflow indication is used by a mask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350204 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove KNL specific check prefix from xmulo.ll test. NFC

This was added at a time when i1 was a legal type with avx512f and there was a bug. i1 is no longer considered a legal type with avx512f so there should be no codegen difference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350203 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] canonicalize raw IR rotate patterns to funnel shift

The final piece of IR-level analysis to allow this was committed with:
rL350188

Using the intrinsics should improve transforms based on cost models
like vectorization and inlining.

The backend should be prepared too, so we can now canonicalize more
sequences of shift/logic to the intrinsics and know that the end
result should be equal or better to the original code even if the
target does not have an actual rotate instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350199 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Factor the core code out of LowerXALUO into a helper function. Use it in LowerBRCOND and LowerSELECT to avoid some duplicated code.

This makes it easier to keep the LowerBRCOND and LowerSELECT code in sync with LowerXALUO so they always pick the same operation for overflowing instructions.

This is inspired by the helper functions used by ARM and AArch64 for the same purpose.

The test change is because LowerSELECT was not in sync with LowerXALUO with regard to INC/DEC for SADDO/SSUBO.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350198 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] bool -> LLVMBool

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350197 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add Accessors for Discarding Value Names in the IR

Summary: Add accessors so the performance improvement from this setting is accessible to third parties.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350196 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove KNL specific check prefix from xaluo.ll test. NFC

This was added at a time when i1 was a legal type with avx512f and there was a bug. i1 is no longer considered a legal type with avx512f so there should be no codegen difference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350195 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases to show where LowerSELECT doesn't select SADDO/SSUBO to INC/DEC, but LowerXALUOOp does. Leading to duplicate code.

When SADDO/SSUBO is used as a part of a condition, the X86 backend has to lower the instruction twice. One for the flags use and then once for the data use. These two selections should be kept in sync so they end up with one node providing the data and the flags. This doesn't seem to be happening for INC/DEC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350194 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] move/rename helper for horizontal op codegen; NFC

Preliminary commit as suggested in D56011.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350193 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] Regenerate test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350190 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] Remove -instsimplify from BDCE test; NFC

To make it more obvious which part of the transformation is carried
out by BDCE. Also drop the CHECK-IO lines which only run -instsimplify
as they don't really seem meaningful if the main check doesn't run
-instsimplify either.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350189 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "[BDCE][DemandedBits] Detect dead uses of undead instructions"

This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771.

BDCE currently detects instructions that don't have any demanded bits
and replaces their uses with zero. However, if an instruction has
multiple uses, then some of the uses may be dead (have no demanded bits)
even though the instruction itself is still live. This patch extends
DemandedBits/BDCE to detect such uses and replace them with zero.
While this will not immediately render any instructions dead, it may
lead to simplifications (in the motivating case, by converting a rotate
into a simple shift), break dependencies, etc.

The implementation tries to strike a balance between analysis power and
complexity/memory usage. Originally I wanted to track demanded bits on
a per-use level, but ultimately we're only really interested in whether
a use is entirely dead or not. I'm using an extra set to track which uses
are dead. However, as initially all uses are dead, I'm not storing uses
those user is also dead. This case is checked separately instead.

The previous attempt to land this lead to miscompiles, because cases
where uses were initially dead but were later found to be live during
further analysis were not always correctly removed from the DeadUses
set. This is fixed now and the added test case demanstrates such an
instance.

Differential Revision: https://reviews.llvm.org/D55563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350188 91177308-0d34-0410-b5e6-96231b3b80d8

Reversing the commit in revision 350186. Revision causes regression in 4
tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350187 91177308-0d34-0410-b5e6-96231b3b80d8

Omit range checks from jump tables when lowering switches with unreachable
default

During the lowering of a switch that would result in the generation of a jump
table, a range check is performed before indexing into the jump table, for the
switch value being outside the jump table range and a conditional branch is
inserted to jump to the default block. In case the default block is
unreachable, this conditional jump can be omitted. This patch implements
omitting this conditional branch for unreachable defaults.

Review Reference: D52002

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350186 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] canonicalize MUL with NEG operand

-X * Y --> -(X * Y)
X * -Y --> -(X * Y)

Differential Revision: https://reviews.llvm.org/D55961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350185 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-exegesis, llvm-extract, llvm-link

Also add build file for dependency llvm/lib/ExecutionEngine/MCJIT.

The exegesis stuff is pretty hairy and knows a lot about Target internals (in
general, not specifically in the GN build). I put the llvm-tblgen -gen-exegesis
call in llvm/tools/llvm-exegesis/lib/X86, instead of in llvm/lib/Target/X86
where it is in CMake land, and asked on D52932 why it's in that place in the
CMake build.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56167

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350184 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-rc, llvm-rtdyld

Also add build file for dependencies llvm/lib/ExecutionEngine,
llvm/lib/ExecutionEngine/RuntimeDyld.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350183 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add PR34641 masked shld/shrd test cases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350181 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add additional RUN lines to prepare for D56156. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350180 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add SIGN_EXTEND_VECTOR_INREG support to computeKnownBits.

Differential Revision: https://reviews.llvm.org/D56168

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350179 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add X86ISD::VSRAI to computeKnownBitsForTargetNode.

Differential Revision: https://reviews.llvm.org/D56169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350178 91177308-0d34-0410-b5e6-96231b3b80d8

Keep tablegen commands in alphabetical order. NFCI.

Mentioned on D56167.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350176 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Fix propagating HOME envvar to unittests

Propagate HOME environment variable to unittests. This is necessary
to fix test failures resulting from pw_home pointing to a non-existing
directory while being overriden with HOME. Apparently Gentoo users
hit this sometimes when they override build directory for Portage.

Original bug report: https://bugs.gentoo.org/674088

Differential Revision: https://reviews.llvm.org/D56162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350175 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Accept "sve" as arch feature in assembler

Differential Revision: https://reviews.llvm.org/D56128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350174 91177308-0d34-0410-b5e6-96231b3b80d8

[MSan] Handle llvm.is.constant intrinsic

MSan used to report false positives in the case the argument of
llvm.is.constant intrinsic was uninitialized.
In fact checking this argument is unnecessary, as the intrinsic is only
used at compile time, and its value doesn't depend on the value of the
argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350173 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add missing one use check on the shuffle in the bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1) transform.

Found while trying out some other changes so I don't really have a test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350172 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Make `ninja check-clang` also run Clang's unit tests

Also add a build file for clang/lib/ASTMatchers/Dynamic, which is only needed
by tests (and clang/tools/extra).

Also make llvm/utils/gn/build/sync_source_lists_from_cmake.py check that every
CMakeLists.txt file below {lld,clang}/unittests has a corresponding BUILD.gn
file, so we notice if new test binaries get added (since the failure mode for
missing GN build files for tests is just the tests silently not running in the
GN build).

Also add a unittest() macro for defining unit test targets, and add a lengthy
comment there about where the unit test binaries go and why.

With this, the build files for //clang are complete.

Differential Revision: https://reviews.llvm.org/D56116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350171 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Implement the .arch_extension directive

Differential Revision: https://reviews.llvm.org/D56131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350169 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] [COFF] Use Error/Expected returns instead of calling reportError. NFC.

Differential Revision: https://reviews.llvm.org/D55922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350168 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix machine verify pass error for PATCHPOINT pseudo instruction that bad machine code

Summary:
For SDAG, we pretend patchpoints aren't special at all until we emit the code for the pseudo.
Then the verifier runs and it seems like we have a use of an undefined register (the register will
be reserved later, but the verifier doesn't know that).

So this patch call setUsesTOCBasePtr before emit the code for the pseudo, so verifier can know
X2 is a reserved register.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D56148

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350165 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fixed extra semicolon warning
-This line, and those below, will be ignored--

M lib/Support/Error.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350162 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix ADDE, SUBE do not know how to promote operator

Summary:
This patch is created to fix the Bugzilla bug 39815:
https://bugs.llvm.org/show_bug.cgi?id=39815

This patch is to support promotion integer result for the instruction ADDE, SUBE.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D56119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350161 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't mark SEXTLOAD from v4i8/v4i16/v8i8 as Custom on pre-sse4.1.

This seems to be getting in the way more than its helping. This does mean we stop scalarizing some cases, but I'm not convinced the scalarization was really better.

Some of the changes to vsel-cmp-load.ll are a regression but D56156 should fix it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350159 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add custom type legalization for SIGN_EXTEND_VECTOR_INREG from 16i16/v32i8 to v4i64 when v4i64 needs splitting.

This allows us to sign extend to v4i32 first. And then share that extension to implement the final steps to v4i64 using a pcmpgt and punpckl and punpckh.

We already do something similar for SIGN_EXTEND with -x86-experimental-vector-widening-legalization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350158 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] Macro for register set defs for the Asm Parser

We have some unfortunate code in the back end that defines a bunch of register
sets for the Asm Parser. Every time another class is needed in the parser, we
have to add another one of those definitions with explicit lists of registers.
This NFC patch simply provides macros to use to condense that code a little bit.

Differential revision: https://reviews.llvm.org/D54433

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350156 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Complete the custom legalization of vector int to fp conversion

A recent patch has added custom legalization of vector conversions of
v2i16 -> v2f64. This just rounds it out for other types where the input vector
has an illegal (narrower) type than the result vector. Specifically, this will
handle the following conversions:

v2i8 -> v2f64
v4i8 -> v4f32
v4i16 -> v4f32

Differential revision: https://reviews.llvm.org/D54663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350155 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] [NFC] update testcases for canonicalize MUL with NEG operand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350154 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix CR Bit spill pseudo expansion

The current CRBIT spill pseudo-op expansion creates a KILL instruction
that kills the CRBIT and defines the enclosing CR field. However, this
paints a false picture to the register allocator that all bits in the CR
field are killed so copies of other bits out of the field become dead and
removable.
This changes the expansion to preserve the KILL flag on the CRBIT as an
implicit use and to treat the CR field as an undef input.

Thanks to Hal Finkel for the review and Uli Weigand for implementation input.

Differential revision: https://reviews.llvm.org/D55996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350153 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Show an error on attempt to use 64-bit PC-relative relocation

The following code requests 64-bit PC-relative relocations unsupported
by MIPS ABI. Now it triggers an assertion. It's better to show an error
message.
```
foo:
.quad bar - foo
```

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350152 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Show a regular error message on attempt to use one byte relocation

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350151 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case from PR38217. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350150 91177308-0d34-0410-b5e6-96231b3b80d8

Drop SE cache early because loop parent can change in LoopSimplifyCFG

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350145 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix comments in ExplicitLocals (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350144 91177308-0d34-0410-b5e6-96231b3b80d8

Add vtable anchor to classes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350142 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't mark SEXTLOAD v4i8->v4i64 and v8i8->v8i64 as custom under vector widening legalization.

This was tricking us into making these operations and then letting them get scalarized later. But I can't prove that the scalarized version is actually better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350141 91177308-0d34-0410-b5e6-96231b3b80d8

[UnrollRuntime] NFC: Updated exiting tests and added more tests

Added more tests for multiple exiting blocks to the LatchExit.
Today these cases are not supported. Patch to follow soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350135 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Directly emit X86ISD::PMULUDQ from the ReplaceNodeResults handling of v2i8/v2i16/v2i32 multiply.

Previously we emitted a multiply and some masking that was supposed to matched to PMULUDQ, but the masking could sometimes be removed before we got a chance to match it. So instead just emit the PMULUDQ directly.

Remove the DAG combine that was added when the ReplaceNodeResults code was originally added. Add a new DAG combine to avoid regressions in shrink_vmul.ll

Some of the shrink_vmul.ll test cases now pick PMULUDQ instead of PMADDWD/PMULLD, but I think this should be an improvement on most CPUs.

I think all of this can go away if/when we switch to -x86-experimental-vector-widening-legalization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350134 91177308-0d34-0410-b5e6-96231b3b80d8

[UnrollRuntime] NFC: Add comment and verify LCSSA

Added -verify-loop-lcssa to test cases.
Updated comments in ConnectProlog.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350131 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add command-line option for SB

SB (Speculative Barrier) is only mandatory from 8.5
onwards but is optional from Armv8.0-A. This patch adds a command
line option to enable SB, as it was previously only possible to
enable by selecting -march=armv8.5-a.

This patch also moves to FeatureSB the old FeatureSpecRestrict.

Reviewers: pbarrio, olista01, t.p.northover, LukeCheeseman

Differential Revision: https://reviews.llvm.org/D55921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350126 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Extend the `MemberAttributes` interface with the `isStatic` method

Summary:
This patch extends the MemberAttributes interface with the isStatic method.
It is needed for D56126.

Reviewers: zturner, rnk

Reviewed By: zturner

Differential Revision: https://reviews.llvm.org/D56127

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350125 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC][DOC] Updated AMD GPU assembler description.

Minor bugfixing and improvements.

See bug 36572: https://bugs.llvm.org/show_bug.cgi?id=36572

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350120 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Add failing test on LCSSA form preservation of LoopSimplifyCFG

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350119 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] handle ISD:TRUNCATE in BitPermutationSelector

This is the last one in a series of patches to support better code generation for bitfield insert.
BitPermutationSelector already support ISD::ZERO_EXTEND but not TRUNCATE.
This patch adds support for ISD:TRUNCATE in BitPermutationSelector.

For example of this test case,
struct s64b {
  int a:4;
  int b:16;
  int c:24;
};
void bitfieldinsert64b(struct s64b *p, unsigned char v) {
  p->b = v;
}

the selection DAG loos like:

t14: i32,ch = load<(load 4 from %ir.0)> t0, t2, undef:i64
       t18: i32 = and t14, Constant:i32<-1048561>
            t4: i64,ch = CopyFromReg t0, Register:i64 %1
          t22: i64 = AssertZext t4, ValueType:ch:i8
        t23: i32 = truncate t22
      t16: i32 = shl nuw nsw t23, Constant:i32<4>
    t19: i32 = or t18, t16
  t20: ch = store<(store 4 into %ir.0)> t14:1, t19, t2, undef:i64

By handling truncate in the BitPermutationSelector, we can use information from AssertZext when selecting t19 and skip the mask operation corresponding to t18.
So the generated sequences with and without this patch are

without this patch
rlwinm 5, 5, 0, 28, 11 # corresponding to t18
rlwimi 5, 4, 4, 20, 27
with this patch
rlwimi 5, 4, 4, 12, 27

Differential Revision: https://reviews.llvm.org/D49076

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350118 91177308-0d34-0410-b5e6-96231b3b80d8

Temporarily disable term folding in LoopSimplifyCFG, add tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350117 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopSimplifyCFG] Delete dead blocks in RPO

Deletion of dead blocks in arbitrary order may lead to failure
of assertion in `DeleteDeadBlock` that requires that we have
deleted all predecessors before we can delete the current block.
We should instead delete them in RPO order.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350116 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Remove the implicit use of the register if it is replaced by Imm
If we are changing the MI operand from Reg to Imm, we need also handle its implicit use if have.

Differential Revision: https://reviews.llvm.org/D56078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350115 91177308-0d34-0410-b5e6-96231b3b80d8