granicus.if.org Git

Revert "[AMDGPU] Run `unreachable-mbb-elimination` after isel to clean up PHIs."

Summary:
This has been superseded by "[AMDGPU]: PHI Elimination hooks added for custom COPY insertion."

This reverts the code changes from commit 53f967f2bdb6aa7b08596880c3689d1ecad6f0ff
but keeps the test case.

Reviewers: hliao, arsenm, tpr, dstuttard

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374347 91177308-0d34-0410-b5e6-96231b3b80d8

Fix OCaml/core.ml fneg check

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374346 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG][X86] Add isNegatibleForFree/GetNegatedExpression override placeholders. NFCI.

Continuing to undo the rL372756 reversion.

Differential Revision: https://reviews.llvm.org/D67557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374345 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readelf] - Do not enter an infinite loop when printing histogram.

This is similar to D68086.
We are entering an infinite loop when dumping a histogram for a specially crafted
.hash section with a loop in a chain.

Differential revision: https://reviews.llvm.org/D68771

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374344 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Output of od can be lower or upper case (llvm-objcopy/yaml2obj).

The command `od -t x` is used to dump data in hex format.
The LIT tests assumes that the hex characters are in lowercase.
However, there are also platforms which use uppercase letter.

To solve this issue the tests are updated to use the new
`--ignore-case` option of FileCheck.

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson

Differential Revision: https://reviews.llvm.org/D68693

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374343 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Match more patterns for half word bswap

Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68250

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374340 91177308-0d34-0410-b5e6-96231b3b80d8

[FileCheck] Implement --ignore-case option.

The FileCheck utility is enhanced to support a `--ignore-case`
option. This is useful in cases where the output of Unix tools
differs in case (e.g. case not specified by Posix).

Reviewers: Bigcheese, jakehehrlich, rupprecht, espindola, alexshap, jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D68146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374339 91177308-0d34-0410-b5e6-96231b3b80d8

[LV][NFC] Factor out calculation of "best" estimated trip count.

This is just small refactoring to minimize changes in upcoming patch.
In the next path I'm going to introduce changes into heuristic for vectorization of "tiny trip count" loops.

Patch by Evgeniy Brevnov <evgueni.brevnov@gmail.com>

Reviewers: hsaito, Ayal, fhahn, reames

Reviewed By: hsaito

Differential Revision: https://reviews.llvm.org/D67690

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374338 91177308-0d34-0410-b5e6-96231b3b80d8

MinidumpYAML: Add support for the memory info list stream

Summary:
The implementation is fairly straight-forward and uses the same patterns
as the existing streams. The yaml form does not attempt to preserve the
data in the "gaps" that can be created by setting a larger-than-required
header or entry size in the stream header, because the existing consumer
(lldb) does not make use of the information in the gap in any way, and
attempting to preserve that would make the implementation more
complicated.

Reviewers: amccarth, jhenderson, clayborg

Subscribers: llvm-commits, lldb-commits, markmentovai, zturner, JosephTremoulet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374337 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] VQADD instructions

This selects MVE VQADD from the vector llvm.sadd.sat or llvm.uadd.sat
intrinsics.

Differential Revision: https://reviews.llvm.org/D68566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374336 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][x86] add tests for (v)select bit magic; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374334 91177308-0d34-0410-b5e6-96231b3b80d8

[Alignment][NFC] Make VectorUtils uas llvm::Align

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, rogfer01, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374330 91177308-0d34-0410-b5e6-96231b3b80d8

[ADR] ArrayRefTest: disable SizeTSizedOperations test - it's UB.

This test is not defined.

FAIL: LLVM-Unit :: ADT/./ADTTests/ArrayRefTest.SizeTSizedOperations (178 of 33926)
******************** TEST 'LLVM-Unit :: ADT/./ADTTests/ArrayRefTest.SizeTSizedOperations' FAILED ********************
Note: Google Test filter = ArrayRefTest.SizeTSizedOperations
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from ArrayRefTest
[ RUN      ] ArrayRefTest.SizeTSizedOperations
/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:180:32: runtime error: applying non-zero offset 9223372036854775806 to null pointer
    #0 0x5ae8dc in llvm::ArrayRef<char>::slice(unsigned long, unsigned long) const /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:180:32
    #1 0x5ae44c in (anonymous namespace)::ArrayRefTest_SizeTSizedOperations_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/unittests/ADT/ArrayRefTest.cpp:85:3
    #2 0x928a96 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2474:5
    #3 0x929793 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
    #4 0x92a152 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
    #5 0x9319d2 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
    #6 0x931416 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/googletest/src/gtest.cc:4257:10
    #7 0x920ac3 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
    #8 0x920ac3 in main /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50:10
    #9 0x7f66135b72e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #10 0x472c19 in _start (/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/unittests/ADT/ADTTests+0x472c19)

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:180:32 in

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374327 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wparentheses warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374326 91177308-0d34-0410-b5e6-96231b3b80d8

[Mips] Fix 374055

EXPENSIVE_CHECKS build was failing on new test.
This is fixed by marking $ra register as undef.
Test now has -verify-machineinstrs to check for operand flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374320 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Use system locale for mri-utf8.test

Summary:
llvm-ar's mri-utf8.test test relies on the en_US.UTF-8 locale to be
installed for its last RUN line to work. If not installed, the unicode
string gets encoded (interpreted) as ascii which fails since the most
significant byte is non zero. This commit changes the test to only rely
on the system being able to encode the pound sign in its default
encoding (e.g. UTF-16 for Microsoft Windows) by always opening the file
via input/output redirection. This avoids forcing a given locale to be
present and supported. A Byte Order Mark is also added to help
recognizing the encoding of the file and its endianness.

Reviewers: gbreynoo, MaskRay, rupprecht, JamesNagurne, jfb

Subscribers: dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374318 91177308-0d34-0410-b5e6-96231b3b80d8

[IfCvt][ARM] Optimise diamond if-conversion for code size

Currently, the heuristics the if-conversion pass uses for diamond if-conversion
are based on execution time, with no consideration for code size. This adds a
new set of heuristics to be used when optimising for code size.

This is mostly target-independent, because the if-conversion pass can
see the code size of the instructions which it is removing. For thumb,
there are a few passes (insertion of IT instructions, selection of
narrow branches, and selection of CBZ instructions) which are run after
if conversion and affect these heuristics, so I've added target hooks to
better predict the code-size effect of a proposed if-conversion.

Differential revision: https://reviews.llvm.org/D67350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374301 91177308-0d34-0410-b5e6-96231b3b80d8

[UBSan][clang][compiler-rt] Applying non-zero offset to nullptr is undefined behaviour

Summary:
Quote from http://eel.is/c++draft/expr.add#4:
```
4     When an expression J that has integral type is added to or subtracted
      from an expression P of pointer type, the result has the type of P.
(4.1) If P evaluates to a null pointer value and J evaluates to 0,
      the result is a null pointer value.
(4.2) Otherwise, if P points to an array element i of an array object x with n
      elements ([dcl.array]), the expressions P + J and J + P
      (where J has the value j) point to the (possibly-hypothetical) array
      element i+j of x if 0≤i+j≤n and the expression P - J points to the
      (possibly-hypothetical) array element i−j of x if 0≤i−j≤n.
(4.3) Otherwise, the behavior is undefined.
```

Therefore, as per the standard, applying non-zero offset to `nullptr`
(or making non-`nullptr` a `nullptr`, by subtracting pointer's integral value
from the pointer itself) is undefined behavior. (*if* `nullptr` is not defined,
i.e. e.g. `-fno-delete-null-pointer-checks` was *not* specified.)

To make things more fun, in C (6.5.6p8), applying *any* offset to null pointer
is undefined, although Clang front-end pessimizes the code by not lowering
that info, so this UB is "harmless".

Since rL369789 (D66608 `[InstCombine] icmp eq/ne (gep inbounds P, Idx..), null -> icmp eq/ne P, null`)
LLVM middle-end uses those guarantees for transformations.
If the source contains such UB's, said code may now be miscompiled.
Such miscompilations were already observed:
* https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/687838.html
* https://github.com/google/filament/pull/1566

Surprisingly, UBSan does not catch those issues
... until now. This diff teaches UBSan about these UB's.

`getelementpointer inbounds` is a pretty frequent instruction,
so this does have a measurable impact on performance;
I've addressed most of the obvious missing folds (and thus decreased the performance impact by ~5%),
and then re-performed some performance measurements using my [[ https://github.com/darktable-org/rawspeed | RawSpeed ]] benchmark:
(all measurements done with LLVM ToT, the sanitizer never fired.)
* no sanitization vs. existing check: average `+21.62%` slowdown
* existing check vs. check after this patch: average `22.04%` slowdown
* no sanitization vs. this patch: average `48.42%` slowdown

Reviewers: vsk, filcab, rsmith, aaron.ballman, vitalybuka, rjmccall, #sanitizers

Reviewed By: rsmith

Subscribers: kristof.beyls, nickdesaulniers, nikic, ychen, dtzWill, xbolva00, dberris, arphaman, rupprecht, reames, regehr, llvm-commits, cfe-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D67122

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374293 91177308-0d34-0410-b5e6-96231b3b80d8

[update_cc_test_checks] Support 'clang | opt | FileCheck'

Some clang lit tests use a pipeline of the form

// RUN: %clang [args] -O0 %s | opt [specific optimizations] | FileCheck %s

to make the expected test output depend on as few optimization phases
as possible, for stability. But when you write a RUN line of this
form, you lose the ability to use update_cc_test_checks.py to
automatically generate the expected output, because it only supports
two-stage pipelines consisting of '%clang | FileCheck' (or %clang_cc1).

This change extends the set of supported RUN lines so that pipelines
with an invocation of `opt` in the middle can still be automatically
handled.

To implement it, I've adjusted `get_function_body()` so that it can
cope with an arbitrary sequence of intermediate pipeline commands. But
the code that decides which RUN lines to consider is more
conservative: it only adds clang | opt | FileCheck to the set of
supported lines, because I didn't want to accidentally include some
other kind of line that doesn't output IR at all.

(Also in this commit is the minimal change to make this script work at
all, after r373912 added an extra parameter to `add_ir_checks`.)

Reviewers: MaskRay, xbolva00

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68406

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374287 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use SGPR_128 instead of SReg_128 for vregs

SGPR_128 only includes the real allocatable SGPRs, and SReg_128 adds
the additional non-allocatable TTMP registers. There's no point in
allocating SReg_128 vregs. This shrinks the size of the classes
regalloc needs to consider, which is usually good.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374284 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for trunc_packus_v16i32_v16i8 with avx512vl+avx512bw and prefer-vector-width=256 and min-legal-vector-width=256. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374283 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][NFC] clang format

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374281 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Handle `null` differently in capture and alias logic

Summary:
`null` in the default address space (=AS 0) cannot be captured nor can
it alias anything. We make this clear now as it can be important for
callbacks and other cases later on. In addition, this patch improves the
debug output for noalias deduction.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374280 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r374277

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374278 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[TextAPI] Introduce TBDv4"

Original Patch broke for compilations w/ gcc and exposed asan fail.
This reland repairs those bugs.

Differential Revision: https://reviews.llvm.org/D67529

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374277 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] add testcase for ppc loop instr form prep - NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374273 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: (manually) merge r374271

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374272 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Try to avoid emitting .cv_loc with line zero

Summary:
Visual Studio doesn't like it while stepping. It kicks you out of the
source view of the file being stepped through and tries to fall back to
the disassembly view.

Fixes PR43530

The fix is incomplete, because it's possible to have a basic block with
no source locations at all. In this case, we don't emit a .cv_loc, but
that will result in wrong stepping behavior in the debugger if the
layout predecessor of the location-less BB has an unrelated source
location. We could try harder to find a valid location that dominates or
post-dominates the current BB, but in general it's a dataflow problem,
and one still might not exist. I left a FIXME about this.

As an alternative, we might want to consider having the middle-end check
if its emitting codeview and get it to stop using line zero.

Reviewers: akhuang

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374267 91177308-0d34-0410-b5e6-96231b3b80d8

Conservatively add volatility and atomic checks in a few places

As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86.

As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them.

This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions.

Differential Revision: https://reviews.llvm.org/D68419

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374261 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r374245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374260 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix tests missed in rL374235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374259 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't fold copies to physregs

In a future patch, this will help cleanup m0 handling.

The register coalescer handles copies from a register that
materializes an immediate, but doesn't handle move immediates
itself. The virtual register uses will often be allocated to the same
register, so there end up being no real copy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374257 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix crash on wide constant load with VGPR pointer

This was ignoring the register bank of the input pointer, and
isUniformMMO seems overly aggressive.

This will now conservatively assume a VGPR in cases where the incoming
bank hasn't been determined yet (i.e. is from a loop phi).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374255 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Relax register classes used

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374254 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix typos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374253 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR

Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374252 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: (manually) merge r374219

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374249 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel] Refactor and split PatternMatchTest. NFC

Split the ConstantFold part into a separate file and
make it use the fixture GISelMITest.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374245 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix PR43617

Check for `nullptr` before inspecting composite function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374243 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fixed dpp combine of VOP1

If original instruction did not have source modifiers they were
not added to the new DPP instruction as well, even if needed.

Differential Revision: https://reviews.llvm.org/D68729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374241 91177308-0d34-0410-b5e6-96231b3b80d8

[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator

Also update Clang to call Builder.CreateFNeg(...) for UnaryMinus.

Differential Revision: https://reviews.llvm.org/D61675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374240 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Make returns variadic

Summary:
This is necessary and sufficient to get simple cases of multiple
return working with multivalue enabled. More complex cases will
require block and loop signatures to be generalized to potentially be
type indices as well.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68684

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374235 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleFDO] Add indexing for function profiles so they can be loaded on demand
in ExtBinary format

Currently for Text, Binary and ExtBinary format profiles, when we compile a
module with samplefdo, even if there is no function showing up in the profile,
we have to load all the function profiles from the profile input. That is a
waste of compile time.

CompactBinary format profile has already had the support of loading function
profiles on demand. In this patch, we add the support to load profile on
demand for ExtBinary format. It will work no matter the sections in ExtBinary
format profile are compressed or not. Experiment shows it reduces the time to
compile a server benchmark by 30%.

When profile remapping and loading function profiles on demand are both used,
extra work needs to be done so that the loading on demand process will take
the name remapping into consideration. It will be addressed in a follow-up
patch.

Differential Revision: https://reviews.llvm.org/D68601

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374233 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-dwarfdump: Support multiple debug_loclists contributions

Also fixing the incorrect "offset" field being computed/printed for each
location list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374232 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Adds section for Additional Topics on Reference page

Adds a new section for Additional Topics on the Reference documentation page. Also moves Support Library topic to User Guides page.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374230 91177308-0d34-0410-b5e6-96231b3b80d8

[System Model] [TTI] Define AMDGPUTTIImpl::getST and AMDGPUTTIImpl::getTLI

To fix "infinite recursion" warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374222 91177308-0d34-0410-b5e6-96231b3b80d8

[System Model] [TTI] Fix virtual destructor warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374221 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Adds Documentation links to sidebar

Adds links to Getting Started/Tutorials, User Guides, and Reference documentation pages to sidebar. Also adds a new section for LLVM IR on the Reference documentation page.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374214 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstProp] add tests for extractelement with undef index; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374210 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Use math constants defined in MathExtras (NFC)

Use the the new math constants in `MathExtras.h`.

Differential revision: https://reviews.llvm.org/D68285

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374208 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Add mathematical constants

Add own version of the mathematical constants from the upcoming C++20 `std::numbers`.

Differential revision: https://reviews.llvm.org/D68257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374207 91177308-0d34-0410-b5e6-96231b3b80d8

[System Model] [TTI] Update cache and prefetch TTI interfaces

Re-apply 9fdfb045ae8b/r365676 with fixes for PPC and Hexagon.  This involved
moving defaults from TargetTransformInfoImplBase to MCSubtargetInfo.

Rework the TTI cache and software prefetching APIs to prepare for the
introduction of a general system model.  Changes include:

- Marking existing interfaces const and/or override as appropriate
- Adding comments
- Adding BasicTTIImpl interfaces that delegate to a subtarget
  implementation
- Moving the default TargetTransformInfoImplBase implementation to a default
  MCSubtarget implementation

Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC
and SystemZ.  AArch64 already has a custom subtarget implementation, so its
custom TTI implementation is migrated to use the new facilities in BasicTTIImpl
to invoke its custom subtarget implementation.  The custom TTI implementations
continue to exist for the other targets with this change.  They are not moved
over to subtarget-based implementations.

The end goal is to have the default subtarget implementation defer to the system
model defined by the target.  With this change, the default MCSubtargetInfo
implementation essentially returns the defaults TargetTransformInfoImplBase used
to return.  Existing users of TTI defaults will hit the defaults now in
MCSubtargetInfo.  Targets that define their own custom TTI implementations won't
use the BasicTTIImpl implementations that route to the subtarget.

Once system models are in place for the targets that use these interfaces, their
custom TTI implementations can be removed.

Differential Revision: https://reviews.llvm.org/D63614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374205 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Shot in the dark attempt to fix ubsan error from r374122

(specifying an underlying type for the enum might also be suitable - but
this seems better/as good, since there's a clear expectation this can
contain values other than the actual enumerators of this enum)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374196 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Refactor ProgressDisplay

Move progress display to separate file. Simplify some code paths.
Decouple from other components via progress callback. Remove unused
`_Display` class.

Reviewed By: serge-sans-paille

Differential Revision: https://reviews.llvm.org/D68525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374194 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add another test for gep inbounds; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374190 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add builtin and intrinsic for v8x16.swizzle

Summary:
This clang builtin and corresponding LLVM intrinsic are necessary to
expose the exact semantics of the underlying WebAssembly instruction
to users. LLVM produces a poison value if the dynamic swizzle indices
are greater than the vector size, but the WebAssembly instruction sets
the corresponding output lane to zero. Users who depend on this
behavior can safely use this builtin.

Depends on D68527.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68531

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374189 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] v8x16.swizzle and rewrite BUILD_VECTOR lowering

Summary:
Adds the new v8x16.swizzle SIMD instruction as specified at
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#swizzling-using-variable-indices.
In addition to adding swizzles as a candidate lowering in
LowerBUILD_VECTOR, also rewrites and simplifies the lowering to
minimize the number of replace_lanes necessary rather than trying to
minimize code size. This leads to more uses of v128.const instead of
splats, which is expected to increase performance.

The new code will be easier to tune once V8 implements all the vector
construction operations, and it will also be easier to add new
candidate instructions in the future if necessary.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68527

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374188 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv][NFC] Change test to conform to strictfp attribute rules.

In particular, the function definition is not marked strictfp despite
containing a function marked strictfp. Also, if any function call is marked
strictfp then all function calls in that function must be marked.

This change to move the one strictfp call to a new properly marked function
meets all the new rules.

Tested with a stricter version of D68233.

Reviewed by: spatel
Approved by: spatel
Differential Revision: https://reviews.llvm.org/D68713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374186 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] respect target register width for GEP vectorization (PR43578)

We failed to account for the target register width (max vector factor)
when vectorizing starting from GEPs. This causes vectorization to
proceed to obviously illegal widths as in:
https://bugs.llvm.org/show_bug.cgi?id=43578

For x86, this also means that SLP can produce rogue AVX or AVX512
code even when the user specifies a narrower vector width.

The AArch64 test in ext-trunc.ll appears to be better using the
narrower width. I'm not exactly sure what getelementptr.ll is trying
to do, but it's testing with "-slp-threshold=-18", so I'm not worried
about those diffs. The x86 test is an over-reduction from SPEC h264;
this patch appears to restore the perf loss caused by SLP when using
-march=haswell.

Differential Revision: https://reviews.llvm.org/D68667

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374183 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Ensure no tagged memory is left in the unallocated portion of the
stack

This patch makes sure that if we tag some memory, we untag that memory before
the function returns/throws via any exit, reachable from the tag operation. For
that we place the untag operation either at:

  a) the lifetime end call for the alloca, if that call post-dominates the
     lifetime start call (where the tag operation is placed), or it (the
     lifetime end call) dominates all reachable exits, otherwise
  b) at the reachable exits

Differential Revision: https://reviews.llvm.org/D68469

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374182 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Remove files got accidentally upload in llvm-svn 374179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374181 91177308-0d34-0410-b5e6-96231b3b80d8

[AIX][XCOFF][NFC] Change the SectionLen field name of CSect Auxiliary entry to SectionOrLength.

Summary:
According the the XCOFF document,
If
Then
XTY_SD
x_scnlen contains the csect length.
XTY_LD
x_scnlen contains the symbol table index of the containing csect.
XTY_CM
x_scnlen contains the csect length.
XTY_ER
x_scnlen contains 0.

Change the SectionLen member name to SectionOrLength is more reasonable.

Authored By: DiggerLin

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D68650

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374179 91177308-0d34-0410-b5e6-96231b3b80d8

Re-land "[dsymutil] Fix handling of common symbols in multiple object files."

The original patch got reverted because it hit a long-standing legacy
issue on Windows that prevents files from being named `com`. Thanks
Kristina & Jeremy for pointing this out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374178 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Make the use of moveAllAfterMergeBlocks consistent.

Summary:
The rule for the moveAllAfterMergeBlocks API si for all instructions
from `From` to have been moved to `To`, while keeping the CFG edges (and
block terminators) unchanged.
Update all the callsites for moveAllAfterMergeBlocks to follow this.

Pending follow-up: since the same behavior is needed everytime, merge
all callsites into one. The common denominator may be the call to
`MergeBlockIntoPredecessor`.

Resolves PR43569.

Reviewers: george.burgess.iv

Subscribers: Prazek, sanjoy.google, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68659

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374177 91177308-0d34-0410-b5e6-96231b3b80d8

Fix Wdocumentation unknown parameter warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374171 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Ensure that ExecutableFunction are aligned.

Summary: Experiments show that this is the alignment we get (for ELF+Linux), but let's ensure that we have it.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68703

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374170 91177308-0d34-0410-b5e6-96231b3b80d8

Add and adjust saturating tests. NFC

This adds some extra testing to the existing [su][add/sub]_sat X86 and AArch64
tests and adds equivalent tests for ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374169 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Emitting SCEV checks with OptForSize

When optimising for size and SCEV runtime checks need to be emitted to check
overflow behaviour, the loop vectorizer can run in this assert:

  LoopVectorize.cpp:2699: void llvm::InnerLoopVectorizer::emitSCEVChecks(
  llvm::Loop *, llvm::BasicBlock *): Assertion `!BB->getParent()->hasOptSize()
  && "Cannot SCEV check stride or overflow when opt

We should not generate predicates while optimising for size because
code will be generated for predicates such as these SCEV overflow runtime
checks.

This should fix PR43371.

Differential Revision: https://reviews.llvm.org/D68082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374166 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Rename local variable. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374165 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Split expandLoadImmReal into multiple methods. NFC

The `expandLoadImmReal` handles four different and almost non-overlapping
cases: loading a "single" float immediate into a GPR, loading a "single"
float immediate into a FPR, and the same couple for a "double" float
immediate.

It's better to move each `else if` branch into separate methods.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374164 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix r374158

Some bots complain about missing 'class':

LlvmState.h:70:40: error: declaration of ‘std::unique_ptr<const llvm::TargetMachine> llvm::exegesis::LLVMState::TargetMachine’ [-fpermissive]
std::unique_ptr<const TargetMachine> TargetMachine;

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374162 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add tests for insertelement to non-immediate vector element indices

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374161 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add tests for extractelement from non-immediate vector element indices

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374160 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add saturating arithmetic tests for MVE. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374159 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Remove extra `llvm::` qualifications.

Summary: Second patch: in the lib.

Reviewers: gchatelet

Subscribers: nemanjai, tschuett, MaskRay, mgrang, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68692

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374158 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Remove extra `llvm::` qualifications.

Summary: First patch: in unit tests.

Subscribers: nemanjai, tschuett, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374157 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Fix crash when using HwModes in CodeEmitterGen

When an instruction has an encoding definition for only a subset of
the available HwModes, ensure we just avoid generating an encoding
rather than crash.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374150 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Add missing std::move in rL374146.

This was breaking some bots:

/home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/include/llvm/Support/Error.h:483:5:   required from ‘llvm::Expected<T>::Expected(OtherT&&, typename std::enable_if<std::is_convertible<_Rep2, _Rep>::value>::type*) [with OtherT = std::vector<llvm::exegesis::CodeTemplate>&; T = std::vector<llvm::exegesis::CodeTemplate>; typename std::enable_if<std::is_convertible<_Rep2, _Rep>::value>::type = void]’
/home/buildbots/ppc64le-clang-lnt-test/clang-ppc64le-lnt/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:238:20:   required from here
/usr/include/c++/6/bits/stl_construct.h:75:7: error: use of deleted function ‘llvm::exegesis::CodeTemplate::CodeTemplate(const llvm::exegesis::CodeTemplate&)’
     { ::new(static_cast<void*>(__p)) _T1(std::forward<_Args>(__args)...); }
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374149 91177308-0d34-0410-b5e6-96231b3b80d8

Unify the two CRC implementations

David added the JamCRC implementation in r246590. More recently, Eugene
added a CRC-32 implementation in r357901, which falls back to zlib's
crc32 function if present.

These checksums are essentially the same, so having multiple
implementations seems unnecessary. This replaces the CRC-32
implementation with the simpler one from JamCRC, and implements the
JamCRC interface in terms of CRC-32 since this means it can use zlib's
implementation when available, saving a few bytes and potentially making
it faster.

JamCRC took an ArrayRef<char> argument, and CRC-32 took a StringRef.
This patch changes it to ArrayRef<uint8_t> which I think is the best
choice, and simplifies a few of the callers nicely.

Differential revision: https://reviews.llvm.org/D68570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374148 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Fix rL374146.

Remove extra semicolon: Target.cpp:187:2: warning: extra ‘;’ [-Wpedantic]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374147 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Explore LEA addressing modes.

Summary:
This will help for PR32326.

This shows the well-known issue with `RBP` and `R13` as base registers.

Reviewers: gchatelet

Subscribers: tschuett, llvm-commits, RKSimon, andreadb

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68646

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374146 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r374139, "[dsymutil] Fix handling of common symbols in multiple object files."

The added test files ("com", "com1.o", "com2.o") are reserved names on
Windows, and makes 'git checkout' fail with a filesystem error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374144 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis][NFC] Remove unecessary `using llvm::` directives.

We've been in namespace llvm for at least a year.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374143 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Fix handling of common symbols in multiple object files.

For common symbols the linker emits only a single symbol entry in the
debug map. This caused dsymutil to not relocate common symbols when
linking DWARF coming form object files that did not have this entry.
This patch fixes that by keeping track of common symbols in the object
files and synthesizing a debug map entry for them using the address from
the main binary.

Differential revision: https://reviews.llvm.org/D68680

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374139 91177308-0d34-0410-b5e6-96231b3b80d8

[TypeSize] Fix module builds (cassert)

TypeSize.h uses `assert` statements without including
the <cassert> header first which leads to failures
in modular builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374138 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: unbreak libcxx build after r374116 by restoring gen_link_script.py for gn

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374129 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Fixes broken sphinx build - undefined label

Removes label ref pointing to non-existent subsystem docs page.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374128 91177308-0d34-0410-b5e6-96231b3b80d8

[IA] Add tests for a few other edge cases

Test with the last eight bits within the range [7F, FF] and with
lower-case hex letters.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374124 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] Improve verbose output (NFC)

The verbose output for finding relocations assumed that we'd always dump
the DIE after (which starts with a newline) and therefore didn't include
one itself. However, this isn't always true, leading to garbled output.

This patch adds a newline to the verbose output and adds a line that
says that the DIE is being kept (which isn't obvious otherwise). It also
adds a 0x prefix to the relocations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374123 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Move LLE enum handling to .def to match RLE handling

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374122 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP} Replace SExt with ZExt if the input is known-non-negative

Summary:
zero-extension is far more friendly for further analysis.
While this doesn't directly help with the shift-by-signext problem, this is not unrelated.

This has the following effect on test-suite (numbers collected after the finish of middle-end module pass manager):
| Statistic                            |     old |     new | delta | percent change |
| correlated-value-propagation.NumSExt |       0 |    6026 |  6026 |   +100.00%     |
| instcount.NumAddInst                 |  272860 |  271283 | -1577 |     -0.58%     |
| instcount.NumAllocaInst              |   27227 |   27226 | -1    |      0.00%     |
| instcount.NumAndInst                 |   63502 |   63320 | -182  |     -0.29%     |
| instcount.NumAShrInst                |   13498 |   13407 | -91   |     -0.67%     |
| instcount.NumAtomicCmpXchgInst       |    1159 |    1159 |  0    |      0.00%     |
| instcount.NumAtomicRMWInst           |    5036 |    5036 |  0    |      0.00%     |
| instcount.NumBitCastInst             |  672482 |  672353 | -129  |     -0.02%     |
| instcount.NumBrInst                  |  702768 |  702195 | -573  |     -0.08%     |
| instcount.NumCallInst                |  518285 |  518205 | -80   |     -0.02%     |
| instcount.NumExtractElementInst      |   18481 |   18482 |  1    |      0.01%     |
| instcount.NumExtractValueInst        |   18290 |   18288 | -2    |     -0.01%     |
| instcount.NumFAddInst                |  139035 |  138963 | -72   |     -0.05%     |
| instcount.NumFCmpInst                |   10358 |   10348 | -10   |     -0.10%     |
| instcount.NumFDivInst                |   30310 |   30302 | -8    |     -0.03%     |
| instcount.NumFenceInst               |     387 |     387 |  0    |      0.00%     |
| instcount.NumFMulInst                |   93873 |   93806 | -67   |     -0.07%     |
| instcount.NumFPExtInst               |    7148 |    7144 | -4    |     -0.06%     |
| instcount.NumFPToSIInst              |    2823 |    2838 |  15   |      0.53%     |
| instcount.NumFPToUIInst              |    1251 |    1251 |  0    |      0.00%     |
| instcount.NumFPTruncInst             |    2195 |    2191 | -4    |     -0.18%     |
| instcount.NumFSubInst                |   92109 |   92103 | -6    |     -0.01%     |
| instcount.NumGetElementPtrInst       | 1221423 | 1219157 | -2266 |     -0.19%     |
| instcount.NumICmpInst                |  479140 |  478929 | -211  |     -0.04%     |
| instcount.NumIndirectBrInst          |       2 |       2 |  0    |      0.00%     |
| instcount.NumInsertElementInst       |   66089 |   66094 |  5    |      0.01%     |
| instcount.NumInsertValueInst         |    2032 |    2030 | -2    |     -0.10%     |
| instcount.NumIntToPtrInst            |   19641 |   19641 |  0    |      0.00%     |
| instcount.NumInvokeInst              |   21789 |   21788 | -1    |      0.00%     |
| instcount.NumLandingPadInst          |   12051 |   12051 |  0    |      0.00%     |
| instcount.NumLoadInst                |  880079 |  878673 | -1406 |     -0.16%     |
| instcount.NumLShrInst                |   25919 |   25921 |  2    |      0.01%     |
| instcount.NumMulInst                 |   42416 |   42417 |  1    |      0.00%     |
| instcount.NumOrInst                  |  100826 |  100576 | -250  |     -0.25%     |
| instcount.NumPHIInst                 |  315118 |  314092 | -1026 |     -0.33%     |
| instcount.NumPtrToIntInst            |   15933 |   15939 |  6    |      0.04%     |
| instcount.NumResumeInst              |    2156 |    2156 |  0    |      0.00%     |
| instcount.NumRetInst                 |   84485 |   84484 | -1    |      0.00%     |
| instcount.NumSDivInst                |    8599 |    8597 | -2    |     -0.02%     |
| instcount.NumSelectInst              |   45577 |   45913 |  336  |      0.74%     |
| instcount.NumSExtInst                |   84026 |   78278 | -5748 |     -6.84%     |
| instcount.NumShlInst                 |   39796 |   39726 | -70   |     -0.18%     |
| instcount.NumShuffleVectorInst       |  100272 |  100292 |  20   |      0.02%     |
| instcount.NumSIToFPInst              |   29131 |   29113 | -18   |     -0.06%     |
| instcount.NumSRemInst                |    1543 |    1543 |  0    |      0.00%     |
| instcount.NumStoreInst               |  805394 |  804351 | -1043 |     -0.13%     |
| instcount.NumSubInst                 |   61337 |   61414 |  77   |      0.13%     |
| instcount.NumSwitchInst              |    8527 |    8524 | -3    |     -0.04%     |
| instcount.NumTruncInst               |   60523 |   60484 | -39   |     -0.06%     |
| instcount.NumUDivInst                |    2381 |    2381 |  0    |      0.00%     |
| instcount.NumUIToFPInst              |    5549 |    5549 |  0    |      0.00%     |
| instcount.NumUnreachableInst         |    9855 |    9855 |  0    |      0.00%     |
| instcount.NumURemInst                |    1305 |    1305 |  0    |      0.00%     |
| instcount.NumXorInst                 |   10230 |   10081 | -149  |     -1.46%     |
| instcount.NumZExtInst                |   60353 |   66840 |  6487 |     10.75%     |
| instcount.TotalBlocks                |  829582 |  829004 | -578  |     -0.07%     |
| instcount.TotalFuncs                 |   83818 |   83817 | -1    |      0.00%     |
| instcount.TotalInsts                 | 7316574 | 7308483 | -8091 |     -0.11%     |

TLDR: we produce -0.11% less instructions, -6.84% less `sext`, +10.75% more `zext`.
To be noted, clearly, not all new `zext`'s are produced by this fold.

(And now i guess it might have been interesting to measure this for D68103 :S)

Reviewers: nikic, spatel, reames, dberlin

Reviewed By: nikic

Subscribers: hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68654

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374112 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP][NFC] Revisit sext vs. zext test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374111 91177308-0d34-0410-b5e6-96231b3b80d8

Mark several PointerIntPair methods as lvalue-only

No point in mutating 'this' if it's just going to be thrown away.

https://reviews.llvm.org/D63945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374102 91177308-0d34-0410-b5e6-96231b3b80d8

[tblgen] Add getOperatorAsDef() to Record

Summary:
While working with DagInit's, it's often the case that you expect the
operator to be a reference to a def. This patch adds a wrapper for this
common case to reduce the amount of boilerplate callers need to duplicate
repeatedly.

getOperatorAsDef() returns the record if the DagInit has an operator that is
a DefInit. Otherwise, it prints a fatal error.

There's only a few pre-existing examples in LLVM at the moment and I've
left a few instances of the code this simplifies as they had more specific
error messages than the generic one this produces. I'm going to be using
this a fair bit in my subsequent patches.

Reviewers: bogner, volkan, nhaehnle

Reviewed By: nhaehnle

Subscribers: nhaehnle, hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374101 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] do compile-once run-everywhere relocation for bitfields

A bpf specific clang intrinsic is introduced:
   u32 __builtin_preserve_field_info(member_access, info_kind)
Depending on info_kind, different information will
be returned to the program. A relocation is also
recorded for this builtin so that bpf loader can
patch the instruction on the target host.
This clang intrinsic is used to get certain information
to facilitate struct/union member relocations.

The offset relocation is extended by 4 bytes to
include relocation kind.
Currently supported relocation kinds are
enum {
    FIELD_BYTE_OFFSET = 0,
    FIELD_BYTE_SIZE,
    FIELD_EXISTENCE,
    FIELD_SIGNEDNESS,
    FIELD_LSHIFT_U64,
    FIELD_RSHIFT_U64,
};
for __builtin_preserve_field_info. The old
access offset relocation is covered by
    FIELD_BYTE_OFFSET = 0.

An example:
struct s {
    int a;
    int b1:9;
    int b2:4;
};
enum {
    FIELD_BYTE_OFFSET = 0,
    FIELD_BYTE_SIZE,
    FIELD_EXISTENCE,
    FIELD_SIGNEDNESS,
    FIELD_LSHIFT_U64,
    FIELD_RSHIFT_U64,
};

void bpf_probe_read(void *, unsigned, const void *);
int field_read(struct s *arg) {
  unsigned long long ull = 0;
  unsigned offset = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_OFFSET);
  unsigned size = __builtin_preserve_field_info(arg->b2, FIELD_BYTE_SIZE);
#ifdef USE_PROBE_READ
  bpf_probe_read(&ull, size, (const void *)arg + offset);
  unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64);
#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
  lshift = lshift + (size << 3) - 64;
#endif
#else
  switch(size) {
  case 1:
    ull = *(unsigned char *)((void *)arg + offset); break;
  case 2:
    ull = *(unsigned short *)((void *)arg + offset); break;
  case 4:
    ull = *(unsigned int *)((void *)arg + offset); break;
  case 8:
    ull = *(unsigned long long *)((void *)arg + offset); break;
  }
  unsigned lshift = __builtin_preserve_field_info(arg->b2, FIELD_LSHIFT_U64);
#endif
  ull <<= lshift;
  if (__builtin_preserve_field_info(arg->b2, FIELD_SIGNEDNESS))
    return (long long)ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64);
  return ull >> __builtin_preserve_field_info(arg->b2, FIELD_RSHIFT_U64);
}

There is a minor overhead for bpf_probe_read() on big endian.

The code and relocation generated for field_read where bpf_probe_read() is
used to access argument data on little endian mode:
        r3 = r1
        r1 = 0
        r1 = 4  <=== relocation (FIELD_BYTE_OFFSET)
        r3 += r1
        r1 = r10
        r1 += -8
        r2 = 4  <=== relocation (FIELD_BYTE_SIZE)
        call bpf_probe_read
        r2 = 51 <=== relocation (FIELD_LSHIFT_U64)
        r1 = *(u64 *)(r10 - 8)
        r1 <<= r2
        r2 = 60 <=== relocation (FIELD_RSHIFT_U64)
        r0 = r1
        r0 >>= r2
        r3 = 1  <=== relocation (FIELD_SIGNEDNESS)
        if r3 == 0 goto LBB0_2
        r1 s>>= r2
        r0 = r1
LBB0_2:
        exit

Compare to the above code between relocations FIELD_LSHIFT_U64 and
FIELD_LSHIFT_U64, the code with big endian mode has four more
instructions.
        r1 = 41   <=== relocation (FIELD_LSHIFT_U64)
        r6 += r1
        r6 += -64
        r6 <<= 32
        r6 >>= 32
        r1 = *(u64 *)(r10 - 8)
        r1 <<= r6
        r2 = 60   <=== relocation (FIELD_RSHIFT_U64)

The code and relocation generated when using direct load.
        r2 = 0
        r3 = 4
        r4 = 4
        if r4 s> 3 goto LBB0_3
        if r4 == 1 goto LBB0_5
        if r4 == 2 goto LBB0_6
        goto LBB0_9
LBB0_6:                                 # %sw.bb1
        r1 += r3
        r2 = *(u16 *)(r1 + 0)
        goto LBB0_9
LBB0_3:                                 # %entry
        if r4 == 4 goto LBB0_7
        if r4 == 8 goto LBB0_8
        goto LBB0_9
LBB0_8:                                 # %sw.bb9
        r1 += r3
        r2 = *(u64 *)(r1 + 0)
        goto LBB0_9
LBB0_5:                                 # %sw.bb
        r1 += r3
        r2 = *(u8 *)(r1 + 0)
        goto LBB0_9
LBB0_7:                                 # %sw.bb5
        r1 += r3
        r2 = *(u32 *)(r1 + 0)
LBB0_9:                                 # %sw.epilog
        r1 = 51
        r2 <<= r1
        r1 = 60
        r0 = r2
        r0 >>= r1
        r3 = 1
        if r3 == 0 goto LBB0_11
        r2 s>>= r1
        r0 = r2
LBB0_11:                                # %sw.epilog
        exit

Considering verifier is able to do limited constant
propogation following branches. The following is the
code actually traversed.
        r2 = 0
        r3 = 4   <=== relocation
        r4 = 4   <=== relocation
        if r4 s> 3 goto LBB0_3
LBB0_3:                                 # %entry
        if r4 == 4 goto LBB0_7
LBB0_7:                                 # %sw.bb5
        r1 += r3
        r2 = *(u32 *)(r1 + 0)
LBB0_9:                                 # %sw.epilog
        r1 = 51   <=== relocation
        r2 <<= r1
        r1 = 60   <=== relocation
        r0 = r2
        r0 >>= r1
        r3 = 1
        if r3 == 0 goto LBB0_11
        r2 s>>= r1
        r0 = r2
LBB0_11:                                # %sw.epilog
        exit

For native load case, the load size is calculated to be the
same as the size of load width LLVM otherwise used to load
the value which is then used to extract the bitfield value.

Differential Revision: https://reviews.llvm.org/D67980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374099 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix i16 arithmetic pattern redundancy

There were 2 problems here. First, these patterns were duplicated to
handle the inverted shift operands instead of using the commuted
PatFrags.

Second, the point of the zext folding patterns don't apply to the
non-0ing high subtargets. They should be skipped instead of inserting
the extension. The zeroing high code would be emitted when necessary
anyway. This was also emitting unnecessary zexts in cases where the
high bits were undefined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374092 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"

Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"

This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a.
This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04.

The patch is breaking PowerPC internal build, checked with author, reverting
on behalf of him for now due to timezone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374091 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] add test with prefer-vector-width function attribute; NFC (PR43578)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374090 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeExtractor] Factor out and reuse shrinkwrap analysis

Factor out CodeExtractor's analysis of allocas (for shrinkwrapping
purposes), and allow the analysis to be reused.

This resolves a quadratic compile-time bug observed when compiling
AMDGPUDisassembler.cpp.o.

Pre-patch (Release + LTO clang):

```
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  176.5278 ( 57.8%)   0.4915 ( 18.5%)  177.0192 ( 57.4%)  177.4112 ( 57.3%)  Hot Cold Splitting
```

Post-patch (ReleaseAsserts clang):

```
   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
  1.4051 (  3.3%)   0.0079 (  0.3%)   1.4129 (  3.2%)   1.4129 (  3.2%)  Hot Cold Splitting
```

Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary
pre- vs. post-patch.

An alternate approach is to hide CodeExtractorAnalysisCache from clients
of CodeExtractor, and to recompute the analysis from scratch inside of
CodeExtractor::extractCodeRegion(). This eliminates some redundant work
in the shrinkwrapping legality check. However, some clients continue to
exhibit O(n^2) compile time behavior as computing the analysis is O(n).

rdar://55912966

Differential Revision: https://reviews.llvm.org/D68616

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374089 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add offsets to MMO when lowering buffer intrinsics

Summary:
Without offsets on the MachineMemOperands (MMOs),
MachineInstr::mayAlias() will return true for all reads and writes to the
same resource descriptor. This leads to O(N^2) complexity in the MachineScheduler
when analyzing dependencies of buffer loads and stores. It also limits
the SILoadStoreOptimizer from merging more instructions.

This patch reduces the compile time of one pathological compute shader
from 12 seconds to 1 second.

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65097

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374087 91177308-0d34-0410-b5e6-96231b3b80d8