granicus.if.org Git

Make sure that the DAG combiner doesn't merge stores that we explicitly
asked not be greater than preferred vector width for the vectorizer.
Test for both 128 and 256 with a skylake architecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360183 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow sinking fneg operands through an FP min/max

Fundamentally/generally, we should not have to rely on bailouts/crippling of
folds. In this particular case, I think we always recognize the inverted
predicate min/max pattern, so there should not be any loss of optimization.
Codegen looks better because we are eliminating an fneg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360180 91177308-0d34-0410-b5e6-96231b3b80d8

[CommandLine] Allow Options to specify multiple OptionCategory's.

Summary:
It's not uncommon for separate components to share common
Options, e.g., it's common for related Passes to share Options in
addition to the Pass specific ones.

With this change, components can use OptionCategory's to simply help
output even if some of the options are shared.

Reviewed By: MaskRay

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360179 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Yet more combination of tests for unordered.atomic memset

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360177 91177308-0d34-0410-b5e6-96231b3b80d8

Debug Info: Support address space attributes on rvalue references.

DWARF5, 2.12 20ff says that

Any debugging information entry representing a pointer or reference
type [may have a DW_AT_address_class attribute].

The existing code (https://reviews.llvm.org/D29670) seems to take a
quite literal interpretation of that wording. I don't see a reason why
an rvalue reference isn't a reference type in the spirit of that
paragraph. This patch allows rvalue references to also have address
spaces.

rdar://problem/50511483

Differential Revision: https://reviews.llvm.org/D61625

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360176 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC][NFC] Update build-vector-tests.ll using utils/update_llc_test_checks.py

build-vector-tests.ll is a huge testcase, it is hard to maintain: eg:
any fundamental changes might need to update hundreds of lines. We should
leverage the script to maintain it.

This patch simply run utils/update_llc_test_checks.py on it. There
should be no missing test points.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360175 91177308-0d34-0410-b5e6-96231b3b80d8

Guard __builtin_available() with __has_builtin to support older host compilers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360174 91177308-0d34-0410-b5e6-96231b3b80d8

Regenerate test to try and fix buildbots

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360173 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor

When simplifying TokenFactors, we potentially iterate over all
operands of a large number of TokenFactors. This causes quadratic
compile times in some cases and the large token factors cause additional
scalability problems elsewhere.

This patch adds some limits to the number of nodes explored for the
cases mentioned above.

Reviewers: niravd, spatel, craig.topper

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D61397

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360171 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for FP min/max with negated operands; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360170 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid use-after-move warnings by using swap instead. NFCI.

Swap should be as quick in these cases, and leaves the original variables in a known (empty) state.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360164 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion

Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel

Reviewed By: hfinkel

Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360162 91177308-0d34-0410-b5e6-96231b3b80d8

[JITLink] Fix some copy/paste related typos in a test case.

Several X86_64_RELOC_SUBTRACTOR tests for subtrahend handling were incorrectly
labeled as tests for kinds of minuend handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360160 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Add explicit representations of umin/smin

Summary:
Currently we express umin as `~umax(~x, ~y)`. However, this becomes
a problem for operands in non-integral pointer spaces, because `~x`
is not something we can compute for `x` non-integral. However, since
comparisons are generally still allowed, we are actually able to
express `umin(x, y)` directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions were
largely identical.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D50167

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360159 91177308-0d34-0410-b5e6-96231b3b80d8

Fix local shadow variable warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360157 91177308-0d34-0410-b5e6-96231b3b80d8

Precommit tests for or/add transform. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360149 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Use the two-constant NR algorithm for refining estimates

The single-constant algorithm produces infinities on a lot of denormal values.
The precision of the two-constant algorithm is actually sufficient across the
range of denormals. We will switch to that algorithm for now to avoid the
infinities on denormals. In the future, we will re-evaluate the algorithm to
find the optimal one for PowerPC.

Differential revision: https://reviews.llvm.org/D60037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360144 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] - Print relocation record in a GNU format.

This fixes the https://bugs.llvm.org/show_bug.cgi?id=41355.

Previously with -r we printed relocation section name instead of the target section name.
It was like this: "RELOCATION RECORDS FOR [.rel.text]"
Now it is: "RELOCATION RECORDS FOR [.text]"

Also when relocation target section has more than one relocation section,
we did not combine the output. Now we do.

Differential revision: https://reviews.llvm.org/D61312

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360143 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r360116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360141 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Run `git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360140 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] BenchmarkRunner::runConfiguration(): write small snippet to memory

It was previously writing this temporary snippet to file,
then reading it back, but leaving the tmp file in place.
This is both unefficient, and results in huge garbage pileup
in /tmp.

One would have thought it would have been caught during D60317..

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360138 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] - Allow setting st_value explicitly for Symbol.

In some cases it is useful to explicitly set symbol's st_name value.
For example, I am using it in a patch for LLD to remove the broken
binary from a test case and replace it with a YAML test.

Differential revision: https://reviews.llvm.org/D61180

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360137 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[TableGen] Fix a typo"

Summary:
This reverts commit r360106.

The revisioin causes llvm-tblgen to hang while generating info for
RISCV.td. The root cause might be in the RISCV.td definition but I don't
know enough about this to investigate further.

Command that starts hangning after r360106:
`llvm-build/bin/llvm-tblgen -I llvm/include -I llvm/tools/clang/include -I llvm/lib/Target/RISCV -gen-instr-info llvm/lib/Target/RISCV/RISCV.td`

Reviewers: sammccall, yan_luo, craig.topper, gribozavr

Reviewed By: gribozavr

Subscribers: PkmX, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61632

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360136 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Widen G_SELECT operands

...except for the condition operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360135 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Fold concat(packus(),packus()) -> packus(concat(),concat()) (PR34773)

Basic "revectorization" combine, we can probably do more opcodes here but it can be a tricky cost-benefit depending on where the subvectors came from - but this case helps shuffle combining.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360134 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed "Value stored to 'Opc' is never read" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360133 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Reduce scope of variables where possible. NFCI.

Fixes cppcheck warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360131 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINT

We actually have a couple of G_PTRTOINT to s8 when building clang, so
we should do something about them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360130 91177308-0d34-0410-b5e6-96231b3b80d8

Fix uninitialized variable warning. NFCI.

This also fixes a scan-build "array subscript is undefined" warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360128 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Widen G_GEP index operand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360127 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit access

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360125 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] InstructionBenchmark::writeYamlTo(): don't forget to flush()

This *APPEARS* to fix a *very* infuriating issue of Yaml's being corrupted,
partially written, truncated. Or at least i'm not seeing the issue
on a new benchmark sweep.

The issue is somewhat rare, happens maybe once in 1000 benchmarks.
Which means there are up to hundreds of broken benchmarks
for a full x86 sweep in a single mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360124 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Verify that SOP2/SOPC instructions have at most one immediate operand

Summary:
No test case because I don't know of a way to trigger this, but I
accidentally caused this to fail while working on a different change.

Change-Id: I8015aa447fe27163cc4e4902205a203bd44bf7e3

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360123 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel][X86] If selectFNeg fails, fall back to SelectionDAG not treating it as an fsub.

Summary:
If fneg lowering for fsub -0.0, x fails we currently fall back to treating it as an fsub. This has different behavior for nans than the xor with sign bit trick we normally try to do. On X86, the xor trick for double fails fast-isel in 32-bit mode with sse2 due to 64 bit integer types not being available. With -O2 we would always use an xorpd for this case. If we use subsd, this creates an observable behavior difference between -O0 and -O2. So fall back to SelectionDAG if we can't fast-isel it, that way SelectionDAG will use the xorpd.

I believe this patch is restoring the behavior prior to r345295 from last October. This was missed then because our fast isel case in 32-bit mode aborted fast-isel earlier for another reason. But I've added new tests to cover that.

Reviewers: andrew.w.kaylor, cameron.mcinally, spatel, efriedma

Reviewed By: cameron.mcinally

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360111 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add more test coverage for reloctions against section symbols

The only known user of this relocation type and symbol type is
the debug info sections, but we were not testing the `--relocatable`
output path.

This change adds a minimal test case to cover relocations against
section symbols includes `--relocatable` output.

Differential Revision: https://reviews.llvm.org/D61623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360110 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Delete TypedDINodeRef

TypedDINodeRef<T> is a redundant wrapper of Metadata * that is actually a T *.

Accordingly, change DI{Node,Scope,Type}Ref uses to DI{Node,Scope,Type} * or their const variants.
This allows us to delete many resolve() calls that clutter the code.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D61369

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360108 91177308-0d34-0410-b5e6-96231b3b80d8

[SanitizerCoverage] Use different module ctor names for trace-pc-guard and inline-8bit-counters

Fixes the main issue in PR41693

When both modes are used, two functions are created:
`sancov.module_ctor`, `sancov.module_ctor.$LastUnique`, where
$LastUnique is the current LastUnique counter that may be different in
another module.

`sancov.module_ctor.$LastUnique` belongs to the comdat group of the same
name (due to the non-null third field of the ctor in llvm.global_ctors).

    COMDAT group section [    9] `.group' [sancov.module_ctor] contains 6 sections:
       [Index]    Name
       [   10]   .text.sancov.module_ctor
       [   11]   .rela.text.sancov.module_ctor
       [   12]   .text.sancov.module_ctor.6
       [   13]   .rela.text.sancov.module_ctor.6
       [   23]   .init_array.2
       [   24]   .rela.init_array.2

    # 2 problems:
    # 1) If sancov.module_ctor in this module is discarded, this group
    # has a relocation to a discarded section. ld.bfd and gold will
    # error. (Another issue: it is silently accepted by lld)
    # 2) The comdat group has an unstable name that may be different in
    # another translation unit. Even if the linker allows the dangling relocation
    # (with --noinhibit-exec), there will be many undesired .init_array entries
    COMDAT group section [   25] `.group' [sancov.module_ctor.6] contains 2 sections:
       [Index]    Name
       [   26]   .init_array.2
       [   27]   .rela.init_array.2

By using different module ctor names, the associated comdat group names
will also be different and thus stable across modules.

Reviewed By: morehouse, phosek

Differential Revision: https://reviews.llvm.org/D61510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360107 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Fix a typo

Check "Big" instead of "Small" in the second condition.

Differential Revision: https://reviews.llvm.org/D61605

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360106 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor UnaryOperator class

The UnaryOperator class was originally placed in llvm/IR/Instructions.h, with the other UnaryInstructions. However, I'm now thinking that it makes more sense for it to live in llvm/IR/InstrTypes.h, with BinaryOperator. It is more similar to BinaryOperator than any of the other UnaryInstructions.

NFCI

Differential Revision: https://reviews.llvm.org/D61614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360103 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use extended vector register classes in getRegForInlineAsmConstraint to support x/y/zmm16-31 when the type is mismatched.

The FR32/FR64/VR128/VR256 register classes don't contain the upper 16 registers. For most cases we use the default implementation which will find any register class that contains the register in question if the VT is legal for the register class. But if the VT is i32 or i64, we won't find a matching register class and will instead up in the code modified in this patch.

If the requested register is x/y/zmm16-31 we weren't returning a register class that contains those registers and will hit an assertion in the caller.

To fix this, I've changed to use the extended register class instead. I don't believe we need a subtarget check to see if avx512 is enabled. The default implementation just pick whatever register class it finds first. I checked and we currently pick FR32X for XMM0 with an f32 type using the default implementation regardless of whether avx512 is enabled. So I assume its it is ok to do the same for i32.

Differential Revision: https://reviews.llvm.org/D61457

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360102 91177308-0d34-0410-b5e6-96231b3b80d8

Fix bug in getCompleteTypeIndex in codeview debug info

Summary:
When there are multiple instances of a forward decl record type, only the first one is emitted with a type index, because
the type is added to a map with a null type index. Avoid this by reordering so that forward decl types aren't added to the map.

Reviewers: rnk

Subscribers: aprantl, hiraditya, arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61460

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360101 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Glue register copies to tail calls.

This generally follows what other targets do. I don't completely
understand why the special case for tail calls existed in the first
place; even when the code was committed in r105413, call lowering didn't
work in the way described in the comments.

Stack protector lowering breaks if the register copies are not glued to
a tail call: we have to insert the stack protector check before the tail
call, and we choose the location based on the assumption that all
physical register dependencies of a tail call are adjacent to the tail
call. (See FindSplitPointForStackProtector.) This is sort of fragile,
but I don't see any reason to break that assumption.

I'm guessing nobody has seen this before just because it's hard to
convince the scheduler to actually schedule the code in a way that
breaks; even without the glue, the only computation that could actually
be scheduled after the register copies is the computation of the call
address, and the scheduler usually prefers to schedule that before the
copies anyway.

Fixes https://bugs.llvm.org/show_bug.cgi?id=41417

Differential Revision: https://reviews.llvm.org/D60427

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360099 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel] Pass the fneg input operand to hasTrivialKill in FastISel::selectFNeg.

We're trying to calculate the kill flag for OpReg which is the input so we need to pass the input here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360097 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case to show that we don't set the kill flag properly for fast isel handling of fneg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360096 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx1010 verifier changes

Differential Revision: https://reviews.llvm.org/D61521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360095 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32

GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for
all targets.

Differential Revision: https://reviews.llvm.org/D61525

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360094 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Add tests for optimized lowerings of element.unordered.atomic memset/memcmove/memcopy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360093 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Rename tests before adding new ones

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360092 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Autogen a test in advance of updates

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360091 91177308-0d34-0410-b5e6-96231b3b80d8

Fix pr33010, a 2 year old crashing regression

The problem was that we were creating a CMOV64rr <TargetFrameIndex>, <TargetFrameIndex>. The entire point of a TFI is that address code is not generated, so there's no way to legalize/lower this. Instead, simply prevent it's creation.

Arguably, we shouldn't be using *Target*FrameIndices in StatepointLowering at all, but that's a much deeper change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360090 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add more test cases for fast-isel handling of fneg.

The fneg double case is falling back to a subsd in 32-bit mode if you write a test that doesn't trigger a fast-isel abort on the return value.

The subsd lowering has different behavior with respect to nans than using an xor. This is inconsisent with what we would do in SelectionDAG
and can lead to differences between -O0 and -O2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360088 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] gfx1010 memory legalizer

Differential Revision: https://reviews.llvm.org/D61535

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360087 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)"

This reverts r357452 (git commit 21eb771dcb5c11d7500fa6ad551c97a921997f05).

This was causing strange optimization-related test failures on an internal test. Will followup with more details offline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360086 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove the suffix on vcvt[u]si2ss/sd register variants in assembly printing.

We require d/q suffixes on the memory form of these instructions to disambiguate the memory size.
We don't require it on the register forms, but need to support parsing both with and without it.

Previously we always printed the d/q suffix on the register forms, but it's redundant and
inconsistent with gcc and objdump.

After this patch we should support the d/q for parsing, but not print it when its unneeded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360085 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Default to SEH exception handling on MinGW

The SEH implementation is pretty mature at this point.

Differential Revision: https://reviews.llvm.org/D61590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360080 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] sink FP negation of operands through select

We don't always get this:

Cond ? -X : -Y --> -(Cond ? X : Y)

...even with the legacy IR form of fneg in the case with extra uses,
and we miss matching with the newer 'fneg' instruction because we
are expecting binops through the rest of the path.

Differential Revision: https://reviews.llvm.org/D61604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360075 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r360063.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360074 91177308-0d34-0410-b5e6-96231b3b80d8

Pull out repeated CI->getCalledFunction() calls. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360070 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG][X86] Support inline assembly returning an mmx register into a type with fewer than 64 bits.

It's possible to use the 'y' mmx constraint with a type narrower than 64-bits.

This patch supports this by bitcasting the mmx type to 64-bits and then
truncating to the desired type.

There are probably other missing type combinations we need to support, but this
is the case we have a bug report for.

Fixes PR41748.

Differential Revision: https://reviews.llvm.org/D61582

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360069 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Handle <1 x T> vector return types properly.

After support for dealing with types that need to be extended in some way was
added in r358032 we didn't correctly handle <1 x T> return types. These types
don't have a GISel direct representation, instead we just see them as scalars.
When we need to pad them into <2 x T> types however we need to use a
G_BUILD_VECTOR instead of trying to do a G_CONCAT_VECTOR.

This fixes PR41738.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360068 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r359392 and r358887

Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead"
Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling"

Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list.

Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead
moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer
contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with
respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work
there. I'll file a separate PR for that and add test cases.

Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised
if that bug can still be hit independent of that.

This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360066 91177308-0d34-0410-b5e6-96231b3b80d8

Fix more Windows bots after r360015.
Depending on the environment, the directory separator might
appear as \ or \\ on different bots.

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/17446/steps/test-check-all/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360065 91177308-0d34-0410-b5e6-96231b3b80d8

Remove duplicate assignments. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360064 91177308-0d34-0410-b5e6-96231b3b80d8

Add libc++ to link XRay test cases if libc++ is used to build CLANG

Summary: When libc++ is used to build CLANG, its XRay libraries libclang_rt.xray-*.a have dependencies on libc++. Therefore, libc++ is needed to link and run XRay test cases. For Linux -rpath is also needed to specify where to load libc++. This change sets macro LLVM_LIBCXX_USED to 1 if libc++ is actually used in the build. XRay tests then check the flag and add -L<llvm_shlib_dir> -lc++ and -Wl,-rpath=<llvm_shlib_dir> if needed.

Reviewers: hubert.reinterpretcast, amyk, dberris, jasonliu, sfertile, EricWF

Subscribers: dberris, mgorny, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61016

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360060 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce code duplication; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360059 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for fneg+sel; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360058 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: More TODO tweaking

Differential Revision: https://reviews.llvm.org/D61468

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360057 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Update TODO now that libcxx libcxxabi libunwind clang-tools-extra are done

Differential Revision: https://reviews.llvm.org/D61468

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360056 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] Add srem() support

Add support for srem() to ConstantRange so we can use it in LVI. For
srem the sign of the result matches the sign of the LHS. For the RHS
only the absolute value is important. Apart from that the logic is
like urem.

Just like for urem this is only an approximate implementation. The tests
check a few specific cases and run an exhaustive test for conservative
correctness (but not exactness).

Differential Revision: https://reviews.llvm.org/D61207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360055 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG][AArch64] Boolean and/or reduce to umax/min reduce (PR41635)

This addresses one half of https://bugs.llvm.org/show_bug.cgi?id=41635
by combining a VECREDUCE_AND/OR into VECREDUCE_UMIN/UMAX (if latter is
legal but former is not) for zero-or-all-ones boolean reductions (which
are detected based on sign bits).

Differential Revision: https://reviews.llvm.org/D61398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360054 91177308-0d34-0410-b5e6-96231b3b80d8

Add FNeg support to InstructionSimplify

Differential Revision: https://reviews.llvm.org/D61573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360053 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] regenerate test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360052 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce code duplication; NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360051 91177308-0d34-0410-b5e6-96231b3b80d8

Modernize repmovsb implementation of x86 memcpy and allow runtime sizes.

Summary: This is a prerequisite to RFC http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61593

Fix typo.

Turn this patch into an NFC.

Addressing comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360050 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r360018

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360049 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix uninitialized members in constructor warnings. NFCI.

Initialize all member variables in X86ATTInstPrinter and X86DAGToDAGISel constructors to fix cppcheck warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360047 91177308-0d34-0410-b5e6-96231b3b80d8

Fix CMake Invalid Escape Sequence

Patch by xoviat

Differential Revision: https://reviews.llvm.org/D60658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360045 91177308-0d34-0410-b5e6-96231b3b80d8

Fix compilation warnings when compiling with GCC 7.3

Differential Revision: https://reviews.llvm.org/D61046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360044 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion

A condition for exiting the legalization of v4i32 conversion to v2f64 through
extract/convert/build erroneously checks for the extract having type i32.
This is not adequate as smaller extracts are actually legalized to i32 as well.
Furthermore, an early exit is missing which means that we only check that
both extracts are from the same vector if that check fails.
As a result, both cases in the included test case fail - the first gets a
select error and the second generates incorrect code.

The culprit commit is r274535.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360043 91177308-0d34-0410-b5e6-96231b3b80d8

X86DAGToDAGISel::tryVPTESTM - fix uninitialized variable warning. NFCI.

findBroadcastedOp should always initialize the value if it returns true but static-analyzer isn't great at recognising this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360037 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Remove redundant bracket in rL360035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360036 91177308-0d34-0410-b5e6-96231b3b80d8

Try fix Windows bot after rL360015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360035 91177308-0d34-0410-b5e6-96231b3b80d8

Try fix Windows bot after rL360015

http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast/builds/25599/steps/test/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360034 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-c-test] Make include-all.c do what its name says it does

The purpose of this file is to make sure that all includes in llvm-c
works when included from a C source file (i.e no C++isms sneaked in).
To do this it must actually include all the include files.

Reviewed By: whitequark
Differential Revision: https://reviews.llvm.org/D61567

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360033 91177308-0d34-0410-b5e6-96231b3b80d8

[LoadStoreVectorizer] vectorizeStoreChain - ensure we find a store type.

Properly initialize store type to null then ensure we find a real store type in the chain.

Fixes scan-build null dereference warning and makes the code clearer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360031 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Move X86 tests under the X86 directory

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360029 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] X86InstrInfo::findThreeSrcCommutedOpIndices - fix unread variable warning.

scan-build was reporting that CommutableOpIdx1 never used its original initialized value - move it down to where its first used to make the real initialization more obvious (and matches the comment that's there).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360028 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] lowerVectorShuffle - use any_of to detect out of bounds shuffle indices. NFCI.

Fixes cppcheck local shadow warning as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360027 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Remove duplicated std::move from LocRange constructor

scan-build was reporting that we were referencing a moved variable - in fact we were moving it twice.....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360025 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Update memcpy tests

Summary: Runs utils/update_llc_test_checks.py on a few memcpy files

Reviewers: courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61507

Remove cfi noise by adding nounwind

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360023 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Move files to correct directories after D60552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360022 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Simplify bcmp too.

Summary: Fixes PR40699.

Reviewers: gchatelet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61585

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360021 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] This is a test for the commit access.

Summary: Signed-off-by: Pengfei Wang <pengfei.wang@intel.com>

Reviewers: LuoYuanke

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61580

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360019 91177308-0d34-0410-b5e6-96231b3b80d8

Enable AVX512_BF16 instructions, which are supported for BFLOAT16 in Cooper Lake

Summary:
1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake;
2. Enable VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision.
VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed BF16 Data.
VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data.
VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed Single Precision.
For more details about BF16 isa, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference

Author: LiuTianle

Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, RKSimon, spatel

Reviewed By: craig.topper

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360017 91177308-0d34-0410-b5e6-96231b3b80d8

DWARF v5: fix directory index in the line table

Summary:
Prior to DWARF v5, a directory index of 0 represents DW_AT_comp_dir.

In DWARF v5, the index starts with 0 and Entry.DirIdx is the index into
Prologue.IncludeDirectories.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D61253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360015 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref.

Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert
DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian
targets for same reasons as in D59687.

Differential Revision: https://reviews.llvm.org/D60611

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360013 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-c] Make LLVMGetStringErrorTypeId a proper prototype

In C a function declaration with an empty argument list isn't a real
prototype, it will allow calling the function with any number of
arguments. It will also cause warnings when used in C code compiled with
'-Wstrict-prototypes'

Reviewed By: whitequark
Differential Revision: https://reviews.llvm.org/D61568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360012 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Replace llvm_unreachable at the end of getCopyFromParts with a report_fatal_error.

Based on PR41748, not all cases are handled in this function.

llvm_unreachable is treated as an optimization hint than can prune code paths
in a release build. This causes weird behavior when PR41748 is encountered on a
release build. It appears to generate an fp_round instruction from the floating
point code.

Making this a report_fatal_error prevents incorrect optimization of the code
and will instead generate a message to file a bug report.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360008 91177308-0d34-0410-b5e6-96231b3b80d8

[libcxxabi] Don't use -fvisibility-global-new-delete-hidden when not defining them

When builing the hermetic static library, the compiler switch
-fvisibility-global-new-delete-hidden is necessary to get the new and
delete operator definitions made correctly. However, when those
definitions are not included in the library, then this switch does harm.
With lld (though not all linkers) setting STV_HIDDEN on SHN_UNDEF
symbols makes it an error to leave them undefined or defined via dynamic
linking that should generate PLTs for -shared linking (lld makes this a
hard error even without -z defs). Though leaving the symbols undefined
would usually work in practice if the linker were to allow it (and the
user didn't pass -z defs), this actually indicates a real problem that
could bite some target configurations more subtly at runtime. For
example, x86-32 ELF -fpic code generation uses hidden visibility on
declarations in the caller's scope as a signal that the call will never
be resolved to a PLT entry and so doesn't have to meet the special ABI
requirements for PLT calls (setting %ebx). Since these functions might
actually be resolved to PLT entries at link time (we don't know what the
user is linking in when the hermetic library doesn't provide all the
symbols itself), it's not safe for the compiler to treat their
declarations at call sites as having hidden visibility.

Differential Revision: https://reviews.llvm.org/D61572

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360004 91177308-0d34-0410-b5e6-96231b3b80d8

[libcxx] Don't use -fvisibility-global-new-delete-hidden when not defining them

When builing the hermetic static library, the compiler switch
-fvisibility-global-new-delete-hidden is necessary to get the new and
delete operator definitions made correctly. However, when those
definitions are not included in the library, then this switch does harm.
With lld (though not all linkers) setting STV_HIDDEN on SHN_UNDEF
symbols makes it an error to leave them undefined or defined via dynamic
linking that should generate PLTs for -shared linking (lld makes this a
hard error even without -z defs). Though leaving the symbols undefined
would usually work in practice if the linker were to allow it (and the
user didn't pass -z defs), this actually indicates a real problem that
could bite some target configurations more subtly at runtime. For
example, x86-32 ELF -fpic code generation uses hidden visibility on
declarations in the caller's scope as a signal that the call will never
be resolved to a PLT entry and so doesn't have to meet the special ABI
requirements for PLT calls (setting %ebx). Since these functions might
actually be resolved to PLT entries at link time (we don't know what the
user is linking in when the hermetic library doesn't provide all the
symbols itself), it's not safe for the compiler to treat their
declarations at call sites as having hidden visibility.

Differential Revision: https://reviews.llvm.org/D61571

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360003 91177308-0d34-0410-b5e6-96231b3b80d8