granicus.if.org Git

[X86][CodeGen][NFC] Delay `combineIncDecVector()` from DAGCombine to X86DAGToDAGISel

Summary:
We were previously doing it in DAGCombine.
But we also want to do `sub %x, C` -> `add %x, (sub 0, C)` for vectors in DAGCombine.
So if we had `sub %x, -1`, we'll transform it to `add %x, 1`,
which `combineIncDecVector()` will immediately transform back into `sub %x, -1`,
and here we go again...

I've marked this as NFC since not a single test changes,
but since that 'changes' DAGCombine, probably this isn't fully NFC.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: craig.topper

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62327

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370327 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] (insert_vector_elt (vector_shuffle X, Y), (extract_vector_elt X, N), IdxC) -> (vector_shuffle X, Y)

Summary: This is beneficial when the shuffle is only used once and end up being generated in a few places when some node is combined into a shuffle.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370326 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Masked load and store and predicate tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370325 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Shift amount reassociation in bittest: trunc-of-lshr (PR42399)

Summary:
Finally, the fold i was looking forward to :)

The legality check is muddy, i doubt i've groked the full generalization,
but it handles all the cases i care about, and can come up with:
https://rise4fun.com/Alive/26j

I.e. we can perform the fold if **any** of the following is true:
* The shift amount is either zero or one less than widest bitwidth
* Either of the values being shifted has at most lowest bit set
* The value that is being shifted by `shl` (which is not truncated) should have no less leading zeros than the total shift amount;
* The value that is being shifted by `lshr` (which **is** truncated) should have no less leading zeros than the widest bit width minus total shift amount minus one

I strongly suspect there is some better generalization, but i'm not aware of it as of right now.
For now i also avoided using actual `computeKnownBits()`, but restricted it to constants.

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370324 91177308-0d34-0410-b5e6-96231b3b80d8

LegalizeSetCCCondCode - Reduce scope of NeedSwap to fix cppcheck warning. NFCI.

No need for this to be defined outside the only switch case its used in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370320 91177308-0d34-0410-b5e6-96231b3b80d8

Fix variable set but no used warnings on NDEBUG builds. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370319 91177308-0d34-0410-b5e6-96231b3b80d8

Fix variable set but no used warning on NDEBUG builds. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370317 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Add a ResourceSectionRef method for getting the data entry, print it in llvm-readobj

Differential Revision: https://reviews.llvm.org/D66819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370311 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Add a bounds checking helper for iterating a coff_resource_dir_table

Instead of blindly incrementing pointers in llvm-readobj, use this
helper, which does bounds checking against the available section
data.

Differential Revision: https://reviews.llvm.org/D66818

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370310 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF] Fix error handling in ResourceSectionRef

Previously, the expression (Reader.readFoo()) was expanded twice,
triggering asserts as one of the Error types ends up not checked
(and as it was expanded twice, the method would end up called twice
if it failed first).

Differential Revision: https://reviews.llvm.org/D66817

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370309 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Print the resource type textually for .res files

This already is done when dumping resources from coff objects.

Differential Revision: https://reviews.llvm.org/D66816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370308 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Remove a leftover string trim operation. NFC.

This became unnecessary in SVN r359153.

Differential Revision: https://reviews.llvm.org/D66815

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370307 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove isel patterns with X86VBroadcast+scalar_to_vector+load.

The DAG should have these as X86VBroadcast+load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370299 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some unneeded X86VBroadcast isel patterns that have larger than 128 bit input types.

We should always be shrinking the input to 128 bits or smaller
when the node is created.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370296 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Deduce "noalias" attribute

Summary:
This patch adds very basic deduction for noalias.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Tags: LLVM

Differential Revision: https://reviews.llvm.org/D66207

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370295 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add a DAG combine to combine INSERTPS and VBROADCAST of a scalar load. Remove corresponding isel patterns.

We had an isel pattern to perform this, but its better to
do it in DAG combine as a simplification. This also fixes the lack
of patterns for AVX512 targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370294 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make inline assembly 'x' and 'v' constraints work for f128.

Including a type legalizer fix to make bitcast operand promotion
work correctly when getSoftenedFloat returns f128 instead of i128.

Fixes PR43157

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370293 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Use Lazy strategy for DTU used for MergeBlockIntoPredecessor.

We do not access the DT in the loop, so we do not have to apply updates
eagerly. We can apply them lazyly and flush them after we are done
merging blocks.

As follow-up work, we might be able to use the DTU above as well,
instead of manually updating the DT.

This brings the example from PR43134 from ~100s to ~4s for a relase +
assertions build on my machine.

Reviewers: efriedma, kuhar, asbirlea, brzycki

Reviewed By: kuhar, brzycki

Differential Revision: https://reviews.llvm.org/D66911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370292 91177308-0d34-0410-b5e6-96231b3b80d8

[ObjectYAML] Fix lifetime issue in dumpDebugLines

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370289 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Improve messages in iteration verify mode

When we now verify the iteration count we will see the actual count
and the expected count before the assertion is triggered.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370285 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][NFC] Add const to map key

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370284 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][Fix] Indicate change correctly

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370283 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Fix typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370282 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't use frame virtual registers

SGPR spills aren't really handled after SILowerSGPRSpills. In order to
directly control what happens if the scavenger needs to spill, the
scavenger needs to be used directly. There is an alternative to
spilling in these contexts anyway since the frame register can be
increment and restored.

This does present another possible issue if spilling is needed for the
unused carry out if an add is needed. I think this can be avoided by
using a scalar add (although that clobbers SCC, which happens anyway).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370281 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel/TableGen: Handle setcc patterns

This is a special case because one node maps to two different G_
instructions, and the operand order is changed.

This mostly enables G_FCMP for AMDPGPU. G_ICMP is still manually
selected for now since it has the SALU and VALU complication to deal
with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370280 91177308-0d34-0410-b5e6-96231b3b80d8

Add requirement to test.

-debug-only option for llc is only available in debug builds so
"REQUIRES: asserts" is needed in the tes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370279 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix a couple isel patterns to not shrink a volatile load.

Also add a FIXME because I'm not sure why these patterns exist. Looks like a missing combine.

And another FIXME because the AVX512 equivalent one of the patterns is missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370276 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall

The patch fixed the issue that RV64 didn't clear the upper bits
when return complex floating value with lp64 ABI.

float _Complex
complex_add(float _Complex a, float _Complex b)
{
return a + b;
}

RealResult = zero_extend(RealA + RealB)
ImageResult = ImageA + ImageB
Return (RealResult | (ImageResult << 32))

The patch introduces shouldExtendTypeInLibCall target hook to suppress
the AssertZext generation when lowering floating LibCall.

Thanks to Eli's comments from the Bugzilla
https://bugs.llvm.org/show_bug.cgi?id=42820

Differential Revision: https://reviews.llvm.org/D65497

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370275 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add atomic.fence instruction

Summary:
This adds `atomic.fence` instruction:
https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator

And we now emit the new `atomic.fence` instruction for multithread
fences, rather than the prevous `atomic.rmw` hack.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, tlively, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66794

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370272 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Fix omission of INSTALL_WITH_TOOLCHAIN to llvm_add_library()

Due to a misstake with r365902 that tried to simplify the install with
toolchain logic LLVM-C.dll was no longer being installed.

Patch By: Jakob Bornecrantz

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370271 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add an empty line to separate different patterns. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370269 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Fix 64-bit address loading in case of applying 32-bit mask to the result

If result of 64-bit address loading combines with 32-bit mask, LLVM
tries to optimize the code and remove "redundant" loading of upper
32-bits of the address. It leads to incorrect code on MIPS64 targets.

MIPS backend creates the following chain of commands to load 64-bit
address in the `MipsTargetLowering::getAddrNonPICSym64` method:
```
(add (shl (add (shl (add %highest(sym), %higher(sym)),
                    16),
               %hi(sym)),
          16),
     %lo(%sym))
```

If the mask presents, LLVM decides to optimize the chain of commands. It
really does not make sense to load upper 32-bits because the 0x0fffffff
mask anyway clears them. After removing redundant commands we get this
chain:
```
(add (shl (%hi(sym), 16), %lo(%sym))
```

There is no patterns matched `(MipsHi (i64 symbol))`. Due a bug in `SYM_32`
predicate definition, backend incorrectly selects a pattern for a 32-bit
symbols and uses the `lui` instruction for loading `%hi(sym)`.

As a result we get incorrect set of instructions with unnecessary 16-bit
left shifting:
```
lui     at,0x0
    R_MIPS_HI16     foo
dsll    at,at,0x10
daddiu  at,at,0
    R_MIPS_LO16     foo
```

This patch resolves two problems:
- Fix `SYM_32/SYM_64` predicates to prevent selection of patterns dedicated
  to 32-bit symbols in case of using N64 ABI.
- Add missed patterns for 64-bit symbols for `%hi/%lo`.

Fix PR42736.

Differential Revision: https://reviews.llvm.org/D66228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370268 91177308-0d34-0410-b5e6-96231b3b80d8

Add tie-breaker for register class sorting in getSuperRegForSubReg

llvm::stable_sort is apparently not sufficient.

Use the same tie-breaker/sorting style as TopoOrderRC fix bot failures.

E.g.

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19401/steps/test-check-all/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370267 91177308-0d34-0410-b5e6-96231b3b80d8

Fix for "DICompileUnit not listed in llvm.dbg.cu" verification error after ...

...cloning a function from a different module

Currently when a function with debug info is cloned from a different module, the
cloned function may have hanging DICompileUnits, so that the module with the
cloned function fails debug info verification.

The proposed fix inserts all DICompileUnits reachable from the cloned function
to "llvm.dbg.cu" metadata operands of the cloned function module.

Reviewed By: aprantl, efriedma

Differential Revision: https://reviews.llvm.org/D66510

Patch by Oleg Pliss (Oleg.Pliss@azul.com)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370265 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj][XCOFF][NFC] Add return statement to avoid -Wimplicit-fallthrough warning

This is to fix the commit in r370097.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370260 91177308-0d34-0410-b5e6-96231b3b80d8

[ASan] Make insertion of version mismatch guard configurable

By default ASan calls a versioned function
`__asan_version_mismatch_check_vXXX` from the ASan module constructor to
check that the compiler ABI version and runtime ABI version are
compatible. This ensures that we get a predictable linker error instead
of hard-to-debug runtime errors.

Sometimes, however, we want to skip this safety guard. This new command
line option allows us to do just that.

rdar://47891956

Reviewed By: kubamracek

Differential Revision: https://reviews.llvm.org/D66826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370258 91177308-0d34-0410-b5e6-96231b3b80d8

Ignore object files that lack coverage information.

Before this change, if multiple binary files were presented, all of them must have been instrumented or the load would fail with coverage_map_error::no_data_found.

Patch by Dean Sturtevant.

Differential Revision: https://reviews.llvm.org/D66763

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370257 91177308-0d34-0410-b5e6-96231b3b80d8

Use the handle --check-prefixes mechanism to de-verbosify a couple atomics tests [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370256 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Import patterns containing SUBREG_TO_REG

Reuse the logic for INSERT_SUBREG to also import SUBREG_TO_REG patterns.

- Split `inferSuperRegisterClass` into two functions, one which tries to use
  an existing TreePatternNode (`inferSuperRegisterClassForNode`), and one that
  doesn't. SUBREG_TO_REG doesn't have a node to leverage, which is the cause
  for the split.

- Rename GlobalISelEmitterInsertSubreg.td to GlobalISelEmitterSubreg.td and
  update it.

- Update impacted tests in the AArch64 and X86 backends.

This is kind of a hit/miss for code size improvements/regressions. E.g. in
add-ext.ll, we now get some identity copies. This isn't really anything the
importer can handle, since it's caused by a later pass introducing the copy for
the sake of correctness.

Differential Revision: https://reviews.llvm.org/D66769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370254 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r370249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370251 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix bug when calculating user_spgr_count for Code Object V3 assembler

Stop counting explicitly disabled user_spgr's in the user_sgpr_count field of the kernel descriptor.

Differential Revision: https://reviews.llvm.org/D66900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370250 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] clean up wrap propagation for reassociated ops; NFCI

Always true/false checks were flagged by static analysis;
https://bugs.llvm.org/show_bug.cgi?id=43143

I have not confirmed the logic difference in propagating nsw vs. nuw,
but presumably we would have noticed a bug by now if that was wrong.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370248 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueMapper] NFC: Remove dead code to pause metadata mapping

Summary:
This functionality was added when Mapper::mapMetadata was recursive. It
is no longer needed after r265456, which switched it to be iterative.

Reviewers: dexonsmith, srhines

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66860

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370236 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][ReleaseNotes] Add a note about the switch to widening legalization for narrow vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370233 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Regularly clear dependences to remove spurious ones

As dependences between abstract attributes can become stale, e.g., if
one was sufficient to imply another one at some point but it has since
been wakened to the point it is not usable for the formerly implied one.
To weed out spurious dependences, and thereby eliminate unneeded
updates, we introduce an option to determine how often the dependence
cache is cleared and recomputed during the fixpoint iteration.

Note that the initial value was determined such that we see a positive
result on our tests.

Differential Revision: https://reviews.llvm.org/D63315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370230 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv] Add fptosi and fptoui constrained intrinsics.

This implements constrained floating point intrinsics for FP to signed and
unsigned integers.

Quoting from D32319:
The purpose of the constrained intrinsics is to force the optimizer to
respect the restrictions that will be necessary to support things like the
STDC FENV_ACCESS ON pragma without interfering with optimizations when
these restrictions are not needed.

Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton
Approved by: Craig Topper
Differential Revision: http://reviews.llvm.org/D63782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370228 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][GlobalISel] Fall back when translating musttail calls

These are currently translated as normal functions calls in AArch64.

Until we have proper tail call lowering, we shouldn't translate these.

Differential Revision: https://reviews.llvm.org/D66842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370225 91177308-0d34-0410-b5e6-96231b3b80d8

Reduce scope of variable only used in a local pattern match. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370224 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added more tests for D66651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370222 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Disable recursion in foldGEPICmp for vector pointer GEPs

Due to missing vector support in this function, recursion can
generate worse code in some cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370221 91177308-0d34-0410-b5e6-96231b3b80d8

Fix uninitialized variable warning in cppcheck. NFCI.

InstCombiner::MaxArraySizeForCombine is set outside the constructor so we need to ensure it has a default initialization value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370220 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Added a comment to avoid possible confusion

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370217 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Adjust number of SGPRs available in Calling Convention

This reduces the number of SGPRs due to some concerns about running
out of SGPRs if you make all the SGPRs that aren't reserved available
for the calling convention.

Change-Id: Idb4ca4dc72f5b6808cb524ff7270915a8de5b4c1

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370215 91177308-0d34-0410-b5e6-96231b3b80d8

Remove duplicate 'BitWidth' variable. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370212 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Restrict liveness and return information to functions

Summary:
Until we have proper call-site information we should not recompute
liveness and return information for each call site. This patch directly
uses the function versions and introduces TODOs at the usage sites.

The required iterations to get to the fixpoint are most of the time
reduced by this change and we always avoid work duplication.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66562

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370208 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombiner::visitSelectInst - rename Pred to MinMaxPred to stop shadow variable warning. NFCI.

We have a lot of Predicate variables, all similarly named....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370207 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[yaml2obj] - Don't allow setting StOther and Other/Visibility at the same time."

This relands this commit, I mistakenly reverted the original change
thinking it was the cause of the observed MSan failures but it was not.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370206 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Don't generate libcalls for wide shifts on Windows (PR42711)

Neither libgcc or compiler-rt are usually used on Windows, so these
functions can't be called.

Differential revision: https://reviews.llvm.org/D66880

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370204 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test for rotate combining when add X, X is used instead of shl X, 1. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370203 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[yaml2obj] - Don't allow setting StOther and Other/Visibility at the same time."

This reverts commit r370032, it was causing check-llvm failures on
sanitizer-x86_64-linux-bootstrap-msan

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370198 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Fix cppcheck shadow variable warning. NFCI.

We already have an outer Ops variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370197 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Use less registers to load address of TargetExternalSymbol

There is no pattern matched `add hi, (MipsLo texternalsym)`. As a result,
loading an address of 32-bit symbol requires two registers and one more
additional instruction:
```
addiu $1, $zero, %lo(foo)
lui   $2, %hi(foo)
addu  $25, $2, $1
```

This patch adds the missed pattern and enables generation more effective
set of instructions:
```
lui   $1, %hi(foo)
addiu $25, $1, %lo(foo)
```

Differential Revision: https://reviews.llvm.org/D66771

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370196 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Add buildLegalVectorShuffle facility to help build legal shuffles

Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370190 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Remove LoadedSlice::Cost default 'ForCodeSize' constructor arguments. NFCI.

These were always being passed in and it allowed me to add the explicit tag to stop a cppcheck warning about 1 argument constructors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370189 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r370187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370188 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Move MVEVPTBlockPass to a separate file. NFC

This just pulls the MVEVPTBlockPass into a separate file, as opposed to being
wrapped up in Thumb2ITBlockPass.

Differential revision: https://reviews.llvm.org/D66579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370187 91177308-0d34-0410-b5e6-96231b3b80d8

[MVE] VMOVX patterns

This adds fp16 VMOVX patterns, using the same patterns as rL362482 with some
adjustments for MVE. It allows us to move fp16 registers without going into and
out of gprs.

VMOVX is able to move the top bits from a fp16 in a fp reg into the bottom bits
of another register, zeroing the rest. This can be used for odd MVE register
lanes. The top bits are not read by fp16 instructions, so no move is required
there if we are dealing with even lanes.

Differential revision: https://reviews.llvm.org/D66793

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370184 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Fix ByVal Attribute crashing

With the introduction of the typed byval attribute change there was no
way that the LLVM-C API could create the correct class Attribute. If a
program that uses the C API creates a ByVal attribute and annotates a
function with that attribute LLVM will crash when it assembles or write
that module containing the function out as bitcode.

This change is a minimal fix to at least allow code to work, this is
because the byval change is on the 9.0 and I don't want to introduce new
LLVM-C API this late in the release cycle.

By Jakob Bornecrantz!

Differential revision: https://reviews.llvm.org/D66144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370176 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Fold tail by masking - handle reductions

Allow vectorizing loops that have reductions when tail is folded by masking.
A select is introduced in VPlan, choosing between the last value carried by the
loop-exit/live-out instruction of the reduction, and the penultimate value
carried by the reduction phi, according to the "i < n" mask of fold-tail.
This select replaces the last value as the live-out value of the loop.

Differential Revision: https://reviews.llvm.org/D66720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370173 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM][ParallelDSP] Change search for muls

rL369567 reverted a couple of recent changes made to ARMParallelDSP
because of a miscompilation error: PR43073.

The issue stemmed from an underlying bug that was caused by adding
muls into a reduction before it was proved that they could be executed
in parallel with another mul.

Most of the changes here are from the previously reverted commits.
The additional changes have been made area:
1) The Search function now doesn't insert any muls into the Reduction
object. That now happens once the search has successfully finished.
2) For any muls added into the reduction but that weren't paired, we
accumulate their values as an input into the smlad.

Differential Revision: https://reviews.llvm.org/D66660

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370171 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Unbreak tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370170 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Updated test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370169 91177308-0d34-0410-b5e6-96231b3b80d8

Annotate return values of allocation functions with dereferenceable_or_null

Summary:
Example
define dso_local noalias i8* @_Z6maixxnv() local_unnamed_addr #0 {
entry:
%call = tail call noalias dereferenceable_or_null(64) i8* @malloc(i64 64) #6
ret i8* %call
}

Reviewers: jdoerfert

Reviewed By: jdoerfert

Subscribers: aaron.ballman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370168 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Add the missing ARMv8 subarch detection

Differential Revision: https://reviews.llvm.org/D66849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370163 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopFusion] Fix another -Wunused-function in -DLLVM_ENABLE_ASSERTIONS=off build

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370156 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix constraining scalar and/or/xor

If the result register already had a register class assigned, the
sources may not have been properly constrained.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370150 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r370105 - Update two x86 datalayouts for r370083, looks like racing commits

r370083 has been reverted, which this change depends on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370147 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Speculative fix for r369966 on llvm-clang-x86_64-win

Run the MIR pipeline in this test to completion to try and avoid a "Bad
machine code" error.

Build failure:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190826/688338.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370145 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Change the X86 datalayout to add three address spaces for 32 bit signed,"

This reverts commit r370083 because it caused check-lld failures on
sanitizer-x86_64-linux-fast.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370142 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Implement addrspacecast for 32-bit constant addrspace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370140 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Assert preconditions and merge all users into one codepath in Loads.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370128 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Disable some portions of foldGEPICmp for GEPs that return a vector of pointers. Fix other portions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370114 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Implement RISCVRegisterInfo::getPointerRegClass

Fixes bug 43041

Differential Revision: https://reviews.llvm.org/D66752

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370113 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Improve EmitGEPOffset handling of vector GEPs with scalar indices.

This patch splats the scalar index if necessary before using it
in any integer casts or other arithmetic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370112 91177308-0d34-0410-b5e6-96231b3b80d8

Update two x86 datalayouts for r370083, looks like racing commits

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370105 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC.

This change moves the actual stack pointer manipulation into the legalizer,
available to targets via lower(). The codegen is slightly different because
we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than
adding a negative amount via G_GEP.

Differential Revision: https://reviews.llvm.org/D66678

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370104 91177308-0d34-0410-b5e6-96231b3b80d8

[Loads/SROA] Remove blatantly incorrect code and fix a bug revealed in the process

The code we had isSafeToLoadUnconditionally was blatantly wrong. This function takes a "Size" argument which is supposed to describe the span loaded from. Instead, the code use the size of the pointer passed (which may be unrelated!) and only checks that span. For any Size > LoadSize, this can and does lead to miscompiles.

Worse, the generic code just a few lines above correctly handles the cases which *are* valid. So, let's delete said code.

Removing this code revealed two issues:
1) As noted by jdoerfert the removed code incorrectly handled external globals. The test update in SROA is to stop testing incorrect behavior.
2) SROA was confusing bytes and bits, but this wasn't obvious as the Size parameter was being essentially ignored anyway. Fixed.

Differential Revision: https://reviews.llvm.org/D66778

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370102 91177308-0d34-0410-b5e6-96231b3b80d8

DAG: computeNumSignBits for MUL

Copied directly from the IR version.

Most of the testcases I've added for this are somewhat problematic
because they really end up testing the yet to be implemented version
for MUL_I24/MUL_U24.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370099 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add baseline test for num sign bits of mul

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370098 91177308-0d34-0410-b5e6-96231b3b80d8

[XCOFF][AIX] Generate symbol table entries with llvm-readobj

Summary:

This patch implements main entry and auxiliary entries of symbol table generation for llvm-readobj on AIX.
The source code of aix_xcoff_xlc_test8.o (compile with xlc) is:

-bash-4.2$ cat test8.c
extern int i;
extern int TestforXcoff;
extern int fun(int i);
static int static_i;
char* p="abcd";
int fun1(int j) {
  static_i++;
  j++;
  j=j+*p;
  return j;
}
int main() {
  i++;
  fun(i);
  return fun1(i);
}

Patch provided by DiggerLin

Differential Revision: https://reviews.llvm.org/D65240

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370097 91177308-0d34-0410-b5e6-96231b3b80d8

Revert Autogenerate the shebang lines for tools/opt-viewer

This reverts r369486 (git commit 8d18384809957cc923752e10a86adab129e3df48)

The opt-viewer tests don't pass after this change, and fixing them isn't
trivial. opt-viewer.py imports optmap, which requires adjusting
pythonpath, which is more work than I'm willing to do to fix forward.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370095 91177308-0d34-0410-b5e6-96231b3b80d8

[ORCv2] - New Speculate Query Implementation

Summary:
This patch introduces, SequenceBBQuery - new heuristic to find likely next callable functions it tries to find the blocks with calls in order of execution sequence of Blocks.

It still uses BlockFrequencyAnalysis to find high frequency blocks. For a handful of hottest blocks (plan to customize), the algorithm traverse and discovered the caller blocks along the way to Entry Basic Block and Exit Basic Block. It uses Block Hint, to stop traversing the already visited blocks in both direction. It implicitly assumes that once the block is visited during discovering entry or exit nodes, revisiting them again does not add much. It also branch probability info (cached result) to traverse only hot edges (planned to customize) from hot blocks. Without BPI, the algorithm mostly return's all the blocks in the CFG with calls.

It also changes the heuristic queries, so they don't maintain states. Hence it is safe to call from multiple threads.

It also implements, new instrumentation to avoid jumping into JIT on every call to the function with the help _orc_speculate.decision.block and _orc_speculate.block.

"Speculator Registration Mechanism is also changed" - kudos to @lhames

Open to review, mostly looking to change implementation of SequeceBBQuery heuristics with good data structure choices.

Reviewers: lhames, dblaikie

Reviewed By: lhames

Subscribers: mgorny, hiraditya, mgrang, llvm-commits, lhames

Tags: #speculative_compilation_in_orc, #llvm

Differential Revision: https://reviews.llvm.org/D66399

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370092 91177308-0d34-0410-b5e6-96231b3b80d8

[Tblgen][MCA] Add the ability to mark groups as LoadQueue and StoreQueue. NFCI

Before this patch, users were not allowed to optionally mark processor resource
groups as load/store queues. That is because tablegen class MemoryQueue was
originally declared as expecting a ProcResource template argument (instead of a
more generic ProcResourceKind).

That was an oversight, since the original intention from D54957 was to let user
mark any processor resource as either load/store queue. This patch adds the
ability to use processor resource groups in MemoryQueue definitions. This is not
a user visible change.

Differential Revision: https://reviews.llvm.org/D66810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370091 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add amdgpu-32bit-address-high-bits to MIR serialization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370089 91177308-0d34-0410-b5e6-96231b3b80d8

[JITLink] Fix bogus TimerGroup constructor call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370088 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix crash from inconsistent register types for v3i16/v3f16

This is something of a workaround since computeRegisterProperties
seems to be doing the wrong thing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370086 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] NFC remove unimplemented query

Summary: CFGWalk Query is unimplemented for valid reasons. But the declaration got included in commit file.

Reviewers: lhames, dblaikie

Reviewed By: dblaikie

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370085 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit "[GlobalISel] Import patterns containing INSERT_SUBREG"

I thought `llvm::sort` was stable for some reason but it's not.

Use `llvm::stable_sort` in `CodeGenTarget::getSuperRegForSubReg`.

Original patch: https://reviews.llvm.org/D66498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370084 91177308-0d34-0410-b5e6-96231b3b80d8

Change the X86 datalayout to add three address spaces for 32 bit signed,
32 bit unsigned, and 64 bit pointers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370083 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[GlobalISel] Import patterns containing INSERT_SUBREG"

When EXPENSIVE_CHECKS are enabled, GlobalISelEmitterSubreg.td doesn't get
stable output.

Reverting while I debug it.

See: https://reviews.llvm.org/D66498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370080 91177308-0d34-0410-b5e6-96231b3b80d8