granicus.if.org Git

[InstCombine] Added support for (X >>s C) << C --> X & (-1 << C)

Differential Revision: https://reviews.llvm.org/D36743

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310949 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC][Kaleidoscope] Update Chapter 1 of BuildingAJIT to incorporate recent ORC
API changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310947 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] sink sext after ashr

Narrow ops are better for bit-tracking, and in the case of vectors,
may enable better codegen.

As the trunc test shows, this can allow follow-on simplifications.

There's a block of code in visitTrunc that deals with shifted ops
with FIXME comments. It may be possible to remove some of that now,
but I want to make sure there are no problems with this step first.

http://rise4fun.com/Alive/Y3a

Name: hoist_ashr_ahead_of_sext_1
  %s = sext i8 %x to i32
  %r = ashr i32 %s, 3  ; shift value is < than source bit width
  =>
  %a = ashr i8 %x, 3
  %r = sext i8 %a to i32

Name: hoist_ashr_ahead_of_sext_2
  %s = sext i8 %x to i32
  %r = ashr i32 %s, 8  ; shift value is >= than source bit width
  =>
  %a = ashr i8 %x, 7   ; so clamp this shift value
  %r = sext i8 %a to i32

Name: junc_the_trunc
  %a = sext i16 %v to i32
  %s = ashr i32 %a, 18
  %t = trunc i32 %s to i16
  =>
  %t = ashr i16 %v, 15

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310942 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Include infinite loops in PostDominatorTree

Summary:
This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change.

What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG.

This patch makes the following assumptions:
- A sequence of updates should produce the same tree as a recalculating it.
- Any sequence of the same updates should lead to the same tree.
- Siblings and roots are unordered.

The last two properties are essential to efficiently perform batch updates in the future.
When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently.

This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough.
That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping  clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call:

```
# functions:  52283
# samples:  337609
# reverse unreachable BBs:  216022
# BBs:  247840796
Percent reverse-unreachable:  0.08716159869015269 %
Max(PercRevUnreachable) in a function:  87.58620689655172 %
# > 25 % samples:  471 ( 0.1395104988314885 % samples )
... in 145 ( 0.27733680163724345 % functions )
```

Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway.

I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :).

Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel

Reviewed By: dberlin

Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050

Differential Revision: https://reviews.llvm.org/D35851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310940 91177308-0d34-0410-b5e6-96231b3b80d8

test-release.sh: Move test-suite setup to beginning of the script

Summary:
We want to catch failures early before do the full 3 stage build.

The goal here is to avoid running through the whole build process and have
it fail at the end (and not create the binary packages), just because
some prerequisites failed to install.

Reviewers: rovka, hans

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310939 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Add case statements for AArch64 to the local stub and callback manager
creation functions.

This should allow lli to lazily execute code using OrcLazyJIT on AArch64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310938 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for sext+ashr; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310935 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wunused-lambda-capture for Release build.

`I` and `this` are used only in assert or DEBUG, so they are unused
in Release build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310934 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-dwarfdump] - Attemp to fix BB after r310915.

Now MIPS one is unhappy:
http://lab.llvm.org:8011/builders/llvm-mips-linux/builds/2221

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310928 91177308-0d34-0410-b5e6-96231b3b80d8

[Doc] Update LangRef for new Module Flag Behavior

Summary:
Add the documentation for the new module flag behavior. The new
ModFlagBehavior is added in r303590.

Reviewers: tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310926 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-dwarfdump] - Refactor section name/uniqueness gathering.

As was requested in D36313 thread,

with this patch section names and uniqueness calculated once,
and not every time when a range is dumped.

Differential revision: https://reviews.llvm.org/D36740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310923 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r310919 - [globalisel][tablegen] Support zero-instruction emission.

As expected, this failed on the windows bots but the instrumentation showed
something interesting. The ADD8ri and INC8r rules are never directly compared
on the windows machines. That implies that the issue lies in transitivity of
the Compare predicate. I believe I've already verified that but maybe I missed
something.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310922 91177308-0d34-0410-b5e6-96231b3b80d8

Re-commit with some instrumentation: [globalisel][tablegen] Support zero-instruction emission.

Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.

The previous commit failed on the windows bots and this one is likely to fail
on those same bots. However, the added instrumentation should reveal a particular
isHigherPriorityThan() evaluation which I'm expecting to expose that
these machines are weighing priority of two rules differently from the
non-windows machines.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310919 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] - Attemp to fix BB after r310915.

Not sure what BB does not like.

While building module 'LLVM_DebugInfo_DWARF' imported from /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/lib/DebugInfo/DWARF/DWARFAbbreviationDeclaration.cpp:10:
In file included from <module-includes>:7:
In file included from /home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/DebugInfo/DWARF/DWARFContext.h:29:
/home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/DebugInfo/DWARF/DWARFObject.h:30:17: error: declaration of 'object' must be imported from module 'LLVM_Object.Decompressor' before it is required
virtual const object::ObjectFile *getFile() const { return nullptr; }
^
/home/buildbot/modules-slave-2/clang-x86_64-linux-selfhost-modules-2/llvm.src/include/llvm/Object/Decompressor.h:18:11: note: previous declaration is here
namespace object {

http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules-2/builds/10766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310918 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Add RISCVInstPrinter and basic MC assembler tests

With the addition of RISCVInstPrinter, it is now possible to test the basic
operation of the RISCV MC layer.

Differential Revision: https://reviews.llvm.org/D23564

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310917 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-dwarfdump] - Print section name and index when dumping .debug_info ranges

Teaches llvm-dwarfdump to print section index and name of range
when it dumps .debug_info.

Differential revision: https://reviews.llvm.org/D36313

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310915 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Recognize new relocation types

This patch adds all RISC-V relocation types, as of binutils 2.29. Note that
R_RISCV32_PCREL is not currently documented in the RISC-V ELF PSABI.

Differential Revision: https://reviews.llvm.org/D36455

Patch by Chih-Mao Chen (@PkmX)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310914 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Minor savings to Sink casts to unravel first order recurrence

Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
is not needed.

Differential Revision: https://reviews.llvm.org/D36408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310910 91177308-0d34-0410-b5e6-96231b3b80d8

Propagate error in LazyEmittingLayer::removeModule.

Summary:
Besides being the better thing to do, not doing so will triggers an assert with LLVM_ENABLE_ABI_BREAKING_CHECKS.

Reviewers: lhames

Reviewed By: lhames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36700

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310906 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Replace VL[0] to VL0 with assert, add propagateIRFlags extra parameter VL0,
replace E->Scalars[0] to VL0, NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310904 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Add install target for LLVMFuzzer

This allows including LLVMFuzzer as distribution component.

Differential Revision: https://reviews.llvm.org/D36540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310897 91177308-0d34-0410-b5e6-96231b3b80d8

Add missing dependency in ICP. (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310896 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Only outline candidates of length >= 2

Since we don't factor in instruction lengths into outlining calculations
right now, it's never the case that a candidate could have length < 2.

Thus, we should quit early when we see such candidates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310894 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Teach decomposeBitTestICmp to handle non-canonical compares

This adds support non-canonical compare predicates. InstSimplify can't rely on canonicalization to have occurred.

Differential Revision: https://reviews.llvm.org/D36646

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310893 91177308-0d34-0410-b5e6-96231b3b80d8

Remove checks for debug info intrinsics in use lists, NFC

These haven't done anything since debug info intrinsics stopped
appearing in Value use lists in 2014.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310892 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS] Implement support for -mstack-alignment.

Summary:
This is modeled on the implementation for x86 which stores the command line
option in a 'StackAlignOverride' field in MipsSubtarget and then uses this
to compute a 'stackAlignment' value in
MipsSubtarget::initializeSubtargetDependencies.

The stackAlignment() method in MipsSubTarget is renamed to getStackAlignment()
and returns the computed 'stackAlignment'.

Reviewers: sdardis

Reviewed By: sdardis

Subscribers: llvm-commits, arichardson

Differential Revision: https://reviews.llvm.org/D35874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310891 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r310869, "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"

This recommits r310869, with the moved files and no extra changes.

Original commit message:

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310889 91177308-0d34-0410-b5e6-96231b3b80d8

[InlineCost] Refactor the checks for different analyses to be a bit more
localized to the code that uses those analyses.

Technically, this can change behavior as we no longer require the
existence of the ProfileSummaryInfo analysis to use local profile
information via BFI. We didn't actually require the PSI to have an
interesting profile though, so this only really impacts the behavior in
non-default pass pipelines.

IMO, this makes it substantially less surprising how everything works --
before an analysis that wasn't actually used had to exist to trigger
*any* profile aware inlining. I think the new organization makes it more
obvious where various checks for profile signals happen.

Differential Revision: https://reviews.llvm.org/D36710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310888 91177308-0d34-0410-b5e6-96231b3b80d8

Add strictfp attribute to prevent unwanted optimizations of libm calls

Differential Revision: https://reviews.llvm.org/D34163

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310885 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] try to use less RAM while processing the initial corpus

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310881 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] explicitly use -fsanitize-coverage=trace-pc-guard in test/dump_coverage.test; mark print_coverage/dump_coverage as To-be-deprecated

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310877 91177308-0d34-0410-b5e6-96231b3b80d8

IPRA: Allow target to enable IPRA by default

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310876 91177308-0d34-0410-b5e6-96231b3b80d8

IPRA: Run RegUsageInfoPropagate much later

This was running immediately after isel, before
isel pseudos were even expanded which is really
unreasonable. Move this to before pre-reglloc
passes in case some other pre-regalloc pass wants to
use the updated regmask info.

Fixes one of the reasons IPRA doesn't do anything on
AMDGPU currently. Tests will be included with future
patch after a few more are fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310875 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r310869 "[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify"

Failed to add the two files that moved. And then added an extra change I didn't mean to while trying to fix that. Reverting everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310873 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r310870 "[InstCombine][InstSimplify] 'git add' two files that moved in r310869."

An extra change crept in here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310872 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][InstSimplify] 'git add' two files that moved in r310869.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310870 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify][InstCombine] Modify the interface of decomposeBitTestICmp and use it in the InstSimplify

This addresses a fixme in InstSimplify about using decomposeBitTest. This also fixes InstSimplify to handle ugt and ult compares too.

I've modified the interface a little to return only the APInt version of the mask that InstSimplify needs. InstCombine now has a small wrapper routine to create a Constant out of it. I've also dropped the returning of 0 since InstSimplify doesn't need that. So InstCombine creates a zero constant itself.

I also had to make decomposeBitTest support vectors since InstSimplify needs that.

As InstSimplify can't use something from the Transforms library, I've moved the CmpInstAnalysis code to the Analysis library.

Differential Revision: https://reviews.llvm.org/D36593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310869 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Add some tests cases for selects with bittests hidden in ugt/ult/uge/ule compares. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310868 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add codegen for VSX word extract convert to FP

Add codegen for VSX word extract conversion from signed/unsigned to single/double
precision.

For UINT_TO_FP:
Extract word unsigned and convert to float was implemented in https://reviews.llvm.org/D20239.
Here we will add the missing extract integer and conversion to double. This
utilizes the new P9 instruction xxextractuw to extracting an integer element
when the result will be converted to double thereby saving 2 direct moves
(VSR <-> GPR).

For SINT_TO_FP:
We will implement the following sequence which will also reduce the number of
instructions by saving 2 direct moves.

v4i32->f32:
        xxspltw
        xvcvsxwsp
        xscvspdpn

v4i32->f64:
        xxspltw
        xvcvsxwdp

Differential Revision: https://reviews.llvm.org/D35859

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310866 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel]: Add some helper constructors to MIRBuilder

https://reviews.llvm.org/D36636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310860 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Don't delete assumes of side-effectful instructions

ValueTracking has to strike a balance when attempting to propagate information
backwards from assumes, because if the information is trivially propagated
backwards, it can appear to LLVM that the assumption is known to be true, and
therefore can be removed.

This is sound (because an assumption has no semantic effect except for causing
UB), but prevents the assume from allowing further optimizations.

The isEphemeralValueOf check exists to try and prevent this issue by not
removing the source of an assumption. This tries to make it a little bit more
general to handle the case of side-effectful instructions, such as in

  %0 = call i1 @get_val()
  %1 = xor i1 %0, true
  call void @llvm.assume(i1 %1)

Patch by Ariel Ben-Yehuda, thanks!

Differential Revision: https://reviews.llvm.org/D36590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310859 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Reland "[mips][mt][6/7] Add support for mftr, mttr instructions.""

This reverts r310834. It didn't pacify the buildbot, FileCheck is still
crashing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310854 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] fold the mask op on 8- and 16-bit rotates

Ref the post-commit thread for r310770:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170807/478507.html

The motivating cases as 'C' source examples can look like this:

unsigned char rotate_right_8(unsigned char v, int shift) {
  // shift &= 7;
  v = ( v >> shift ) | ( v << ( 8 - shift ) );
  return v;
}

https://godbolt.org/g/K6rc1A

Notice that the source doesn't contain UB-safe masked shift amounts, but instcombine created those
in order to produce narrow rotate patterns. This should be the last step needed to resolve PR34046:
https://bugs.llvm.org/show_bug.cgi?id=34046

Differential Revision: https://reviews.llvm.org/D36644

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310849 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Schedule bundle with different opcodes.

This change let us schedule a bundle with different opcodes in it, for example : [ load, add, add, add ]

Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310847 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix a place that was mishandling X86ISD::UMUL.

According to the X86ISelLowering.h, UMUL results are low, high, and flags. But this place was treating result 1 or 2 as flags.

Differential Revision: https://reviews.llvm.org/D36654

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310846 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove flag setting ISD nodes from computeKnownBitsForTargetNode

Summary:
The flag result is an i32 type. But its only really used for connectivity. I don't think anything even assumes a particular format. We don't ever do any real operations on it. So known bits don't help us optimize anything.

My main motivation is that the UMUL behavior is actually wrong. I was going to fix this in D36654, but then realized there was just no reason for it to be here.

Reviewers: RKSimon, zvi, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310845 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Make the itinerary parameter actually pass through the the AVX512_maskable_common multiclass

Summary: This looks to have been disconnected about 3 years ago in r219358.

Reviewers: gadi.haber, RKSimon, zvi

Reviewed By: gadi.haber

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310844 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Remove leftover code for when i1 was a legal type from the fast isel load/store code.

Summary:
I don't think we need this code anymore. It only existed because i1 used to be legal.

There's probably more unneeded code in fast isel still.

Reviewers: guyblank, zvi

Reviewed By: guyblank

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36652

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310843 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] reduce scope of an assert (PR34179)

The assert was added with r310779 and is usually correct,
but as the test shows, not always. The 'volatile' on the
load is needed to expose the faulty path because without
it, DemandedBits would return that the load is just dead
rather than not demanded, and so we wouldn't hit the
bogus assert.

Also, since the lambda is just a single-line now, get rid
of it and inline the DB.isAllOnesValue() calls.

This should fix (prevent execution of a faulty assert):
https://bugs.llvm.org/show_bug.cgi?id=34179

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310842 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."

This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win
buildbot.

Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.

For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.

Reviewers: slthakur, atanasyan

Differential Revision: https://reviews.llvm.org/D35253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310834 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Do not try to deduplicate commutative operations if both operand are the same.

Summary: It is creating useless work as the commuted nodes is the same as the node we are working on in that case.

Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310832 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] combine vextract (v1iX extract_subvector(vNiX, Idx))
into vextract(vNiX,Idx) when creating vextract with getNode().
This case appeared in AVX512 after fixing pr33349 in r310552.

Differential revision: https://reviews.llvm.org/D36571

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310828 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-cov] Add an option which maps the location of source directories on another machine to your local copies

Summary:
This patch adds the -path-equivalence option (example: llvm-cov show -path-equivalence=/origin/path,/local/path) which maps the source code path from one machine to another when using `llvm-cov show`. This is similar to the -filename-equivalence option, but doesn't require you to specify all the source files on the command line.

This allows you to generate the coverage data on one machine (e.g. in a CI system), and then use llvm-cov on another machine where you have the same code base on a different path.

Reviewers: vsk

Reviewed By: vsk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36391

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310827 91177308-0d34-0410-b5e6-96231b3b80d8

MachineInstr: Reason locally about some memory objects before going to AA.

This addresses a FIXME in MachineInstr::mayAlias.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310825 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Enable option to peel remainder loop

On some targets, the penalty of executing runtime unrolling checks
and then not the unrolled loop can be significantly detrimental to
performance. This results in the need to be more conservative with
the unroll count, keeping a trip count of 2 reduces the overhead as
well as increasing the chance of the unrolled body being executed. But
being conservative leaves performance gains on the table.

This patch enables the unrolling of the remainder loop introduced by
runtime unrolling. This can help reduce the overhead of misunrolled
loops because the cost of non-taken branches is much less than the
cost of the backedge that would normally be executed in the remainder
loop. This allows larger unroll factors to be used without suffering
performance loses with smaller iteration counts.

Differential Revision: https://reviews.llvm.org/D36309

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310824 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Remove unused MC function

An unused function warning was raised in
https://bugs.llvm.org/show_bug.cgi?id=34178.

The offending function, in AArch64MCCodeEmitter.cpp, was committed by
me last week.

Differential Revision: https://reviews.llvm.org/D36665

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310823 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)"

This reverts commit r310782.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310822 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Revert r310583 which enabled functionality that still is
causing compile time issues.

Moreover, the patch *deleted* the flag in addition to changing the
default, and links to a code review that doesn't even discuss the flag
and just has an update to a Clang test case.

I've followed up on the commit thread to ask for numbers on compile time
at this point, leaving the flag in place until things stabilize, and
pointing at specific code that seems to exhibit excessive compile time
with this patch.

Original commit message for r310583:
"""
[ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2.

The original patch was an improvement to IR ValueTracking on
non-negative integers. It has been checked in to trunk (D18777,
r284022). But was disabled by default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
""""

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310816 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add hasSideEffects = 0 to the 8-bit and 16-bit register broadcasts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310813 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unused argument from the vextract_for_size multiclass. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310812 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Remove comment I should have removed in r310808. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310811 91177308-0d34-0410-b5e6-96231b3b80d8

[opt-viewer] Listify `dict_items` for Py3 indexing

Summary:
In Python 2, calling `dict.items()` returns an indexable `list`, whereas
on Python 3 it returns a set-like `dict_items` object, which cannot be
indexed. Explicitly onvert the `dict_items` object so that it can be
indexed when using Python 3.

In combination with D36622, D36623, and D36624, this change allows
`opt-viewer.py` to exit successfully when run with Python 3.4.

Test Plan:
Run `opt-viewer.py` using Python 3.4 and confirm it does not encounter a
runtime error when when indexing into `dict.items()`.

Reviewers: anemet

Reviewed By: anemet

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36630

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310810 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Revert r310346 (and followups r310356 & r310424) which
introduce a miscompile bug.

There appears to be a bug where the generated code to extract the sign
bit doesn't work correctly for 32-bit inputs. I've replied to the
original commit pointing out the problem. I think I see by inspection
(and reading the manual for PPC) how to fix this, but I can't be 100%
confident and I also don't know what the best way to test this is.
Currently it seems nearly impossible to get the backend to hit this code
path, but the patch autohr is likely in a better position to craft such
test cases than I am, and based on where the bug is it should be easily
done.

Original commit message for r310346:
"""
[PowerPC] Eliminate compares - add i32 sext/zext handling for SETLE/SETGE

Adds handling for SETLE/SETGE comparisons on i32 values. Furthermore, it
adds the handling for the special case where RHS == 0.

Differential Revision: https://reviews.llvm.org/D34048
"""

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310809 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Simplify the instruction defintion for VEXTRACT. NFCI

The comment about why we couldn't use avx512_maskable appears to have been incorrect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310808 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Tidy-up Cortex-A15 DPR-SPR optimizer implementation

Modernise the code with range-loops etc

Reviewed by: @fhahn, @rovka
Differential Revision: https://reviews.llvm.org/D36502

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310807 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Simplify and inline FoldOrWithConstants/FoldXorWithConstants

Summary:
These functions were overly complicated. The body of this function was rechecking for an And operation to find the constant, but we already knew we were looking at two Ands ORed together and the pieces are in variables. We already had earlier nearby code that checked for ConstantInts. So just inline the remaining parts into the earlier code.

Next step is to use m_APInt instead of ConstantInt.

Reviewers: spatel, efriedma, davide, majnemer

Reviewed By: spatel

Subscribers: zzheng, llvm-commits

Differential Revision: https://reviews.llvm.org/D36439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310806 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][BMI] Add BEXTR demanded bits test cases (PR34042)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310802 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix typo from r310794. Index = 0 should have been Index == 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310801 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove unused pattern fragment that referenced MVT::i1. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310799 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF, ARM64] Use '//' as comment character in assembly files in GNU environments

This allows using semicolons for bundling up more than one
statement per line. This is used within the mingw-w64 project in some
assembly files that contain code for multiple architectures.

Differential Revision: https://reviews.llvm.org/D36366

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310797 91177308-0d34-0410-b5e6-96231b3b80d8

Remove RISCV from LLVM_ALL_TARGETS in CMakeLists.txt

It was mistakenly added to that list in D23560 (committed in rL285712). RISCV
is an experimental backend and should never have been in that list, I
mistakenly interpreted LLVM_ALL_TARGETS as a list of all targets rather than
targets to build by default. Unfortunately, because of this the RISCV backend
has been building by default when it shouldn't be.

This commet adds a description comment, which should help to avoid such
mistakes in the future.

See my message to llvm-dev for more information and analysis
<http://lists.llvm.org/pipermail/llvm-dev/2017-August/116347.html>.

Differential Revision: https://reviews.llvm.org/D36538

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310796 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX512] Correct isExtractSubvectorCheap so that it will return the correct answers for extracting 128-bits from a 512-bit vector and for mask registers.

Previously it would not return true for extracting either of the upper quarters of a 512-bit registers.

For mask registers we support extracting anything from index 0. And otherwise we only support extracting the upper half of a register.

Differential Revision: https://reviews.llvm.org/D36638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310794 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][ARM][TargetLowering] Add SrcVT to isExtractSubvectorCheap

Summary:
Without the SrcVT its hard to know what is really being asked for. For example if your target has 128, 256, and 512 bit vectors. Maybe extracting 128 from 256 is cheap, but maybe extracting 128 from 512 is not.

For x86 we do support extracting a quarter of a 512-bit register. But for i1 vectors we don't have isel patterns for extracting arbitrary pieces. So we need this to have a correct implementation of isExtractSubvectorCheap for mask vectors.

Reviewers: RKSimon, zvi, efriedma

Reviewed By: RKSimon

Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310793 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SandyBridge] Additional updates to the SNB instructions scheduling information

This is a continuation patch for commit r307529 which completely replaces the scheduling information for the SandyBridge architecture target by modifying the file X86SchedSandyBridge.td located under the X86 Target (see also https://reviews.llvm.org/D35019).

In this patch we added the scheduling information of additional SNB instructions that were missing from the patch commit r307529, fixed the scheduling of several resource groups that include only port0 instead of port05 (i.e., port0 OR port5) and fixed several incorrect instructions' scheduling in the r307529 commit.

The patch also includes the X87 instructions which were missing in previous patch commit r307529 as reported in bugzilla bug 34080.

Reviewers: zvi, RKSimon, chandlerc, igorb, m_zuckerman, craig.topper, aymanmus, dim

Differential Revision: https://reviews.llvm.org/D36388

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310792 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added additional shuffle+trunc test case.

An existing test should have covered this but a typo caused it to fail. I've kept both as the codegen for the typo case needs addressing as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310791 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][TBM] Add tests showing failure to fold RFLAGS result into TBM instructions.

And fails to select TBM instructions at all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310790 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AsmParser][AVX512] Error appropriately when K0 is tried as a write-mask

K0 isn't expected as a write-mask, so provide a detailed error here, instead of the more generic one (invalid op for insn)
Conforms with gas

Differential Revision: https://reviews.llvm.org/D36570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310789 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][TBM] Regenerate bextri intrinsics tests. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310788 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Add combine for TESTM

Add an X86 combine for TESTM when one of the operands is a BUILD_VECTOR(0,0,...).

TESTM op0, BUILD_VECTOR(0,0,...) -> BUILD_VECTOR(0,0,...)
TESTM BUILD_VECTOR(0,0,...), op1 -> BUILD_VECTOR(0,0,...)

Differential Revision:
https://reviews.llvm.org/D36536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310787 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Early out of combineInsertSubvector for mask vectors.

The combines here shouldn't be done for mask vectors, but it wasn't clear anything was preventing that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310786 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix bad comment. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310785 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When handling addcarry intrinsic, create the flag result with the correct type so we don't crash if we use a memory instruction

Summary:
Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result.

Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types.

We should probably consider this for 5.0 as well.

Reviewers: RKSimon, zvi, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310784 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Extending pattern detection for vector shuffle (REAPPLIED)

If all the operands of a BUILD_VECTOR extract elements from same vector then split the vector efficiently based on the maximum vector access index.

Reapplied with fix to only work with simple value types.

Committed on behalf of @jbhateja (Jatin Bhateja)

Differential Revision: https://reviews.llvm.org/D35788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310782 91177308-0d34-0410-b5e6-96231b3b80d8

[Triple] Add isThumb and isARM functions.

Summary:
isThumb returns true for Thumb triples (little and big endian), isARM
returns true for ARM triples (little and big endian).
There are a few more checks using arm/thumb that are not covered by
those functions, e.g. that the architecture is either ARM or Thumb
(little endian) or ARM/Thumb little endian only.

Reviewers: javed.absar, rengolin, kristof.beyls, t.p.northover

Reviewed By: rengolin

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D34682

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310781 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate merge store tests. NFCI.

Gives us a much better idea of what is going on than just relying on a few checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310780 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] clear poison generators after turning a value into zero (PR33695, PR34037)

nsw, nuw, and exact carry implicit assumptions about their operands, so we need
to clear those after trivializing a value. We decided there was no danger for
llvm.assume or metadata, so there's just a comment about that.

This fixes miscompiles as shown in:
https://bugs.llvm.org/show_bug.cgi?id=33695
https://bugs.llvm.org/show_bug.cgi?id=34037

Differential Revision: https://reviews.llvm.org/D36592

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310779 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some minor typos in the llvm XRay exemple

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310777 91177308-0d34-0410-b5e6-96231b3b80d8

D36604: PR34148: Do not assume we can use a copy relocation for an `external_weak` global

An `external_weak` global may be intended to resolve as a null pointer if it's
not defined, so it doesn't make sense to use a copy relocation for it.

Differential Revision: https://reviews.llvm.org/D36604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310773 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] experimental support for Clang's coverage (fprofile-instr-generate), Linux-only

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310771 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for rotate left/right with masked shifter; NFC

As noted in the test comment, instcombine now produces the masked
shift value even when it's not included in the source, so we should
handle this.

Although the AMD/Intel docs don't say it explicitly, over-rotating
the narrow ops produces the same results. An existence proof that
this works as expected on all x86 comes from gcc 4.9 or later:
https://godbolt.org/g/K6rc1A

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310770 91177308-0d34-0410-b5e6-96231b3b80d8

[MIPS] Use ABI to determine stack alignment.

Summary:
The stack alignment depends on the ABI (16 bytes for N32 and N64 and 8
bytes for O32), not the CPU type.

Reviewers: sdardis

Reviewed By: sdardis

Subscribers: atanasyan, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D36326

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310768 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] regenerate test checks, add 64-bit run; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310767 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310766 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some broken tests.

These were pending in a separate patch but I forgot to squash them
before comitting, and this one didn't go through.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310764 91177308-0d34-0410-b5e6-96231b3b80d8

[OptDiag] Updating Remarks in SampleProfile

Updating remark API to newer OptimizationDiagnosticInfo API. This
allows remarks to show up in diagnostic yaml file, and enables use
of opt-viewer tool.

Hotness information for remarks (L505 and L751) do not display hotness
information, most likely due to profile information not being
propagated yet. Unsure if this is the desired outcome.

Patch by Tarun Rajendran.

Differential Revision: https://reviews.llvm.org/D36127

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310763 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't use fsin/fcos/fsincos instructions ever

Summary:
Previously we would use these instructions if sse was disabled and fastmath was enabled.

As mentioned in D28335, this is a bad idea.

Reviewers: efriedma, scanon, DavidKreitzer

Reviewed By: DavidKreitzer

Subscribers: zvi, llvm-commits

Differential Revision: https://reviews.llvm.org/D36344

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310762 91177308-0d34-0410-b5e6-96231b3b80d8

Fix access to undefined weak symbols in pic code

When the access to a weak symbol is not a call, the access has to be
able to produce the value 0 at runtime.

We were sometimes producing code sequences where that was not possible
if the code was leaded more than 4g away from 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310756 91177308-0d34-0410-b5e6-96231b3b80d8

Output S_SECTION symbols to the Linker module.

PDBs need to contain 1 module for each object file/compiland,
and a special one synthesized by the linker.  This one contains
a symbol record for each output section in the executable with
its address information.  This patch adds such symbols to the
linker module.  Note that we also are supposed to add an
S_COFFGROUP symbol for what appears to be each input section that
contributes to each output section, but it's not entirely clear
how to generate these yet, so I'm leaving that for a separate
patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310754 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Start adding tail call support

Handle the sibling call cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310753 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] recommend Clang Coverage for coverage visualization

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310751 91177308-0d34-0410-b5e6-96231b3b80d8