]> granicus.if.org Git - llvm/log
llvm
5 years ago[SimplifyLibCalls] Mark known arguments with nonnull
David Bolvansky [Tue, 17 Sep 2019 09:32:52 +0000 (09:32 +0000)]
[SimplifyLibCalls] Mark known arguments with nonnull

Reviewers: efriedma, jdoerfert

Reviewed By: jdoerfert

Subscribers: ychen, rsmith, joerg, aaron.ballman, lebedev.ri, uenoku, jdoerfert, hfinkel, javed.absar, spatel, dmgreen, llvm-commits

Differential Revision: https://reviews.llvm.org/D53342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372091 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-readobj] - Fix BB after r372087.
George Rimar [Tue, 17 Sep 2019 09:26:49 +0000 (09:26 +0000)]
[llvm-readobj] - Fix BB after r372087.

Seems I forgot to update the number of bytes checked.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372089 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-ar] Parse 'h' and '-h': display help and exit
Fangrui Song [Tue, 17 Sep 2019 09:25:52 +0000 (09:25 +0000)]
[llvm-ar] Parse 'h' and '-h': display help and exit

Support `llvm-ar h` and `llvm-ar -h` because they may be what users try
at first. Note, operation 'h' is undocumented in GNU ar.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D67560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372088 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-readobj] - Fix a TODO in elf-reloc-zero-name-or-value.test.
George Rimar [Tue, 17 Sep 2019 09:12:10 +0000 (09:12 +0000)]
[llvm-readobj] - Fix a TODO in elf-reloc-zero-name-or-value.test.

The "TODO" mentioned was:

"Add test for symbol with no name but with a value once yaml2obj allows
referencing symbols with no name from relocations."

We can do it now.

Differential revision: https://reviews.llvm.org/D67609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372087 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed
Alexander Timofeev [Tue, 17 Sep 2019 09:08:58 +0000 (09:08 +0000)]
[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed

Defferential Revision: https://reviews.llvm.org/D67101

Reviewers: rampitec, vpykhtin

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372086 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] LE support in ConstantIslands
Sam Parker [Tue, 17 Sep 2019 09:08:05 +0000 (09:08 +0000)]
[ARM] LE support in ConstantIslands

The low-overhead branch extension provides a loop-end 'LE' instruction
that performs no decrement nor compare, it just jumps backwards. This
patch modifies the constant islands pass to try to insert LE
instructions in place of a Thumb2 conditional branch, instead of
shrinking it. This only happens if a cmp can be converted to a cbn/z
and used to exit the loop.

Differential Revision: https://reviews.llvm.org/D67404

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372085 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LoopUnroll] Use LoopSize+1 as threshold, to allow unrolling loops matching LoopSize.
Florian Hahn [Tue, 17 Sep 2019 09:02:48 +0000 (09:02 +0000)]
[LoopUnroll] Use LoopSize+1 as threshold, to allow unrolling loops matching LoopSize.

We use `< UP.Threshold` later on, so we should use LoopSize + 1, to
allow unrolling if the result won't exceed to loop size.

Fixes PR43305.

Reviewers: efriedma, dmgreen, paquette

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D67594

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372084 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-readobj] - Refactor the code.
George Rimar [Tue, 17 Sep 2019 08:53:18 +0000 (08:53 +0000)]
[llvm-readobj] - Refactor the code.

It's a straightforward refactoring that allows to simplify and encapsulate the code.

Differential revision: https://reviews.llvm.org/D67624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372083 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-objcopy] - Remove python invocations from 2 test cases.
George Rimar [Tue, 17 Sep 2019 08:38:53 +0000 (08:38 +0000)]
[llvm-objcopy] - Remove python invocations from 2 test cases.

It is possible to use yaml2obj to create sections with overlapping sh_offset now.
This patch does that.

Differential revision: https://reviews.llvm.org/D67610

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372081 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[bugpoint] Add support for -Oz and properly enable -Os.
Florian Hahn [Tue, 17 Sep 2019 08:14:09 +0000 (08:14 +0000)]
[bugpoint] Add support for -Oz and properly enable -Os.

This patch adds -Oz as option and also properly enables support for -Os.
Currently, the existing check for -Os is dead, because the enclosing if
only checks of O1, O2 and O3.

There is still a difference between the -Oz pipeline compared to opt,
but I have not been able to track that down yet.

Reviewers: bogner, sebpop, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D67593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372079 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM][MVE] Add invalidForTailPredication to TSFlags
Sam Parker [Tue, 17 Sep 2019 07:43:04 +0000 (07:43 +0000)]
[ARM][MVE] Add invalidForTailPredication to TSFlags

Set this bit for the MVE reduction instructions to prevent a loop from
becoming tail predicated in their presence.

Differential Revision: https://reviews.llvm.org/D67444

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372076 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor] Use Alias Analysis in noalias callsite argument deduction
Hideto Ueno [Tue, 17 Sep 2019 06:53:27 +0000 (06:53 +0000)]
[Attributor] Use Alias Analysis in noalias callsite argument deduction

Summary: This patch adds a check of alias analysis in `noalias` callsite argument deduction.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372075 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor] Create helper struct for handling analysis getters
Hideto Ueno [Tue, 17 Sep 2019 05:45:18 +0000 (05:45 +0000)]
[Attributor] Create helper struct for handling analysis getters

Summary: This patch introduces a helper struct `AnalysisGetter` to put together analysis getters. In this patch, a getter for `AAResult` is also added for  `noalias`.

Reviewers: jdoerfert, sstefan1

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372072 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[git-llvm] Do not reinvent `@{upstream}` (take 2)
David Zarzycki [Tue, 17 Sep 2019 04:44:13 +0000 (04:44 +0000)]
[git-llvm] Do not reinvent `@{upstream}` (take 2)

This makes git-llvm more of a thin wrapper around git while temporarily
maintaining backwards compatibility with past git-llvm behavior.

Using @{upstream} makes git-llvm more robust when used with a nontrivial
local repository.

https://reviews.llvm.org/D67389

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372070 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Split oversized vXi1 vector arguments and return values into scalars on avx512...
Craig Topper [Tue, 17 Sep 2019 04:41:14 +0000 (04:41 +0000)]
[X86] Split oversized vXi1 vector arguments and return values into scalars on avx512 targets.

Previously we tried to split them into narrower v64i1 or v16i1
pieces that each got promoted to vXi8 and then passed in a zmm
or xmm register. But this crashes when you need to pass more
pieces than available registers reserved for argument passing.

The scalarizing done here generates much longer and slower code,
but is consistent with the behavior of avx2 and earlier targets
for these types.

Fixes PR43323.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372069 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Allow masked VBROADCAST instructions to be turned into BLENDM with a broadcast...
Craig Topper [Tue, 17 Sep 2019 04:41:10 +0000 (04:41 +0000)]
[X86] Allow masked VBROADCAST instructions to be turned into BLENDM with a broadcast load to avoid a copy.

The BLENDM instructions allow an 2 sources and an independent
destination while masked VBROADCAST has the destination tied
to the source.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372068 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add support for commuting EVEX VCMP instructons with any immediate value.
Craig Topper [Tue, 17 Sep 2019 04:41:05 +0000 (04:41 +0000)]
[X86] Add support for commuting EVEX VCMP instructons with any immediate value.

Previously we limited to the EQ/NE/TRUE/FALSE/ORD/UNORD immediates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372067 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add test case for missed opportunity to commute a VCMP instruction after unfold...
Craig Topper [Tue, 17 Sep 2019 04:41:01 +0000 (04:41 +0000)]
[X86] Add test case for missed opportunity to commute a VCMP instruction after unfolding one load in order to fold another load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372066 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Enable commuting of EVEX VCMP for all immediate values during isel.
Craig Topper [Tue, 17 Sep 2019 04:40:58 +0000 (04:40 +0000)]
[X86] Enable commuting of EVEX VCMP for all immediate values during isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372065 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agollvm-reduce: Clean out previous test temp/output dir, since it was a dir and now...
David Blaikie [Mon, 16 Sep 2019 23:56:26 +0000 (23:56 +0000)]
llvm-reduce: Clean out previous test temp/output dir, since it was a dir and now it's used as just a single file

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372054 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agollvm-reduce: Remove some string copies
David Blaikie [Mon, 16 Sep 2019 23:54:57 +0000 (23:54 +0000)]
llvm-reduce: Remove some string copies

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372053 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r372035: "[lit] Make internal diff work in pipelines"
Joel E. Denny [Mon, 16 Sep 2019 23:47:46 +0000 (23:47 +0000)]
Revert r372035: "[lit] Make internal diff work in pipelines"

This breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372051 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[GlobalISel] Partially revert r371901.
Amara Emerson [Mon, 16 Sep 2019 23:46:03 +0000 (23:46 +0000)]
[GlobalISel] Partially revert r371901.

r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372050 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agollvm-reduce: Make tests shell-independent by passing the interpreter on the command...
David Blaikie [Mon, 16 Sep 2019 23:41:19 +0000 (23:41 +0000)]
llvm-reduce: Make tests shell-independent by passing the interpreter on the command line rather than using #! in the test file

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372049 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdd libc to path mappings in git-llvm.
David L. Jones [Mon, 16 Sep 2019 23:36:35 +0000 (23:36 +0000)]
Add libc to path mappings in git-llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372048 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
Nemanja Ivanovic [Mon, 16 Sep 2019 22:54:52 +0000 (22:54 +0000)]
[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32

Add the missing piece of r372029.
Somehow when the patch for review D61961 was committed, only the test case
went in and the code didn't. This of course caused all kinds of build bot
breaks.
This patch just adds the code for that patch.

Author: Lei Huang
Differential revision: https://reviews.llvm.org/D61961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372043 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Remarks] Allow remarks::Format::YAML to take a string table
Francis Visoiu Mistrih [Mon, 16 Sep 2019 22:45:17 +0000 (22:45 +0000)]
[Remarks] Allow remarks::Format::YAML to take a string table

It should be allowed to take a string table in case all the strings in
the remarks point there, but it shouldn't use it during serialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372042 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[lit] Make internal diff work in pipelines
Joel E. Denny [Mon, 16 Sep 2019 21:22:29 +0000 (21:22 +0000)]
[lit] Make internal diff work in pipelines

When using lit's internal shell, RUN lines like the following
accidentally execute an external `diff` instead of lit's internal
`diff`:

```
 # RUN: program | diff file -
 # RUN: not diff file1 file2 | FileCheck %s
```

Such cases exist now, in `clang/test/Analysis` for example.  We are
preparing patches to ensure lit's internal `diff` is called in such
cases, which will then fail because lit's internal `diff` cannot
currently be used in pipelines and doesn't recognize `-` as a
command-line option.

To enable pipelines, this patch moves lit's `diff` implementation into
an out-of-process script, similar to lit's `cat` implementation.  A
follow-up patch will implement `-` to mean stdin.

Reviewed By: probinson, stella.stamenova

Differential Revision: https://reviews.llvm.org/D66574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372035 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC] Test commit access
Bardia Mahjour [Mon, 16 Sep 2019 20:44:15 +0000 (20:44 +0000)]
[NFC] Test commit access

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372033 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Docs] Bug fix for docs homepage
DeForest Richards [Mon, 16 Sep 2019 20:29:56 +0000 (20:29 +0000)]
[Docs] Bug fix for docs homepage

Removes reference to non-existent Reference Documentation page.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372032 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage
DeForest Richards [Mon, 16 Sep 2019 20:19:32 +0000 (20:19 +0000)]
[Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage

Adds a section for Getting Started/Tutorials and Reference topics to the LLVM docs homepage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372031 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
Lei Huang [Mon, 16 Sep 2019 20:04:15 +0000 (20:04 +0000)]
[PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32

This is a follow up patch from https://reviews.llvm.org/D57857 to handle
extract_subvector v4f32.  For cases where we fpext of v2f32 to v2f64 from
extract_subvector we currently generate on P9 the following:

  lxv 0, 0(3)
  xxsldwi 1, 0, 0, 1
  xscvspdpn 2, 0
  xxsldwi 3, 0, 0, 3
  xxswapd 0, 0
  xscvspdpn 1, 1
  xscvspdpn 3, 3
  xscvspdpn 0, 0
  xxmrghd 0, 0, 3
  xxmrghd 1, 2, 1
  stxv 0, 0(4)
  stxv 1, 0(5)

This patch custom lower it to the following sequence:

  lxv 0, 0(3)       # load the v4f32 <w0, w1, w2, w3>
  xxmrghw 2, 0, 0   # Produce the following vector <w0, w0, w1, w1>
  xxmrglw 3, 0, 0   # Produce the following vector <w2, w2, w3, w3>
  xvcvspdp 2, 2     # FP-extend to <d0, d1>
  xvcvspdp 3, 3     # FP-extend to <d2, d3>
  stxv 2, 0(5)      # Store <d0, d1> (%vecinit11)
  stxv 3, 0(4)      # Store <d2, d3> (%vecinit4)

Differential Revision: https://reviews.llvm.org/D61961

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372029 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Coverage] Speed up file-based queries for coverage info, NFC
Vedant Kumar [Mon, 16 Sep 2019 19:08:44 +0000 (19:08 +0000)]
[Coverage] Speed up file-based queries for coverage info, NFC

Speed up queries for coverage info in a file by reducing the amount of
time spent determining whether a function record corresponds to a file.

This gives a 36% speedup when generating a coverage report for `llc`.
The reduction is entirely in user time.

rdar://54758110

Differential Revision: https://reviews.llvm.org/D67575

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372025 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Coverage] Assert that filenames in a TU are unique, NFC
Vedant Kumar [Mon, 16 Sep 2019 19:08:41 +0000 (19:08 +0000)]
[Coverage] Assert that filenames in a TU are unique, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372024 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LTO][Legacy] Add new C inferface to query libcall functions
Steven Wu [Mon, 16 Sep 2019 18:49:54 +0000 (18:49 +0000)]
[LTO][Legacy] Add new C inferface to query libcall functions

Summary:
This is needed to implemented the same approach as lld (implemented in r338434)
for how to handling symbols that can be generated by LTO code generator
but not present in the symbol table for linker that uses legacy C APIs.

libLTO is in charge of providing the list of symbols. Linker is in
charge of implementing the eager loading from static libraries using
the list of symbols.

rdar://problem/52853974

Reviewers: tejohnson, bd1976llvm, deadalnix, espindola

Reviewed By: tejohnson

Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372021 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups
Reid Kleckner [Mon, 16 Sep 2019 18:49:09 +0000 (18:49 +0000)]
[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups

This fixes relocations against __profd_ symbols in discarded sections,
which is PR41380.

In general, instrumentation happens very early, and optimization and
inlining happens afterwards. The counters for a function are calculated
early, and after inlining, counters for an inlined function may be
widely referenced by other functions.

For C++ inline functions of all kinds (linkonce_odr &
available_externally mainly), instr profiling wants to deduplicate these
__profc_ and __profd_ globals. Otherwise the binary would be quite
large.

I made __profd_ and __profc_ comdat in r355044, but I chose to make
__profd_ internal. At the time, I was only dealing with coverage, and in
that case, none of the instrumentation needs to reference __profd_.
However, if you use PGO, then instrumentation passes add calls to
__llvm_profile_instrument_range which reference __profd_ globals. The
solution is to make these globals externally visible by using
linkonce_odr linkage for data as was done for counters.

This is safe because PGO adds a CFG hash to the names of the data and
counter globals, so if different TUs have different globals, they will
get different data and counter arrays.

Reviewers: xur, hans

Differential Revision: https://reviews.llvm.org/D67579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372020 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM][Codegen] Autogenerate arm-cgp-casts.ll test.
Roman Lebedev [Mon, 16 Sep 2019 18:28:22 +0000 (18:28 +0000)]
[ARM][Codegen] Autogenerate arm-cgp-casts.ll test.

Apparently it got broken by r372009 while i thought it was r372012.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372019 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands
Simon Pilgrim [Mon, 16 Sep 2019 17:30:33 +0000 (17:30 +0000)]
[X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands

Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372013 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] A predicate cast of a predicate cast is a predicate cast
David Green [Mon, 16 Sep 2019 17:29:07 +0000 (17:29 +0000)]
[ARM] A predicate cast of a predicate cast is a predicate cast

The adds some very basic folding of PREDICATE_CASTS, removing cases when they
are chained together. These would already be removed eventually, as these are
lowered to copies. This just allows it to happen earlier, which can help other
simplifications.

Differential Revision: https://reviews.llvm.org/D67591

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372012 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB...
Roman Lebedev [Mon, 16 Sep 2019 16:18:24 +0000 (16:18 +0000)]
[SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost

Summary:
Previously, if the threshold was 2, we were willing to speculatively
execute 2 cheap instructions in both basic blocks (thus we were willing
to speculatively execute cost = 4), but weren't willing to speculate
when one BB had 3 instructions and other one had no instructions,
even thought that would have total cost of 3.

This looks inconsistent to me.
I don't think `cmov`-like instructions will start executing
until both of it's inputs are available: https://godbolt.org/z/zgHePf
So i don't see why the existing behavior is the correct one.

Also, let's add it's own `cl::opt` for this threshold,
with default=4, so it is not stricter than the previous threshold:
will allow to fold when there are 2 BB's each with cost=2.
And since the logic has changed, it will also allow to fold when
one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

This is an alternative solution to D65148:
This fix is mainly motivated by `signbit-like-value-extension.ll` test.
That pattern comes up in JPEG decoding, see e.g.
`Figure F.12 â€“ Extending the sign bit of a decoded value in V`
of `ITU T.81` (JPEG specification).
That branch is not predictable, and it is within the innermost loop,
so the fact that that pattern ends up being stuck with a branch
instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta |      % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

We significantly decrease basic block count, notably decrease instruction count,
significantly decrease branch count and very significantly increase `cmov` count.

Performance-wise, unsurprisingly, this has great effect on
target RawSpeed benchmark. I'm seeing 5 **major** improvements:
```
Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681
```

That being said, it's always best-effort, so there will likely
be cases where this worsens things.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372009 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] remove unneeded one-use checks for icmp fold
Sanjay Patel [Mon, 16 Sep 2019 16:15:25 +0000 (16:15 +0000)]
[InstCombine] remove unneeded one-use checks for icmp fold

Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372007 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] move tests for icmp+add; NFC
Sanjay Patel [Mon, 16 Sep 2019 15:33:40 +0000 (15:33 +0000)]
[InstCombine] move tests for icmp+add; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372004 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Add patterns for BSWAP intrinsic on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:20:10 +0000 (15:20 +0000)]
[ARM] Add patterns for BSWAP intrinsic on MVE

BSWAP can use the VREV instruction on MVE to produce better results than
expanding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372002 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Add patterns for bitreverse intrinsic on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:20:03 +0000 (15:20 +0000)]
[ARM] Add patterns for bitreverse intrinsic on MVE

BITREVERSE can use the VBRSR which will reverse and right shift.
Shifting right by 0 will just reverse the bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372001 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Lower CTTZ on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:19:56 +0000 (15:19 +0000)]
[ARM] Lower CTTZ on MVE

Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and
count the leading zeros, equivalent to a count trailing zeros (CTTZ).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372000 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Add patterns for CTLZ on MVE
Oliver Cruickshank [Mon, 16 Sep 2019 15:19:49 +0000 (15:19 +0000)]
[ARM] Add patterns for CTLZ on MVE

CTLZ intrinsic can use the VCLS instruction on MVE, which produces
better results than expanding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371999 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ExecutionEngine] Don't dereference a dyn_cast result. NFCI.
Simon Pilgrim [Mon, 16 Sep 2019 15:19:11 +0000 (15:19 +0000)]
[ExecutionEngine] Don't dereference a dyn_cast result. NFCI.

The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371998 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LV] Add ARM MVE tail-folding tests
Sjoerd Meijer [Mon, 16 Sep 2019 14:56:26 +0000 (14:56 +0000)]
[LV] Add ARM MVE tail-folding tests

Now that the vectorizer can do tail-folding (rL367592), and the ARM backend
understands MVE masked loads/stores (rL371932), it's time to add the MVE
tail-folding equivalent of the X86 tests that I added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371996 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SystemZ] Call erase() on the right MBB in SystemZTargetLowering::emitSelect()
Jonas Paulsson [Mon, 16 Sep 2019 14:49:36 +0000 (14:49 +0000)]
[SystemZ]  Call erase() on the right MBB in SystemZTargetLowering::emitSelect()

Since MBB was split *before* MI, the MI(s) will reside in JoinMBB (MBB) at
the point of erasing them, so calling StartMBB->erase() is actually wrong,
although it is "working" by all appearances.

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371995 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC] remove unused functions
Guillaume Chatelet [Mon, 16 Sep 2019 14:48:58 +0000 (14:48 +0000)]
[NFC] remove unused functions

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67616

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371994 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source
Matt Arsenault [Mon, 16 Sep 2019 14:26:14 +0000 (14:26 +0000)]
AMDGPU/GlobalISel: Fail select of G_INSERT non-32-bit source

This was producing an illegal copy which would hit an assert
later. Error on selection for now until this is implemented.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371993 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Fix some broken run lines
Matt Arsenault [Mon, 16 Sep 2019 14:14:40 +0000 (14:14 +0000)]
AMDGPU/GlobalISel: Fix some broken run lines

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371992 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL
Matt Arsenault [Mon, 16 Sep 2019 14:14:37 +0000 (14:14 +0000)]
AMDGPU/GlobalISel: Fix RegBankSelect for G_FRINT and G_FCEIL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371991 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Remove another illegal select test
Matt Arsenault [Mon, 16 Sep 2019 14:14:31 +0000 (14:14 +0000)]
AMDGPU/GlobalISel: Remove another illegal select test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371990 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86][NFC] Add a `use-aa` feature.
Clement Courbet [Mon, 16 Sep 2019 14:05:28 +0000 (14:05 +0000)]
[X86][NFC] Add a `use-aa` feature.

Summary:
This allows enabling useaa on the command-line and will allow enabling the
feature on a per-CPU basis where benchmarking shows improvements.

This is modelled after the ARM/AArch64 target.

Reviewers: RKSimon, andreadb, craig.topper

Subscribers: javed.absar, kristof.beyls, hiraditya, ychen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67266

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371989 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] add/move tests for icmp with add operand; NFC
Sanjay Patel [Mon, 16 Sep 2019 14:05:19 +0000 (14:05 +0000)]
[InstCombine] add/move tests for icmp with add operand; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371988 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[docs][llvm-strings] Write llvm-strings documentation
James Henderson [Mon, 16 Sep 2019 13:56:12 +0000 (13:56 +0000)]
[docs][llvm-strings] Write llvm-strings documentation

Previously we only had a stub document.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D67554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371984 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[docs][llvm-size] Write llvm-size documentation
James Henderson [Mon, 16 Sep 2019 13:20:37 +0000 (13:20 +0000)]
[docs][llvm-size] Write llvm-size documentation

Previously we only had a stub document.

Reviewed by: serge-sans-paille, MaskRay

Differential Revision: https://reviews.llvm.org/D67555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371983 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Fold VCMP into VPT
David Green [Mon, 16 Sep 2019 13:02:41 +0000 (13:02 +0000)]
[ARM] Fold VCMP into VPT

MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.

There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.

Differential Revision: https://reviews.llvm.org/D66577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371982 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] remove unneeded one-use checks for icmp fold
Sanjay Patel [Mon, 16 Sep 2019 12:54:34 +0000 (12:54 +0000)]
[InstCombine] remove unneeded one-use checks for icmp fold

This fold and several others were added in:
rL125734 <https://reviews.llvm.org/rL125734>
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 is a related patch in this series.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371981 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] add icmp tests with extra uses; NFC
Sanjay Patel [Mon, 16 Sep 2019 12:19:18 +0000 (12:19 +0000)]
[InstCombine] add icmp tests with extra uses; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371979 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] fix comments to match code; NFC
Sanjay Patel [Mon, 16 Sep 2019 12:12:05 +0000 (12:12 +0000)]
[InstCombine] fix comments to match code; NFC

This blob was written before match() existed, so it
could probably be reduced significantly.

But I suspect it isn't well tested, so tests would have
to be added to reduce risk from logic changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371978 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogn build: Merge r371976
Nico Weber [Mon, 16 Sep 2019 11:33:54 +0000 (11:33 +0000)]
gn build: Merge r371976

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371977 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.
Simon Pilgrim [Mon, 16 Sep 2019 11:22:44 +0000 (11:22 +0000)]
[VPlanSLP] Don't dereference a cast_or_null<VPInstruction> result. NFCI.

The static analyzer is warning about a potential null dereference of the cast_or_null result, I've split the cast_or_null check from the ->getUnderlyingInstr() call to avoid this, but it appears that we weren't seeing any null pointers in the dumped bundles in the first place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371975 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference...
Simon Pilgrim [Mon, 16 Sep 2019 10:48:16 +0000 (10:48 +0000)]
[SLPVectorizer] Assert that we find a LastInst to silence analyzer null dereference warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371974 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SLPVectorizer] Don't dereference a dyn_cast result. NFCI.
Simon Pilgrim [Mon, 16 Sep 2019 10:35:09 +0000 (10:35 +0000)]
[SLPVectorizer] Don't dereference a dyn_cast result. NFCI.

The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371973 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdded return statement to fix compile and build warning:
Sjoerd Meijer [Mon, 16 Sep 2019 10:30:37 +0000 (10:30 +0000)]
Added return statement to fix compile and build warning:

llvm-rtdyld.cpp:966:7: warning: variable â€˜Result’ set but not used

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371972 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SVE][Inline-Asm] Add constraints for SVE predicate registers
Kerry McLaughlin [Mon, 16 Sep 2019 09:45:27 +0000 (09:45 +0000)]
[SVE][Inline-Asm] Add constraints for SVE predicate registers

Summary:
Adds the following inline asm constraints for SVE:
  - Upl: One of the low eight SVE predicate registers, P0 to P7 inclusive
  - Upa: SVE predicate register with full range, P0 to P15

Reviewers: t.p.northover, sdesmalen, rovka, momchil.velikov, cameron.mcinally, greened, rengolin

Reviewed By: rovka

Subscribers: javed.absar, tschuett, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371967 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogn build: Merge r371965
Nico Weber [Mon, 16 Sep 2019 09:43:26 +0000 (09:43 +0000)]
gn build: Merge r371965

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371966 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogn build: Merge r371959
Nico Weber [Mon, 16 Sep 2019 07:34:23 +0000 (07:34 +0000)]
gn build: Merge r371959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371961 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[AArch64] Some more FP16 FMA pattern matching
Sjoerd Meijer [Mon, 16 Sep 2019 07:32:13 +0000 (07:32 +0000)]
[AArch64] Some more FP16 FMA pattern matching

After our previous machinecombiner exercises (rL371321, rL371818, rL371833), we
were still missing a few FP16 FMA patterns.

Differential Revision: https://reviews.llvm.org/D67576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371960 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SystemZ] Merge the SystemZExpandPseudo pass into SystemZPostRewrite.
Jonas Paulsson [Mon, 16 Sep 2019 07:29:37 +0000 (07:29 +0000)]
[SystemZ]  Merge the SystemZExpandPseudo pass into SystemZPostRewrite.

SystemZExpandPseudo:s only job was to expand LOCRMux instructions into jump
sequences. This needs to be done if expandLOCRPseudo() or expandSELRPseudo()
fails to find a legal opcode (all registers "high" or "low"). This task has
now been moved to SystemZPostRewrite while removing the SystemZExpandPseudo
pass.

It is in fact preferred to expand these pseudos directly after register
allocation in SystemZPostRewrite since the hinted register combinations are
then not subject to later optimizations.

Review: Ulrich Weigand
https://reviews.llvm.org/D67432

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371959 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Remove illegal select tests
Matt Arsenault [Mon, 16 Sep 2019 04:21:10 +0000 (04:21 +0000)]
AMDGPU/GlobalISel: Remove illegal select tests

These fail in a release build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371955 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Select SMRD loads for more types
Matt Arsenault [Mon, 16 Sep 2019 00:54:07 +0000 (00:54 +0000)]
AMDGPU/GlobalISel: Select SMRD loads for more types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371954 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: RegBankSelect for kill
Matt Arsenault [Mon, 16 Sep 2019 00:48:37 +0000 (00:48 +0000)]
AMDGPU/GlobalISel: RegBankSelect for kill

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371953 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFP
Matt Arsenault [Mon, 16 Sep 2019 00:37:10 +0000 (00:37 +0000)]
AMDGPU/GlobalISel: Legalize s1 source G_[SU]ITOFP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371952 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Set type on vgpr live in special arguments
Matt Arsenault [Mon, 16 Sep 2019 00:33:00 +0000 (00:33 +0000)]
AMDGPU/GlobalISel: Set type on vgpr live in special arguments

Fixes assertion with workitem ID intrinsics used in non-kernel
functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371951 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Select S16->S32 fptoint
Matt Arsenault [Mon, 16 Sep 2019 00:32:56 +0000 (00:32 +0000)]
AMDGPU/GlobalISel: Select S16->S32 fptoint

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371950 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFP
Matt Arsenault [Mon, 16 Sep 2019 00:29:12 +0000 (00:29 +0000)]
AMDGPU/GlobalISel: Select s32->s16 G_[US]ITOFP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371949 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU/GlobalISel: Fix VALU s16 fneg
Matt Arsenault [Mon, 16 Sep 2019 00:20:54 +0000 (00:20 +0000)]
AMDGPU/GlobalISel: Fix VALU s16 fneg

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371948 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor] Heap-To-Stack Conversion
Stefan Stipanovic [Sun, 15 Sep 2019 21:47:41 +0000 (21:47 +0000)]
[Attributor] Heap-To-Stack Conversion

D53362 gives a prototype heap-to-stack conversion pass. With addition of new attributes in the attributor, this can now be revisted and improved. This will place it in the Attributor to make it easier to use new attributes (eg. nofree, nosync, willreturn, etc.) and other attributor features.

Reviewers: jdoerfert, uenoku, hfinkel, efriedma

Subscribers: lebedev.ri, xbolva00, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D65408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371942 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] remove unneeded one-use checks for icmp fold
Sanjay Patel [Sun, 15 Sep 2019 20:56:34 +0000 (20:56 +0000)]
[InstCombine] remove unneeded one-use checks for icmp fold

This fold and several others were added in:
rL125734
...with no explanation for the one-use checks other than the code
comments about register pressure.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

There are similar checks as noted with the TODO comments. I'm
hoping to remove those restrictions too, but if any of these
does cause a regression, it should be easier to correct by making
small, individual commits.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371940 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] add icmp tests with extra uses; NFC
Sanjay Patel [Sun, 15 Sep 2019 20:13:27 +0000 (20:13 +0000)]
[InstCombine] add icmp tests with extra uses; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371939 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[PowerPC][NFC] Add a testcase for fdiv expansion.
Jinsong Ji [Sun, 15 Sep 2019 20:02:25 +0000 (20:02 +0000)]
[PowerPC][NFC] Add a testcase for fdiv expansion.

Pre-commit for following patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371938 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[GlobalISel] findGISelOptimalMemOpLowering - remove dead initalization. NFCI.
Simon Pilgrim [Sun, 15 Sep 2019 16:56:06 +0000 (16:56 +0000)]
[GlobalISel] findGISelOptimalMemOpLowering - remove dead initalization. NFCI.

Fixes static analyzer warning that "Value stored to 'NewTySize' during its initialization is never read".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371937 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LoadStoreVectorizer] vectorizeLoadChain - ensure we find a valid Type down the load...
Simon Pilgrim [Sun, 15 Sep 2019 16:44:35 +0000 (16:44 +0000)]
[LoadStoreVectorizer] vectorizeLoadChain - ensure we find a valid Type down the load chain. NFCI.

Silence static analyzer uninitialized variable warning by setting the LoadTy to null and then asserting we find a real value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371936 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoInterleavedLoadCombine - merge isa<> and dyn_cast<> duplicates. NFCI.
Simon Pilgrim [Sun, 15 Sep 2019 16:20:12 +0000 (16:20 +0000)]
InterleavedLoadCombine - merge isa<> and dyn_cast<> duplicates. NFCI.

Silence static analyzer null dereference warning of *dyn_cast<BinaryOperator> by merging with the isa<BinaryOperator> above.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371935 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[DebugInfo] Don't dereference a dyn_cast<PDBSymbolData> result. NFCI.
Simon Pilgrim [Sun, 15 Sep 2019 15:38:26 +0000 (15:38 +0000)]
[DebugInfo] Don't dereference a dyn_cast<PDBSymbolData> result. NFCI.

The static analyzer is warning about a potential null dereference - but as we're in DataMemberLayoutItem we should be able to guarantee that the Symbol is a PDBSymbolData type, allowing us to use cast<PDBSymbolData> - and if not assert will fire for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371933 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Masked loads and stores
David Green [Sun, 15 Sep 2019 14:14:47 +0000 (14:14 +0000)]
[ARM] Masked loads and stores

Masked loads and store fit naturally with MVE, the instructions being easily
predicated. This adds lowering for the simple cases of masked loads and stores.
It does not yet deal with widening/narrowing or pre/post inc, and so is
currently behind an option.

The llvm masked load intrinsic will accept a "passthru" value, dictating the
values used for the zero masked lanes. In MVE the instructions write 0 to the
zero predicated lanes, so we need to match a passthru that isn't 0 (or undef)
with a select instruction to pull in the correct data after the load.

Differential Revision: https://reviews.llvm.org/D67186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371932 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SLP] limit vectorization of Constant subclasses (PR33958)
Sanjay Patel [Sun, 15 Sep 2019 13:03:24 +0000 (13:03 +0000)]
[SLP] limit vectorization of Constant subclasses (PR33958)

This is a fix for:
https://bugs.llvm.org/show_bug.cgi?id=33958

It seems universally true that we would not want to transform this kind of
sequence on any target, but if that's not correct, then we could view this
as a target-specific cost model problem. We could also white-list ConstantInt,
ConstantFP, etc. rather than blacklist Global and ConstantExpr.

Differential Revision: https://reviews.llvm.org/D67362

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371931 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Simplify and update vmla test. NFC
David Green [Sun, 15 Sep 2019 11:53:05 +0000 (11:53 +0000)]
[ARM] Simplify and update vmla test. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371930 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CodeEmitter] Improve testing for APInt encoding
James Molloy [Sun, 15 Sep 2019 08:44:40 +0000 (08:44 +0000)]
[CodeEmitter] Improve testing for APInt encoding

I missed Artem's comment in D67487 before committing.

Differential Revision: https://reviews.llvm.org/D67487

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371929 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CodeEmitter] Support instruction widths > 64 bits
James Molloy [Sun, 15 Sep 2019 08:35:08 +0000 (08:35 +0000)]
[CodeEmitter] Support instruction widths > 64 bits

Some VLIW instruction sets are Very Long Indeed. Using uint64_t constricts the Inst encoding to 64 bits (naturally).

This change switches CodeEmitter to a mode that uses APInts when Inst's bitwidth is > 64 bits (NFC for existing targets).

When Inst.BitWidth > 64 the prototype changes to:

  void TargetMCCodeEmitter::getBinaryCodeForInstr(const MCInst &MI,
                                                  SmallVectorImpl<MCFixup> &Fixups,
                                                  APInt &Inst,
                                                  APInt &Scratch,
                                                  const MCSubtargetInfo &STI);

The Inst parameter returns the encoded instruction, the Scratch parameter is used internally for manipulating operands and is exposed so that the underlying storage can be reused between calls to getBinaryCodeForInstr. The goal is to elide any APInt constructions that we can.

Similarly the operand encoding prototype changes to:

  getMachineOpValue(const MCInst &MI, const MCOperand &MO, APInt &op, SmallVectorImpl<MCFixup> &Fixups, const MCSubtargetInfo &STI);

That is, the operand is passed by reference as APInt rather than returned as uint64_t.

To reiterate, this APInt mode is enabled only when Inst.BitWidth > 64, so this change is NFC for existing targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371928 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[TargetLowering] SimplifyDemandedBits - add EXTRACT_SUBVECTOR support.
Simon Pilgrim [Sat, 14 Sep 2019 16:38:26 +0000 (16:38 +0000)]
[TargetLowering] SimplifyDemandedBits - add EXTRACT_SUBVECTOR support.

Call SimplifyDemandedBits on the source vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371923 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251)
Roman Lebedev [Sat, 14 Sep 2019 13:47:27 +0000 (13:47 +0000)]
[InstSimplify] simplifyUnsignedRangeCheck(): handle few tautological cases (PR43251)

Summary:
This is split off from D67356, since these cases produce a constant,
no real need to keep them in instcombine.

Alive proofs:
https://rise4fun.com/Alive/u7Fk
https://rise4fun.com/Alive/4lV

https://bugs.llvm.org/show_bug.cgi?id=43251

Reviewers: spatel, nikic, xbolva00

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371921 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ScheduleDAGMILive] Fix typo in comment.
Mingjie Xing [Sat, 14 Sep 2019 03:27:38 +0000 (03:27 +0000)]
[ScheduleDAGMILive] Fix typo in comment.

Differential Revision: https://reviews.llvm.org/D67478

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371916 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][Fix] Use right type to replace expressions
Johannes Doerfert [Sat, 14 Sep 2019 02:57:50 +0000 (02:57 +0000)]
[Attributor][Fix] Use right type to replace expressions

Summary: This should be obsolete once the functionality in D66967 is integrated.

Reviewers: uenoku, sstefan1

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371915 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-objcopy] Ignore -B --binary-architecture=
Fangrui Song [Sat, 14 Sep 2019 01:36:31 +0000 (01:36 +0000)]
[llvm-objcopy] Ignore -B --binary-architecture=

GNU objcopy documents that -B is only useful with architecture-less
input (i.e. "binary" or "ihex"). After D67144, -O defaults to -I, and
-B is essentially a NOP.

* If -O is binary/ihex, GNU objcopy ignores -B.
* If -O is elf*, -B provides the e_machine field in GNU objcopy.

So to convert a blob to an ELF, `-I binary -B i386:x86-64 -O elf64-x86-64` has to be specified.

`-I binary -B i386:x86-64 -O elf64-x86-64` creates an ELF with its
e_machine field set to EM_NONE in GNU objcopy, but a regular x86_64 ELF
in elftoolchain elfcopy. Follow the elftoolchain approach (ignoring -B)
to simplify code. Users that expect their command line portable should
specify -B.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D67215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371914 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-objcopy] Default --output-target to --input-target when unspecified
Fangrui Song [Sat, 14 Sep 2019 01:36:16 +0000 (01:36 +0000)]
[llvm-objcopy] Default --output-target to --input-target when unspecified

Fixes PR42171.

In GNU objcopy, if -O (--output-target) is not specified, the value is
copied from -I (--input-target).

```
objcopy -I binary -B i386:x86-64 a.txt b       # b is copied from a.txt
llvm-objcopy -I binary -B i386:x86-64 a.txt b  # b is an x86-64 object file
```

This patch changes our behavior to match GNU. With this change, we can
delete code related to -B handling (D67215).

Reviewed By: jakehehrlich

Differential Revision: https://reviews.llvm.org/D67144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371913 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-ar] Uncapitalize error messages and delete full stop
Fangrui Song [Sat, 14 Sep 2019 01:18:47 +0000 (01:18 +0000)]
[llvm-ar] Uncapitalize error messages and delete full stop

Most GNU binutils don't append full stops in error messages. This
convention has been adopted by a bunch of LLVM binary utilities. Make
llvm-ar follow the convention as well.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D67558

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371912 91177308-0d34-0410-b5e6-96231b3b80d8