granicus.if.org Git

[WebAssembly] Optimize Irreducible Control Flow

Summary:
Irreducible control flow is not that rare, e.g. it happens in malloc and
3 other places in the libc portions linked in to a hello world program.
This patch improves how we handle that code: it emits a br_table to
dispatch to only the minimal necessary number of blocks. This reduces
the size of malloc by 33%, and makes it comparable in size to asm2wasm's
malloc output.

Added some tests, and verified this passes the emscripten-wasm tests run
on the waterfall (binaryen2, wasmobj2, other).

Reviewers: aheejin, sunfish

Subscribers: mgrang, jgravelle-google, sbc100, dschuff, llvm-commits

Differential Revision: https://reviews.llvm.org/D55467

Patch by Alon Zakai (kripken)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350367 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fixed disassembler not knowing about new brlist operand

Summary:
The previously introduced new operand type for br_table didn't have
a disassembler implementation, causing an assert.

Reviewers: dschuff, aheejin

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D56227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350366 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Made InstPrinter more robust

Summary:
Instead of asserting on certain kinds of malformed instructions, it
now still print, but instead adds an annotation indicating the
problem, and/or indicates invalid_type etc.

We're using the InstPrinter from many contexts that can't always
guarantee values are within range (e.g. the disassembler), where having
output is more valueable than asserting.

Reviewers: dschuff, aheejin

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D56223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350365 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add 512-bit vector tests for horizontal ops; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350364 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add AVX512 runs for horizontal ops; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350362 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for D56283.

This tests a case where we need to be able to compute sign bits for two insert_subvectors that is a liveout of a basic block. The result is then used as a boolean vector in another basic block.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350359 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] remove dead CHECK lines from test file; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350358 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] split tests for FP and integer horizontal math

These are similar patterns, but when you throw AVX512 onto the pile,
the number of variations explodes. For FP, we really don't care about
AVX1 vs. AVX2 for FP ops. There may be some superficial shuffle diffs,
but that's not what we're testing for here, so I removed those RUNs.

Separating by type also lets us specify 'sse3' for the FP file vs. 'ssse3'
for the integer file...because x86.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350357 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add common FileCheck prefix to reduce assert duplication; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350356 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove terrible DX Register parsing hack in parse operand. NFCI.

Fold hack special casing of (%dx) operand parsing into the related
hack for out*/in* instruction parsing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350355 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner][x86] scalarize binop followed by extractelement

As noted in PR39973 and D55558:
https://bugs.llvm.org/show_bug.cgi?id=39973
...this is a partial implementation of a fold that we do as an IR canonicalization in instcombine:

// extelt (binop X, Y), Index --> binop (extelt X, Index), (extelt Y, Index)

We want to have this in the DAG too because as we can see in some of the test diffs (reductions),
the pattern may not be visible in IR.

Given that this is already an IR canonicalization, any backend that would prefer a vector op over
a scalar op is expected to already have the reverse transform in DAG lowering (not sure if that's
a realistic expectation though). The transform is limited with a TLI hook because there's an
existing transform in CodeGenPrepare that tries to do the opposite transform.

Differential Revision: https://reviews.llvm.org/D55722

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350354 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Update integration/blink.ll as we now generate sbi/cbi instructions.

Silence long standing test failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350353 91177308-0d34-0410-b5e6-96231b3b80d8

[CaptureTracking] Add a unit test for MaxUsesToExplore

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350351 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.

Detailed description: SIFoldOperands::foldInstOperand iterates over the
operand uses calling the function that changes def-use iteratorson the
way. As a result loop exits immediately when def-use iterator is
changed. Hence, the operand is folded to the very first use instruction
only. This makes VGPR live along the whole basic block and increases
register pressure significantly. The performance drop observed in SHOC
DeviceMemory test is caused by this bug.

Proposed fix: collect uses to separate container for further processing
in another loop.

Testing: make check-llvm
SHOC performance test.

Reviewers: rampitec, ronlieb

Differential Revision: https://reviews.llvm.org/D56161

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350350 91177308-0d34-0410-b5e6-96231b3b80d8

[UnrollRuntime] Move the DomTree verification under expensive checks

Suggested by Hal as done in r349871.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350349 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused %host_cc lit pattern

It was added in r257236 but then the one use was removed in r309517. Since no
test should call %host_cc, remove the pattern.

Differential Revision: https://reviews.llvm.org/D56200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350348 91177308-0d34-0410-b5e6-96231b3b80d8

Reflow module.modulemap for readability

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350347 91177308-0d34-0410-b5e6-96231b3b80d8

Unbreak the modules build by splitting Target out into its own top-level module

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350346 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Resubmit rL345008 "Split MachinePipeliner code into header and cpp files""

This reverts commit r350290.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350345 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[MachinePipeliner] Add missing header file to MachinePipeliner.h"

This reverts commit r350296.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350344 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix buildbots on older compilers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350343 91177308-0d34-0410-b5e6-96231b3b80d8

[MCStreamer] Use report_fatal_error in EmitRawTextImpl

Use report_fatal_error in MCStreamer::EmitRawTextImpl instead of
using errs() and explain the rationale behind it not being
llvm_unreachable() to save confusion for any future maintainers.

Differential Revision: https://reviews.llvm.org/D56245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350342 91177308-0d34-0410-b5e6-96231b3b80d8

[elfabi] Introduce tool for ELF TextAPI

Follow up for D53051

This patch introduces the tool associated with the ELF implementation of
TextAPI (previously llvm-tapi, renamed for better distinction). This
tool will house a number of features related to enalysis and
manipulation of shared object's exposed interfaces. The first major
feature for this tool is support for producing binary stubs that are
useful for compile-time linking of shared objects. This patch introduces
beginnings of support for reading binary ELF objects to work towards
that goal.

Added:

- elfabi tool.
- support for reading architecture from a binary ELF file into an
ELFStub.
- Support for writing .tbe files.

Differential Revision: https://reviews.llvm.org/D55352

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350341 91177308-0d34-0410-b5e6-96231b3b80d8

Rename TapiTests to TextAPITests

This makes the target name consistent with how all the other unit tests are
named.

Differential Revision: https://reviews.llvm.org/D56216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350339 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for buildvector with extracted element; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350338 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typos in comments

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350337 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][ELF] Implement a mutable section visitor that updates size-related fields (Size, EntrySize, Align) before layout.

Summary:
Fix EntrySize, Size, and Align before doing layout calculation.

As a side cleanup, this removes a dependence on sizeof(Elf_Sym) within BinaryReader, so we can untemplatize that.

This unblocks a cleaner implementation of handling the -O<format> flag. See D53667 for a previous attempt. Actual implementation of the -O<format> flag will come in an upcoming commit, this is largely a NFC (although not _totally_ one, because alignment on binary input was actually wrong before).

Reviewers: jakehehrlich, jhenderson, alexshap, espindola

Reviewed By: jhenderson

Subscribers: emaste, arichardson, llvm-commits

Differential Revision: https://reviews.llvm.org/D56211

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350336 91177308-0d34-0410-b5e6-96231b3b80d8

[UnrollRuntime] Add DomTree verification under debug mode

NFC: This adds the dom tree verification under debug mode at a point
just before we start unrolling the loop. This allows us to verify dom
tree at a state where it is much smaller and before the unrolling
actually happens.
This also implies we do not need to run -verify-dom-info everytime to
see if the DT is in a valid state when we transform the loop for runtime
unrolling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350334 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Add new scheduling predicates

Add new scheduling predicates to identify the ASIMD loads and stores using the post indexed addressing mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350332 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - no explicit reference to Python version

Update documentation and shebang.

Differential Revision: https://reviews.llvm.org/D56252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350327 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - iterator protocol

In Python2 next() is used wile it's __next__ in Python3.

Differential Revision: https://reviews.llvm.org/D56250

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350326 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add truncate cost tests to cover all legal destination types

We were only testing costs for legal source vector element counts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350323 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Improve code comment and reuse an helper function in ResourceManager. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350322 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV][MC] Accept %lo and %pcrel_lo on operands to li

This matches GNU assembler behaviour.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350321 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - decode/encode string

Differential Revision: https://reviews.llvm.org/D56258

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350320 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - test if type is integral

Rely on numbers.Integral instead of int/long

Differential Revision: https://reviews.llvm.org/D56262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350316 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - urllib

Differential Revision: https://reviews.llvm.org/D56261

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350315 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - has_key vs. in operator

Use portable `in` operator instead of `has_key(...)` method.

Differential Revision: https://reviews.llvm.org/D56260

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350314 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - map/filter

Differential Revision: https://reviews.llvm.org/D56259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350313 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - iteritems() vs. items()

Always use `items()` and introduce extra `list(...)` call when needed.

Differential Revision: https://reviews.llvm.org/D56257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350312 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - portable way of raising exceptions

Differential Revision: https://reviews.llvm.org/D56256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350311 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Remove unused Python import

Differential Revision: https://reviews.llvm.org/D56254

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350310 91177308-0d34-0410-b5e6-96231b3b80d8

Pythran compat - range vs. xrange

Use range instead of xrange whenever possible. The extra list creation in Python2
is generally not a performance bottleneck.

Differential Revision: https://reviews.llvm.org/D56253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350309 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - assertRaisesRegex

Python3 uses assertRaisesRegex instad of assertRaisesRegexp.

Differential Revision: https://reviews.llvm.org/D56251

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350308 91177308-0d34-0410-b5e6-96231b3b80d8

Python compat - print statement

Make sure all print statements are compatible with Python 2 and Python3 using
the `from __future__ import print_function` statement.

Differential Revision: https://reviews.llvm.org/D56249

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350307 91177308-0d34-0410-b5e6-96231b3b80d8

[NewPM] Port Msan

Summary:
Keeping msan a function pass requires replacing the module level initialization:
That means, don't define a ctor function which calls __msan_init, instead just
declare the init function at the first access, and add that to the global ctors
list.

Changes:
- Pull the actual sanitizer and the wrapper pass apart.
- Add a newpm msan pass. The function pass inserts calls to runtime
library functions, for which it inserts declarations as necessary.
- Update tests.

Caveats:
- There is one test that I dropped, because it specifically tested the
definition of the ctor.

Reviewers: chandlerc, fedor.sergeev, leonardchan, vitalybuka

Subscribers: sdardis, nemanjai, javed.absar, hiraditya, kbarton, bollu, atanasyan, jsji

Differential Revision: https://reviews.llvm.org/D55647

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350305 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix missing testfile change of rL350299

This file was missing on the patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350302 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Cleanup saturated add/sub tests
Use X86/X64 check prefixes
Use nounwind to reduce cfi noise

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350301 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Flag ADD/SUB SSAT/USAT intrinsics trivially vectorizable (PR40123)

Enables SLP vectorization for the SSE2 PADDS/PADDUS/PSUBS/PSUBUS style intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350300 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add command-line option for SB

SB (Speculative Barrier) is only mandatory from 8.5
onwards but is optional from Armv8.0-A. This patch adds a command
line option to enable SB, as it was previously only possible to
enable by selecting -march=armv8.5-a.

This patch also renames FeatureSpecRestrict to FeatureSB.

Reviewed By: olista01, LukeCheeseman

Differential Revision: https://reviews.llvm.org/D55990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350299 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer][X86] Add ADD/SUB SSAT/USAT tests (PR40123)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350297 91177308-0d34-0410-b5e6-96231b3b80d8

[MachinePipeliner] Add missing header file to MachinePipeliner.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350296 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add ADD/SUB SSAT/USAT vector costs (PR40123)

Costs for real SSE2 instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350295 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add ADD/SUB SSAT/USAT cost tests (PR40123)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350293 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Change section name with metadata access

Summary:
The commit rL348922 introduced a means to set Metadata
section kind for a global variable, if its explicit section
name was prefixed with ".AMDGPU.metadata.".

This patch changes that prefix to ".AMDGPU.comment.",
as "metadata" in the section name might lead to
ambiguity with metadata used by AMD PAL runtime.

Change-Id: Idd4748800d6fe801441d91595fc21e5a4171e668

Reviewers: kzhuravl

Reviewed By: kzhuravl

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D56197

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350292 91177308-0d34-0410-b5e6-96231b3b80d8

Resubmit rL345008 "Split MachinePipeliner code into header and cpp files"

The commit caused unclear failures in http://green.lab.llvm.org/green//job/lldb-cmake/
will revert if the error reappears

Differential Revision: https://reviews.llvm.org/D56084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350290 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Skip over dbg-instr in twoaddr pass

A DBG_VALUE between a two-address instruction and a following COPY
would prevent rescheduleMIBelowKill optimization inside
TwoAddressInstructionPass.

Differential Revision: https://reviews.llvm.org/D55987

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350289 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] [COFF] Print the symbol index for relocations

There can be multiple local symbols with the same name (for e.g.
comdat sections), and thus the symbol name itself isn't enough
to disambiguate symbols.

Differential Revision: https://reviews.llvm.org/D56140

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350288 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for opportunities to use KTEST when check if the result of ANDing two mask registers is zero.

The test cases are constructed to avoid folding the AND into a masked compare operation.

Currently we emit a KAND and a KORTEST for these cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350287 91177308-0d34-0410-b5e6-96231b3b80d8

Don't go over 80 chars in MCStreamer.cpp. NFC.

Fixing up style issues around the area to prepare for
a larger differential.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350286 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Enable the Out-of-Order scheduling model for P9 hw

When switched to the MI scheduler for P9, the hardware is modeled as out of order.
However, inside the MI Scheduler algorithm, we still use the in-order scheduling model
as the MicroOpBufferSize isn't set. The MI scheduler take it as the hw cannot buffer
the op. So, only when all the available instructions issued, the pending instruction
could be scheduled. That is not true for our P9 hw in fact.

This patch is trying to enable the Out-of-Order scheduling model. The buffer size 44 is
picked from the P9 hw spec, and the perf test indicate that, its value won't hurt the cpu2017.

With this patch, there are 3 specs improved over 3% and 1 spec deg over 3%. The detail is as follows:

x264_r: +6.95%
cactuBSSN_r: +6.94%
lbm_r: +4.11%
xz_r: -3.85%

And the GEOMEAN for all the C/C++ spec in spec2017 is about 0.18% improved.

Reviewer: Nemanjai
Differential Revision: https://reviews.llvm.org/D55810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350285 91177308-0d34-0410-b5e6-96231b3b80d8

Teach ObjCARC optimizer about equivalent PHIs when eliminating autoreleaseRV/retainRV pairs

OptimizeAutoreleaseRVCall skips optimizing llvm.objc.autoreleaseReturnValue if it
sees a user which is llvm.objc.retainAutoreleasedReturnValue, and if they have
equivalent arguments (either identical or equivalent PHIs). It then assumes that
ObjCARCOpt::OptimizeRetainRVCall will optimize the pair instead.

Trouble is, ObjCARCOpt::OptimizeRetainRVCall doesn't know about equivalent PHIs
so optimizes in a different way and we are left with an unoptimized llvm.objc.autoreleaseReturnValue.

This teaches ObjCARCOpt::OptimizeRetainRVCall to also understand PHI equivalence.

rdar://problem/47005143

Reviewed By: ahatanak

Differential Revision: https://reviews.llvm.org/D56235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350284 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC visualizer for PointerUnion4

Calculate which item is being held and then display it with the appropriate type. We also
optimize the display of PointerUnion3 to take advantage of our knowing that the IntMask is
always 1 in PointerUnion types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350280 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Expand LLVMRelocMode

Summary: Add read[only|write] PIC relocation models to the C API and teach the TargetMachine API about it.

Reviewers: whitequark, deadalnix

Reviewed By: whitequark

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D56187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350279 91177308-0d34-0410-b5e6-96231b3b80d8

[tblgen][disasm] Emit record names again when decoder conflicts occur.

And add a test for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350277 91177308-0d34-0410-b5e6-96231b3b80d8

[gold] emit assembly listing from gold plugin on LTO stage

Summary:
Sometimes it's useful to emit assembly after LTO stage to modify it manually. Emitting precodegen bitcode file (via save-temps plugin option) and then feeding it to llc doesn't always give the same binary as original.
This patch is simpler alternative to https://reviews.llvm.org/D24020.

Patch by Denis Bakhvalov.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: MaskRay, inglorion, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D56114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350276 91177308-0d34-0410-b5e6-96231b3b80d8

MSVC Visualizer for PointerUnion3

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350275 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add load folding support to the custom isel we do for X86ISD::UMUL/SMUL.

The peephole pass isn't always able to fold the load because it can't commute the implicit usage of AL/AX/EAX/RAX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350272 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases to show that we fail to fold loads into i8 smulo and i8/i16/i32/i64 umulo lowering without the assistance of the peephole pass. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350271 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] made assembler parse block_type

Summary:
This was previously ignored and an incorrect value generated.

Also fixed Disassembler's handling of block_type.

Reviewers: dschuff, aheejin

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D56092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350270 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Scan all variants of vague symbol for reachability.

Summary:
Alias can make one (but not all) live, we still need to scan all others if this symbol is reachable
from somewhere else.

Reviewers: tejohnson, grimar

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D56117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350269 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] Fix typo in test; NFC

shl by 32 is undefined. This was intended to be a shl by 31 as part
of a rotate sequence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350265 91177308-0d34-0410-b5e6-96231b3b80d8

Fix assert in ObjCARC optimizer when deleting retainBlock of null or undef.

The caller to EraseInstruction had this conditional:

// ARC calls with null are no-ops. Delete them.
if (IsNullOrUndef(Arg))

but the assert inside EraseInstruction only allowed ConstantPointerNull and not
undef or bitcasts.

This adds support for both of these cases.

rdar://problem/47003805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350261 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly][NFC] Elaborate on simd-noopt test comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350260 91177308-0d34-0410-b5e6-96231b3b80d8

[BDCE] Remove instructions without demanded bits

If an instruction has no demanded bits, remove it directly during BDCE,
instead of leaving it for something else to clean up.

Differential Revision: https://reviews.llvm.org/D56185

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350257 91177308-0d34-0410-b5e6-96231b3b80d8

Git ignore CLion project configuration files. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350256 91177308-0d34-0410-b5e6-96231b3b80d8

Format AggresiveInstCombine.cpp. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350255 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC PointerUnion visualizer

Differential Revision: https://reviews.llvm.org/D56186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350250 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove X86ISD::INC/DEC. Just select them from X86ISD::ADD/SUB at isel time

INC/DEC are pretty much the same as ADD/SUB except that they don't update the C flag.

This patch removes the special nodes and just pattern matches from ADD/SUB during isel if the C flag isn't being used.

I had to avoid selecting DEC is the result isn't used. This will become a SUB immediate which will turned into a CMP later by optimizeCompareInstr. This lead to the one test change where we use a CMP instead of a DEC for an overflow intrinsic since we only checked the flag.

This also exposed a hole in our RMW flag matching use of hasNoCarryFlagUses. Our root node for the match is a store and there's no guarantee that all the flag users have been selected yet. So hasNoCarryFlagUses needs to check copyToReg and machine opcodes, but it also needs to check for the pre-match SETCC, SETCC_CARRY, BRCOND, and CMOV opcodes.

Differential Revision: https://reviews.llvm.org/D55975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350245 91177308-0d34-0410-b5e6-96231b3b80d8

[MS Demangler] Add a flag for dumping types without tag specifier.

Sometimes it's useful to be able to output demangled names without
tag specifiers like "struct", "class", etc. This patch adds a
flag enabling this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350241 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] After performing the division by constant optimization for a DIV or REM node, replace the users of the corresponding REM or DIV node if it exists.

Currently we expand the two nodes separately. This gives DAG combiner an opportunity to optimize the expanded sequence taking into account only one set of users. When we expand the other node we'll create the expansion again, but might not be able to optimize it the same way. So the nodes won't CSE and we'll have two similarish sequences in the same basic block. By expanding both nodes at the same time we'll avoid prematurely optimizing the expansion until both the division and remainder have been replaced.

Improves the test case from PR38217. There may be additional opportunities after this.

Differential Revision: https://reviews.llvm.org/D56145

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350239 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add fuzzers in llvm/tools that are needed for check-llvm

Also add a fuzzer() template for defining fuzzers that's similar to
add_llvm_fuzzer in the CMake build, and a build file for dependency
llvm/lib/FuzzMutate.

Also make `assert(defined(...` error strings a bit more self-consistent.

Differential Revision: https://reviews.llvm.org/D56194

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350238 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Adding full coverage of MC encoding for the XOP and LWP ISAs.

Adding MC regressions tests to cover the XOP isa set.
This patch is part of a larger task to cover MC encoding of all X86 isa sets started in revision: https://reviews.llvm.org/D39952

Differential Revision: https://reviews.llvm.org/D41392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350237 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeIntegerTypes] When promoting the result of an extract_vector_elt also promote the input type if necessary

By also promoting the input type we get a better idea for what scalar type to use. This can provide better results if the result of the extract is sign extended. What was previously happening is that the extract result would be legalized, sometime later the input of the sign extend would be legalized using the result of the extract. Then later the extract input would be legalized forcing a truncate into the input of the sign extend using a replace all uses. This requires DAG combine to combine out the sext/truncate pair. But sometimes we visited the truncate first and messed things up before the sext could be combined.

By creating the extract with the correct scalar type when we create legalize the result type, the truncate will be added right away. Then when the sign_extend input is legalized it will create an any_extend of the truncate which can be optimized by getNode to maybe remove the truncate. And then a sign_extend_inreg. Now DAG combine doesn't have to worry about getting rid of the extend.

This fixes the regression on X86 in D56156.

Differential Revision: https://reviews.llvm.org/D56176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350236 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner][X86][PowerPC] Teach visitSIGN_EXTEND_INREG to fold (sext_in_reg (aext/sext x)) -> (sext x) when x has more than 1 sign bit and the sext_inreg is from one of them.

If x has multiple sign bits than it doesn't matter which one we extend from so we can sext from x's msb instead.

The X86 setcc-combine.ll changes are a little weird. It appears we ended up with a (sext_inreg (aext (trunc (extractelt)))) after type legalization. The sext_inreg+aext now gets optimized by this combine to leave (sext (trunc (extractelt))). Then we visit the trunc before we visit the sext. This ends up changing the truncate to an extractvectorelt from a bitcasted vector. I have a follow up patch to fix this.

Differential Revision: https://reviews.llvm.org/D56156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350235 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add build files for bugpoint-passes and LLVMHello plugins

These two plugins are loaded into a host process that contains all LLVM
symbols, so they don't link against anything. This required minor readjustments
to the tablegen() setup of IR.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56204

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350234 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: lli, lli-child-target

Also add build files for dependencies llvm/lib/ExecutionEngine/{Interpreter,Orc}

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350226 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Remove SeenUse check when optimizing conditional branch in
PPCPreEmitPeephole pass.

PPCPreEmitPeephole will convert a BC to B when the conditional branch is
based on a constant CR by CRSET or CRUNSET. This is added in
https://reviews.llvm.org/rL343100.

When the conditional branch is known to be always taken, all branches will
be removed and a new unconditional branch will be inserted. However, when
SeenUse is false the original patch will not remove the branches, but still
insert the new unconditional branch, update the successors and create
inconsistent IR. Compiling the synthetic testcase included can show the
problem we run into.

The patch simply removes the SeenUse condition when adding branches into
InstrsToErase set.

Differential Revision: https://reviews.llvm.org/D56041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350223 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Support SHLD/SHRD masked shift-counts (PR34641)

Peek through shift modulo masks while matching double shift patterns.

I was hoping to delay this until I could remove the X86 code with generic funnel shift matching (PR40081) but this will do for now.

Differential Revision: https://reviews.llvm.org/D56199

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350222 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add more tests for potential horizontal ops; NFC

As discussed in D56011 - add runs for AVX512 and tests with extra uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350221 91177308-0d34-0410-b5e6-96231b3b80d8

[BasicAA] Support arbitrary pointer sizes (and fix an overflow bug)

Motivated by the discussion in D38499, this patch updates BasicAA to support
arbitrary pointer sizes by switching most remaining non-APInt calculations to
use APInt. The size of these APInts is set to the maximum pointer size (maximum
over all address spaces described by the data layout string).

Most of this translation is straightforward, but this patch contains a fix for
a bug that revealed itself during this translation process. In order for
test/Analysis/BasicAA/gep-and-alias.ll to pass, which is run with 32-bit
pointers, the intermediate calculations must be performed using 64-bit
integers. This is because, as noted in the patch, when GetLinearExpression
decomposes an expression into C1*V+C2, and we then multiply this by Scale, and
distribute, to get (C1*Scale)*V + C2*Scale, it can be the case that, even
through C1*V+C2 does not overflow for relevant values of V, (C2*Scale) can
overflow. If this happens, later logic will draw invalid conclusions from the
(base) offset value. Thus, when initially applying the APInt conversion,
because the maximum pointer size in this test is 32 bits, it started failing.
Suspicious, I created a 64-bit version of this test (included here), and that
failed (miscompiled) on trunk for a similar reason (the multiplication can
overflow).

After fixing this overflow bug, the first test case (at least) in
Analysis/BasicAA/q.bad.ll started failing. This is also a 32-bit test, and was
relying on having 64-bit intermediate values to have BasicAA return an accurate
result. In order to fix this problem, and because I believe that it is not
uncommon to use i64 indexing expressions in 32-bit code (especially portable
code using int64_t), it seems reasonable to always use at least 64-bit
integers. In this way, we won't regress our analysis capabilities (and there's
a command-line option added, so experimenting with this should be easy).

As pointed out by Eli during the review, there are other potential overflow
conditions that this patch does not address. Fixing those is left to follow-up
work.

Patch by me with contributions from Michael Ferguson (mferguson@cray.com).

Differential Revision: https://reviews.llvm.org/D38662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350220 91177308-0d34-0410-b5e6-96231b3b80d8

Extend Module::getOrInsertGlobal to control the construction of the
GlobalVariable

Summary:
Extend Module::getOrInsertGlobal to accept a callback for creating a new
GlobalVariable if necessary instead of calling the GV constructor
directly using default arguments. Additionally overload
getOrInsertGlobal for the previous default behavior.

Reviewers: chandlerc

Subscribers: hiraditya, llvm-commits, bollu

Differential Revision: https://reviews.llvm.org/D56130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350219 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Minor refactoring of method DefaultResourceStrategy::select. NFCI

Common code used by the default resource strategy to select pipeline resources
has been moved to an helper function.

The new selection logic has been slightly rewritten to get rid of a redundant
zero check on the `ReadyMask` value. Before this patch, method select internally
called function `PowerOf2Floor` to compute the next ready pipeline resource.
However, `PowerOf2Floor` forces an implicit (redundant) zero check on the input
value. By construction, `ReadyMask` can never be zero. This patch replaces the
call to `PowerOf2Floor` with an equivalent block of code which avoids the
redundant zero check. This gives a minor 3-3.5% speedup on a release build.

No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350218 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: bugpoint, dsymutil, llvm-opt-report

Also add build file for dependency llvm/lib/OptRemarks.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350217 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-c-test, llvm-cfi-verify, llvm-cov, llvm-cvtres

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350216 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-cxxdump, llvm-cxxfilt, llvm-cxxmap

Needed for check-llvm.

This is the last target reading llvm_install_binutils_symlinks.

Differential Revision: https://reviews.llvm.org/D56190

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350215 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-diff, llvm-dwp

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56189

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350214 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-mca, llvm-mt

Also add build file for dependency llvm/lib/MCA.

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350213 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-size, llvm-split, llvm-strings

Needed for check-llvm.

Differential Revision: https://reviews.llvm.org/D56164

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350212 91177308-0d34-0410-b5e6-96231b3b80d8

[gn build] Add some llvm/tools: llvm-xray, sancov, sanstats, verify-uselistorder, yaml-bench

Also add build file for dependency llvm/lib/XRay.

Needed for check-llvm.

(yaml-bench is an llvm/util, not an llvm/tool.)

Differential Revision: https://reviews.llvm.org/D56163

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350211 91177308-0d34-0410-b5e6-96231b3b80d8