granicus.if.org Git

[llvm-exegesis] Add stabilization test with config

In preparation for D68629.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374020 91177308-0d34-0410-b5e6-96231b3b80d8

[IA] Recognize hexadecimal escape sequences

Summary:
Implement support for hexadecimal escape sequences to match how GNU 'as'
handles them. I.e., read all hexadecimal characters and truncate to the
lower 16 bits.

Reviewers: nickdesaulniers, jcai19

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68598

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374018 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize

In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374017 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] [NFC] replace addWithNoSignedWrap with addWithNoWrap.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374016 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add REQUIRES: asserts to cfg-stackify-eh.ll

This was missing in D68552.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374015 91177308-0d34-0410-b5e6-96231b3b80d8

[LitConfig] Silenced notes/warnings on quiet.

Lit has a "quiet" option, -q, which is documented to "suppress no
error output". Previously, LitConfig displayed notes and warnings when
the quiet option was specified. The result was that it was not
possible to get only pertinent file/line information to be used by an
editor to jump to the location where checks were failing without
passing a number of unhelpful locations first. Here, the
implementations of LitConfig.note and LitConfig.warning are modified
to account for the quiet flag and avoid displaying if the flag has
indeed been set.

Patch by Nate Chandler

Reviewed by yln

Differential Revision: https://reviews.llvm.org/D68044

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374009 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Clamp G_SITOFP/G_UITOFP sources

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373989 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][NFC] Add debug output

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373988 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][FIX] Remove initialize calls and add undefs

The initialization logic has become part of the Attributor but the
patches that introduced these calls here were in development when the
transition happened.

We also now clean up (undefine) the macros used to create attributes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373987 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Use local linkage instead of internal

Local linkage is internal or private, and private is a specialization of
internal, so either is fine for all our "local linkage" queries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373986 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Use abstract call sites for call site callback

Summary:
When we iterate over uses of functions and expect them to be call sites,
we now use abstract call sites to allow callback calls.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, hfinkel, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67871

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373985 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Shrink zero extends of gather indices from type less than i32 to types larger than i32.

Gather instructions can use i32 or i64 elements for indices. If
the index is zero extended from a type smaller than i32 to i64, we
can shrink the extend to just extend to i32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373982 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for zero extending a gather index from less than i32 to i64.

We should be able to use a smaller zero extend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373981 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the spelling of my name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373980 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add new calling convention that guarantees tail call optimization

When the target option GuaranteedTailCallOpt is specified, calls with
the fastcc calling convention will be transformed into tail calls if
they are in tail position. This diff adds a new calling convention,
tailcc, currently supported only on X86, which behaves the same way as
fastcc, except that the GuaranteedTailCallOpt flag does not need to
enabled in order to enable tail call optimization.

Patch by Dwight Guth <dwight.guth@runtimeverification.com>!

Reviewed By: lebedev.ri, paquette, rnk

Differential Revision: https://reviews.llvm.org/D67855

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373976 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Fix unwind mismatch stat computation

Summary:
There was a bug when computing the number of unwind destination
mismatches in CFGStackify. When there are many mismatched calls that
share the same (original) destination BB, they have to be counted
separately.

This also fixes a typo and runs `fixUnwindMismatches` only when the wasm
exception handling is enabled. This is to prevent unnecessary
computations and does not change behavior.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373975 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lipo] Add TextAPI to LINK_COMPONENTS

Summary:
D68319 uses `MachO::getCPUTypeFromArchitecture` and without this builds
with `-DBUILD_SHARED_LIBS=ON` fail.

Reviewers: alexshap

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68594

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373974 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor][FIX] Remove assertion wrong for on invalid IRPositions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373972 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-c] Add UnaryOperator to LLVM_FOR_EACH_VALUE_SUBCLASS macro

Note that we are not sure where the tests for these functions lives. This was discussed in the Phab Diff.

Differential Revision: https://reviews.llvm.org/D68588

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373969 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add memory intrinsics handling to mayThrow()

Summary:
Previously, `WebAssembly::mayThrow()` assumed all inputs are global
addresses. But when intrinsics, such as `memcpy`, `memmove`, or `memset`
are lowered to external symbols in instruction selection and later
emitted as library calls. And these functions don't throw.

This patch adds handling to those memory intrinsics to `mayThrow`
function. But while most of libcalls don't throw, we can't guarantee all
of them don't throw, so currently we conservatively return true for all
other external symbols.

I think a better way to solve this problem is to embed 'nounwind' info
in `TargetLowering::CallLoweringInfo`, so that we can access the info
from the backend. This will also enable transferring 'nounwind'
properties of LLVM IR instructions. Currently we don't transfer that
info and we can only access properties of callee functions, if the
callees are within the module. Other targets don't need this info in the
backend because they do all the processing before isel, but it will help
us because that info will reduce code size increase in fixing unwind
destination mismatches in CFGStackify.

But for now we return false for these memory intrinsics and true for all
other libcalls conservatively.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68553

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373967 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-lipo] Relax the check of the specified input file architecture

cctools lipo only compares the cputypes when it verifies that
the specified (via -arch) input file and the architecture match.
This diff adjusts the behavior of llvm-lipo accordingly.

Differential revision: https://reviews.llvm.org/D68319

Test plan: make check-all

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373966 91177308-0d34-0410-b5e6-96231b3b80d8

[Attributor] Deduce memory behavior of functions and arguments

Deduce the memory behavior, aka "read-none", "read-only", or
"write-only", for functions and arguments.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67384

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373965 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fold conditional sign-extend of high-bit-extract into high-bit-extract-with-signext (PR42389)

This can come up in Bit Stream abstractions.

The pattern looks big/scary, but it can't be simplified any further.
It only is so simple because a number of my preparatory folds had
happened already (shift amount reassociation / shift amount
reassociation in bit test, sign bit test detection).

Highlights:
* There are two main flavors: https://rise4fun.com/Alive/zWi
  The difference is add vs. sub, and left-shift of -1 vs. 1
* Since we only change the shift opcode,
  we can preserve the exact-ness: https://rise4fun.com/Alive/4u4
* There can be truncation after high-bit-extraction:
  https://rise4fun.com/Alive/slHc1   (the main pattern i'm after!)
  Which means that we need to ignore zext of shift amounts and of NBits.
* The sign-extending magic can be extended itself (in add pattern
  via sext, in sub pattern via zext. not the other way around!)
  https://rise4fun.com/Alive/NhG
  (or those sext/zext can be sinked into `select`!)
  Which again means we should pay attention when matching NBits.
* We can have both truncation of extraction and widening of magic:
  https://rise4fun.com/Alive/XTw
  In other words, i don't believe we need to have any checks on
  bitwidths of any of these constructs.

This is worsened in general by the fact that we may have `sext` instead
of `zext` for shift amounts, and we don't yet canonicalize to `zext`,
although we should. I have not done anything about that here.

Also, we really should have something to weed out `sub` like these,
by folding them into `add` variant.

https://bugs.llvm.org/show_bug.cgi?id=42389

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373964 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][NFC] Tests for "conditional sign-extend of high-bit-extract" pattern (PR42389)

https://bugs.llvm.org/show_bug.cgi?id=42389

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373963 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Move isSignBitCheck(), handle rest of the predicates

True, no test coverage is being added here. But those non-canonical
predicates that are already handled here already have no test coverage
as far as i can tell. I tried to add tests for them, but all the patterns
already get handled elsewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373962 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][NFC] dropRedundantMaskingOfLeftShiftInput(): change how we deal with mask

Summary:
Currently, we pre-check whether we need to produce a mask or not.
This involves some rather magical constants.
I'd like to extend this fold to also handle the situation
when there's also a `trunc` before outer shift.
That will require another set of magical constants.
It's ugly.

Instead, we can just compute the mask, and check
whether mask is a pass-through (all-ones) or not.
This way we don't need to have any magical numbers.

This change is NFC other than the fact that we now compute
the mask and then check if we need (and can!) apply it.

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68470

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373961 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] dropRedundantMaskingOfLeftShiftInput(): propagate undef shift amounts

Summary:
When we do `ConstantExpr::getZExt()`, that "extends" `undef` to `0`,
which means that for patterns a/b we'd assume that we must not produce
any bits for that channel, while in reality we simply didn't care
about that channel - i.e. we don't need to mask it.

Reviewers: spatel

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68239

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373960 91177308-0d34-0410-b5e6-96231b3b80d8

[Bitcode] Update naming of UNOP_NEG to UNOP_FNEG

Differential Revision: https://reviews.llvm.org/D68588

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373958 91177308-0d34-0410-b5e6-96231b3b80d8

[AccelTable] Remove stale comment (NFC)

rdar://55857228

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373956 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: try to make system-libs.windows.test pass

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373948 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Handle more G_INSERT cases

Start manually writing a table to get the subreg index. TableGen
should probably generate this, but I'm not sure what it looks like in
the arbitrary case where subregisters are allowed to not fully cover
the super-registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373947 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Partially implement lower for G_INSERT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373946 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix selection of 16-bit shifts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373945 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select VALU G_AMDGPU_FFBH_U32

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373944 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Use S_MOV_B64 for inline constants

This hides some defects in SIFoldOperands when the immediates are
split.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373943 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Widen 16-bit G_MERGE_VALUEs sources

Continue making a mess of merge/unmerge legality.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373942 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Select more G_INSERT cases

At minimum handle the s64 insert type, which are emitted in real cases
during legalization.

We really need TableGen to emit something to emit something like the
inverse of composeSubRegIndices do determine the subreg index to use.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373938 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Add target pre-isel instructions

Allows targets to introduce regbankselectable
pseudo-instructions. Currently the closet feature to this is an
intrinsic. However this requires creating a public intrinsic
declaration. This litters the public intrinsic namespace with
operations we don't necessarily want to expose to IR producers, and
would rather leave as private to the backend.

Use a new instruction bit. A previous attempt tried to keep using enum
value ranges, but it turned into a mess.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373937 91177308-0d34-0410-b5e6-96231b3b80d8

Second attempt to add iterator_range::empty()

Doing this makes MSVC complain that `empty(someRange)` could refer to
either C++17's std::empty or LLVM's llvm::empty, which previously we
avoided via SFINAE because std::empty is defined in terms of an empty
member rather than begin and end. So, switch callers over to the new
method as it is added.

https://reviews.llvm.org/D68439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373935 91177308-0d34-0410-b5e6-96231b3b80d8

Fix Calling Convention through aliases

r369697 changed the behavior of stripPointerCasts to no longer include
aliases. However, the code in CGDeclCXX.cpp's createAtExitStub counted
on the looking through aliases to properly set the calling convention of
a call.

The result of the change was that the calling convention mismatch of the
call would be replaced with a llvm.trap, causing a runtime crash.

Differential Revision: https://reviews.llvm.org/D68584

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373929 91177308-0d34-0410-b5e6-96231b3b80d8

[Remarks] Pass StringBlockValue as StringRef.

After changing the remark serialization, we now pass StringRefs to the
serializer. We should use StringRef for StringBlockVal, to avoid
creating temporary objects, which then cause StringBlockVal.Value to
point to invalid memory.

Reviewers: thegameg, anemet

Reviewed By: thegameg

Differential Revision: https://reviews.llvm.org/D68571

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373923 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build errors caused by rL373914.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373919 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-profdata] Minor format fix

Summary: Minor format fix for output of "llvm-profdata -show"

Reviewers: wmi

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373917 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] getTargetShuffleInputs - move VT.isSimple/isVector checks inside. NFCI.

Stop all the callers from having to check the value type before calling getTargetShuffleInputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373915 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleFDO] Add compression support for any section in ExtBinary profile format

Previously ExtBinary profile format only supports compression using zlib for
profile symbol list. In this patch, we extend the compression support to any
section. User can select some or all of the sections to compress. In an
experiment, for a 45M profile in ExtBinary format, compressing name table
reduced its size to 24M, and compressing all the sections reduced its size
to 11M.

Differential Revision: https://reviews.llvm.org/D68253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373914 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] add test that asserted after cost model change (PR43582); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373913 91177308-0d34-0410-b5e6-96231b3b80d8

Allow update_test_checks.py to not scrub names.

Add a --preserve-names option to tell the script not to replace IR names.
Sometimes tests want those names. For example if a test is looking for a
modification to an existing instruction we'll want to make the names.

Differential Revision: https://reviews.llvm.org/D68081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373912 91177308-0d34-0410-b5e6-96231b3b80d8

Fix another sphinx warning.

Differential Revision: https://reviews.llvm.org/D64746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373909 91177308-0d34-0410-b5e6-96231b3b80d8

Regenerate ptr-rotate.ll . NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373908 91177308-0d34-0410-b5e6-96231b3b80d8

[Mips] Always save RA when disabling frame pointer elimination

This ensures that frame-based unwinding will continue to work when
calling a noreturn function; there is not much use having the caller's
frame pointer saved if you don't also have the caller's program counter.

Patch by James Clarke.

Differential Revision: https://reviews.llvm.org/D68542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373907 91177308-0d34-0410-b5e6-96231b3b80d8

[Mips] Fix evaluating J-format branch targets

J/JAL/JALX/JALS are absolute branches, but stay within the current
256 MB-aligned region, so we must include the high bits of the
instruction address when calculating the branch target.

Patch by James Clarke.

Differential Revision: https://reviews.llvm.org/D68548

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373906 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add bindings to create macro debug info

Summary: The C API doesn't have the bindings to create macro debug information.

Reviewers: whitequark, CodaFi, deadalnix

Reviewed By: whitequark

Subscribers: aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58334

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373903 91177308-0d34-0410-b5e6-96231b3b80d8

Fix sphinx warnings.

Differential Revision: https://reviews.llvm.org/D64746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373902 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit

Fix comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373901 91177308-0d34-0410-b5e6-96231b3b80d8

[FPEnv] Add constrained intrinsics for lrint and lround

Earlier in the year intrinsics for lrint, llrint, lround and llround were
added to llvm. The constrained versions are now implemented here.

Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally
Approved by: craig.topper
Differential Revision: https://reviews.llvm.org/D64746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373900 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: use better triple on windows

The CMake build uses "x86_64-pc-windows-msvc". The "-msvc" suffix is
important because e.g. clang/test/lit.cfg.py matches against the
suffix "windows-msvc" to compute the presence of the "ms-sdk" and
the absence of the "LP64" feature.

Differential Revision: https://reviews.llvm.org/D68572

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373899 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r373888 "[IA] Recognize hexadecimal escape sequences"

It broke MC/AsmParser/directive_ascii.s on all bots:

Assertion failed: (Index < Length && "Invalid index!"), function operator[],
file ../../llvm/include/llvm/ADT/StringRef.h, line 243.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373898 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix test checks

The GFX10-DENORM-STRICT checks were only passing by accident. Fix them
to make the test more robust in the face of scheduling or register
allocation changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373893 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readelf/llvm-objdump] - Improve/refactor the implementation of SHT_LLVM_ADDRSIG section dumping.

This patch:

* Adds a llvm-readobj/llvm-readelf test file for SHT_LLVM_ADDRSIG sections. (we do not have any)
* Enables dumping of SHT_LLVM_ADDRSIG with --all.
* Changes the logic to report a warning instead of an error when something goes wrong during dumping
(allows to continue dumping SHT_LLVM_ADDRSIG and other sections on error).
* Refactors a piece of logic to a new toULEB128Array helper which might be used for GNU-style
dumping implementation.

Differential revision: https://reviews.llvm.org/D68383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373890 91177308-0d34-0410-b5e6-96231b3b80d8

[IA] Recognize hexadecimal escape sequences

Summary:
Implement support for hexadecimal escape sequences to match how GNU 'as'
handles them. I.e., read all hexadecimal characters and truncate to the
lower 16 bits.

Reviewers: nickdesaulniers

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68483

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373888 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Pacify gcc-5.4 more

Followup to a previous pacification, this performs the same workaround
to the TableGen generated code for tuple automata.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373883 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[SLP] avoid reduction transform on patterns that the backend can load-combine"

This reverts SVN r373833, as it caused a failed assert "Non-zero loop
cost expected" on building numerous projects, see PR43582 for details
and reproduction samples.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373882 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-locstats] Fix a typo in the documentation; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373880 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Autogenerate checks in leaFixup32.mir and leaFixup64.mir. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373878 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Support LEA64_32r in processInstrForSlow3OpLEA and use INC/DEC when possible.

Move the erasing and iterator updating inside to match the
other slow LEA function.

I've adapted code from optTwoAddrLEA and basically rebuilt the
implementation here. We do lose the kill flags now just like
optTwoAddrLEA. This runs late enough in the pipeline that
shouldn't really be a problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373877 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][CGP] Tests for making ICMP_EQ use CR result of ICMP_S(L|G)T dominators

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373876 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: no-op style tweak in sync script

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373873 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Removes Subsystem Documentation page

Removes Subsystem Documentation page. Also moves existing topics on Subsystem Documentation page to User Guides and Reference pages.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373872 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217)

If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element.

This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted.

Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper.

Fixes PR43217

Differential Revision: https://reviews.llvm.org/D68544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373871 91177308-0d34-0410-b5e6-96231b3b80d8

Fix signed/unsigned warning. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373870 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][PowerPC] Reorganize CRNotPat multiclass patterns in PPCInstrInfo.td

This is patch aims to group together the `CRNotPat` multi class instantiations
within the `PPCInstrInfo.td` file.

Integer instantiations of the multi class are grouped together into a section,
and the floating point patterns are separated into its own section.

Differential Revision: https://reviews.llvm.org/D67975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373869 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Remove resolveTargetShuffleInputs and use getTargetShuffleInputs directly.

Move the resolveTargetShuffleInputsAndMask call to after the shuffle mask combine before the undef/zero constant fold instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373868 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Don't merge known undef/zero elements into target shuffle masks.

Replaces setTargetShuffleZeroElements with getTargetShuffleAndZeroables which reports the Zeroable elements but doesn't merge them into the decoded target shuffle mask (the merging has been moved up into getTargetShuffleInputs until we can get rid of it entirely).

This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle mask isn't adjusted by its source inputs but instead we cache them in a parallel Zeroable mask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373867 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add custom type legalization for v16i64->v16i8 truncate and v8i64->v8i8 truncate when v8i64 isn't legal

Summary:
The default legalization for v16i64->v16i8 tries to create a multiple stage truncate concatenating after each stage and truncating again. But avx512 implements truncates with multiple uops. So it should be better to truncate all the way to the desired element size and then concatenate the pieces using unpckl instructions. This minimizes the number of 2 uop truncates. The unpcks are all single uop instructions.

I tried to handle this by just custom splitting the v16i64->v16i8 shuffle. And hoped that the DAG combiner would leave the two halves in the state needed to make D68374 do the job for each half. This worked for the first half, but the second half got messed up. So I've implemented custom handling for v8i64->v8i8 when v8i64 needs to be split to produce the VTRUNCs directly.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373864 91177308-0d34-0410-b5e6-96231b3b80d8

[LegalizeTypes][X86] When splitting a vselect for type legalization, don't split a setcc condition if the setcc input is legal and vXi1 conditions are supported

Summary: The VSELECT splitting code tries to split a setcc input as well. But on avx512 where mask registers are well supported it should be better to just split the mask and use a single compare.

Reviewers: RKSimon, spatel, efriedma

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373863 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: make windows build less broken

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373858 91177308-0d34-0410-b5e6-96231b3b80d8

[LOOPGUARD] Remove asserts in getLoopGuardBranch
Summary: The assertion in getLoopGuardBranch can be a 'return nullptr'
under if condition.
Authored By: DTharun
Reviewer: Whitney, fhahn
Reviewed By: Whitney, fhahn
Subscribers: fhahn, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D66084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373857 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Removes Programming Documentation page

Removes Programming Documentation page. Also moves existing topics on Programming Documentation page to User Guides and Reference pages.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373856 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] resolveTargetShuffleInputs - call getTargetShuffleInputs instead of using setTargetShuffleZeroElements directly. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373855 91177308-0d34-0410-b5e6-96231b3b80d8

[Docs] Adds new Getting Started/Tutorials page

Adds a new page for Getting Started/Tutorials topics. Also updates existing topic categories on the User Guides and Reference pages.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373854 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [DAGCombine] Match more patterns for half word bswap

This reverts r373850 (git commit 25ba49824d2d4f2347b4a7cb1623600a76ce9433)

This patch appears to cause multiple codegen regression test failures - http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/10680

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373853 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Replace 'isDarwin' with 'IsDarwin'

Summary: Replace 'isDarwin' with 'IsDarwin' based on LLVM naming convention.

Differential Revision: https://reviews.llvm.org/D68336

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373852 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fold fneg disguised as select+fmul (PR43497)

Extends rL373230 and solves the motivating bug (although in a narrow way):
https://bugs.llvm.org/show_bug.cgi?id=43497

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373851 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Match more patterns for half word bswap

Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68250

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373850 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] combineExtractSubvector - merge duplicate variables. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373849 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add fast-math-flags for better test coverage; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373848 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] don't assume 'inbounds' for bitcast pointer to GEP transform (PR43501)

https://bugs.llvm.org/show_bug.cgi?id=43501
We can't declare a GEP 'inbounds' in general. But we may salvage that information if
we have known dereferenceable bytes on the source pointer.

Differential Revision: https://reviews.llvm.org/D68244

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373847 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] matchVectorShuffleAsBlend - use Zeroable element mask directly.

We can make use of the Zeroable mask to indicate which elements we can safely set to zero instead of creating a target shuffle mask on the fly.

This allows us to remove createTargetShuffleMask.

This is part of the work to fix PR43024 and allow us to use SimplifyDemandedElts to simplify shuffle chains - we need to get to a point where the target shuffle masks isn't adjusted by its source inputs in setTargetShuffleZeroElements but instead we cache them in a parallel Zeroable mask.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373846 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Enable AVX512BW for memcmp()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373845 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fall back on weird G_EXTRACT offsets

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373842 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect mul24 intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373841 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: RegBankSelect DS GWS intrinsics

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373840 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Lower G_ATOMIC_CMPXCHG_WITH_SUCCESS

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373839 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Partially implement lower for G_EXTRACT

Turn into shift and truncate. Doesn't yet handle pointers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373838 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Fix RegBankSelect for sendmsg intrinsics

This wasn't updated for the immarg handling change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373837 91177308-0d34-0410-b5e6-96231b3b80d8

[FastISel] Copy the inline assembly dialect to the INLINEASM instruction.

Fixes PR43575.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373836 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Push sign extensions of comparison bool results through bitops (PR42025)

As discussed on PR42025, with more complex boolean math we can end up with many truncations/extensions of the comparison results through each bitop.

This patch handles the cases introduced in combineBitcastvxi1 by pushing the sign extension through the AND/OR/XOR ops so its just the original SETCC ops that gets extended.

Differential Revision: https://reviews.llvm.org/D68226

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373834 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] avoid reduction transform on patterns that the backend can load-combine

I don't see an ideal solution to these 2 related, potentially large, perf regressions:
https://bugs.llvm.org/show_bug.cgi?id=42708
https://bugs.llvm.org/show_bug.cgi?id=43146

We decided that load combining was unsuitable for IR because it could obscure other
optimizations in IR. So we removed the LoadCombiner pass and deferred to the backend.
Therefore, preventing SLP from destroying load combine opportunities requires that it
recognizes patterns that could be combined later, but not do the optimization itself (
it's not a vector combine anyway, so it's probably out-of-scope for SLP).

Here, we add a scalar cost model adjustment with a conservative pattern match and cost
summation for a multi-instruction sequence that can probably be reduced later.
This should prevent SLP from creating a vector reduction unless that sequence is
extremely cheap.

In the x86 tests shown (and discussed in more detail in the bug reports), SDAG combining
will produce a single instruction on these tests like:

  movbe   rax, qword ptr [rdi]

or:

  mov     rax, qword ptr [rdi]

Not some (half) vector monstrosity as we currently do using SLP:

  vpmovzxbq       ymm0, dword ptr [rdi + 1] # ymm0 = mem[0],zero,zero,..
  vpsllvq ymm0, ymm0, ymmword ptr [rip + .LCPI0_0]
  movzx   eax, byte ptr [rdi]
  movzx   ecx, byte ptr [rdi + 5]
  shl     rcx, 40
  movzx   edx, byte ptr [rdi + 6]
  shl     rdx, 48
  or      rdx, rcx
  movzx   ecx, byte ptr [rdi + 7]
  shl     rcx, 56
  or      rcx, rdx
  or      rcx, rax
  vextracti128    xmm1, ymm0, 1
  vpor    xmm0, xmm0, xmm1
  vpshufd xmm1, xmm0, 78          # xmm1 = xmm0[2,3,0,1]
  vpor    xmm0, xmm0, xmm1
  vmovq   rax, xmm0
  or      rax, rcx
  vzeroupper
  ret

Differential Revision: https://reviews.llvm.org/D67841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373833 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] lowerShuffleAsLanePermuteAndRepeatedMask - variable renames. NFCI.

Rename some variables to match lowerShuffleAsRepeatedMaskAndLanePermute - prep work toward adding some equivalent sublane functionality.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373832 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add tests for LKK algorithm

Added some tests testing urem and srem operations with a constant divisor.

Patch by TG908 (Tim Gymnich)

Differential Revision: https://reviews.llvm.org/D68421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373830 91177308-0d34-0410-b5e6-96231b3b80d8