granicus.if.org Git

[SLPVectorizer] Add initial alternate opcode support for cast instructions. (REAPPLIED)

We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Reapplied with fix to only accept 2 different casts if they come from the same source type.

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336812 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Ensure alternate/passthrough doesn't vectorize sdiv with undef elts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336809 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Add some additional alternate cast tests

Initial attempt at D49135 failed as we weren't correctly handling casts with different source types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336808 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL336804: [SLPVectorizer] Add initial alternate opcode support for cast instructions.

Reverting due to buildbot failures

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336806 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r334887: [SmallSet] Add SmallSetIterator.

This version now uses the subset of is_trivially_XXX provided by
GCC 4.8 and llvm/Support/type_traits.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336805 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Add initial alternate opcode support for cast instructions.

We currently only support binary instructions in the alternate opcode shuffles.

This patch is an initial attempt at adding cast instructions as well, this raises several issues that we probably want to address as we continue to generalize the alternate mechanism:

1 - Duplication of cost determination - we should probably add scalar/vector costs helper functions and get BoUpSLP::getEntryCost to use them instead of determining costs directly.
2 - Support alternate instructions with the same opcode (e.g. casts with different src types) - alternate vectorization of calls with different IntrinsicIDs will require this.
3 - Allow alternates to be a different instruction type - mixing binary/cast/call etc.
4 - Allow passthrough of unsupported alternate instructions - related to PR30787/D28907 'copyable' elements.

Differential Revision: https://reviews.llvm.org/D49135

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336804 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Ignore debug uses in MachineCopyPropagation

Debug uses should not count as real uses, since the presence of debug
information could affect the generated code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336803 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Update the P5600 scheduler model not to use instruction itineraries.

This mostly brings the P5600 scheduler model to a mostly complete
status. There are a number of instructions which trigger the
`error:'MipsP5600Model' lacks information for` error. These are certain
codegen only instructions relating to MIPS64 which can be addressed by
using the correct predicates for them. That will be done in a full-up
patch.

Patch by Simon Dardis.

Differential revision: https://reviews.llvm.org/D45245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336802 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Converts isLegalNarrowLoad into isLegalNarrowLdSt

Reuse this function as to test correctness and profitability of
reducing width of either load or store operations.

Reviewsers: samparker

Differential Revision: https://reviews.llvm.org/D48624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336800 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Use a different character to flag instructions with side-effects in the Instruction Info View. NFC

This makes easier to identify changes in the instruction info flags. It also
helps spotting potential regressions similar to the one recently introduced at
r336728.

Using the same character to mark MayLoad/MayStore/HasSideEffects is problematic
for llvm-lit. When pattern matching substrings, llvm-lit consumes tabs and
spaces. A change in position of the flag marker may not trigger a test failure.

This patch only changes the character used for flag `hasSideEffects`. The reason
why I didn't touch other flags is because I want to avoid spamming the mailing
because of the massive diff due to the numerous tests affected by this change.

In future, each instruction flag should be associated with a different character
in the Instruction Info View.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336797 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Tests for x & (-1 >> y) == x -> x u<= (-1 >> y) fold

https://bugs.llvm.org/show_bug.cgi?id=38123

This pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958
https://bugs.llvm.org/show_bug.cgi?id=21530
in unsigned case, therefore it is probably a good idea to improve it.

https://rise4fun.com/Alive/Rny

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336796 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] ParallelDSP: multiple reduction stmts in loop

This fixes an issue that we were not properly supporting multiple reduction
stmts in a loop, and not generating SMLADs for these cases. The alias analysis
checks were done too early, making it too conservative.

Differential revision: https://reviews.llvm.org/D49125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336795 91177308-0d34-0410-b5e6-96231b3b80d8

Use debug-prefix-map for AT_NAME

AT_NAME was being emitted before the directory paths were remapped. This
ensures that all paths are remapped before anything is emitted.

An additional test case has been added.

Note that this only works if the replacement string is an absolute path.
If not, then AT_decl_file believes the new path is a relative path, and
joins that path with the compilation directory. I do not know of a good
way to resolve this.

Patch by: Siddhartha Bagaria (starsid)

Differential revision: https://reviews.llvm.org/D49169

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336793 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r336653: [VPlan] Add VPlanTestBase.h with helper

The original version caused a memsan failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336792 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for COMPACT instruction.

The compact instruction shuffles active elements of vector
into lowest numbered elements and sets remaining elements
to zero.

e.g.
compact z0.s, p0, z1.s

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336789 91177308-0d34-0410-b5e6-96231b3b80d8

Fix check-prefix vs check-prefixes typo in updated test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336787 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Regenerate SDIV tests

Will make codegen diffs much easier to grok in a future patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336786 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] icmp-logical.ll: add a few more tests.

The @masked_and_notA_slightly_optimized and @masked_or_A
will break when PR38123 will be fixed:
https://rise4fun.com/Alive/Rny
Clearly, they aren't optimized currently.

https://rise4fun.com/Alive/ERo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336784 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for LAST(A|B) and CLAST(A|B) instructions.

The LASTB and LASTA instructions extract the last active element,
or element after the last active, from the source vector.

The added variants are:

  Scalar:
  last(a|b)  w0, p0, z0.b
  last(a|b)  w0, p0, z0.h
  last(a|b)  w0, p0, z0.s
  last(a|b)  x0, p0, z0.d

  SIMD & FP Scalar:
  last(a|b)  b0, p0, z0.b
  last(a|b)  h0, p0, z0.h
  last(a|b)  s0, p0, z0.s
  last(a|b)  d0, p0, z0.d

The CLASTB and CLASTA conditionally extract the last or element after
the last active element from the source vector.

The added variants are:

  Scalar:
  clast(a|b)  w0, p0, w0, z0.b
  clast(a|b)  w0, p0, w0, z0.h
  clast(a|b)  w0, p0, w0, z0.s
  clast(a|b)  x0, p0, x0, z0.d

  SIMD & FP Scalar:
  clast(a|b)  b0, p0, b0, z0.b
  clast(a|b)  h0, p0, h0, z0.h
  clast(a|b)  s0, p0, s0, z0.s
  clast(a|b)  d0, p0, d0, z0.d

  Vector:
  clast(a|b)  z0.b, p0, z0.b, z1.b
  clast(a|b)  z0.h, p0, z0.h, z1.h
  clast(a|b)  z0.s, p0, z0.s, z1.s
  clast(a|b)  z0.d, p0, z0.d, z1.d

Please refer to the architecture specification for more details on
the semantics of the added instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336783 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Add -hex-dump (-x) option

Differential Revision: https://reviews.llvm.org/D48281

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336782 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Fix extra space padding in icmp-mul-zext.ll test

update_test_checks will drop it anyway, creating noise..

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336781 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Add variable names and regenerate icmp-logical.ll test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336780 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add constant buildvector support to isKnownNeverZero

This allows us to use SelectionDAG::isKnownNeverZero in DAGCombiner::visitREM (visitSDIVLike/visitUDIVLike handle the checking for constants).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336779 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-mca] Add tests for partial register writes.

llvm-mca doesn't know that on modern AMD processors, portions of a general
purpose register are not treated independently. So, a partial register write has
a false dependency on the super-register.

The issue with partial register writes will be addressed by a follow-up patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336778 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Remove dead code. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336777 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Support non-uniform X%C -> X-(X/C)*C folds

First stage in PR38057 - support non-uniform constant vectors in the combine to reuse the division-by-constant logic.

We can definitely do better for srem pow2 remainders (and avoid that extra multiply....) but this at least helps keep everything on the vector unit.

Differential Revision: https://reviews.llvm.org/D48975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336774 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add (urem X, -1) -> select(X == -1, 0, x) fold

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336773 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Add missing std::moves to fix build failure.

gcc 4.7 seems to disagree with gcc 5.3 about whether you need to say
'return std::move(thing)' instead of just 'return thing'. All the
json::Arrays and json::Objects that I was implicitly turning into
json::Values by returning them from functions now have explicit
std::move wrappers, so hopefully 4.7 will be happy now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336772 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Add a general-purpose JSON backend.

The aim of this backend is to output everything TableGen knows about
the record set, similarly to the default -print-records backend. But
where -print-records produces output in TableGen's input syntax
(convenient for humans to read), this backend produces it as
structured JSON data, which is convenient for loading into standard
scripting languages such as Python, in order to extract information
from the data set in an automated way.

The output data contains a JSON representation of the variable
definitions in output 'def' records, and a few pieces of metadata such
as which of those definitions are tagged with the 'field' prefix and
which defs are derived from which classes. It doesn't dump out
absolutely every piece of knowledge it _could_ produce, such as type
information and complicated arithmetic operator nodes in abstract
superclasses; the main aim is to allow consumers of this JSON dump to
essentially act as new backends, and backends don't generally need to
depend on that kind of data.

The new backend is implemented as an EmitJSON() function similar to
all of llvm-tblgen's other EmitFoo functions, except that it lives in
lib/TableGen instead of utils/TableGen on the basis that I'm expecting
to add it to clang-tblgen too in a future patch.

To test it, I've written a Python script that loads the JSON output
and tests properties of it based on comments in the .td source - more
or less like FileCheck, except that the CHECK: lines have Python
expressions after them instead of textual pattern matches.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: arichardson, labath, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D46054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336771 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Only call llvm::value::dump() in debug build.

This fixes compile error in r336759. llvm::value::dump is not available
in released build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336770 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] The TEST instruction is eliminated when BSF/TZCNT is used

Summary:
These changes cover the PR#31399.
Now the ffs(x) function is lowered to (x != 0) ? llvm.cttz(x) + 1 : 0
and it corresponds to the following llvm code:
  %cnt = tail call i32 @llvm.cttz.i32(i32 %v, i1 true)
  %tobool = icmp eq i32 %v, 0
  %.op = add nuw nsw i32 %cnt, 1
  %add = select i1 %tobool, i32 0, i32 %.op
and x86 asm code:
  bsfl     %edi, %ecx
  addl     $1, %ecx
  testl    %edi, %edi
  movl     $0, %eax
  cmovnel  %ecx, %eax
In this case the 'test' instruction can't be eliminated because
the 'add' instruction modifies the EFLAGS, namely, ZF flag
that is set by the 'bsf' instruction when 'x' is zero.

We now produce the following code:
  bsfl     %edi, %ecx
  movl     $-1, %eax
  cmovnel  %ecx, %eax
  addl     $1, %eax

Patch by Ivan Kulagin

Reviewers: davide, craig.topper, spatel, RKSimon

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336768 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r336760: "[ORC] Add unit tests for the reexports utility that were..."

This patch broke a few buildbots. I will investigate and re-apply when I have
a fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336767 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some composite MOVSS/MOVSD isel patterns.

These patterns looked for a MOVSS/SD followed by a scalar_to_vector. Or a scalar_to_vector followed by a load.

In both cases we emitted a MOVSS/SD for the MOVSS/SD part, a REG_CLASS for the scalar_to_vector, and a MOVSS/SD for the load.

But we have patterns that do each of those 3 things individually so there's no reason to build large patterns.

Most of the test changes are just reorderings. The one test that had a meaningful change is pr30430.ll and it appears to be a regression. But its doing -O0 so I think it missed a lot of opportunities and was just getting lucky before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336762 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Remove a shadowing definition.

There is already a VSO member V in the CoreAPIsStandardTest test fixture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336761 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Add unit tests for the reexports utility that were left out of r336741,
and fix a bug that these exposed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336760 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add pass to infer prototypes for prototype-less functions

See https://bugs.llvm.org/show_bug.cgi?id=35385

Differential Revision: https://reviews.llvm.org/D48471

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336759 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Drop constexpr in unit test to appease a bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336758 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Use a gtest fixture to remove a bunch of boilerplate in CoreAPIsTest.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336757 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Add remaining __flaot128 builtin support for FMA round to odd

Implement this as it is done on GCC:

__float128 a, b, c, d;
a = __builtin_fmaf128_round_to_odd (b, c, d);         // generates xsmaddqpo
a = __builtin_fmaf128_round_to_odd (b, c, -d);        // generates xsmsubqpo
a = - __builtin_fmaf128_round_to_odd (b, c, d);       // generates xsnmaddqpo
a = - __builtin_fmaf128_round_to_odd (b, c, -d);      // generates xsnmsubpqp

Differential Revision: https://reviews.llvm.org/D48218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336754 91177308-0d34-0410-b5e6-96231b3b80d8

[test cases] add test cases for find more abs pattern

Differential Revision: https://reviews.llvm.org/D49123

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336752 91177308-0d34-0410-b5e6-96231b3b80d8

[TableGen] Fix some bad formatting. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336751 91177308-0d34-0410-b5e6-96231b3b80d8

[LangRef] Clarify alloca of zero bytes.

Let's be conservative here; it matches what we actually implemented, and
it should be rare in practice anyway.

Differential Revision: https://reviews.llvm.org/D49042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336744 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Treat cmn immediates as legal in isLegalICmpImmediate.

The original code attempted to do this, but the std::abs() call didn't
actually do anything due to implicit type conversions. Fix the type
conversions, and perform the correct check for negative immediates.

This probably has very little practical impact, but it's worth fixing
just to avoid confusion in the future, I think.

Differential Revision: https://reviews.llvm.org/D48907

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336742 91177308-0d34-0410-b5e6-96231b3b80d8

[ORC] Generalize alias materialization to support re-exports (i.e. aliasing of
symbols in another VSO).

Also fixes a bug where chained aliases within a single VSO would deadlock on
materialization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336741 91177308-0d34-0410-b5e6-96231b3b80d8

Sort includes + include a missing `extern "C"` header

If we don't include Initialization.h,
`LLVMInitializeAggressiveInstCombiner` won't see its `extern "C"` decl.
This causes sadness, name mangling, and linker errors.

Reported on the mailing lists by Vladimir Vissoultchev. Thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336736 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove AddedComplexity from all patterns that use X86vzmovl as their root.

Some added 20 and some added 15. Its unclear when to use which value and whether they are required at all.

This patch removes them all. If we start finding real world issues we may need to add them back with proper tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336735 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wmismatched-tags warning

class -> struct in forward declaration.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336733 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Teach X86InstrInfo::commuteInstructionImpl to use MOVSD/MOVSS for BLEND under optsize when the immediate allows it.

Isel currently emits movss/movsd a lot of the time and an accidental double commute turns it into a blend.

Ideally we'd select blend directly in isel under optspeed and not rely on the double commute to create blend.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336731 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336730 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove X86ISD::MOVLPS and X86ISD::MOVLPD. NFCI

These ISD nodes try to select the MOVLPS and MOVLPD instructions which are special load only instructions. They load data and merge it into the lower 64-bits of an XMM register. They are logically equivalent to our MOVSD node plus a load.

There was only one place in X86ISelLowering that used MOVLPD and no places that selected MOVLPS. The one place that selected MOVLPD had to choose between it and MOVSD based on whether there was a load. But lowering is too early to tell if the load can really be folded. So in isel we have patterns that use MOVSD for MOVLPD if we can't find a load.

We also had patterns that select the MOVLPD instruction for a MOVSD if we can find a load, but didn't choose the MOVLPD ISD opcode for some reason.

So it seems better to just standardize on MOVSD ISD opcode and manage MOVSD vs MOVLPD instruction with isel patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336728 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix layering issue with AMDGPUHSAMetadataStreamer (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336722 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Use std::map to get determistic imports files

Summary:
I noticed that the .imports files emitted for distributed ThinLTO
backends do not have consistent ordering. This is because StringMap
iteration order is not guaranteed to be deterministic. Since we already
have a std::map with this information, used when emitting the individual
index files (ModuleToSummariesForIndex), use it for the imports files as
well.

This issue is likely causing some unnecessary rebuilds of the ThinLTO
backends in our distributed build system as the imports files are inputs
to those backends.

Reviewers: pcc, steven_wu, mehdi_amini

Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits

Differential Revision: https://reviews.llvm.org/D48783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336721 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove dead SDNode object from X86InstrFragmentsSIMD.td. NFC

It points to an opcode that doesn't exist.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336720 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r336653 "[VPlan] Add VPlanTestBase.h with helper class to build VPlan for tests."

Memory leaks in tests.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/6289/steps/check-llvm%20asan/logs/stdio

Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x554ea8 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:106
    #1 0x56cef1 in llvm::VPlanTestBase::doAnalysis(llvm::Function&) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanTestBase.h:53:14
    #2 0x56bec4 in llvm::VPlanTestBase::buildHCFG(llvm::BasicBlock*) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanTestBase.h:57:3
    #3 0x571f1e in llvm::(anonymous namespace)::VPlanHCFGTest_testVPInstructionToVPRecipesInner_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanHCFGTest.cpp:119:15
    #4 0xed2291 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
    #5 0xed44c8 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11
    #6 0xed5890 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28
    #7 0xef3634 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43
    #8 0xef27e0 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc
    #9 0xebbc23 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46
    #10 0xebbc23 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:51
    #11 0x7f65569592e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)

and more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336718 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Set per-runtime library directory suffix in runtimes build

Do not use LLVM_RUNTIMES_LIBDIR_SUFFIX variable which is an internal
variable used by the runtimes build from individual runtimes, instead
set per-runtime librarhy directory suffix variable which is necessary
for the sanitized runtimes build to install libraries into correct
location.

Differential Revision: https://reviews.llvm.org/D49121

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336713 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove AddedComplexity from register form of NOT. NFCI

I believe isProfitableToFold will stop the load folding that this was intended to overcome.

Given an (xor load, -1), isProfitableToFold will see that the immediate can be folded with the xor using a one byte immediate since it can be sign extended. It doesn't know about NOT, but the one byte immediate check is enough to stop the fold.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336712 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove AddedComplexity from MMX_X86movw2d patterns.

There were only 3 patterns with this node as a root and they all the same AddedComplexity. So this doesn't really do anything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336711 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Teach the build system to codesign built products

Automatically codesign all executables and dynamic libraries if a
codesigning identity is given (via LLVM_CODESIGNING_IDENTITY). This
option is darwin only for now.

Also update platforms/iOS.cmake to pick up the right versions of
codesign and codesign_allocate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336708 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Refactor HSAMetadataStream::emitKernel (NFC)

Move all metadata construction into AMDGPUHSAMetadataStreamer.

Differential Revision: https://reviews.llvm.org/D48176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336707 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][X86_64] Support for G_SITOFP

The instruction selection is automatically handled by tablegen

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336703 91177308-0d34-0410-b5e6-96231b3b80d8

[Evaluator] Examine alias when evaluating function call

This fixes PR38120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336702 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add special case fast paths for udiv x,1 and udiv x,-1

udiv x,-1 was going down the (slow) BuildUDIV route resulting in unnecessary shifts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336701 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[AccelTable] Provide abstraction for emitting DWARF5 accelerator tables."

This reverts r336529 because an alternative approach turned out to be a
better fit for dsymuil.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336698 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Make hidden argument metadata consistent with
amdgpu-implicitarg-num-bytes attribute

Differential Revision: https://reviews.llvm.org/D49096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336697 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow flag propagation when using safe constant

This corresponds with the code for the single binop pattern
added in rL336684.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336696 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add srem/udiv/urem by constant tests

Match the tests in combine-sdiv.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336694 91177308-0d34-0410-b5e6-96231b3b80d8

[gcov] Fix ABI when calling llvm_gcov_... routines from instrumentation code

The llvm_gcov_... routines in compiler-rt are regular C functions that
need to be called using the proper C ABI for the target. The current
code simply calls them using plain LLVM IR types. Since the type are
mostly simple, this happens to just work on certain targets. But other
targets still need special handling; in particular, it may be necessary
to sign- or zero-extended sub-word values to comply with the ABI. This
caused gcov failures on SystemZ in particular.

Now the very same problem was already fixed for the llvm_profile_ calls
here: https://reviews.llvm.org/D21736

This patch uses the same method to fix the llvm_gcov_ calls, in
particular calls to llvm_gcda_start_file, llvm_gcda_emit_function, and
llvm_gcda_emit_arcs.

Reviewed By: marco-c

Differential Revision: https://reviews.llvm.org/D49134

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336692 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add missing a few {{$}}s to a test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336691 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/NFC: Fix typo in test name

hsa-metadata-enqueu-kernel.ll ->
hsa-metadata-enqueue-kernel.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336689 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Add interface to finish pending labels.

When manually finishing the object writer in dsymutil, it's possible
that there are pending labels that haven't been resolved. This results
in an assertion when the assembler tries to fixup a label that doesn't
have an address yet.

Differential revision: https://reviews.llvm.org/D49131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336688 91177308-0d34-0410-b5e6-96231b3b80d8

Update test to work on Windows

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336687 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] safely allow non-commutative binop identity constant folds

This was originally intended with D48893, but as discussed there, we
have to make the folds safe from producing extra poison. This should
give the single binop folds the same capabilities as the existing
folds for 2-binops+shuffle.

LLVM binary opcode review: there are a total of 18 binops. There are 7
commutative binops (add, mul, and, or, xor, fadd, fmul) which we already
fold. We're able to fold 6 more opcodes with this patch (shl, lshr, ashr,
fdiv, udiv, sdiv). There are no folds for srem/urem/frem AFAIK. We don't
bother with sub/fsub with constant operand 1 because those are
canonicalized to add/fadd. 7 + 6 + 3 + 2 = 18.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336684 91177308-0d34-0410-b5e6-96231b3b80d8

Add CachedHashStringRef::data().

This accessor is useful and could be slightly more efficient than
Str.val().data() because you can avoid StringRef instantiation.

Differential Revision: https://reviews.llvm.org/D49133

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336683 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Change .mir testcase to make sure function is not in SSA form

If a machine function satisfies SSA, the IsSSA property is assumed even
if the pass to be executed runs after existing from SSA. If the pass
output then does not conform to SSA, a verifier error will be flagged
(with expensive checks enabled).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336682 91177308-0d34-0410-b5e6-96231b3b80d8

Support -fdebug-prefix-map in llvm-mc. This is useful to omit the
debug compilation dir when compiling assembly files with -g.
Part of PR38050.

Patch by Siddhartha Bagaria!

Differential Revision: https://reviews.llvm.org/D48988

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336680 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] drop poison flags when shuffle mask undef propagates to constant

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336679 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Asm: Support for predicated unary operations.

This patch adds support for the following instructions:
  CLS  (Count Leading Sign bits)
  CLZ  (Count Leading Zeros)
  CNT  (Count non-zero bits)
  CNOT (Logically invert boolean condition in vector)
  NOT  (Bitwise invert vector)
  FABS (Floating-point absolute value)
  FNEG (Floating-point negate)

All operations are predicated and unary, e.g.
  clz  z0.s, p0/m, z1.s

- CLS, CLZ, CNT, CNOT and NOT have variants for 8, 16, 32
  and 64 bit elements.

- FABS and FNEG have variants for 16, 32 and 64 bit elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336677 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply "AMDGPU: Force inlining if LDS global address is used"

This reverts commit r336623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336675 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] allow more shuffle-binop folds with safe constants

The case with 2 variables is more complicated than the case where
we eliminate the shuffle entirely because a shuffle with an undef
mask element creates an undef result.

I'm not aware of any current analysis/transform that recognizes that
undef propagating to a div/rem/shift, but we have to guard against
the possibility.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336668 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo][LoopVectorize] Preserve DL in induction PHI and Add

Differential Revision: https://reviews.llvm.org/D48968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336667 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] visitREM - call visitSDIVLike/visitUDIVLike directly to avoid recursive combining.

As suggested by @efriedma on D48975 use the visitSDIVLike/visitUDIVLike functions introduced at rL336656.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336664 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add implicit uses even when untied explicit uses are present

An explicit untied use is not sufficient to maintain liveness of a
register redefined in a predicated instruction. For example
  %1 = COPY %0
  ...
  %1 = A2_paddif %2, %1, 1
could become
  $r1 = COPY $r0
  ...
  $r1 = A2_paddif $p0, $r1, 1
and later
  $r1 = COPY $r0                ;; this is not really dead!
  ...
  $r1 = A2_paddif $p0, $r0, 1

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336662 91177308-0d34-0410-b5e6-96231b3b80d8

[LowerSwitch] Fixed faulty PHI nodes

Summary:
Fixed two cases of where PHI nodes need to be updated by lowerswitch.

When lowerswitch find out that the switch default branch is not
reachable it remove the old default and replace it with the most
popular block from the cases, but it forget to update the PHI
nodes in the default block.

The PHI nodes also need to be updated when the switch is replaced
with a single branch.

Reviewers: hans, reames, arsenm

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D47203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336659 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Harded JSON against invalid UTF-8.

Parsing invalid UTF-8 input is now a parse error.
Creating JSON values from invalid UTF-8 now triggers an assertion, and
(in no-assert builds) substitutes the unicode replacement character.
Strings retrieved from json::Value are always valid UTF-8.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336657 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Split SDIV/UDIV optimization expansions from the rest of the combines. NFCI.

As suggested by @efriedma on D48975, this patch separates the BuildDiv/Pow2 style optimizations from the rest of the visitSDIV/visitUDIV to make it easier to reuse the combines and will allow us to avoid some rather nasty node recursive combining in visitREM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336656 91177308-0d34-0410-b5e6-96231b3b80d8

[VPlan] Add VPlanTestBase.h with helper class to build VPlan for tests.

Reviewers: dcaballe, hsaito, rengolin

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D49032

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336653 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC "signed/unsigned mismatch" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336649 91177308-0d34-0410-b5e6-96231b3b80d8

[PM/Unswitch] Fix unused variable in r336646.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336647 91177308-0d34-0410-b5e6-96231b3b80d8

[PM/Unswitch] Fix a collection of closely related issues with trivial
switch unswitching.

The core problem was that the way we handled unswitching trivial exit
edges through the default successor of a switch. For some reason
I thought the right way to do this was to add a block containing
unreachable and point the default successor at this block. In
retrospect, this has an amazing number of problems.

The first issue is the one that this pass has always worked around -- we
have to *detect* such edges and avoid unswitching them again. This
seemed pretty easy really. You juts look for an edge to a block
containing unreachable. However, this pattern is woefully unsound. So
many things can break it. The amazing thing is that I found a test case
where *simple-loop-unswitch itself* breaks this! When we do
a *non-trivial* unswitch of a switch we will end up splitting this exit
edge. The result will be a default successor that is an exit and
terminates in ... a perfectly normal branch. So the first test case that
I started trying to fix is added to the nontrivial test cases. This is
a ridiculous example that did just amazing things previously. With just
unswitch, it would create 10+ copies of this stuff stamped out. But if
you combine it *just right* with a bunch of other passes (like
simplify-cfg, loop rotate, and some LICM) you can get it to do this
infinitely. Or at least, I never got it to finish. =[

This, in turn, uncovered another related issue. When we are manipulating
these switches after doing a trivial unswitch we never correctly updated
PHI nodes to reflect our edits. As soon as I started changing how these
edges were managed, it became obvious there were more issues that
I couldn't realistically leave unaddressed, so I wrote more test cases
around PHI updates here and ensured all of that works now.

And this, in turn, required some adjustment to how we collect and manage
the exit successor when it is the default successor. That showed a clear
bug where we failed to include it in our search for the outer-most loop
reached by an unswitched exit edge. This was actually already tested and
the test case didn't work. I (wrongly) thought that was due to SCEV
failing to analyze the switch. In fact, it was just a simple bug in the
code that skipped the default successor. While changing this, I handled
it correctly and have updated the test to reflect that we now get
precise SCEV analysis of trip counts for the outer loop in one of these
cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336646 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fast-isel tests for lowered truncation intrinsics

This patch adds fast-isel tests for the IR patterns produced for truncation
intrinsics in rC336643.

Differential Revision: https://reviews.llvm.org/D48822

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336645 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Prefer BLEND(SHL(v,c1),SHL(v,c2)) over MUL(v, c3)

Now that rL336250 has landed, we should prefer 2 immediate shifts + a shuffle blend over performing a multiply. Despite the increase in instructions, this is quicker (especially for slow v4i32 multiplies), avoid loads and constant pool usage. It does mean however that we increase register pressure. The code size will go up a little but by less than what we save on the constant pool data.

This patch also adds support for v16i16 to the BLEND(SHIFT(v,c1),SHIFT(v,c2)) combine, and also prevents blending on pre-SSE41 shifts if it would introduce extra blend masks/constant pool usage.

Differential Revision: https://reviews.llvm.org/D48936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336642 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate vector-shuffle-512-v8.ll so the script will merge the 32 and 64 bit checks together. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336641 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use IsProfitableToFold to block vinsertf128rm in favor of insert_subreg instead of artifically increasing pattern complexity to give priority.

This is a much more direct way to solve the issue than just giving extra priority.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336639 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove some seemingly unnecessary patterns.

We're missing the EVEX equivalents of these patterns and seem to get along fine.

I think we end up with X86vzload for the obvious IR cases that would produce this DAG.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336638 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add back GCCBuiltin on mask_div_ss/sd_round.

We no longer need custom handling in clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336627 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Correct vfixupimm load patterns to look for an integer load, not a floating point load bitcasted to integer.

DAG combine wouldn't let a floating point load bitcasted to integer exist. It would just be an integer load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336626 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases that show failure to fold load into vfixupimm instructions due to bad isel pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336625 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove FloatVT from X86VectorVTInfo in X86InstrAVX512.td

The only places it was used where places where VT was the same as FloatVT. So switch those uses to VT and drop it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336624 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "AMDGPU: Force inlining if LDS global address is used"

This reverts commit r336587, it was causing test failures on the
sanitizer bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336623 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF][NFC] Refactor range list emission to use a static helper

This is prep for DWARF v5 range list emission. Emission of a single range list is moved
to a static helper function.

Reviewer: jdevlieghere

Differential Revision: https://reviews.llvm.org/D49098

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336621 91177308-0d34-0410-b5e6-96231b3b80d8