granicus.if.org Git

[ConstantRange] makeGuaranteedNoWrapRegion(): `shl` support

Summary:
If all the shifts amount are already poison-producing,
then we can add more poison-producing flags ontop:
https://rise4fun.com/Alive/Ocwi

Otherwise, we should only consider the possible range of shift amts that don't result in poison.

For unsigned range not not overflow, we must not shift out any set bits,
and the actual limit for `x` can be computed by backtransforming
the maximal value we could ever get out of the `shl` - `-1` through
`lshr`. If the `x` is any larger than that then it will overflow.

Likewise for signed range, but just in signed domain..

This is based on the general idea outlined by @nikic in https://reviews.llvm.org/D68672#1714990

Reviewers: nikic, sanjoy

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits, nikic

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69217

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375370 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantRange] Optimize nowrap region test, remove redundant tests; NFC

Enumerate one less constant range in TestNoWrapRegionExhaustive,
which was unnecessary. This allows us to bump the bit count from
3 to 5 while keeping reasonable timing.

Drop four tests for multiply nowrap regions, as these cover subsets
of the exhaustive test. They do use a wider bitwidth, but I don't
think it's worthwhile to have them additionally now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375369 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Increase vcc liveness scan threshold

Avoids a test regression in a future patch. Also add debug printing on
this case, so I waste less time debugging folds in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375367 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Split flat offsets that don't fit in DAG

We handle it this way for some other address spaces.

Since r349196, SILoadStoreOptimizer has been trying to do this. This
is after SIFoldOperands runs, which can change the addressing
patterns. It's simpler to just split this earlier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375366 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix missing OPERAND_IMMEDIATE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375365 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add baseline tests for flat offset splitting

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375364 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't re-get the subtarget

It's already available in the class.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375363 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj][obj2yaml] - Do not create a symbol table by default.

This patch tries to resolve problems faced in D68943
and uses some of the code written by Konrad Wilhelm Kleine
in that patch.

Previously, yaml2obj tool always created a .symtab section.
This patch changes that. With it we only create it when
have a "Symbols:" tag in the YAML document or when
we need to create it because it is used by another section(s).

obj2yaml follows the new behavior and does not print "Symbols:"
anymore when there is no symbol table.

Differential revision: https://reviews.llvm.org/D69041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375361 91177308-0d34-0410-b5e6-96231b3b80d8

Fix minor warning in DWARFVerifier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375357 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't error on calls to null or undef

Calls to constants should probably be generally handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375356 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Simplify umin/max of zext and sext of the same value

This is a common idiom which arises after induction variables are widened, and we have two or more exit conditions. Interestingly, we don't have instcombine or instsimplify support for this either.

Differential Revision: https://reviews.llvm.org/D69006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375349 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Pulled out helper to decode target shuffle element sentinel values to 'Zeroable' known undef/zero bits. NFCI.

Renamed 'resolveTargetShuffleAndZeroables' to 'resolveTargetShuffleFromZeroables' to match.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375348 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering][DAGCombine][MSP430] add/use hook for Shift Amount Threshold (1/2)

Provides a TLI hook to allow targets to relax the emission of shifts, thus enabling
codegen improvements on targets with no multiple shift instructions and cheap selects
or branches.

Contributes to a Fix for PR43559:
https://bugs.llvm.org/show_bug.cgi?id=43559

Patch by: @joanlluch (Joan LLuch)

Differential Revision: https://reviews.llvm.org/D69116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375347 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add dependency on GlobalISel for unit tests to fix shared libs build

The unit test uses GlobalISel but the dependency is not listed in the
CMakeLists.txt file which causes failures in shared libs build with GCC.

This just adds the dependency.

Differential revision: https://reviews.llvm.org/D69064

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375346 91177308-0d34-0410-b5e6-96231b3b80d8

[MSP430] Shift Amount Threshold in DAGCombine (Baseline Tests); NFC

Patch by: @joanlluch (Joan LLuch)

Differential Revision: https://reviews.llvm.org/D69099

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375345 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] lowerV16I8Shuffle - tryToWidenViaDuplication - undef unpack args

tryToWidenViaDuplication lowers using the shuffle_v8i16(unpack_v16i8(shuffle_v8i16(x),shuffle_v8i16(x))) pattern, but the unpack only needs the even/odd 16i8 args if the original v16i8 shuffle mask references the even/odd elements - which isn't true for many extension style shuffles.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375342 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] LowerUINT_TO_FP_i64 - only use HADDPD for size/fast-hops

We were always generating a single source HADDPD, but really we should only do this if shouldUseHorizontalOp says its a good idea.

Differential Revision: https://reviews.llvm.org/D69175

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375341 91177308-0d34-0410-b5e6-96231b3b80d8

Explicit in the doc the current list of projects (with easy copy and paste)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375339 91177308-0d34-0410-b5e6-96231b3b80d8

Make it clear in the doc that 'all' in LLVM_ENABLE_PROJECTS does install ALL projects

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375337 91177308-0d34-0410-b5e6-96231b3b80d8

Avoid including CodeView/SymbolRecord.h from MCStreamer.h

Move the types needed out so they can be forward declared instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375325 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove optnone from a test

It's not clear why the test had this. I'm unable to break the original
case with the original patch reverted with or without optnone.

This avoids a failure in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375321 91177308-0d34-0410-b5e6-96231b3b80d8

Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each

Now X86ISelLowering doesn't depend on many IR analyses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375320 91177308-0d34-0410-b5e6-96231b3b80d8

Prune Analysis includes from SelectionDAG.h

Only forward declarations are needed here. Follow-on to r375311.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375319 91177308-0d34-0410-b5e6-96231b3b80d8

Move endian constant from Host.h to SwapByteOrder.h, prune include

Works on this dependency chain:
  ArrayRef.h ->
  Hashing.h -> --CUT--
  Host.h ->
  StringMap.h / StringRef.h

ArrayRef is very popular, but Host.h is rarely needed. Move the
IsBigEndianHost constant to SwapByteOrder.h. Clients of that header are
more likely to need it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375316 91177308-0d34-0410-b5e6-96231b3b80d8

Prune two MachineInstr.h includes, fix up deps

MachineInstr.h included AliasAnalysis.h, which includes a world of IR
constructs mostly unneeded in CodeGen. Prune it. Same for
DebugInfoMetadata.h.

Noticed with -ftime-trace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375311 91177308-0d34-0410-b5e6-96231b3b80d8

LiveIntervals: Fix handleMoveUp with subreg def moving across a def

If a subregister def was moved across another subregister def and
another use, the main range was not correctly updated. The end point
of the moved interval ended too early and missed the use from theh
other lanes in the subreg def.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375300 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Build compiler-rt code with -fvisibility=hidden.

This matches the CMake build.

Differential Revision: https://reviews.llvm.org/D69202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375299 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] move PHI nodes to AGPR class

If all uses of a PHI are in AGPR register class we should
avoid unneeded copies via VGPRs.

Differential Revision: https://reviews.llvm.org/D69200

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375297 91177308-0d34-0410-b5e6-96231b3b80d8

[SampleFDO] Add profile remapping support for profile on-demand loading used
by ExtBinary format profile

Profile on-demand loading was added for ExtBinary format profile in rL374233,
but currently profile on-demand loading doesn't work well with profile
remapping. The patch adds the support.

Suppose a function in the current module has outline instance in the profile.
The function name in the module is different from the name of the outline
instance, but remapper knows the two names are equal. When loading profile
on-demand, the outline instance has to be loaded with remapper's help.

At the same time SampleProfileReaderItaniumRemapper is changed from a proxy
of SampleProfileReader to a helper member in SampleProfileReader.

Differential Revision: https://reviews.llvm.org/D68901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375295 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Remove -amdgpu-spill-sgpr-to-smem.

Summary: The implementation was never completed and never used except in tests.

Reviewers: arsenm, mareko

Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69163

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375293 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] setDeducedOverflowingFlags(): actually inc per-opcode stats

This is really embarrassing. Those are pointers, so that offsets the
pointers, not the statistics pointed-by the pointer...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375290 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r375288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375289 91177308-0d34-0410-b5e6-96231b3b80d8

Disable exit-on-SIGPIPE in lldb

Occasionally, during test teardown, LLDB writes to a closed pipe.
Sometimes the communication is inherently unreliable, so LLDB tries to
avoid being killed due to SIGPIPE (it calls `signal(SIGPIPE, SIG_IGN)`).
However, LLVM's default SIGPIPE behavior overrides LLDB's, causing it to
exit with IO_ERR.

Opt LLDB out of the default SIGPIPE behavior. I expect that this will
resolve some LLDB test suite flakiness (tests randomly failing with
IO_ERR) that we've seen since r344372.

rdar://55750240

Differential Revision: https://reviews.llvm.org/D69148

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375288 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix register parsing in .seh_* in Intel syntax

Previously, the parser checked for a '%' prefix to indicate a register.
In Intel syntax mode, LLVM does not print a '%' prefix on registers, so
LLVM could not parse its own assembly output. Instead, require that
register numbers be integer literals, or at least start with an integer
literal, which is consistent with .cfi_* directive register parsing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375287 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][CVP] Some tests for `mul` no-wrap deduction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375285 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Allow multivalue signatures in object files

Summary:
Also changes the wasm YAML format to reflect the possibility of having
multiple return types and to put the returns after the params for
consistency with the binary encoding.

Reviewers: aheejin, sbc100

Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, arphaman, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375283 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel][CallLowering] Make isIncomingArgumentHandler a pure virtual method

The default implementation of isIncomingArgumentHandler could lead
to generating incorrect code.
Make it a pure virtual method, so that targets know they have to
override it to produce correct code.

NFC

Differential Revision: https://reviews.llvm.org/D69187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375277 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] After proving that @llvm.with.overflow()/@llvm.sat() don't overflow, also try to prove other no-wrap

Summary:
CVP, unlike InstCombine, does not run till exaustion.
It only does a single pass.

When dealing with those special binops, if we prove that they can
safely be demoted into their usual binop form,
we do set the no-wrap we deduced. But when dealing with usual binops,
we try to deduce both no-wraps.

So if we convert e.g. @llvm.uadd.with.overflow() to `add nuw`,
we won't attempt to check whether it can be `add nuw nsw`.

This patch proposes to call `processBinOp()` on newly-created binop,
which is identical to what we do for div/rem already.

Reviewers: nikic, spatel, reames

Reviewed By: nikic

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375273 91177308-0d34-0410-b5e6-96231b3b80d8

[examples] Fix some comments in the LLJITWithJITLink example

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375269 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Relax 32-bit SGPR register class

Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This
will allow the register coalescer to do a better job eliminating
copies to m0.

For GlobalISel, as a terrible hack, use SGPR_32 for things that should
use SCC until booleans are solved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375267 91177308-0d34-0410-b5e6-96231b3b80d8

[examples] Add an example of how to use JITLink and small-code-model with LLJIT.

JITLink is LLVM's newer jit-linker. It is an alternative to (and hopefully
eventually a replacement for) LLVM's older jit-linker, RuntimeDyld. Unlike
RuntimeDyld which requries JIT'd code to be complied with the large code
model, JITlink can link code compiled with the small code model, which is
the native code model for a number of targets (including all supported MachO
targets).

This example shows how to:

-- Create a JITLink InProcessMemoryManager
-- Set the code model to small
-- Use a JITLink backed ObjectLinkingLayer as the linking layer for LLJIT
(rather than the default RTDyldObjectLinkingLayer).

Note: This example will only work on platforms supported by JITLink. As of
this commit that's MachO/x86-64 and MachO/arm64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375266 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix SMEM WAR hazard for gfx10 readlane

Summary: Hazard recognizer fails to see hazard with V_READLANE_B32_gfx10.

Reviewers: rampitec

Reviewed By: rampitec

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69172

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375265 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Reduce value of synthesized timeouts

Large timeout values (one year, positive infinity) trip up Python on
Windows with "OverflowError: timeout value is too large". One week
seems to work and is still large enough in practice.

Thanks to Simon Pilgrim for helping me test this.
https://reviews.llvm.org/rL375171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375264 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Remove unnecessary tracking of test_index

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375263 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Only send back test result from worker process

Avoid sending back the whole run.Test object (which needs to be pickled)
from the worker process when we are only interested in the test result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375262 91177308-0d34-0410-b5e6-96231b3b80d8

[Codegen] Link MIRParser into CodeGenTests to fix MachineSizeOptsTest building

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375261 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][CVP] Add @llvm.*.sat tests where we could prove both no-overflows

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375260 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r375254

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375256 91177308-0d34-0410-b5e6-96231b3b80d8

[PGO][PGSO] SizeOpts changes.

Summary:
(Split of off D67120)

SizeOpts/MachineSizeOpts changes for profile guided size optimization.

Reviewers: davidxl

Subscribers: mgorny, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375254 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] combineX86ShufflesRecursively - pull out isTargetShuffleVariableMask. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375253 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Reimplement FPMathOperator::classof as a whitelist.

Summary:
This makes it much easier to verify that the implementation matches the
documentation. It uncovered a bug in the unit tests where we were
accidentally setting fast math flags on a load instruction.

Reviewers: spatel, wristow, arsenm, hfinkel, aemerson, efriedma, cameron.mcinally, mcberg2017, jmolloy

Subscribers: wdng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375252 91177308-0d34-0410-b5e6-96231b3b80d8

Update docs for fast-math flags.

This adds fneg, phi and select to the list of operations that may use
fast-math flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375250 91177308-0d34-0410-b5e6-96231b3b80d8

Update MinidumpYAML to use minidump::Exception for exception stream

Reviewers: labath, jhenderson, clayborg, MaskRay, grimar

Reviewed By: grimar

Subscribers: lldb-commits, grimar, MaskRay, hiraditya, llvm-commits

Tags: #llvm, #lldb

Differential Revision: https://reviews.llvm.org/D68657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375242 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC][GFX10] Added sdwa/dpp versions of v_cndmask_b32

See https://bugs.llvm.org/show_bug.cgi?id=43608

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D69096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375241 91177308-0d34-0410-b5e6-96231b3b80d8

[DFAPacketizer] Fix large compile-time regression for VLIW targets

D68992 / rL375086 refactored the packetizer and removed a bunch of logic. Unfortunately it creates an Automaton object whenever a DFAPacketizer is required. These objects have no longevity, and in particular on a debug build the population of the Automaton's transition map from the underlying table is very slow (because it is called ~10 times per MachineFunction, in the testcase I'm looking at).

This patch changes Automaton to wrap its underlying constant data in std::shared_ptr, which allows trivial copy construction. The DFAPacketizer creation function now creates a static archetypical Automaton and copies that whenever a new DFAPacketizer is required.

This takes a testcase down from ~20s to ~0.5s in debug mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375240 91177308-0d34-0410-b5e6-96231b3b80d8

Add ExceptionStream to llvm::Object::minidump

Summary:
This will allow updating MinidumpYAML and LLDB to use this common
definition.

Reviewers: labath, jhenderson, clayborg

Reviewed By: labath

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68656

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375239 91177308-0d34-0410-b5e6-96231b3b80d8

One more attempt to fix PS4 buildbot after r375219

PS4 buildbot seems to be dropping variable names for some reason

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375237 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix PS4 buildbot after r375219

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375235 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r375152 as it is causing failures on EXPENSIVE_CHECKS bot

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375233 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Removing deprecated comment in ScalarEvolutionExpander

Removing a comment in the ScalarEvolutionExpander.cpp file that was about the
class SCEVSDivExpr, which has been long gone from LLVM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375232 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC][GFX9] Corrected parsing of v_cndmask_b32_sdwa

See https://bugs.llvm.org/show_bug.cgi?id=43607

Reviewers: arsenm, rampitec

Differential Revision: https://reviews.llvm.org/D69095

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375231 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][CVP] Count all the no-wraps we proved

Summary:
It looks like this is the only missing statistic in the CVP pass.
Since we prove NSW and NUW separately i'd think we should count them separately too.

Reviewers: nikic, spatel, reames

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68740

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375230 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Adding support for PMMIR_EL1 register

Summary:
The PMMIR_EL1 register is present in Armv8.4 with PMU extension.
This patch adds support for it.

Reviewers: t.p.northover, dnsampaio

Reviewed By: dnsampaio

Subscribers: kristof.beyls, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375228 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Add SPLAT_VECTOR ISD Node

Adds a new ISD node to replicate a scalar value across all elements of
a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot
be used.

Fixes up default type legalization for scalable vectors after the
new MVT type ranges were introduced.

At present I only use this node for scalable vectors. A DAGCombine has
been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all
elements are the same, but only if the default operation action of
Expand has been overridden by the target.

I've only added result promotion legalization for scalable vector
i8/i16/i32/i64 types in AArch64 for now.

Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy

Reviewed By: jmolloy

Differential Revision: https://reviews.llvm.org/D47775

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375222 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTOCodeGenerator] Add support for index-based WPD

Differential revision: https://reviews.llvm.org/D68950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375219 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Don't combine callee-save and local stack adjustment when optimizing for size

For arm64, D18619 introduced the ability to combine bumping the stack pointer
upfront in case it needs to be bumped for both the callee-save area as well as
the local stack area.

That diff already remarks that "This change can cause an increase in
instructions", but argues that even when that happens, it should be still be a
performance benefit because the number of micro-ops is reduced.

We have observed that this code-size increase can be significant in practice.
This diff disables combining stack bumping for methods that are marked as
optimize-for-size.

Example of a prologue with the behavior before this diff (combining stack bumping when possible):
  sub        sp, sp, #0x40
  stp        d9, d8, [sp, #0x10]
  stp        x20, x19, [sp, #0x20]
  stp        x29, x30, [sp, #0x30]
  add        x29, sp, #0x30
  [... compute x8 somehow ...]
  stp        x0, x8, [sp]

And after this  diff, if the method is marked as optimize-for-size:
  stp        d9, d8, [sp, #-0x30]!
  stp        x20, x19, [sp, #0x10]
  stp        x29, x30, [sp, #0x20]
  add        x29, sp, #0x20
  [... compute x8 somehow ...]
  stp        x0, x8, [sp, #-0x10]!

Note that without combining the stack bump there are two auto-decrements,
nicely folded into the stp instructions, whereas otherwise there is a single
sub sp, ... instruction, but not folded.

Patch by Nikolai Tillmann!

Differential Revision: https://reviews.llvm.org/D68530

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375217 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate memcmp tests and add X64-AVX512 common prefix

Should help make the changes in D69157 clearer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375215 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC "not all control paths return a value" warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375214 91177308-0d34-0410-b5e6-96231b3b80d8

Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warnings. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375213 91177308-0d34-0410-b5e6-96231b3b80d8

[Codegen] Alter the default promotion for saturating adds and subs

The default promotion for the add_sat/sub_sat nodes currently does:
    ANY_EXTEND iN to iM
    SHL by M-N
    [US][ADD|SUB]SAT
    L/ASHR by M-N

If the promoted add_sat or sub_sat node is not legal, this can produce code
that effectively does a lot of shifting (and requiring large constants to be
materialised) just to use the overflow flag. It is simpler to just do the
saturation manually, using the higher bitwidth addition and a min/max against
the saturating bounds. That is what this patch attempts to do.

Differential Revision: https://reviews.llvm.org/D68926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375211 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][SVE] Implement unpack intrinsics

Summary:
Implements the following intrinsics:
  - int_aarch64_sve_sunpkhi
  - int_aarch64_sve_sunpklo
  - int_aarch64_sve_uunpkhi
  - int_aarch64_sve_uunpklo

This patch also adds AArch64ISD nodes for UNPK instead of implementing
the intrinsics directly, as they are required for a future patch which
implements the sign/zero extension of legal vectors.

This patch includes tests for the Subdivide2Argument type added by D67549

Reviewers: sdesmalen, SjoerdMeijer, greened, rengolin, rovka

Reviewed By: greened

Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D67550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375210 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix miscompile bug in canEvaluateShuffled

Summary:
Add restrictions in canEvaluateShuffled to prevent that we for example
transform

  %0 = insertelement <2 x i16> undef, i16 %a, i32 0
  %1 = srem <2 x i16> %0, <i16 2, i16 1>
  %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0>

into

   %1 = insertelement <2 x i16> undef, i16 %a, i32 1
   %2 = srem <2 x i16> %1, <i16 undef, i16 2>

as having an undef denominator makes the srem undefined (for all
vector elements).

Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689
Reviewers: spatel, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69038

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375208 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Pre-commit of test case showing miscompile bug in canEvaluateShuffled

Adding the reproducer from  https://bugs.llvm.org/show_bug.cgi?id=43689,
showing that instcombine is doing a bad transform. It transforms

  %0 = insertelement <2 x i16> undef, i16 %a, i32 0
  %1 = srem <2 x i16> %0, <i16 2, i16 1>
  %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0>

into

   %1 = insertelement <2 x i16> undef, i16 %a, i32 1
   %2 = srem <2 x i16> %1, <i16 undef, i16 2>

The undef denominator makes the whole srem undefined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375207 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Emit KTEST when possible

https://reviews.llvm.org/D69111

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375197 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Move resolving of XFAIL result codes out of Test.setResult

This will allow us to serialize just the result object instead of the
whole lit.Test object back from the worker to the main lit process.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375195 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] worker.py: Improve code for executing a single test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375194 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars] Factor out some common code into a utility function

As requested in review of D69009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375191 91177308-0d34-0410-b5e6-96231b3b80d8

[Test] Precommit test for D69006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375190 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Move loclist base address from DwarfFile to DebugLocStream

There's no need to have more than one of these (there can be two
DwarfFiles - one for the .o, one for the .dwo - but only one loc/loclist
section (either in the .o or the .dwo) & certainly one per
DebugLocStream, which is currently singular in DwarfDebug)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375183 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Remove unused parameter (from DwarfDebug.cpp:emitListsTableHeaderStart)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375180 91177308-0d34-0410-b5e6-96231b3b80d8

Reland [llvm-objdump] Use a counter for llvm-objdump -h instead of the section index.

This relands r374931 (reverted in r375088). It fixes 32-bit builds by using the right format string specifier for uint64_t (PRIu64) instead of `%d`.

Original description:

When listing the index in `llvm-objdump -h`, use a zero-based counter instead of the actual section index (e.g. shdr->sh_index for ELF).

While this is effectively a noop for now (except one unit test for XCOFF), the index values will change in a future patch that filters certain sections out (e.g. symbol tables). See D68669 for more context. Note: the test case in `test/tools/llvm-objdump/X86/section-index.s` already covers the case of incrementing the section index counter when sections are skipped.

Reviewers: grimar, jhenderson, espindola

Reviewed By: grimar

Subscribers: emaste, sbc100, arichardson, aheejin, arphaman, seiya, llvm-commits, MaskRay

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68848

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375178 91177308-0d34-0410-b5e6-96231b3b80d8

[Error] Make llvm::cantFail include the original error messages

Summary:
The current implementation eats the current errors and just outputs
the message parameter passed to llvm::cantFail. This change appends
the original error message(s), so the user can see exactly why
cantFail failed. New logic is conditional on NDEBUG.

Reviewed By: lhames

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375176 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] drop getIsFP td helper

We already have isFloatType helper, and they are out of sync.
Drop one and merge the type list.

Differential Revision: https://reviews.llvm.org/D69138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375175 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Move computation of deadline up into base class

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375171 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add support for shell wildcards

Summary: GNU objcopy accepts the --wildcard flag to allow wildcard matching on symbol-related flags. (Note: it's implicitly true for section flags).

The basic syntax is to allow *, ?, \, and [] which work similarly to how they work in a shell. Additionally, starting a wildcard with ! causes that wildcard to prevent it from matching a flag.

Use an updated GlobPattern in libSupport to handle these patterns. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway).

Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap

Reviewed By: MaskRay

Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375169 91177308-0d34-0410-b5e6-96231b3b80d8

Reland "[lit] Synthesize artificial deadline"

We always want to use a deadline when calling `result.await`. Let's
synthesize an artificial deadline (now plus one year) to simplify code
and do less busy waiting.

Thanks to Reid Kleckner for diagnosing that a deadline for of "positive
infinity" does not work with Python 3 anymore. See commit:
4ff1e34b606d9a9fcfd8b8b5449a558315af94e5

I tested this patch with Python 2 and Python 3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375165 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add test for setcc to shift transform; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375158 91177308-0d34-0410-b5e6-96231b3b80d8

[cmake] Pass external project source directories to sub-configures

We're passing LLVM_EXTERNAL_PROJECTS to cross-compilation configures, so
we also need to pass the source directories of those projects, otherwise
configuration can fail from not finding them.

Differential Revision: https://reviews.llvm.org/D69076

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375157 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Fix the return type of getOffset/getSize

Header64.offset/Header64.size are uint64_t, thus we should not
truncate them to unit32_t. Moreover, there are a number of places
where we sum the offset and the size (e.g. in various checks in MachOUniversal.cpp),
the truncation causes issues since the offset/size can perfectly fit into uint32_t,
while the sum overflows.

Differential revision: https://reviews.llvm.org/D69126

Test plan: make check-all

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375154 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Some more preparatory cleanup for dropRedundantMaskingOfLeftShiftInput()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375153 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Turn on CR-Logical reducer pass

Quite a while ago, we implemented a pass that will reduce the number of
CR-logical operations we emit. It does so by converting a CR-logical operation
into a branch. We have kept this off by default because it seemed to cause a
significant regression with one benchmark.
However, that regression turned out to be due to a completely unrelated
reason - AADB introducing a self-copy that is a priority-setting nop and it was
just exacerbated by this pass.

Now that we understand the reason for the only degradation, we can turn this
pass on by default. We have long since fixed the cause for the degradation.

Differential revision: https://reviews.llvm.org/D52431

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375152 91177308-0d34-0410-b5e6-96231b3b80d8

Reapply r375051: [support] GlobPattern: add support for `\` and `[!...]`, and allow `]` in more places

Reland r375051 (reverted in r375052) after fixing lld tests on Windows in r375126 and r375131.

Original description: Update GlobPattern in libSupport to handle a few more cases. It does not fully match the `fnmatch` used by GNU objcopy since named character classes (e.g. `[[:digit:]]`) are not supported, but this should support most existing use cases (mostly just `*` is what's used anyway).

This will be used to implement the `--wildcard` flag in llvm-objcopy to be more compatible with GNU objcopy.

This is split off of D66613 to land the libSupport changes separately. The llvm-objcopy part will land soon.

Reviewers: jhenderson, MaskRay, evgeny777, espindola, alexshap

Reviewed By: MaskRay

Subscribers: nickdesaulniers, emaste, arichardson, hiraditya, jakehehrlich, abrachet, seiya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375149 91177308-0d34-0410-b5e6-96231b3b80d8

NFC: Fix variable only used in asserts by propagating the value.

Summary:
This fixes builds with assertions disabled that would otherwise
fail with unused variable warnings

Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69123

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375148 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [lit] Synthesize artificial deadline

Python on Windows raises this OverflowError:
gotit = waiter.acquire(True, timeout)
OverflowError: timestamp too large to convert to C _PyTime_t

So it seems this API behave the same way on every OS.

Also reverts the dependent commit a660dc590a5e8dafa1ba6ed56447ede151d17bd9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375143 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] add tests for popcount with zext; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375142 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars] Split loop predication out of optimizeLoopExits [NFC]

In the process of writing D69009, I realized we have two distinct sets of invariants within this single function, and basically no shared logic. The optimize loop exit transforms (including the new one in D69009) only care about *analyzeable* exits. Loop predication, on the other hand, has to reason about *all* exits. At the moment, we have the property (due to the requirement for an exact btc) that all exits are analyzeable, but that will likely change in the future as we add widenable condition support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375138 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] Workaround for PR43479, don't re-emit instr labels

Summary:
In the long run we should come up with another mechanism for marking
call instructions as heap allocation sites, and remove this workaround.
For now, we've had two bug reports about this, so let's apply this
workaround. SLH (the other client of instruction labels) probably has
the same bug, but the solution there is more likely to be to mark the
call instruction as not duplicatable, which doesn't work for debug info.

Reviewers: akhuang

Subscribers: aprantl, hiraditya, aganea, chandlerc, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375137 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC][InstCombine] Tests for "fold variable mask before variable shift-of-trunc" (PR42563)

https://bugs.llvm.org/show_bug.cgi?id=42563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375135 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars] Factor out a helper function for readability [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375133 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Move computation of deadline up into base class

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375130 91177308-0d34-0410-b5e6-96231b3b80d8