granicus.if.org Git

[NewGVN] NFC fixes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290679 91177308-0d34-0410-b5e6-96231b3b80d8

[WinEH] Don't assume endFunction is called while in .text

Jump table emission can switch to .rdata before
WinException::endFunction gets called. Just remember the appropriate
text section we started in and reset back to it when we end the
function. We were already switching sections back from .xdata anyway.

Fixes the first problem in PR31488, so that now COFF switch tables can
live in .rdata if we want them to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290678 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Global sweep replacing NULL with nullptr. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290670 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Remove redundant code. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290669 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] equals() for loads/stores is the same. Unify.

Differential Revision: https://reviews.llvm.org/D28116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290667 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Introduce a devirtualization iteration layer for the new PM.

This is an orthogonal and separated layer instead of being embedded
inside the pass manager. While it adds a small amount of complexity, it
is fairly minimal and the composability and control seems worth the
cost.

The logic for this ends up being nicely isolated and targeted. It should
be easy to experiment with different iteration strategies wrapped around
the CGSCC bottom-up walk using this kind of facility.

The mechanism used to track devirtualization is the simplest one I came
up with. I think it handles most of the cases the existing iteration
machinery handles, but I haven't done a *very* in depth analysis. It
does however match the basic intended semantics, and we can tweak or
tune its exact behavior incrementally as necessary. One thing that we
may want to revisit is freshly building the value handle set on each
iteration. While I don't think this will be a significant cost (it is
strictly fewer value handles but more churn of value handes than the old
call graph), it is conceivable that we'll want a somewhat more clever
tracking mechanism. My hope is to layer that on as a follow up patch
with data supporting any implementation complexity it adds.

This code also provides for a basic count heuristic: if the number of
indirect calls decreases and the number of direct calls increases for
a given function in the SCC, we assume devirtualization is responsible.
This matches the heuristics currently used in the legacy pass manager.

Differential Revision: https://reviews.llvm.org/D23114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290665 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach the CGSCC's CG update utility to more carefully invalidate
analyses when we're about to break apart an SCC.

We can't wait until after breaking apart the SCC to invalidate things:
1) Which SCC do we then invalidate? All of them?
2) Even if we invalidate all of them, a newly created SCC may not have
a proxy that will convey the invalidation to functions!

Previously we only invalidated one of the SCCs and too late. This led to
stale analyses remaining in the cache. And because the caching strategy
actually works, they would get used and chaos would ensue.

Doing invalidation early is somewhat pessimizing though if we *know*
that the SCC structure won't change. So it turns out that the design to
make the mutation API force the caller to know the *kind* of mutation in
advance was indeed 100% correct and we didn't do enough of it. So this
change also splits two cases of switching a call edge to a ref edge into
two separate APIs so that callers can clearly test for this and take the
easy path without invalidating when appropriate. This is particularly
important in this case as we expect most inlines to be between functions
in separate SCCs and so the common case is that we don't have to so
aggressively invalidate analyses.

The LCG API change in turn needed some basic cleanups and better testing
in its unittest. No interesting functionality changed there other than
more coverage of the returned sequence of SCCs.

While this seems like an obvious improvement over the current state, I'd
like to revisit the core concept of invalidating within the CG-update
layer at all. I'm wondering if we would be better served forcing the
callers to handle the invalidation beforehand in the cases that they
can handle it. An interesting example is when we want to teach the
inliner to *update and preserve* analyses. But we can cross that bridge
when we get there.

With this patch, the new pass manager an build all of the LLVM test
suite at -O3 and everything passes. =D I haven't bootstrapped yet and
I'm sure there are still plenty of bugs, but this gives a nice baseline
so I'm going to increasingly focus on fleshing out the missing
functionality, especially the bits that are just turned off right now in
order to let us establish this baseline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290664 91177308-0d34-0410-b5e6-96231b3b80d8

This is a large patch for X86 AVX-512 of an optimization for reducing code size by encoding EVEX AVX-512 instructions using the shorter VEX encoding when possible.

There are cases of AVX-512 instructions that have two possible encodings. This is the case with instructions that use vector registers with low indexes of 0 - 15 and do not use the zmm registers or the mask k registers.
The EVEX encoding prefix requires 4 bytes whereas the VEX prefix can take only up to 3 bytes. Consequently, using the VEX encoding for these instructions results in a code size reduction of ~2 bytes even though it is compiled with the AVX-512 features enabled.

Reviewers: Craig Topper, Zvi Rackoover, Elena Demikhovsky
Differential Revision: https://reviews.llvm.org/D27901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290663 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach the inliner's call graph update to handle inserting new edges
when they are call edges at the leaf but may (transitively) be reached
via ref edges.

It turns out there is a simple rule: insert everything as a ref edge
which is a safe conservative default. Then we let the existing update
logic handle promoting some of those to call edges.

Note that it would be fairly cheap to make these call edges right away
if that is desirable by testing whether there is some existing call path
from the source to the target. It just seemed like slightly more
complexity in this code path that isn't strictly necessary. If anyone
feels strongly about handling this differently I'm happy to change it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290649 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Remove a piece of a comment that said that InstCombiner contains pass infrastructure. That hasn't been true since r226618. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290648 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Actually commit the test update that was supposed to accompany
r290644. Sorry for this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290646 91177308-0d34-0410-b5e6-96231b3b80d8

[LCG] Teach the ref edge removal to handle a ref edge that is trivial
due to a call cycle.

This actually crashed the ref removal before.

I've added a unittest that covers this kind of interesting graph
structure and mutation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290645 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Disable the loop vectorizer from the new PM's pipeline as it
currenty relies on the old PM's dependency system forming LCSSA.

The new PM will require a different design for this, and for now this is
causing most of the issues I'm currently seeing in testing. I'd like to
get to a testable baseline and then work on re-enabling things one at
a time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290644 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Canonicalize insert splat sequences into an insert + shuffle

This adds a combine that canonicalizes a chain of inserts which broadcasts
a value into a single insert + a splat shufflevector.

This fixes PR31286.

Differential Revision: https://reviews.llvm.org/D27992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290641 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] add an experimental flag -experimental_len_control=1 that sets max_len to 1M and tries to increases the actual max sizes of mutations very gradually (second attempt)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290637 91177308-0d34-0410-b5e6-96231b3b80d8

Mark comparator call operator as const

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290636 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] don't create large random mutations when given an empty seed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290634 91177308-0d34-0410-b5e6-96231b3b80d8

[sanitizer-coverage] sort the switch cases

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290628 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-readobj: ELF: Make DT tags machine aware

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290623 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] fix UB and simplify the computation of the RNG seed (https://llvm.org/bugs/show_bug.cgi?id=31456)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290622 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach MemDep to invalidate its result object when its cached
analysis handles become invalid.

Add a test case for its invalidation logic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290620 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: add explicit casts for -Wqual-cast

Fix a warning detected by gcc 6:
warning: cast from type 'const void*' to type 'uint8_t* {aka unsigned char*}' casts away qualifiers [-Wcast-qual]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290618 91177308-0d34-0410-b5e6-96231b3b80d8

ASMParser: use range-based for loops (NFC)

Convert the verify method to use a few more range based for loops,
converting to const iterators in the process.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290617 91177308-0d34-0410-b5e6-96231b3b80d8

test: modernise ARM CodeGen tests

Replace the use of grep with FileCheck. Tidy up some of the tests. A
few of the tests have been left as weak as previously, though some have
been made more stringent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290616 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Simplify a bit removing else after return. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290615 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Remove a pointless optimization.

There is no need to do this within an analysis. That method shouldn't
even be reached if this predicate holds as the actual useful
optimization is in the analysis manager itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290614 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to make the Windows bots green after r290609.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290613 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Add more dedicated testing to cover the invalidation logic added to
BasicAA in r290603.

I've kept the basic testing in the new PM test file as that also covers
the AAManager invalidation logic. If/when there is a good place for
broader AA testing it could move there.

This test is somewhat unsatisfying as I can't get it to fail even with
ASan outside of explicit checks of the invalidation. Apparently we don't
yet have any test coverage of the BasicAA code paths using either the
domtree or loopinfo -- I made both of them always be null and check-llvm
passed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290612 91177308-0d34-0410-b5e6-96231b3b80d8

[MemCpyOpt] Don't sink LoadInst below possible clobber.

Differential Revision: https://reviews.llvm.org/D26811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290611 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Fix "||" vs "|" mixup.

The effect of the bug was that we would incorrectly create summaries
for global and weak values defined in module asm (since we were
essentially testing for bit 1 which is SF_Undefined, and the
RecordStreamer ignores local undefined references). This would have
resulted in conservatively disabling importing of anything referencing
globals and weaks defined in module asm. Added these cases to the test
which now fails without this bug fix.

Fixes PR31459.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290610 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64][AsmParser] Add support for parsing shift/extend operands with symbols.

Differential Revision: https://reviews.llvm.org/D27953

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290609 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][llvm-mc] Predefined symbols to access register counts (.kernel.{v|s}gpr_count)

The feature allows for conditional assembly, filling the entries
of .amd_kernel_code_t etc.

Symbols are defined with value 0 at the beginning of each kernel scope.
After each register usage, the respective symbol is set to:
value = max( value, ( register index + 1 ) )
Thus, at the end of scope the value represents a count of used registers.

Kernel scopes begin at .amdgpu_hsa_kernel directive, end at the
next .amdgpu_hsa_kernel (or EOF, whichever comes first). There is also
dummy scope that lies from the beginning of source file til the
first .amdgpu_hsa_kernel.

Test added.

Differential Revision: https://reviews.llvm.org/D27859

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290608 91177308-0d34-0410-b5e6-96231b3b80d8

[MemDep] Operand visited twice bugfix

Because operand was not marked as seen it was visited twice.
It doesn't change behavior of optimization, it just saves redudant
visit, so no test changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290607 91177308-0d34-0410-b5e6-96231b3b80d8

RuntimeDyldELF: refactor AArch64 relocations. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290606 91177308-0d34-0410-b5e6-96231b3b80d8

Fix unit test in NDEBUG build

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290604 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach BasicAA how to invalidate its result object.

This requires custom handling because BasicAA caches handles to other
analyses and so it needs to trigger indirect invalidation.

This fixes one of the common crashes when using the new PM in real
pipelines. I've also tweaked a regression test to check that we are at
least handling the most immediate case.

I'm going to work at re-structuring this test some to both scale better
(rather than all being in one file) and check more invalidation paths in
a follow-up commit, but I wanted to get the basic bug fix in place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290603 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix build bot after r290597

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290602 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Disable more of the loop passes -- LCSSA and LoopSimplify are also
not really wired into the loop pass manager in a way that will let us
productively use these passes yet.

This lets the new PM get farther in basic testing which is useful for
establishing a good baseline of "doesn't explode". There are still
plenty of crashers in basic testing though, this just gets rid of some
noise that is well understood and not representing a specific or narrow
bug.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290601 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Assembler: support SDWA and DPP for VOP2b instructions

Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D28051

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290599 91177308-0d34-0410-b5e6-96231b3b80d8

RuntimeDyldELF: add R_AARCH64_ADD_ABS_LO12_NC reloc

Differential revision: https://reviews.llvm.org/D28115

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290598 91177308-0d34-0410-b5e6-96231b3b80d8

Allow setting multiple debug types

Differential revision: https://reviews.llvm.org/D28109

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290597 91177308-0d34-0410-b5e6-96231b3b80d8

Change a std::vector to SmallVector in NewGVN

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290596 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach the AAManager and AAResults layer (the worst offender for
inter-analysis dependencies) to use the new invalidation infrastructure.

This teaches it to invalidate itself when any of the peer function
AA results that it uses become invalid. We do this by just tracking the
originating IDs. I've kept it in a somewhat clunky API since some users
of AAResults are outside the new PM right now. We can clean this API up
if/when those users go away.

Secondly, it uses the registration on the outer analysis manager proxy
to trigger deferred invalidation when a module analysis result becomes
invalid.

I've included test cases that specifically try to trigger use-after-free
in both of these cases and they would crash or hang pretty horribly for
me even without ASan. Now they work nicely.

The `InvalidateAnalysis` utility pass required some tweaking to be
useful in this context and it still is pretty garbage. I'd like to
switch it back to the previous implementation and teach the explicit
invalidate method on the AnalysisManager to take care of correctly
triggering indirect invalidation, but I wanted to go ahead and send this
out so folks could see how all of this stuff works together in practice.
And, you know, that it does actually work. =]

Differential Revision: https://reviews.llvm.org/D27205

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290595 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Introduce the facilities for registering cross-IR-unit dependencies
that require deferred invalidation.

This handles the other real-world invalidation scenario that we have
cases of: a function analysis which caches references to a module
analysis. We currently do this in the AA aggregation layer and might
well do this in other places as well.

Since this is relative rare, the technique is somewhat more cumbersome.
Analyses need to register themselves when accessing the outer analysis
manager's proxy. This proxy is already necessarily present to allow
access to the outer IR unit's analyses. By registering here we can track
and trigger invalidation when that outer analysis goes away.

To make this work we need to enhance the PreservedAnalyses
infrastructure to support a (slightly) more explicit model for "sets" of
analyses, and allow abandoning a single specific analyses even when
a set covering that analysis is preserved. That allows us to describe
the scenario of preserving all Function analyses *except* for the one
where deferred invalidation has triggered.

We also need to teach the invalidator API to support direct ID calls
instead of always going through a template to dispatch so that we can
just record the ID mapping.

I've introduced testing of all of this both for simple module<->function
cases as well as for more complex cases involving a CGSCC layer.

Much like the previous patch I've not tried to fully update the loop
pass management layer because that layer is due to be heavily reworked
to use similar techniques to the CGSCC to handle updates. As that
happens, we'll have a better testing basis for adding support like this.

Many thanks to both Justin and Sean for the extensive reviews on this to
help bring the API design and documentation into a better state.

Differential Revision: https://reviews.llvm.org/D27198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290594 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Turn on the new PM's inliner in addition to the current one for
most of the inliner test cases.

The inliner involves a bunch of interesting code and tends to be where
most of the issues I've seen experimenting with the new PM lie. All of
these test cases pass, but I'd like to keep some more thorough coverage
here so doing a fairly blanket enabling.

There are a handful of interesting tests I've not enabled yet because
they're focused on the always inliner, or on functionality that doesn't
(yet) exist in the inliner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290592 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add all forms of VPALIGNR, VALIGND, and VALIGNQ to the load folding tables.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290591 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Add one of the features left out of the initial inliner patch:
skipping indirectly recursive inline chains.

To do this, we implicitly build an inline stack for each callsite and
check prior to inlining that doing so would not form a cycle. This uses
the exact same technique and even shares some code with the legacy PM
inliner.

This solution remains deeply unsatisfying to me because it means we
cannot actually iterate the inliner externally. Doing so would not be
able to easily detect and avoid such cycles. Some day I would very much
like to have a solution that works without this internal state to detect
cycles, but this is not that day.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290590 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Wire up another test to the new pass manager.

Nothing really interesting here, but I had to improve the test to use
variables rather than hard coding value names as we happen to end up
with different value names in the new PM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290589 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Ignore `nobuiltin` on `allocsize` function calls.

We currently ignore the `allocsize` attribute on functions calls with
the `nobuiltin` attribute when trying to lower `@llvm.objectsize`. We
shouldn't care about `nobuiltin` here: `allocsize` is explicitly added
by the user, not inferred based on a function's symbol.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290588 91177308-0d34-0410-b5e6-96231b3b80d8

[Analysis] Refactor as promised in r290397.

This also makes us no longer check for `allocsize` on intrinsic calls.
This shouldn't matter, since intrinsics should provide the information
we get from `allocsize` on their own.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290585 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Remove masked pmuldq and pmuludq intrinsics and autoupgrade them to unmasked intrinsics plus a select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290583 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions

PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

This builds on r290554 which added supported for 128 and 256-bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290582 91177308-0d34-0410-b5e6-96231b3b80d8

[LCG] Teach the LazyCallGraph to handle visiting the blockaddress
constant expression and to correctly form function reference edges
through them without crashing because one of the operands (the
`BasicBlock` isn't actually a constant despite being an operand of
a constant).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290581 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add 512-bit unmasked intrinsics for pmuldq and pmuludq so we can add them to InstCombine with the 128 and 256 bit versions.

The 128 and 256 bit masked intrinsics are currently unused by clang. The sse and avx2 unmasked intrinsics are used instead. The new 512-bit intrinsic will be used to do the same. Then all masked versions will removed and autoupgraded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290573 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach the inliner in the new PM to merge attributes after inlining.

Also enable the new PM in the attributes test case which caught this
issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290572 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Modernize all of the inliner tests that were using grep.

This mostly involved converting from grep to FileCheck and tidying up
the IR used.

In one case (invoke_test-3.ll) the test had become completely pointless
as we use 'resume' rather than 'unwind' now, and even then it did not
occur at the end of the line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290570 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512][InstCombine] Teach InstCombine to turn masked scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.

An earlier commit added support for unmasked scalar operations. At that time isel wouldn't generate an optimal sequence for masked operations, but that has now been fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290566 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][AVX-512] Add masked scalar add/sub/mul/div intrinsic test cases that don't have a CUR_DIRECTION rounding mode.

The CUR_DIRECTION case will be optimized in a future commit so this provides coverage for the other cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290565 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add isel patterns to turn native masked scalar add/sub/mul/div into masked instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290564 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Add tests to show missed opportunities for combining masking with scalar arithmetic operations.

These particular sequences will be generated after a future change to teach InstCombine to turn masked scalar arithmetic intrinsics into native IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290563 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Move the collection of call sites to a more appropriate place
inside of `InlineFunction`. Prior to this, call instructions are
specifically being rewritten and replaced within the inlined region,
invalidating some of the call sites.

Several of these regions are using the same technique to walk the
inlined region so this seems clearly safe up to this point.

I've also added a short circuit to the scan for call sites based on what
other code is doing.

With this, the most common crash I've found in the new inliner code is
fixed. I've turned it on for another test case that covers this
scenario.

I'll make my way through most of the other inliner test cases
just to get some easy coverage next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290562 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512][InstCombine] Teach InstCombine to turn packed add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290559 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Teach the always inliner in the new pass manager to support
removing fully-dead comdats without removing dead entries in comdats
with live members.

This factors the core logic out of the current inliner's internals to
a reusable utility and leverages that in both places. The factored out
code should also be (minorly) more efficient in cases where we have very
few dead functions or dead comdats to consider.

I've added a test case to cover this behavior of the always inliner.
This is the last significant bug in the new PM's always inliner I've
found (so far).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290557 91177308-0d34-0410-b5e6-96231b3b80d8

[doc] Add mention of the difference in optimization level between Release and RelWithDebInfo in Cmake.rst

This is surprising to many people.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290556 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add an llvm::erase_if utility to make the standard erase+remove_if
pattern easier to write.

Differential Revision: https://reviews.llvm.org/D28120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290555 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions

PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290554 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add a boring std::partition wrapper similar to our std::remove_if
wrapper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290553 91177308-0d34-0410-b5e6-96231b3b80d8

clang-format NewGVN files

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290551 91177308-0d34-0410-b5e6-96231b3b80d8

Misc cleanups and simplifications for NewGVN.
Mostly use a bit more idiomatic C++ where we can,
so we can combine some things later.

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28111

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290550 91177308-0d34-0410-b5e6-96231b3b80d8

Don't use our own incorrect version of isTriviallyDeadInstruction in NewGVN. Fixes PR/31472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290549 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Add a flag to enable the pass via `-mllvm`.

NewGVN can be tested passing `-mllvm -enable-newgvn` to clang.

Differential Revision: https://reviews.llvm.org/D28059

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290548 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Change test to reflect difference between GVN and NewGVN.

The current GVN algorithm folds unconditional branches to, it claims,
expose more PRE oportunities. The folding, if really needed,
(which is not sure, as it's not really proved it improves analysis)
can be done by an earlier cleanup pass instead of GVN itself.
Ack'ed/SGTM'd by Daniel Berlin.

Differential Revision: https://reviews.llvm.org/D28117

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290546 91177308-0d34-0410-b5e6-96231b3b80d8

Wdocumentation fix

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290545 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX512] Added v64i8 reverse shuffle test (PR31470)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290544 91177308-0d34-0410-b5e6-96231b3b80d8

[NewGVN] Fold lookupOperandLeader() when there's only one use. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290543 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombiner] Simplify lib calls to `round{,f}`

Differential Revision: https://reviews.llvm.org/D28110

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290542 91177308-0d34-0410-b5e6-96231b3b80d8

Test the different scenarios of GlobalDCE and comdats more
systematically and document in the test what all is going on.

This replaces the PR-named test that was the only coverage for GlobalDCE
and comdats previously. I wrote this because I wasn't certain how
comdat DCE was supposed to work and wanted to step through what
GlobalDCE did to fully understand it. After talking to folks and reading
the code and really staring at things it all makes sense but it seemed
good to help write down some of this in a more explicit and fully
covering test case.

For example, it seemed like a bug that GlobalDCE didn't consider comdat
participation of ifuncs. Specifically it seemed like an accident because
testing didn't really cover that case. But in fact, ifuncs specifically
cannot participate in a comdat despite having that API. The new test
case covers this and explicitly documents that DCE gets to fire here
even though there are comdats involved.

Also, we didn't have any positive tests for the challenging cases such
as usage cycles between comdat participants that might make them seem
alive except that there is no external edge into the cycle.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290537 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Fix some patterns to use extended register classes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290536 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.

Summary:
I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon.

I'll do the same thing for packed add/sub/mul/div in a future patch.

Reviewers: delena, RKSimon, zvi, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290535 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512] Don't assume that the rounding mode argument to intrinsics is a constant. While clang will guarantee this, nothing in the backend will.

A non-constant value will now result in an isel error instead of just asserting or crashing due to a bad cast during lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290532 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some bad indentation that I or another introduced somehow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290531 91177308-0d34-0410-b5e6-96231b3b80d8

[AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics into shufflevector instructions

Summary:
This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants.

We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that.

Reviewers: zvi, delena, spatel, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290530 91177308-0d34-0410-b5e6-96231b3b80d8

Fix `update_test_checks.py` bug that incorrectly truncates IR body.

Differential Revision: https://reviews.llvm.org/D26619

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290529 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add a generic concatenating iterator and range (take 2).

This recommits r290512 that was reverted when MSVC failed to compile it. Since
then I've played with various approaches using rextester.com (where I was able
to reproduce the failure) and think that I have a solution thanks in part to
the help of Dave Blaikie! It seems MSVC just has a defective `decltype` in this
version. Manually writing out the type seems to do the trick, even though it is
.... quite complicated.

Original commit message:
This allows both defining convenience iterator/range accessors on types
which walk across N different independent ranges within the object, and
more direct and simple usages with range based for loops such as shown
in the unittest. The same facilities are used for both. They end up
quite small and simple as it happens.

I've also switched an iterator on `Module` to use this. I would like to
add another convenience iterator that includes even more sequences as
part of it and seeing this one already present motivated me to actually
abstract it away and introduce a general utility.

Differential Revision: https://reviews.llvm.org/D28093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290528 91177308-0d34-0410-b5e6-96231b3b80d8

[MemorySSA] Define a restricted upward AccessList splice.

Differential Revision: https://reviews.llvm.org/D26661

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290527 91177308-0d34-0410-b5e6-96231b3b80d8

[AliasAnalysis] Teach BasicAA about memcpy.

Differential Revision: https://reviews.llvm.org/D27034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290526 91177308-0d34-0410-b5e6-96231b3b80d8

Value number stores and memory states so we can detect when memory states are equivalent (IE store of same value to memory).

Reviewers: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D28084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290525 91177308-0d34-0410-b5e6-96231b3b80d8

Rename GVNExpression *ops_ members to *op_* to match conventions in the rest of LLVM

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290524 91177308-0d34-0410-b5e6-96231b3b80d8

[Orc][RPC] Add a ParallelCallGroup utility for dispatching and waiting on
multiple asynchronous RPC calls.

ParallelCallGroup allows multiple asynchronous calls to be dispatched,
and provides a wait method that blocks until all asynchronous calls have
been executed on the remote and all return value handlers run on the
local machine.

This will allow, for example, the JIT client to issue memory allocation calls
for all sections in parallel, then block until all memory has been allocated
on the remote and the allocated addresses registered with the client, at which
point the JIT client can proceed to applying relocations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290523 91177308-0d34-0410-b5e6-96231b3b80d8

[Orc][RPC] Clang-format RPCUtils header.

Some of the recent RPC call type-checking changes weren't formatted prior to
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290520 91177308-0d34-0410-b5e6-96231b3b80d8

Add newline to end of file to quiet warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290519 91177308-0d34-0410-b5e6-96231b3b80d8

revert commit 290516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290517 91177308-0d34-0410-b5e6-96231b3b80d8

Commit try added new empty line

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290516 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Added support for Checksum debug info feature.

Differential Revision: https://reviews.llvm.org/D27642

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290514 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r290512: [ADT] Add a generic concatenating iterator and range.

This code doesn't work on MSVC for reasons that elude me and I've not
yet covinced a workaround to compile cleanly so reverting for now while
I play with it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290513 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add a generic concatenating iterator and range.

This allows both defining convenience iterator/range accessors on types
which walk across N different independent ranges within the object, and
more direct and simple usages with range based for loops such as shown
in the unittest. The same facilities are used for both. They end up
quite small and simple as it happens.

I've also switched an iterator on `Module` to use this. I would like to
add another convenience iterator that includes even more sequences as
part of it and seeing this one already present motivated me to actually
abstract it away and introduce a general utility.

Differential Revision: https://reviews.llvm.org/D28093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290512 91177308-0d34-0410-b5e6-96231b3b80d8

MetadataLoader: replace the tracking of ForwardReferences and UnresolvedNodes with a set-based solution (NFC)

This makes it explicit what is the exact list to handle, and it
looks much more easy to manipulate and understand that the
previous custom tracking of min/max to express the range where
to look for.

Differential Revision: https://reviews.llvm.org/D28089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290507 91177308-0d34-0410-b5e6-96231b3b80d8

MetadataLoader: add an extra assertion in Placeholders flush (NFC)

We don't expect any forward reference at this point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290506 91177308-0d34-0410-b5e6-96231b3b80d8

Add range iterator for blocks in MemoryPhi

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290504 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine][X86] Add tests showing missed opportunities to simplify PMULUDQ/PMULDQ inputs.

PMULUDQ/PMULDQ - only the even elements (0, 2, 4, 6) of the vXi32 inputs are required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290502 91177308-0d34-0410-b5e6-96231b3b80d8