granicus.if.org Git

[PGO] Preserve GlobalsAA in pgo-memop-opt pass.

Preserve GlobalsAA analysis in memory intrinsic calls optimization based on
profiled size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299707 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-extract] Add option for recursive extraction

Summary:
Particularly, with --delete, this can be very useful for testing
new optimizations on some hotspots, without having to run it on the whole
application. E.g. as such:
```
llvm-extract app.bc --recursive --rfunc .*hotspot.* > hotspot.bc
llvm-extract app.bc --recursive --delete --rfunc .*hotspot.* > residual.bc
llc -filetype=obj residual.bc > residual.o
llc -filetype=obj hotspot.bc > hotspot.o
cc -o app residual.o hotspot.o
```

Reviewed By: davide
Differential Revision: https://reviews.llvm.org/D31722

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299706 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Remove redundant combine from visitAnd

This combine is fully handled by SimplifyDemandedInstructionBits as of r299658 where I fixed this code to ensure the Add/Sub had only a single user. Otherwise it would fire and create additional instructions. That fix resulted in an improvement to code generated for tsan which is why I committed it before deleting.

Differential Revision: https://reviews.llvm.org/D31543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299704 91177308-0d34-0410-b5e6-96231b3b80d8

[BFIterator] Remove an assertion that doesn't hold. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299703 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Turn some C-style vararg into variadic templates"

This reverts commit r299699, the examples needs to be updated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299702 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] [ARM CodeGen] Fix chain information of LowerMUL

In LowerMUL, the chain information is not preserved for the new
created Load SDNode.

For example, if a Store alias with one of the operand of Mul.
The Load for that operand need to be scheduled before the Store.
The dependence is recorded in the chain of Store, in TokenFactor.
However, when lowering MUL, the SDNodes for the new Loads for
VMULL are not updated in the TokenFactor for the Store. Thus the
chain is not preserved for the lowered VMULL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299701 91177308-0d34-0410-b5e6-96231b3b80d8

Turn some C-style vararg into variadic templates

Module::getOrInsertFunction is using C-style vararg instead of
variadic templates.

From a user prospective, it forces the use of an annoying nullptr
to mark the end of the vararg, and there's not type checking on the
arguments. The variadic template is an obvious solution to both
issues.

Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu>

Differential Revision: https://reviews.llvm.org/D31070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299699 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Fix dead stripping of globals on Linux.

Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.

Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.

This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.

At this moment it does not work on Gold (as in the globals are never
stripped).

This is a re-land of r298158 rebased on D31358. This time,
asan.module_ctor is put in a comdat as well to avoid quadratic
behavior in Gold.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299697 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Put ctor/dtor in comdat.

When possible, put ASan ctor/dtor in comdat.

The only reason not to is global registration, which can be
TU-specific. This is not the case when there are no instrumented
globals. This is also limited to ELF targets, because MachO does
not have comdat, and COFF linkers may GC comdat constructors.

The benefit of this is a lot less __asan_init() calls: one per DSO
instead of one per TU. It's also necessary for the upcoming
gc-sections-for-globals change on Linux, where multiple references to
section start symbols trigger quadratic behaviour in gold linker.

This is a rebase of r298756.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299696 91177308-0d34-0410-b5e6-96231b3b80d8

[asan] Delay creation of asan ctor.

Create the constructor in the module pass.
This in needed for the GC-friendly globals change, where the constructor can be
put in a comdat in some cases, but we don't know about that in the function
pass.

This is a rebase of r298731 which was reverted due to a false alarm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299695 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Do not create FNENTRYs for aliases of functions.

There doesn't seem to be any point in doing this.

Differential Revision: https://reviews.llvm.org/D31691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299694 91177308-0d34-0410-b5e6-96231b3b80d8

[StripDeadDebugInfo] Drop dead CUs entirely

Summary:
Prior to this while it would delete the dead DIGlobalVariables, it would
leave dead DICompileUnits and everything referenced therefrom. For a bit
bitcode file with thousands of compile units those dead nodes easily
outnumbered the real ones. Clean that up.

Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D31720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299692 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Temporarily change constant address space from 4 to 2

Our final address space mapping is to let constant address space to be 4 to match nvptx.
However for now we will make it 2 to avoid unnecessary work in FE/BE/devlib
about intrinsics returning constant pointers.

Differential Revision: https://reviews.llvm.org/D31770

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299690 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[ARM] Add Kryo to available targets"

This reverts commit 942d6e6f58bf7e63810dd7cbcbce1fdfa5ebc6d4.

Build breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299689 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] Fix visitAND optimization to deal with vector extract case again.

Summary:
Fix case elided by rL298920.

Fixes PR32545.

Reviewers: eli.friedman, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31759

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299688 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Remove unreachable default from SimplifyBinOp.

We have dedicated handlers for every opcode so nothing can get here anymore. The switch doesn't get detected as fully covered because Opcode is an unsigned. Casting to Instruction::BinaryOps still doesn't detect it because BinaryOpsEnd is in the enum and 1 past the last opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299687 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Rename some functions for consistency

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299685 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Fixup some small issues

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299684 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: Fix a small formatting issue in performSymbolicLoadEvaluation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299683 91177308-0d34-0410-b5e6-96231b3b80d8

NewGVN: This patch makes memory congruence work for all types of
memorydefs, not just stores. Along the way, we audit and fixup issues
about how we were tracking memory leaders, and improve the verifier
to notice more memory congruency issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299682 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add Kryo to available targets

Summary:
Host CPU detection now supports Kryo, so we need to recognize it in ARM
target.

Reviewers: mcrosier, t.p.northover, rengolin, echristo, srhines

Reviewed By: t.p.northover, echristo

Subscribers: aemerson

Differential Revision: https://reviews.llvm.org/D31775

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299674 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Stop using CCAssignToRegWithShadow

This does not do what it is attempting to use it for
and requires working around in LowerFormalArguments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299667 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Teach SimplifyMulInst to recognize vectors of i1 as And. Not just scalar i1.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299665 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Change the vector scaling for vector offsets

Keep full offset value on MI-level instructions, but have it scaled down
in the MC-level instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299664 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Add a generic breadth-first-search graph iterator.

This will be used in LCSSA to speed up the canonicalization.

Differential Revision: https://reviews.llvm.org/D31694

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299660 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront size

If a workgroup size is known to be not greater than wavefront size
the s_barrier instruction is not needed since all threads are guarantied
to come to the same point at the same time.

Differential Revision: https://reviews.llvm.org/D31731

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299659 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix a case where we weren't checking that an instruction had a single use resulting in extra instructions being created.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299658 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Resubmit SDWA peephole: enable by default
Reviewers: vpykhtin, rampitec, arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299654 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] NFC patch removing a redundant check.

Since the BUILD_VECTOR has already been checked by
isBuildVectorOfConstantSDNodes() in SelectionDAG::getNode() for a
SIGN_EXTEND_INREG, it can be assumed that Op is always either undef or a
ConstantSDNode, and Ops.size() will always equal VT.getVectorNumElements().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299647 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][MMX] Test showing failure to create MMX non-temporal store

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299640 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel][tablegen] Move <Target>InstructionSelector declarations to anonymous namespaces

Summary: This resolves the issue of tablegen-erated includes in the headers for non-GlobalISel builds in a simpler way than before.

Reviewers: qcolombet, ab

Reviewed By: ab

Subscribers: igorb, ab, mgorny, dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30998

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299637 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Remove a dead ADD during the creation of TBBs

During the optimisation of jump tables in the constant island pass,
an extra ADD could be left over, now dead but not removed.

Differential Revision: https://reviews.llvm.org/D31389

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299634 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Add test cases for mixing add/sub i1 with xor of i1. Seems we can simplify in one direction but not the other.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299627 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Teach SimplifyAddInst and SimplifySubInst that vectors of i1 can be treated as Xor too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299626 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay][docs] Fix hyperlink to XRay doc

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299624 91177308-0d34-0410-b5e6-96231b3b80d8

[Orc] Add missing header include for r299611.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299623 91177308-0d34-0410-b5e6-96231b3b80d8

Revert accidental commit of r299619.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299622 91177308-0d34-0410-b5e6-96231b3b80d8

Revert accidental commit of r299618

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299621 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add commutable matchers for Add and Mul to go with the logic operations that are already present. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299620 91177308-0d34-0410-b5e6-96231b3b80d8

bar

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299619 91177308-0d34-0410-b5e6-96231b3b80d8

foo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299618 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] - Fix spelling error to test commit access.

Just a spelling change in a comment intended to test svn commit access.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299616 91177308-0d34-0410-b5e6-96231b3b80d8

[Orc] Break QueueChannel out into its own header and add a utility,
createPairedQueueChannels, to simplify channel creation in the RPC unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299611 91177308-0d34-0410-b5e6-96231b3b80d8

[Orc] Make orcError return an error_code rather than Error.

This will allow orcError to be used in convertToErrorCode implementations,
which will help in transitioning Orc RPC to Error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299610 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Implement timeouts and max_time for process pool testing

This is necessary to pass the lit test suite at llvm/utils/lit/tests.

There are some pre-existing failures here, but now switching to pools
doesn't regress any tests.

I had to change test-data/lit.cfg to import DummyConfig from a module to
fix pickling problems, but I think it'll be OK if we require test
formats to be written in real .py modules outside lit.cfg files.

I also discovered that in some circumstances AsyncResult.wait() will not
raise KeyboardInterrupt in a timely manner, but you can pass a non-zero
timeout to work around this. This makes threading.Condition.wait use a
polling loop that runs through the interpreter, so it's capable of
asynchronously raising KeyboardInterrupt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299605 91177308-0d34-0410-b5e6-96231b3b80d8

StringTableBuilder: Don't assert when writing an empty raw string table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299602 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Remove an unused declaration. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299598 91177308-0d34-0410-b5e6-96231b3b80d8

[Bugpoint] Use `unique_ptr` correctly.

Moving Modules into `testMergedProgram` is incorrect (and causes segmentation
faults) since all callers expect to retain ownership. This is evidenced by the
later calls to `unique_ptr<Module>::get` in the same function.

Differential Revision: https://reviews.llvm.org/D31727

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299596 91177308-0d34-0410-b5e6-96231b3b80d8

[X86 TTI] Implement LSV hook

Summary:
LSV wants to know the maximum size that can be loaded to a vector register.
On X86, this always matches the maximum register width. Implement this
accordingly and add a test to make sure that LSV can vectorize up to the
maximum permissible width on X86.

Reviewers: delena, arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D31504

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299589 91177308-0d34-0410-b5e6-96231b3b80d8

Remove accidental debug printf. Follow up to r299583.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299584 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r299536. [AMDGPU] SDWA peephole: enable by default.

Reason: breaks multiple bots:

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173

Original Review URL: https://reviews.llvm.org/D31671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299583 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Use -mattr to select HVX mode in a testcase, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299582 91177308-0d34-0410-b5e6-96231b3b80d8

MemorySSA: Remove MemorySSA walker caching.

Summary:
Remove all the caching the clobber walker does, and that the
caching walker does. With the patch to enable storing clobbering
access results for stores, i can find no improvement with the cache
turned on (and a number of degradations, both time and memory, from
the cost of caching. For a large program i have, we do millions of
lookups and inserts with zero hits).

I haven't tried to rename or simplify the walker otherwise yet.

(Appreciate some perf testing on this past my own testing)

Reviewers: george.burgess.iv, davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299578 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Only print the real size of the note

Note payloads are padded to a multiple of 4 bytes in size, but the size
of the string that should be print can be smaller e.g. the n_descsz
field in gold's version note is 9, so that's the whole size of the
string that should be printed. The padding is part of the format of a
SHT_NOTE section or PT_NOTE segment, but it's not part of the note
itself.

Printing the extra null bytes may confuse some tools, e.g. when the
llvm-readobj is sent to grep, it treats the output as binary because
it contains a null byte.

Differential Revision: https://reviews.llvm.org/D30804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299576 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Support FMF contract in fused multiple-and-sub too

This is a follow-on to r299096 which added support for fmadd.

Subtract does not have the case where with two multiply operands we commute in
order to fuse with the multiply with the fewer uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299572 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Remove commented-out code from r299096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299571 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add fold for icmp with or mask of low bits (PR32542)

We already have these 'and' folds:

// X & -C == -C -> X > u ~C
// X & -C != -C -> X <= u ~C
// iff C is a power of 2

...but we were missing the 'or' siblings.

http://rise4fun.com/Alive/n6

This should improve:
https://bugs.llvm.org/show_bug.cgi?id=32524
...but there are 2 or more other pieces to fix still.

Differential Revision: https://reviews.llvm.org/D31712

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299570 91177308-0d34-0410-b5e6-96231b3b80d8

[ExecutionDepsFix] Don't recurse over the CFG

Summary:
Use an explicit work queue instead, to avoid accidentally
causing stack overflows for input with very large CFGs.

Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D31681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299569 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fix formatting and variable names; NFCI

There must be some opportunity to refactor big chunks of nearly duplicated code in FoldOrOfICmps / FoldAndOfICmps.
Also, none of this works with vectors, but it should.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299568 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Fix for Bug 28158 + LIT tests

Added support of the following instructions:
- s_cbranch_cdbgsys
- s_cbranch_cdbgsys_and_user
- s_cbranch_cdbgsys_or_user
- s_cbranch_cdbguser
- s_setkill

Reviewers: vpykhtin

Differential Revision: https://reviews.llvm.org/D31469

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299567 91177308-0d34-0410-b5e6-96231b3b80d8

MemorySSA: Fix and use optimized_def_chain

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299566 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Revert to old execution strategy while I debug these pickling errors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299565 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Use Python 3 style print to satisfy some bots

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299564 91177308-0d34-0410-b5e6-96231b3b80d8

ARMFrameLowering: Slight cleanups; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299562 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Use process pools for test execution by default

Summary:
This drastically reduces lit test execution startup time on Windows. Our
previous strategy was to manually create one Process per job and manage
the worker pool ourselves. Instead, let's use the worker pool provided
by multiprocessing. multiprocessing.Pool(jobs) returns almost
immediately, and initializes the appropriate number of workers, so they
can all start executing tests immediately. This avoids the ramp-up
period that the old implementation suffers from. This appears to speed
up small test runs.

Here are some timings of the llvm-readobj tests on Windows using the
various execution strategies:

# multiprocessing.Pool:
$ for i in `seq 1 3`; do tim python ./bin/llvm-lit.py -sv ../llvm/test/tools/llvm-readobj/ --use-process-pool |& grep real: ; done
real: 0m1.156s
real: 0m1.078s
real: 0m1.094s

# multiprocessing.Process:
$ for i in `seq 1 3`; do tim python ./bin/llvm-lit.py -sv ../llvm/test/tools/llvm-readobj/ --use-processes |& grep real: ; done
real: 0m6.062s
real: 0m5.860s
real: 0m5.984s

# threading.Thread:
$ for i in `seq 1 3`; do tim python ./bin/llvm-lit.py -sv ../llvm/test/tools/llvm-readobj/ --use-threads |& grep real: ; done
real: 0m9.438s
real: 0m10.765s
real: 0m11.079s

I kept the old code to launch processes in case this change doesn't work
on all platforms that LLVM supports, but at some point I would like to
remove both the threading and old multiprocessing execution strategies.

Reviewers: modocache, rafael

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299560 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Try to re-enable MachineBranchProb.ll for ARM/AArch64

Commit r298799 changed code that made the XFAIL on MachineBranchProb.ll
irrelevant, but some configurations still failed. I can't reproduce it
locally, so I'm hoping that enabling this will tell me if some
configurations will really fail or if they were just too slow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299558 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for missing icmp fold (PR32524)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299557 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Fix for Bug 28167 + LIT tests

Corrected src0 for v_writelane_b32:
- Enabled inline constants and literals for SI/CI (VOP2)
- Enabled inline constants for VI (VOP3)

Reviewers: vpykhtin, arsenm

https://reviews.llvm.org/D31463

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299555 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Prevent Merging Bitcast with non-normal loads

Fixes PR32505.

Reviewers: uweigand, jonpa

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D31609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299552 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] Factor out error handling code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299551 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-ar] Remove unneeded std::, NFCI.

This makes it more consistent with other exit() calls in llvm-ar
(and the tools in general).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299549 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-ar] errors go on stderr and not on stdout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299548 91177308-0d34-0410-b5e6-96231b3b80d8

Respect CMAKE_INSTALL_MANDIR for sphinx generated manpages

This is a re-work of r297516, which was reverted in r297545.

https://reviews.llvm.org/D30906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299547 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] Improve error message when output file cannot be opened.

Patch by Sam Clegg!

Differential Revision: https://reviews.llvm.org/D31351

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299546 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Make test case more robust

This test case depends on the loop being vectorized without forcing the
vectorization factor. If the profitability ever changes in the future (due to
cost model improvements), the test may no longer work as intended. Instead of
checking the resulting IR, we should just check the instruction costs. The
costs will be computed regardless if vectorization is profitable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299545 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to bitwise logic+setcc (PR32401)

This is a generic combine enabled via target hook to reduce icmp logic as discussed in:
https://bugs.llvm.org/show_bug.cgi?id=32401

It's likely that other targets will want to enable this hook for scalar transforms,
and there are probably other patterns that can use bitwise logic to reduce comparisons.

Note that we are missing an IR canonicalization for these patterns, and we will probably
prefer the pair-of-compares form in IR (shorter, more likely to fold).

Differential Revision: https://reviews.llvm.org/D31483

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299542 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Don't make a BUILD_VECTOR with operands of illegal type.

When DAGCombiner visits a SIGN_EXTEND_INREG of a BUILD_VECTOR with
constant operands, a new BUILD_VECTOR node will be created transformed
constants.

Llvm-stress found a case where the new BUILD_VECTOR had constant operands
of an illegal type, because the (legal) element type is in fact not a legal
scalar type.

This patch changes this so that the new BUILD_VECTOR has the same operand
type as the old one.

Review: Eli Friedman, Nirav Dave
https://bugs.llvm.org//show_bug.cgi?id=32422

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299540 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for missing add canonicalization; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299539 91177308-0d34-0410-b5e6-96231b3b80d8

[globalisel][tablegen] Fix patterns involving multiple ComplexPatterns.

Summary:
Temporaries are now allocated to operands instead of predicates and this
allocation is used to correctly pair up the rendered operands with the
matched operands.

Previously, ComplexPatterns were allocated temporaries independently in the
Src Pattern and Dst Pattern, leading to mismatches. Additionally, the Dst
Pattern failed to account for the allocated index and therefore always used
temporary 0, 1, ... when it should have used base+0, base+1, ...

Thanks to Aditya Nandakumar for noticing the bug.

Depends on D30539

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: igorb, dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D31054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299538 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] SDWA peephole: enable by default

Reviewers: vpykhtin, rampitec, arsenm

Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299536 91177308-0d34-0410-b5e6-96231b3b80d8

Fix WebAssembly after r299529.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299535 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Renamed combine to make it clear that it only handles the vector shift by immediate opcodes. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299532 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Crypto requires FP.

So if FP is disabled, crypto should also be disabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299531 91177308-0d34-0410-b5e6-96231b3b80d8

Add MCContext argument to MCAsmBackend::applyFixup for error reporting

A number of backends (AArch64, MIPS, ARM) have been using
MCContext::reportError to report issues such as out-of-range fixup values in
their TgtAsmBackend. This is great, but because MCContext couldn't easily be
threaded through to the adjustFixupValue helper function from its usual
callsite (applyFixup), these backends ended up adding an MCContext* argument
and adding another call to applyFixup to processFixupValue. Adding an
MCContext parameter to applyFixup makes this unnecessary, and even better -
applyFixup can take a reference to MCContext rather than a potentially null
pointer.

Differential Revision: https://reviews.llvm.org/D30264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299529 91177308-0d34-0410-b5e6-96231b3b80d8

[LAA] Correctly return a half-open range in expandBounds

This is a latent bug that's been hanging around for a while. For a loop-invariant
pointer, expandBounds would return the range {Ptr, Ptr}, but this was interpreted
as a half-open range, not a closed range. So we ended up planting incorrect
bounds checks. Even worse, they were tautological, so we ended up incorrectly
executing the optimized loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299526 91177308-0d34-0410-b5e6-96231b3b80d8

[coroutines] Add syntax coloring to examples in Coroutines.rst

Subscribers: EricWF

Differential Revision: https://reviews.llvm.org/D31699

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299517 91177308-0d34-0410-b5e6-96231b3b80d8

[ObjCArc] Do not dereference an invalidated iterator.

Fix a bug in ARC contract pass where an iterator that pointed to a
deleted instruction was dereferenced.

It appears that tryToContractReleaseIntoStoreStrong was incorrectly
assuming that a call to objc_retain would not immediately follow a call
to objc_release.

rdar://problem/25276306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299507 91177308-0d34-0410-b5e6-96231b3b80d8

[RuntimeDyld] Remove an unused static member left over from r299449.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299497 91177308-0d34-0410-b5e6-96231b3b80d8

ThinLTOBitcodeWriter: handle aliases first in filterModule

Summary: This change fixes a "local linkage requires default visibility" assert when attempting to build LLVM with ThinLTO on Windows.

Reviewers: pcc, tejohnson, mehdi_amini

Reviewed By: pcc

Subscribers: llvm-commits, Prazek

Differential Revision: https://reviews.llvm.org/D31632

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299491 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Relax assert in broadcast-of-subvector lowering.

Before r294774, there was a problem when lowering broadcasts to use
128-bit subvectors.

When we looked through a bitcast to find the broadcast input, we'd keep
using the original type, so you'd end up with things like:
  (v8f32 (broadcast
    (v4f32 (extract_subvector
      (v8i32 V),
      ...))
    ))

r294774 fixed it to always emit subvectors with the scalar type of the
original source.

It also introduced some asserts, to check that we use scalars with
the same size, and vectors with the same number of elements.

The scalar size equality is checked earlier when looking through bitcasts,
and is a useful assert.

However, the number of elements don't have to be identical: we're always
going to extract a 128-bit subvector, and we can have different size
inputs if we looked through a concat_vector to find a 256-bit source.

Relax the overzealous assert.

Replace it with a check of the original source vector being 256 or 512
bits.  If it's 128 bits, we can't extract_subvector from it.

Fixes PR32371.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299490 91177308-0d34-0410-b5e6-96231b3b80d8

Allow targets to opt-in to codegen in SCC order

Decouple this setting from EnableIRPA.

To support function calls on AMDGPU, it is necessary to
report the global register usage throughout the kernel's
call graph, so callees need to be handled first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299487 91177308-0d34-0410-b5e6-96231b3b80d8

Re-apply MemorySSA: Add support for caching clobbering access in
stores with some fixes.

Summary:
This enables us to cache the clobbering access for stores, despite the
fact that we can't rewrite the use-def chains themselves.

Early testing shows that, after this change, for larger testcases, it
will be a significant net positive (memory and time) to remove the
walker caching.

Reviewers: george.burgess.iv, davide

Subscribers: Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D31567

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299486 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "MemorySSA: Add support for caching clobbering access in stores"

This reverts revision r299322.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299485 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Set defaults based on section names and support name suffixes

Set correct default flags and section type based on its name for .text,
.data, .bss, .init_array, .fini_array, .preinit_array, .tdata, and .tbss
and support section name suffixes for .data.*, .rodata.*, .text.*,
.bss.*, .tdata.* and .tbss.* which matches the behavior of GAS.

Fixes PR31888.

Differential Revision: https://reviews.llvm.org/D30229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299484 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Avoid partial register deps on insertelt of load into lane 0.

This improves upon r246462: that prevented FMOVs from being emitted
for the cross-class INSERT_SUBREGs by disabling the formation of
INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting
caused us to introduce partial dependencies on the vector register.

Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that
is folded away by many patterns, including the scalar LDRS that we
want in this case.

Credit goes to Adam for finding the issue!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299482 91177308-0d34-0410-b5e6-96231b3b80d8

Change section flag character for SHF_LINK_ORDER to "o".

GAS uses "m" as a compatibility alias for "M" (SHF_MERGE).

"o" is free, except on ia64, where it already means SHF_LINK_ORDER.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299479 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test cases for various add/subtracts of constants(scalar, splat, and vector) with phis and selects. Improvements coming in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299476 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Add a minimum export implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299475 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] rename variable for easier reading; NFC

We usually give constants a 'C' somewhere in the name...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299474 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Turn subtract of vectors of i1 into xor like we do for scalar i1. Matches what we already do for add.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299472 91177308-0d34-0410-b5e6-96231b3b80d8