granicus.if.org Git

[CVP] Add tests for sub nowrap inference; NFC

These are baseline tests for D60036.

Patch by Luqman Aden.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358808 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix stack probing on x32 (PR41477)

Fix for https://bugs.llvm.org/show_bug.cgi?id=41477. On the x32 ABI
with stack probing a dynamic alloca will result in a WIN_ALLOCA_32
with a 32-bit size. The current implementation tries to copy it into
RAX, resulting in a physreg copy error. Fix this by copying to EAX
instead. Also fix incorrect opcodes or registers used in subs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358807 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Don't disassemble symbols before SectionAddr

This was caught by UBSAN

tools/llvm-objdump/X86/macho-disassembly-g-dsym.test
tools/llvm-objdump/X86/hex-displacement.test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358806 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't turn (and (shl X, C1), C2) into (shl (and X, (C1 >> C2), C2) if the original AND can represented by MOVZX.

The MOVZX doesn't require an immediate to be encoded at all. Though it does use
a 2 byte opcode so its the same size as a 1 byte immediate. But it has a
separate source and dest register so can help avoid copies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358805 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Turn (and (anyextend (shl X, C1), C2)) into (shl (and (anyextend X), (C1 >> C2), C2) if the AND could match a movzx.

There's one slight regression in here because we don't check that the immediate
already allowed movzx before the shift. I'll fix that next.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358804 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Simplify --{start,stop}-address

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358803 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Object: Improve error messages on invalid section

Also add a test.

Differential Revision: https://reviews.llvm.org/D60836

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358801 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores""

We were shifting the wrong component of a split load when trying to combine them
back into a single value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358800 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize + select G_FRINT

Exactly the same as G_FCEIL, G_FABS, etc.

Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc.

Differential Revision: https://reviews.llvm.org/D60895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358799 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] FastISel: Don't fallback to SelectionDAG after BuildMI in selectCall

My understanding is that once BuildMI has been called we can't fallback
to SelectionDAG.

This change moves the fallback for when getRegForValue() fails for
that target of an indirect call. This was failing in -fPIC mode when
the callee is GlobalValue.

Add a test case that tickles this.

Differential Revision: https://reviews.llvm.org/D60908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358793 91177308-0d34-0410-b5e6-96231b3b80d8

[GVN+LICM] Use line 0 locations for better crash attribution

This is a follow-up to r291037+r291258, which used null debug locations
to prevent jumpy line tables.

Using line 0 locations achieves the same effect, but works better for
crash attribution because it preserves the right inline scope.

Differential Revision: https://reviews.llvm.org/D60913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358791 91177308-0d34-0410-b5e6-96231b3b80d8

Update GN files to build with r358103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358790 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the EnableEarlyCSEMemSSA set of options from the legacy
and new pass managers. They were default to true and not being
used.

Differential Revision: https://reviews.llvm.org/D60747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358789 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Fix checks for AArch64MCExpr::VK_SABS flag.

VK_SABS is part of the SymLoc bitfield in the variant kind which should
be compared for equality, not by checking the VK_SABS bit.

As far as I know, the existing code happened to produce the correct
results in all cases, so this is just a cleanup.

Patch by Stephen Crane.

Differential Revision: https://reviews.llvm.org/D60596

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358788 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Add IRTranslator support for G_FRINT

Add it as a simple intrinsic, update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D60893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358787 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix buildbot failure in commit 1bb57bac959ac163fd7d8a76d734ca3e0ecee6ab.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358786 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Add a G_FRINT opcode

Equivalent to SelectionDAG's frint node.

Differential Revision: https://reviews.llvm.org/D60891

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358785 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for D60801. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358784 91177308-0d34-0410-b5e6-96231b3b80d8

[MS] Emit S_HEAPALLOCSITE debug info

Summary:
This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug
info in codeview. Currently only changes FastISel, so emitting labels still
needs to be implemented in SelectionDAG.

Reviewers: hans, rnk

Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D60800

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358783 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Pass monorepo build settings in cross compile

This allows the cross compiled build targets to configure the LLVM tools and sub-projects that are part of the main build.

This is needed for generating native non llvm *-tablegen tools when cross compiling clang in the monorepo build environment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358779 91177308-0d34-0410-b5e6-96231b3b80d8

[gn] Support dots in CMake paths in the sync script

Some file paths use dots to pick up sources from parent directories.

Differential Revision: https://reviews.llvm.org/D60734

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358774 91177308-0d34-0410-b5e6-96231b3b80d8

[LICM & MemorySSA] Make limit flags pass tuning options.

Summary:
Make the flags in LICM + MemorySSA tuning options in the old and new
pass managers.

Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358772 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"

This introduces some runtime failures which I'll need to investigate further.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358771 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize vector G_FPOW

This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc.

Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change.

Differential Revision: https://reviews.llvm.org/D60218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358764 91177308-0d34-0410-b5e6-96231b3b80d8

[NewPassManager] Adding pass tuning options: loop vectorize.

Summary:
Trying to add the plumbing necessary to add tuning options to the new pass manager.
Testing with the flags for loop vectorize.

Reviewers: chandlerc

Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358763 91177308-0d34-0410-b5e6-96231b3b80d8

[dsymutil] DwarfLinker: delete unused parameter

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358762 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] soften splat mask assert/unreachable (PR41535)

These are general queries, so they should not die when given
a degenerate input like an all undef mask. Callers should be
able to deal with an op that will eventually be simplified away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358761 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Attempt to fix leak-on-invalid found by oss-fuzz

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358760 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358722

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358755 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358754 91177308-0d34-0410-b5e6-96231b3b80d8

[LTO] Add plumbing to save stats during LTO on Darwin.

Gold and ld on Linux already support saving stats, but the
infrastructure is missing on Darwin. Unfortunately it seems like the
configuration from lib/LTO/LTO.cpp is not used.

This patch adds a new LTOStatsFile option and adds plumbing in Clang to
use it on Darwin, similar to the way remarks are handled.

Currnetly the handling of LTO flags seems quite spread out, with a bunch
of duplication. But I am not sure if there is an easy way to improve
that?

Reviewers: anemet, tejohnson, thegameg, steven_wu

Reviewed By: steven_wu

Differential Revision: https://reviews.llvm.org/D60516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358753 91177308-0d34-0410-b5e6-96231b3b80d8

Change \r\n -> \n for llvm-symbolizer/help.test after rL358749

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358751 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-symbolizer] Add llvm-addr2line

This adds an alias for llvm-symbolizer with different defaults so that
it can be used as a drop-in replacement for GNU's addr2line.

If a substring "addr2line" is found in the tool's name:
* it defaults "-i", "-f" and "-C" to OFF;
* it uses "--output-style=GNU" by default.

Differential Revision: https://reviews.llvm.org/D60067

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358749 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-symbolizer] Unhide and document the "-output-style" option

With the latest changes, the option gets useful for users of
llvm-symbolizer, not only for the upcoming llvm-addr2line.

Differential Revision: https://reviews.llvm.org/D60816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358748 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-symbolizer] Make the output with -output-style=GNU closer to addr2line's

This patch addresses two differences in the output of llvm-symbolizer
and GNU's addr2line:

* llvm-symbolizer prints an empty line after the report for an address.

* With "-f -i=0", llvm-symbolizer replaces the name of an inlined
  function with the name from the symbol table, i. e., the top caller
  function in the inlining chain. addr2line preserves the name of the
  inlined function.

Differential Revision: https://reviews.llvm.org/D60770

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358747 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Regenerate extractelt->truncate test.

Prep work for D60462

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358746 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Add "const" to MachineInstr::mayAlias

Summary:
The basic idea here is to make it possible to use
MachineInstr::mayAlias also when the MachineInstr
is const (or the "Other" MachineInstr is const).

The addition of const in MachineInstr::mayAlias
then rippled down to the need for adding const
in several other places, such as
TargetTransformInfo::getMemOperandWithOffset.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60856

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358744 91177308-0d34-0410-b5e6-96231b3b80d8

[PATCH] [MachineScheduler] Check pending instructions when an instruction is scheduled

Pending instructions that may have been blocked from being available by the HazardRecognizer may no longer may not be blocked any more when an instruction is scheduled; pending instructions should be re-checked in this case.

This is primarily aimed at VLIW targets with large parallelism and esoteric constraints.

No testcase as no in-tree targets have this behavior.

Differential revision: https://reviews.llvm.org/D60861

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358743 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeFunc] Delete unused FunctionNode::release()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358742 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeFunc] removeUsers: call remove() only on direct users

removeUsers uses a work list to collect indirect users and call remove()
on those functions. However it has a bug (`if (!Visited.insert(UU).second)`).

Actually, we don't have to collect indirect users.
After the merge of F and G, G's callers will be considered (added to
Deferred). If G's callers can be merged, G's callers' callers will be
considered.

Update the test unnamed-addr-reprocessing.ll to make it clear we can
still merge indirect callers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358741 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Ignore non-SUnits edges

Summary:
Ignore edges to non-SUnits (e.g. ExitSU) when checking
for low latency instructions.

When calling the function isLowLatencyInstruction(),
an ExitSU could be on the list of successors, not necessarily
a regular SU. In other places in the code there is a check
"Succ->NodeNum >= DAGSize" to prevent further processing of
ExitSU as "Succ->getInstr()" is NULL in such a case.
Also, 8 out of 9 cases of "SUnit *Succ = SuccDep.getSUnit())"
has the guard, so it is clearly an omission here.

Change-Id: Ica86f0327c7b2e6bcb56958e804ea6c71084663b

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60864

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358740 91177308-0d34-0410-b5e6-96231b3b80d8

[CallSite removal] Move the legacy PM, call graph, and some inliner
code to `CallBase`.

This patch focuses on the legacy PM, call graph, and some of inliner and legacy
passes interacting with those APIs from `CallSite` to the new `CallBase` class.
No interesting changes.

Differential Revision: https://reviews.llvm.org/D60412

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358739 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeFunc] Use less_first() as the comparator of Schwartzian transform

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358738 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Turn (and (shl X, C1), C2) into (shl (and X, (C1 >> C2), C2) if the AND could match a movzx.

Could get further improvements by recognizing (i64 and (anyext (i32 shl))).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358737 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test cases for turning (and (shl X, C1), C2) into (shl (and X, (C1 >> C2), C2) when the AND could match to a movzx.

We already reorder when C1 >> C2 would allow a smaller immediate encoding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358736 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make sure we copy the HandleSDNode back to N before executing the default code after the switch in matchAddressRecursively

Summary:
There are two places where we create a HandleSDNode in address matching in order to handle the case where N is changed by CSE. But if we end up not matching, we fall back to code at the bottom of the switch that really would like N to point to something that wasn't CSEd away. So we should make sure we copy the handle back to N on any paths that can reach that code.

This appears to be the true reason we needed to check DELETED_NODE in the negation matching. In pr32329.ll we had two subtracts back to back. We recursed through the first subtract, and onto the second subtract. The second subtract called matchAddressRecursively on its LHS which caused that subtract to CSE. We ultimately failed the match and ended up in the default code. But N was pointing at the old node that had been deleted, but the default code didn't know that and took it as the base register. Then we unwound back to the first subtract and tried to access this bogus base reg requiring the check for deleted node. With this patch we now use the CSE result as the base reg instead.

matchAdd has been broken since sometime in 2015 when it was pulled out of the switch into a helper function. The assignment to N at the end was still there, but N was passed by value and not by reference so the update didn't go anywhere.

Reviewers: niravd, spatel, RKSimon, bkramer

Reviewed By: niravd

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60843

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358735 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Use hasFileAtIndex to properly verify DWARF 5 after rL358732

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358734 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm] Prevent duplicate files in debug line header in dwarf 5: another attempt

Another attempt to land the changes in debug line header to prevent duplicate
files in Dwarf 5. I rolled back my previous commit because of a mistake in
generating the object file in a test. Meanwhile, I addressed some offline
comments and changed the implementation; the largest difference is that
MCDwarfLineTableHeader does not keep DwarfVersion but gets it as a parameter. I
also merged the patch to fix two lld tests that will strt to fail into this
patch.

Original Commit:

https://reviews.llvm.org/D59515

Original Message:
Motivation: In previous dwarf versions, file name indexes started from 1, and
the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes
the primary source file to be explicitly given an entry with an index number 0.

The current implementation honors the specification by just duplicating the
main source file, once with index number 0, and later maybe with another
index number. While this is compliant with the letter of the standard, the
duplication causes problems for consumers of this information such as lldb.
(Some files are duplicated, where only some of them have a line table although
all refer to the same file)

With this change, dwarf 5 debug line section files always start from 0, and
the zeroth entry is not duplicated whenever possible. This requires different
handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns
an index zero for a file name, it signals an error in dwarf 4, but not in dwarf
5) However, I think the minor complication is worth it, because it enables all
consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the
file name list homogenously.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358732 91177308-0d34-0410-b5e6-96231b3b80d8

[APInt] Optimize umul_ov

Change two costly udiv() calls to lshr(1)*RHS + left-shift + plus

On one 64-bit umul_ov benchmark, I measured an obvious improvement: 12.8129s -> 3.6257s

Note, there may be some value to special case 64-bit (the most common
case) with __builtin_umulll_overflow().

Differential Revision: https://reviews.llvm.org/D60669

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358730 91177308-0d34-0410-b5e6-96231b3b80d8

MergeFunc: preserve COMDAT information when creating a thunk

We would previously drop the COMDAT on the thunk we generated when replacing a
function body with the forwarding thunk. This would result in a function that
may have been multiply emitted and multiply merged to be emitted with the same
name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is
used for the deduplication of Value Witness functions for Swift.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358728 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Move list of params into a struct [NFCI].

Summary: Cleanup suggested in review of r358304.

Reviewers: sanjoy, efriedma

Subscribers: jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358723 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] add tests for mul-by-element; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358718 91177308-0d34-0410-b5e6-96231b3b80d8

Implement sys::fs::copy_file using the macOS copyfile(3) API
to support APFS clones.

This patch adds a Darwin-specific implementation of
llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to
support APFS copy-on-write clones, which should be faster and much
more space efficient.

https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html

Differential Revision: https://reviews.llvm.org/D60802

This reapplies 358628 with an additional bugfix handling the case
where the destination file already exists. (Caught by the clang testsuite).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358716 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize/select G_(S/Z/ANY)_EXT for v8s8s

This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s.

We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from
selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic.

This adds legalizer support for those 3, which gives us selection via the
importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and
add a GISel line to arm64-vabs.ll.

Differential Revision: https://reviews.llvm.org/D60881

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358715 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize v8s8 loads

Add legalizer support for loads of v8s8 and update legalize-load-store.mir.

Differential Revision: https://reviews.llvm.org/D60877

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358714 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Fix two more asserts-on-invalid, found by oss-fuzz

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358708 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Fix two asserts-on-invalid

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358707 91177308-0d34-0410-b5e6-96231b3b80d8

[GuardWidening] Wire up a NPM version of the LoopGuardWidening pass

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358704 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] FMF propagation for GlobalIsel

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358702 91177308-0d34-0410-b5e6-96231b3b80d8

[BlockExtractor] Extend the file format to support the grouping of basic blocks

Prior to this patch, each basic block listed in the extrack-blocks-file
would be extracted to a different function.

This patch adds the support for comma separated list of basic blocks
to form group.

When the region formed by a group is not extractable, e.g., not single
entry, all the blocks of that group are left untouched.

Let us see this new format in action (comments are not part of the
file format):
;; funcName bbName[,bbName...]
   foo      bb1        ;; Extract bb1 in its own function
   foo      bb2,bb3    ;; Extract bb2,bb3 in their own function
   bar      bb1,bb4    ;; Extract bb1,bb4 in their own function
   bar      bb2        ;; Extract bb2 in its own function

Assuming all regions are extractable, this will create one function and
thus one call per region.

Differential Revision: https://reviews.llvm.org/D60746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358701 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add some PPC vec cost tests to prep for D60160 NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358699 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] combineVectorTruncationWithPACKUS - remove split/concatenation of mask

combineVectorTruncationWithPACKUS is currently splitting the upper bit bit masking into 128-bit subregs and then concatenating them back together.

This was originally done to avoid regressions that caused existing subregs to be concatenated to the larger type just for the AND masking before being extracted again. This was fixed by @spatel (notably rL303997 and rL347356).

This also lets SimplifyDemandedBits do some further improvements before it hits the recursive depth limit.

My only annoyance with this is that we were broadcasting some xmm masks but we seem to have lost them by moving to ymm - but that's a known issue as the logic in lowerBuildVectorAsBroadcast isn't great.

Differential Revision: https://reviews.llvm.org/D60375#inline-539623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358692 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPred] Fix a blatantly obvious bug in r358684

The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358688 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for improved insertelement to index 0 (PR41512); NFC

Patch proposal in D60852.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358687 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Allow custom extensions for externalized debug info

Summary:
Extra flexibility for emitting debug info to external files (remains Darwin only for now).
LLDB needs this functionality to emit a LLDB.framework.dSYM instead of LLDB.dSYM when building the framework, because the latter could conflict with the driver's lldb.dSYM when emitted in the same directory on case-insensitive file systems.

Reviewers: friss, bogner, beanz

Subscribers: mgorny, aprantl, llvm-commits, #lldb

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358685 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPredication] Allow predication of loop invariant computations (within the loop)

The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i'
A = _a.length;
guard (i < A)
a = _a[i]
B = _b.length;
guard (i < B);
b = _b[i];
...
Z = _z.length;
guard (i < Z)
z = _z[i]
accum += a + b + ... + z;

Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form.

Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later.

As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner.  See the udiv test changes as an example.  If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before.

A couple of subtleties in the implementation:
- SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point).  It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point.
- SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located.  (i.e. it can be in the loop)  This implies we have a speculation burden to prove before expanding them outside loops.
- invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance.  I plan to sink this part into SCEV once this has baked for a bit.

Differential Revision: https://reviews.llvm.org/D60093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358684 91177308-0d34-0410-b5e6-96231b3b80d8

[SDA] Bug fix: Use IPD outside the loop as divergence bound

Summary:
The immediate post dominator of the loop header may be part of the divergent loop.
Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: mmasten, arsenm, jvesely, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358681 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a bug in SCEV's isSafeToExpand around speculation safety

isSafeToExpand was making a common, but dangerously wrong, mistake in assuming that if any instruction within a basic block executes, that all instructions within that block must execute.  This can be trivially shown to be false by considering the following small example:
bb:
  add x, y  <-- InsertionPoint
  call @throws()
  udiv x, y <-- SCEV* S
  br ...

It's clearly not legal to expand S above the throwing call, but the previous logic would do so since S dominates (but not properlyDominates) the block containing the InsertionPoint.

Since iterating instructions w/in a block is expensive, this change special cases two cases: 1) S is an operand of InsertionPoint, and 2) InsertionPoint is the terminator of it's block.  These two together are enough to keep all current optimizations triggering while fixing the latent correctness issue.

As best I can tell, this is a silent bug in current ToT.  Given that, there's no tests with this change.  It was noticed in an upcoming optimization change (D60093), and was reviewed as part of that.  That change will include the test which caused me to notice the issue.  I'm submitting this seperately so that anyone bisecting a problem gets a clear explanation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358680 91177308-0d34-0410-b5e6-96231b3b80d8

MinidumpYAML: Fix ambiguity between std::make_unique and llvm::make_unique

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358673 91177308-0d34-0410-b5e6-96231b3b80d8

MinidumpYAML: Add support for ModuleList stream

Summary:
This patch adds support for yaml (de)serialization of the minidump
ModuleList stream. It's a fairly straight forward-application of the
existing patterns to the ModuleList structures defined in previous
patches.

One thing, which may be interesting to call out explicitly is the
addition of "new" allocation functions to the helper BlobAllocator
class. The reason for this was, that there was an emerging pattern of a
need to allocate space for entities, which do not have a suitable
lifetime for use with the existing allocation functions. A typical
example of that was the "size" of various lists, which is only available
as a temporary returned by the .size() method of some container. For
these cases, one can use the new set of allocation functions, which
will take a temporary object, and store it in an allocator-managed
buffer until it is written to disk.

Reviewers: amccarth, jhenderson, clayborg, zturner

Subscribers: lldb-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60405

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358672 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358670 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358669 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358668 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add -B mips

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358667 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2elf/obj2yaml] - Allow normal parsing/dumping of the .rela.dyn section

.rela.dyn is a section that has sh_info normally
set to zero. And Info is an optional field in the description
of the relocation section in YAML.

But currently, yaml2obj would fail to produce the object when
Info is not explicitly listed.

The patch fixes the issue.

Differential revision: https://reviews.llvm.org/D60820

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358656 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Lower ICMP EQ(AND(X,C),C) -> SRA(SHL(X,LOG2(C)),BW-1) iff C is power-of-2.

This replaces the MOVMSK combine introduced at D52121/rL342326

(movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C))

with the more general icmp lowering so it can pick up more cases through bitcasts - notably vXi8 cases which use vXi16 shifts+masks, this patch can remove the mask and use pcmpgtb(0,x) for the sra.

Differential Revision: https://reviews.llvm.org/D60625

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358651 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][llvm-strip] Add switch to allow removing referenced sections

llvm-objcopy currently emits an error if a section to be removed is
referenced by another section. This is a reasonable thing to do, but is
different to GNU objcopy. We should allow users who know what they are
doing to have a way to produce the invalid ELF. This change adds a new
switch --allow-broken-links to both llvm-strip and llvm-objcopy to do
precisely that. The corresponding sh_link field is then set to 0 instead
of an error being emitted.

I cannot use llvm-readelf/readobj to test the link fields because they
emit an error if any sections, like the .dynsym, cannot be properly
loaded.

Reviewed by: rupprecht, grimar

Differential Revision: https://reviews.llvm.org/D60324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358649 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit access [NFC]

Remove a trailing space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358648 91177308-0d34-0410-b5e6-96231b3b80d8

[NewPM] Add Option handling for LoopVectorize

This patch enables passing options to LoopVectorizePass via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358647 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()

Summary:
This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177

When the two operands for BUILD_VECTOR are same, we will get assert error.
llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&):
Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) &&
"The loads cannot be both consecutive and reverse consecutive."' failed.

This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We
should use `getScalarType().getStoreSize();` to get the ElemSize instread of
`getScalarSizeInBits() / 8`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358644 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-profdata] Fix one bad format in llvm-profdata CommandGuide doc. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358643 91177308-0d34-0410-b5e6-96231b3b80d8

Elaborate why we have an option on by default for enabling chr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358641 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Avoid DAG combining assert with fneg(fadd(A,0))

fneg combining attempts to turn it into fadd(fneg(A), fneg(0)), but
creating the new fadd folds to just fneg(A). When A has multiple uses,
this confuses it and you get an assert. Fixed.

Differential Revision: https://reviews.llvm.org/D60633

Change-Id: I0ddc9b7286abe78edc0cd8d734fdeb05ff09821c

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358640 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a typo in comments. [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358639 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast instructions

https://reviews.llvm.org/D60844

Use the style of buildInstr that allows CSEing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358637 91177308-0d34-0410-b5e6-96231b3b80d8

Fix bad compare function over FusionCandidate.

Reverse the checking of the domiance order so that when a self compare happens,
it returns false. This makes compare function have strict weak ordering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358636 91177308-0d34-0410-b5e6-96231b3b80d8

Revert Implement sys::fs::copy_file using the macOS copyfile(3) API to support APFS clones.

This reverts r358628 (git commit 91a06bee788262a294527b815354f380d99dfa9b)
while investigating a crash reproducer bot failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358634 91177308-0d34-0410-b5e6-96231b3b80d8

Implement sys::fs::copy_file using the macOS copyfile(3) API
to support APFS clones.

This patch adds a Darwin-specific implementation of
llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to
support APFS copy-on-write clones, which should be faster and much
more space efficient.

https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html

Differential Revision: https://reviews.llvm.org/D60802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358628 91177308-0d34-0410-b5e6-96231b3b80d8

Fix formatting. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358623 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try to widen 'shl' as part of LEA formation

The test file has pairs of tests that are logically equivalent:
https://rise4fun.com/Alive/2zQ

%t4 = and i8 %t1, 8
%t5 = zext i8 %t4 to i16
%sh = shl i16 %t5, 2
%t6 = add i16 %sh, %t0
=>
%t4 = and i8 %t1, 8
%sh2 = shl i8 %t4, 2
%z5 = zext i8 %sh2 to i16
%t6 = add i16 %z5, %t0

...so if we can fold the shift op into LEA in the 1st pattern, then we
should be able to do the same in the 2nd pattern (unnecessary 'movzbl'
is a separate bug I think).

We don't want to do this any sooner though because that would conflict
with generic transforms that try to narrow the width of the shift.

Differential Revision: https://reviews.llvm.org/D60789

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358622 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit by Denis Bakhvalov

Change-Id: I4d85123a157d957434902fb14ba50926b2d56212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358619 91177308-0d34-0410-b5e6-96231b3b80d8

[AsmPrinter] hoist %a output template to base class for ARM+Aarch64

Summary:
X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do
basically the same thing (Aarch64 did not correctly handle immediates,
ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses
%a with an immediate) for a flag that should be target independent
anyways.

Reviewers: echristo, peter.smith

Reviewed By: echristo

Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358618 91177308-0d34-0410-b5e6-96231b3b80d8

Add a getSizeInBits() accessor to MachineMemOperand. NFC.

Cleans up a bunch of places where we do getSize() * 8.

Differential Revision: https://reviews.llvm.org/D60799

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358617 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Add legalization support for non-power-2 loads and stores

Legalize things like i24 load/store by splitting them into smaller power of 2 operations.

This matches how SelectionDAG handles these operations.

Differential Revision: https://reviews.llvm.org/D59971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358613 91177308-0d34-0410-b5e6-96231b3b80d8

Add basic loop fusion pass.

This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Differential Revision: https://reviews.llvm.org/D55851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358607 91177308-0d34-0410-b5e6-96231b3b80d8

[AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC

Summary:
None of these derived classes do anything that the base class cannot.
If we remove these case statements, then the base class can handle them
just fine.

Reviewers: peter.smith, echristo

Reviewed By: echristo

Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60803

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358603 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols

Summary:
Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool.

ThinLTOCodeGenerator currently does not preserve llvm.used symbols and
it can internalize them. In order to pass the necessary information to the
legacy ThinLTOCodeGenerator, the input to the code generator is
rewritten to be based on lto::InputFile.

Now ThinLTO using the legacy LTO API will requires data layout in
Module.

"internalize" thinlto action in llvm-lto is updated to run both
"promote" and "internalize" with the same configuration as
ThinLTOCodeGenerator. The old "promote" + "internalize" option does not
produce the same output as ThinLTOCodeGenerator.

This fixes: PR41236
rdar://problem/49293439

Reviewers: tejohnson, pcc, kromanova, dexonsmith

Reviewed By: tejohnson

Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358601 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Factor out unreachable inst idiom creation [NFC]

In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later.

This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358600 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Constrain values in with.overflow branches

If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1)
then we can constrain the value of %x inside the branch based on
makeGuaranteedNoWrapRegion(). We do this by extending the edge-value
handling in LVI. This allows CVP to then fold comparisons against %x,
as illustrated in the tests.

Differential Revision: https://reviews.llvm.org/D60650

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358597 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected handling of "-" before expressions

See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D60622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358596 91177308-0d34-0410-b5e6-96231b3b80d8