granicus.if.org Git

MergeFunc: preserve COMDAT information when creating a thunk

We would previously drop the COMDAT on the thunk we generated when replacing a
function body with the forwarding thunk. This would result in a function that
may have been multiply emitted and multiply merged to be emitted with the same
name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is
used for the deduplication of Value Witness functions for Swift.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358728 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Move list of params into a struct [NFCI].

Summary: Cleanup suggested in review of r358304.

Reviewers: sanjoy, efriedma

Subscribers: jlebar, zzheng, dmgreen, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358723 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] add tests for mul-by-element; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358718 91177308-0d34-0410-b5e6-96231b3b80d8

Implement sys::fs::copy_file using the macOS copyfile(3) API
to support APFS clones.

This patch adds a Darwin-specific implementation of
llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to
support APFS copy-on-write clones, which should be faster and much
more space efficient.

https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html

Differential Revision: https://reviews.llvm.org/D60802

This reapplies 358628 with an additional bugfix handling the case
where the destination file already exists. (Caught by the clang testsuite).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358716 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize/select G_(S/Z/ANY)_EXT for v8s8s

This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s.

We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from
selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic.

This adds legalizer support for those 3, which gives us selection via the
importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and
add a GISel line to arm64-vabs.ll.

Differential Revision: https://reviews.llvm.org/D60881

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358715 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize v8s8 loads

Add legalizer support for loads of v8s8 and update legalize-load-store.mir.

Differential Revision: https://reviews.llvm.org/D60877

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358714 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Fix two more asserts-on-invalid, found by oss-fuzz

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358708 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-undname: Fix two asserts-on-invalid

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358707 91177308-0d34-0410-b5e6-96231b3b80d8

[GuardWidening] Wire up a NPM version of the LoopGuardWidening pass

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358704 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] FMF propagation for GlobalIsel

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358702 91177308-0d34-0410-b5e6-96231b3b80d8

[BlockExtractor] Extend the file format to support the grouping of basic blocks

Prior to this patch, each basic block listed in the extrack-blocks-file
would be extracted to a different function.

This patch adds the support for comma separated list of basic blocks
to form group.

When the region formed by a group is not extractable, e.g., not single
entry, all the blocks of that group are left untouched.

Let us see this new format in action (comments are not part of the
file format):
;; funcName bbName[,bbName...]
   foo      bb1        ;; Extract bb1 in its own function
   foo      bb2,bb3    ;; Extract bb2,bb3 in their own function
   bar      bb1,bb4    ;; Extract bb1,bb4 in their own function
   bar      bb2        ;; Extract bb2 in its own function

Assuming all regions are extractable, this will create one function and
thus one call per region.

Differential Revision: https://reviews.llvm.org/D60746

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358701 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add some PPC vec cost tests to prep for D60160 NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358699 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] combineVectorTruncationWithPACKUS - remove split/concatenation of mask

combineVectorTruncationWithPACKUS is currently splitting the upper bit bit masking into 128-bit subregs and then concatenating them back together.

This was originally done to avoid regressions that caused existing subregs to be concatenated to the larger type just for the AND masking before being extracted again. This was fixed by @spatel (notably rL303997 and rL347356).

This also lets SimplifyDemandedBits do some further improvements before it hits the recursive depth limit.

My only annoyance with this is that we were broadcasting some xmm masks but we seem to have lost them by moving to ymm - but that's a known issue as the logic in lowerBuildVectorAsBroadcast isn't great.

Differential Revision: https://reviews.llvm.org/D60375#inline-539623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358692 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPred] Fix a blatantly obvious bug in r358684

The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358688 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for improved insertelement to index 0 (PR41512); NFC

Patch proposal in D60852.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358687 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Allow custom extensions for externalized debug info

Summary:
Extra flexibility for emitting debug info to external files (remains Darwin only for now).
LLDB needs this functionality to emit a LLDB.framework.dSYM instead of LLDB.dSYM when building the framework, because the latter could conflict with the driver's lldb.dSYM when emitted in the same directory on case-insensitive file systems.

Reviewers: friss, bogner, beanz

Subscribers: mgorny, aprantl, llvm-commits, #lldb

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358685 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopPredication] Allow predication of loop invariant computations (within the loop)

The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i'
A = _a.length;
guard (i < A)
a = _a[i]
B = _b.length;
guard (i < B);
b = _b[i];
...
Z = _z.length;
guard (i < Z)
z = _z[i]
accum += a + b + ... + z;

Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form.

Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later.

As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner.  See the udiv test changes as an example.  If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before.

A couple of subtleties in the implementation:
- SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point).  It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point.
- SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located.  (i.e. it can be in the loop)  This implies we have a speculation burden to prove before expanding them outside loops.
- invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance.  I plan to sink this part into SCEV once this has baked for a bit.

Differential Revision: https://reviews.llvm.org/D60093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358684 91177308-0d34-0410-b5e6-96231b3b80d8

[SDA] Bug fix: Use IPD outside the loop as divergence bound

Summary:
The immediate post dominator of the loop header may be part of the divergent loop.
Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop.

Reviewers: nhaehnle

Reviewed By: nhaehnle

Subscribers: mmasten, arsenm, jvesely, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358681 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a bug in SCEV's isSafeToExpand around speculation safety

isSafeToExpand was making a common, but dangerously wrong, mistake in assuming that if any instruction within a basic block executes, that all instructions within that block must execute.  This can be trivially shown to be false by considering the following small example:
bb:
  add x, y  <-- InsertionPoint
  call @throws()
  udiv x, y <-- SCEV* S
  br ...

It's clearly not legal to expand S above the throwing call, but the previous logic would do so since S dominates (but not properlyDominates) the block containing the InsertionPoint.

Since iterating instructions w/in a block is expensive, this change special cases two cases: 1) S is an operand of InsertionPoint, and 2) InsertionPoint is the terminator of it's block.  These two together are enough to keep all current optimizations triggering while fixing the latent correctness issue.

As best I can tell, this is a silent bug in current ToT.  Given that, there's no tests with this change.  It was noticed in an upcoming optimization change (D60093), and was reviewed as part of that.  That change will include the test which caused me to notice the issue.  I'm submitting this seperately so that anyone bisecting a problem gets a clear explanation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358680 91177308-0d34-0410-b5e6-96231b3b80d8

MinidumpYAML: Fix ambiguity between std::make_unique and llvm::make_unique

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358673 91177308-0d34-0410-b5e6-96231b3b80d8

MinidumpYAML: Add support for ModuleList stream

Summary:
This patch adds support for yaml (de)serialization of the minidump
ModuleList stream. It's a fairly straight forward-application of the
existing patterns to the ModuleList structures defined in previous
patches.

One thing, which may be interesting to call out explicitly is the
addition of "new" allocation functions to the helper BlobAllocator
class. The reason for this was, that there was an emerging pattern of a
need to allocate space for entities, which do not have a suitable
lifetime for use with the existing allocation functions. A typical
example of that was the "size" of various lists, which is only available
as a temporary returned by the .size() method of some container. For
these cases, one can use the new set of allocation functions, which
will take a temporary object, and store it in an allocator-managed
buffer until it is written to disk.

Reviewers: amccarth, jhenderson, clayborg, zturner

Subscribers: lldb-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60405

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358672 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358670 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358669 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358668 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add -B mips

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358667 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2elf/obj2yaml] - Allow normal parsing/dumping of the .rela.dyn section

.rela.dyn is a section that has sh_info normally
set to zero. And Info is an optional field in the description
of the relocation section in YAML.

But currently, yaml2obj would fail to produce the object when
Info is not explicitly listed.

The patch fixes the issue.

Differential revision: https://reviews.llvm.org/D60820

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358656 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Lower ICMP EQ(AND(X,C),C) -> SRA(SHL(X,LOG2(C)),BW-1) iff C is power-of-2.

This replaces the MOVMSK combine introduced at D52121/rL342326

(movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C))

with the more general icmp lowering so it can pick up more cases through bitcasts - notably vXi8 cases which use vXi16 shifts+masks, this patch can remove the mask and use pcmpgtb(0,x) for the sra.

Differential Revision: https://reviews.llvm.org/D60625

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358651 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][llvm-strip] Add switch to allow removing referenced sections

llvm-objcopy currently emits an error if a section to be removed is
referenced by another section. This is a reasonable thing to do, but is
different to GNU objcopy. We should allow users who know what they are
doing to have a way to produce the invalid ELF. This change adds a new
switch --allow-broken-links to both llvm-strip and llvm-objcopy to do
precisely that. The corresponding sh_link field is then set to 0 instead
of an error being emitted.

I cannot use llvm-readelf/readobj to test the link fields because they
emit an error if any sections, like the .dynsym, cannot be properly
loaded.

Reviewed by: rupprecht, grimar

Differential Revision: https://reviews.llvm.org/D60324

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358649 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit access [NFC]

Remove a trailing space

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358648 91177308-0d34-0410-b5e6-96231b3b80d8

[NewPM] Add Option handling for LoopVectorize

This patch enables passing options to LoopVectorizePass via the passes pipeline.

Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358647 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS()

Summary:
This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177

When the two operands for BUILD_VECTOR are same, we will get assert error.
llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&):
Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) &&
"The loads cannot be both consecutive and reverse consecutive."' failed.

This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We
should use `getScalarType().getStoreSize();` to get the ElemSize instread of
`getScalarSizeInBits() / 8`.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358644 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-profdata] Fix one bad format in llvm-profdata CommandGuide doc. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358643 91177308-0d34-0410-b5e6-96231b3b80d8

Elaborate why we have an option on by default for enabling chr.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358641 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Avoid DAG combining assert with fneg(fadd(A,0))

fneg combining attempts to turn it into fadd(fneg(A), fneg(0)), but
creating the new fadd folds to just fneg(A). When A has multiple uses,
this confuses it and you get an assert. Fixed.

Differential Revision: https://reviews.llvm.org/D60633

Change-Id: I0ddc9b7286abe78edc0cd8d734fdeb05ff09821c

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358640 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a typo in comments. [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358639 91177308-0d34-0410-b5e6-96231b3b80d8

[GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast instructions

https://reviews.llvm.org/D60844

Use the style of buildInstr that allows CSEing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358637 91177308-0d34-0410-b5e6-96231b3b80d8

Fix bad compare function over FusionCandidate.

Reverse the checking of the domiance order so that when a self compare happens,
it returns false. This makes compare function have strict weak ordering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358636 91177308-0d34-0410-b5e6-96231b3b80d8

Revert Implement sys::fs::copy_file using the macOS copyfile(3) API to support APFS clones.

This reverts r358628 (git commit 91a06bee788262a294527b815354f380d99dfa9b)
while investigating a crash reproducer bot failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358634 91177308-0d34-0410-b5e6-96231b3b80d8

Implement sys::fs::copy_file using the macOS copyfile(3) API
to support APFS clones.

This patch adds a Darwin-specific implementation of
llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to
support APFS copy-on-write clones, which should be faster and much
more space efficient.

https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html

Differential Revision: https://reviews.llvm.org/D60802

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358628 91177308-0d34-0410-b5e6-96231b3b80d8

Fix formatting. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358623 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] try to widen 'shl' as part of LEA formation

The test file has pairs of tests that are logically equivalent:
https://rise4fun.com/Alive/2zQ

%t4 = and i8 %t1, 8
%t5 = zext i8 %t4 to i16
%sh = shl i16 %t5, 2
%t6 = add i16 %sh, %t0
=>
%t4 = and i8 %t1, 8
%sh2 = shl i8 %t4, 2
%z5 = zext i8 %sh2 to i16
%t6 = add i16 %z5, %t0

...so if we can fold the shift op into LEA in the 1st pattern, then we
should be able to do the same in the 2nd pattern (unnecessary 'movzbl'
is a separate bug I think).

We don't want to do this any sooner though because that would conflict
with generic transforms that try to narrow the width of the shift.

Differential Revision: https://reviews.llvm.org/D60789

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358622 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit by Denis Bakhvalov

Change-Id: I4d85123a157d957434902fb14ba50926b2d56212

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358619 91177308-0d34-0410-b5e6-96231b3b80d8

[AsmPrinter] hoist %a output template to base class for ARM+Aarch64

Summary:
X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do
basically the same thing (Aarch64 did not correctly handle immediates,
ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses
%a with an immediate) for a flag that should be target independent
anyways.

Reviewers: echristo, peter.smith

Reviewed By: echristo

Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358618 91177308-0d34-0410-b5e6-96231b3b80d8

Add a getSizeInBits() accessor to MachineMemOperand. NFC.

Cleans up a bunch of places where we do getSize() * 8.

Differential Revision: https://reviews.llvm.org/D60799

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358617 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel] Add legalization support for non-power-2 loads and stores

Legalize things like i24 load/store by splitting them into smaller power of 2 operations.

This matches how SelectionDAG handles these operations.

Differential Revision: https://reviews.llvm.org/D59971

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358613 91177308-0d34-0410-b5e6-96231b3b80d8

Add basic loop fusion pass.

This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Differential Revision: https://reviews.llvm.org/D55851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358607 91177308-0d34-0410-b5e6-96231b3b80d8

[AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC

Summary:
None of these derived classes do anything that the base class cannot.
If we remove these case statements, then the base class can handle them
just fine.

Reviewers: peter.smith, echristo

Reviewed By: echristo

Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60803

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358603 91177308-0d34-0410-b5e6-96231b3b80d8

[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols

Summary:
Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool.

ThinLTOCodeGenerator currently does not preserve llvm.used symbols and
it can internalize them. In order to pass the necessary information to the
legacy ThinLTOCodeGenerator, the input to the code generator is
rewritten to be based on lto::InputFile.

Now ThinLTO using the legacy LTO API will requires data layout in
Module.

"internalize" thinlto action in llvm-lto is updated to run both
"promote" and "internalize" with the same configuration as
ThinLTOCodeGenerator. The old "promote" + "internalize" option does not
produce the same output as ThinLTOCodeGenerator.

This fixes: PR41236
rdar://problem/49293439

Reviewers: tejohnson, pcc, kromanova, dexonsmith

Reviewed By: tejohnson

Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358601 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Factor out unreachable inst idiom creation [NFC]

In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later.

This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358600 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI][CVP] Constrain values in with.overflow branches

If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1)
then we can constrain the value of %x inside the branch based on
makeGuaranteedNoWrapRegion(). We do this by extending the edge-value
handling in LVI. This allows CVP to then fold comparisons against %x,
as illustrated in the tests.

Differential Revision: https://reviews.llvm.org/D60650

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358597 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected handling of "-" before expressions

See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D60622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358596 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] tighten test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358594 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Force skip over SMRD, VMEM and s_waitcnt instructions

Summary: This fixes a large Dawn of War 3 performance regression with RADV from Mesa 19.0 to master which was caused by creating less code in some branches.

Reviewers: arsen, nhaehnle

Reviewed By: nhaehnle

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60824

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358592 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] make test checks more thorough; NFC

This will change with the proposal in D60214.
Unfortunately, the triple is not supported for auto-generation
via script, and the multiple RUN lines have diffs on this test,
but I can't tell exactly what is required by this test.
PR7162 was an assert/crash, so hopefully, this is good enough.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358587 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size.

Summary:
In the following cases, unrolling can be beneficial, even when
optimizing for code size:
1) very low trip counts
2) potential to constant fold most instructions after fully unrolling.

We can unroll in those cases, by setting the unrolling threshold to the
loop size. This might highlight some cost modeling issues and fixing
them will have a positive impact in general.

Reviewers: vsk, efriedma, dmgreen, paquette

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D60265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358586 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Add SimplifyDemandedBits helper that handles demanded elts mask as well

The other SimplifyDemandedBits helpers become wrappers to this new demanded elts variant.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358585 91177308-0d34-0410-b5e6-96231b3b80d8

[Support] Add LEB128 support to BinaryStreamReader/Writer.

Summary:
This patch adds support for ULEB128 and SLEB128 encoding and decoding to
BinaryStreamWriter and BinaryStreamReader respectively.

Support for ULEB128/SLEB128 will be used for eh-frame parsing in the JITLink
library currently under development (see https://reviews.llvm.org/D58704).

Reviewers: zturner, dblaikie

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60810

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358584 91177308-0d34-0410-b5e6-96231b3b80d8

[ScheduleDAGRRList] Recompute topological ordering on demand.

Currently there is a single point in ScheduleDAGRRList, where we
actually query the topological order (besides init code). Currently we
are recomputing the order after adding a node (which does not have
predecessors) and then we add predecessors edge-by-edge.

We can avoid adding edges one-by-one after we added a new node. In that case, we can
just rebuild the order from scratch after adding the edges to the DAG
and avoid all the updates to the ordering.

Also, we can delay updating the DAG until we query the DAG, if we keep a
list of added edges. Depending on the number of updates, we can either
apply them when needed or recompute the order from scratch.

This brings down the geomean compile time for of CTMark with -O1 down 0.3% on X86,
with no regressions.

Reviewers: MatzeB, atrick, efriedma, niravd, paquette

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D60125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358583 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected parsing of registers

See bug 41280: https://bugs.llvm.org/show_bug.cgi?id=41280

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D60621

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358581 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Flag new raw/struct atomic ops as source of divergence

Differential Revision: https://reviews.llvm.org/D60731

Change-Id: I821d93dec8b9cdd247b8172d92fb5e15340a9e7d

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358579 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r358554

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358578 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add DIFile Field Accesssors

Summary:
Add accessors for the file, directory, source file name (curiously, an `Optional` value?), of a DIFile.

This is intended to replace the LLVMValueRef-based accessors used in D52239

Reviewers: whitequark, jberdine, deadalnix

Reviewed By: whitequark, jberdine

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358577 91177308-0d34-0410-b5e6-96231b3b80d8

[CostModel][X86] Add bool anyof/allof reduction costs

On pre-AVX512 targets we can use MOVMSK to extract reduced boolean results. This is properly optimized, annoyingly AVX512 isn't and produces code that is almost as bad as the (unchanged) costs suggest......

Differential Revision: https://reviews.llvm.org/D60403

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358574 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] llvm::Error -> Error. NFC

The unqualified name is more common and is used in the file as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358567 91177308-0d34-0410-b5e6-96231b3b80d8

Change some llvm::{lower,upper}_bound to llvm::bsearch. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358564 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Support full list of bfd targets that lld uses.

Summary:
This change takes the full list of bfd targets that lld supports (see `ScriptParser.cpp`), including generic handling for `*-freebsd` targets (which uses the same settings but with a FreeBSD OSABI). In particular this adds mips support for `--output-target` (but not yet via `--binary-architecture`).

lld and llvm-objcopy use their own different custom data structures, so I'd prefer to check this in as-is (add support directly in llvm-objcopy, including all the test coverage) and do a separate NFC patch(s) that consolidate the two by putting this mapping into libobject.

See [[ https://bugs.llvm.org/show_bug.cgi?id=41462 | PR41462 ]].

Reviewers: jhenderson, jakehehrlich, espindola, alexshap, arichardson

Reviewed By: arichardson

Subscribers: fedor.sergeev, emaste, sdardis, krytarowski, atanasyan, llvm-commits, MaskRay, arichardson

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60773

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358562 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] processOverflowIntrinsic(): don't crash if constant-holding happened

As reported by Mikael Holmén in post-commit review in
https://reviews.llvm.org/D60791#1469765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358559 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Pass ReferenceToDIEOffsets elements by reference

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358558 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed error message printing in write_cmake_config.py

Summary:
Previously, write_cmake_config.py would raise an error while printing
the error, because `leftovers` in "'\n'.join(leftovers)" is a tuple.

Subscribers: mgorny, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60766

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358557 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Autogenerate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358556 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] In CopyToFromAsymmetricReg, use VR128 instead of FR32 instructions for GR32<->XMM register copies.

We have two versions of some instructions, VR128 versions and FR32 versions that
are marked as CodeGenOnly.

This change switches to using the VR128 versions for these copies. It's after
register allocation so the class size no longer matters. This matches how GR64
works.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358555 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Moved the bottleneck analysis to its own file. NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358554 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Add basic loop fusion pass." Per request.

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358553 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Temporarily Revert "Add basic loop fusion pass.""

The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358552 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the run-slp-after-loop-vectorization option.

It's been on by default for 4 years and cleans up the pass
hierarchy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358548 91177308-0d34-0410-b5e6-96231b3b80d8

Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358546 91177308-0d34-0410-b5e6-96231b3b80d8

Add basic loop fusion pass.

This patch adds a basic loop fusion pass. It will fuse loops that conform to the
following 4 conditions:
  1. Adjacent (no code between them)
  2. Control flow equivalent (if one loop executes, the other loop executes)
  3. Identical bounds (both loops iterate the same number of iterations)
  4. No negative distance dependencies between the loop bodies.

The pass does not make any changes to the IR to create opportunities for fusion.
Instead, it checks if the necessary conditions are met and if so it fuses two
loops together.

The pass has not been added to the pass pipeline yet, and thus is not enabled by
default. It can be run stand alone using the -loop-fusion option.

Phabricator: https://reviews.llvm.org/D55851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358543 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] llvm::bsearch, binary search for mere mortals

Summary:
Add to STLExtras a binary search function with a simple mental model:
You provide a range and a predicate which is true above a certain point.
bsearch() tells you that point.
Overloads are provided for integers, iterators, and containers.

This is more suitable than std:: alternatives in many cases:
- std::binary_search only indicates presence/absence
- upper_bound/lower_bound give you the opportunity to pick the wrong one
- all of the options have confusing names and definitions when your predicate
doesn't have simple "less than" semantics
- all of the options require iterators
- we plumb around a useless `value` parameter that should be a lambda capture

The API is inspired by Go's standard library, but we add an extra parameter as
well as some overloads and templates to show how clever C++ is.

Reviewers: ilya-biryukov, gribozavr

Subscribers: dexonsmith, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60779

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358540 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] adjust LEA tests for better coverage; NFC

The scale can 1, 2, or 3.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358539 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add Accessors For Global Variable Metadata Properties

Summary: Metadata for a global variable is really a (GlobalVariable, Expression) tuple. Allow access to these, then allow retrieving the file, scope, and line for a DIVariable, whether global or local. This should be the last of the accessors required for uniform access to location and file information metadata.

Reviewers: jberdine, whitequark, deadalnix

Reviewed By: jberdine, whitequark

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60725

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358532 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a typo in comments. [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358531 91177308-0d34-0410-b5e6-96231b3b80d8

[NVPTXAsmPrinter] clean up dead code. NFC

Summary:
The printOperand function takes a default parameter, for which there are
zero call sites that explicitly pass such a parameter. As such, there
is no case to support. This means that the method
printVecModifiedImmediate is purly dead code, and can be removed.

The eventual goal for some of these AsmPrinter refactoring is to have
printOperand be a virtual method; making it easier to print operands
from the base class for more generic Asm printing. It will help if all
printOperand methods have the same function signature (ie. no Modifier
argument when not needed).

Reviewers: echristo, tra

Reviewed By: echristo

Subscribers: jholewinski, hiraditya, llvm-commits, craig.topper, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60727

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358527 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] Rename preferShiftsToClearExtremeBits and shouldFoldShiftPairToMask (PR41359)

As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious.

shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask

preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358526 91177308-0d34-0410-b5e6-96231b3b80d8

[EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101)

This is 1 of the problems discussed in the post-commit thread for:
rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html
and filed as:
https://bugs.llvm.org/show_bug.cgi?id=41101

Instcombine tries to canonicalize some of these cases (and there's room for improvement
there independently of this patch), but it can't always do that because of extra uses.
So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar
to how we detect commuted compares and commuted min/max/abs.

Differential Revision: https://reviews.llvm.org/D60723

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358523 91177308-0d34-0410-b5e6-96231b3b80d8

Time profiler: optimize json output time

Summary:
Use llvm::json::Array.reserve() to optimize json output time. Here is motivation:
https://reviews.llvm.org/D60609#1468941. In short: for the json array
with ~32K entries, pushing back each entry takes ~4% of whole time compared
to the method of preliminary memory reservation: (3995-3845)/3995 = 3.75%.

Reviewers: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358522 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Simplify umulo and smulo that cannot overflow

If a umul.with.overflow or smul.with.overflow operation cannot
overflow, simplify it to a simple mul nuw / mul nsw. After the
refactoring in D60668 this is just a matter of removing an
explicit check against multiplications.

Differential Revision: https://reviews.llvm.org/D60791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358521 91177308-0d34-0410-b5e6-96231b3b80d8

[Support][JSON] Add reserve() to json Array

Summary:
Space reservation increases json lib performance for the arrays with large number of entries.
Here is the example and discussion: https://reviews.llvm.org/D60609#1468941

Reviewers: lebedev.ri, sammccall

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358520 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Refactoring of the operand reordering code.

This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold:
i. Cleanup and simplify the reordering code, and
ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2.

This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo .

Committed on behalf of @vporpo (Vasileios Porpodas)

Differential Revision: https://reviews.llvm.org/D59973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358519 91177308-0d34-0410-b5e6-96231b3b80d8

[CVP] Add tests for non-overflowing mulo; NFC

Should be simplified to simple mul.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358517 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] X86ISD::PERMV/PERMV3 node types can never fold index ops

Improves codegen demonstrated by D60512 - instructions represented by X86ISD::PERMV/PERMV3 can never memory fold the operand used for their index register.

This patch updates the 'isUseOfShuffle' helper into the more capable 'isFoldableUseOfShuffle' that recognises that the op is used for a X86ISD::PERMV/PERMV3 index mask and can't be folded - allowing us to use broadcast/subvector-broadcast ops to reduce the size of the mask constant pool data.

Differential Revision: https://reviews.llvm.org/D60562

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358516 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Prune fshl/fshr with masked operands

If a constant shift amount is used, then only some of the LHS/RHS
operand bits are demanded and we may be able to simplify based on
that. InstCombineSimplifyDemanded already had the necessary support
for that, we just weren't calling it with fshl/fshr as root.

In particular, this allows us to relax some masked funnel shifts
into simple shifts, as shown in the tests.

Patch by Shawn Landden.

Differential Revision: https://reviews.llvm.org/D60660

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358515 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add tests for fshl/fshr with masked operands; NFC

Baseline tests for D60660.

Patch by Shawn Landden.

Differential Revision: https://reviews.llvm.org/D60688

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358514 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add more tests for LEA formation; NFC

Promoting the shift to the wider type should allow LEA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358513 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Add WithOverflowInst class

This adds a WithOverflowInst class with a few helper methods to get
the underlying binop, signedness and nowrap type and makes use of it
where sensible. There will be two more uses in D60650/D60656.

The refactorings are all NFC, though I left some TODOs where things
could be improved. In particular we have two places where add/sub are
handled but mul isn't.

Differential Revision: https://reviews.llvm.org/D60668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358512 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Add branch_weights to latches so that test is not effected by future profitability patch to LoopPredication

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358506 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Remove indeterministic traversal order

Patch by Sergei Larin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358505 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Test tabs in disassemble-align.s with a more visible character

Summary: Apply rupprecht's suggestion in D60376

Reviewers: rupprecht

Reviewed By: rupprecht

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60777

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358504 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Add missing flag to addressing mode check

The checks in `canFoldInAddressingMode` tested for addressing modes that have a
base register but didn't set the `HasBaseReg` flag to true (it's false by
default). This patch fixes that. Although the omission of the flag was
technically incorrect it had no known observable impact, so no tests were
changed by this patch.

Differential Revision: https://reviews.llvm.org/D60314

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358502 91177308-0d34-0410-b5e6-96231b3b80d8

[OCaml] Update api to account for FNeg and CallBr instructions

Summary:
This diff adds minimal support for the recent FNeg and CallBr
instructions to the OCaml bindings.

Reviewers: whitequark

Reviewed By: whitequark

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60680

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358501 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Custom lower SHL_PARTS, SRA_PARTS, SRL_PARTS

When not optimizing for minimum size (-Oz) we custom lower wide shifts
(SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall.

Differential Revision: https://reviews.llvm.org/D59477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358498 91177308-0d34-0410-b5e6-96231b3b80d8