granicus.if.org Git

DebugInfo: Use base address selection entries in debug_ranges to reduce relocations

(from comments in the test)
Group ranges in a range list that apply to the same section and use a base
address selection entry to reduce the number of relocations to one reloc per
section per range list. DWARF5 debug_rnglist will be more efficient than this
in terms of relocations, but it's still better than one reloc per entry in a
range list.

This is an object/executable size tradeoff - shrinking objects, but growing
the linked executable. In one large binary tested, total object size (not just
debug info) shrank by 16%, entirely relocation entries. Linked executable
grew by 4%. This was with compressed debug info in the objects, uncompressed
in the linked executable. Without compression in the objects, the win would be
smaller (the growth of debug_ranges itself would be more significant).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309526 91177308-0d34-0410-b5e6-96231b3b80d8

test: add an additional cfi_return_column test

Ensure that we still coalesce identical CIEs across FDEs even with
cfi_return_column alterations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309525 91177308-0d34-0410-b5e6-96231b3b80d8

test: make the test clearer (NFC)

Use `llvm-objdump -dwarf=frames` to dump the .eh_frame to validate the
output textually rather than compare the binary output. This makes it
easier to see what is being checked. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309524 91177308-0d34-0410-b5e6-96231b3b80d8

NFC: spell correction.

On behalf of jbhateja

Differential Revision: https://reviews.llvm.org/D35885

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309521 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in comment

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309519 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-symbolizer/print_context.c test: Make debug info path independent

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309518 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-symbolizer: Make test portable using an explicit object file rather than the host compiler

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309517 91177308-0d34-0410-b5e6-96231b3b80d8

Make test robust to changes in prefix/avoid hardcoded line numbers

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309516 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[AVR] Mark a failing symbolizer test as XFAIL"

This reverts commit 83a0e876349adb646ba858eb177b22b0b4bfc59a.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309515 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Fix for CU index usage in 309507

Not sure quite how I failed so clearly to test this, but anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309514 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Mark a failing symbolizer test as XFAIL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309512 91177308-0d34-0410-b5e6-96231b3b80d8

Expanding the test case for vf8 for stride 4 interleaved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309511 91177308-0d34-0410-b5e6-96231b3b80d8

[x86][inline-asm][ms-compat] legalize the use of "jc/jz short <op>"

MS ignores the keyword "short" when used after a jc/jz instruction, LLVM ought to do the same.
Test: D35893

Differential Revision: https://reviews.llvm.org/D35892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309509 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Use DWP cu_index to speed up symbolizing (as intended)

I was a bit lazy when I first implemented this & skipped the index
lookup - obviously for large files this becomes pretty crucial, so here
we go, do the index lookup. Speeds up large DWP symbolizing by... lots.
(20m -> 20s, actually, maybe more in a release build (that was a release
build without index lookup, compared to a debug/non-release build with
the index usage))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309507 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Group member variable along with the rest

Committed in r309498 I didn't spot where the rest of the private members
were in DWARFContext at the time - group them up again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309506 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add addsub intrinsics to the intrinsic lowering table so we have a single set of isel patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309502 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor the build{Module|Function}SimplificationPipeline to expose optimization phase.

Summary: This is in preparation of https://reviews.llvm.org/D36052

Reviewers: chandlerc, davidxl, tejohnson

Reviewed By: chandlerc

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D36053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309500 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Provide option for explicitly specifying the name of the DWP file

If you've archived the DWP file somewhere it's probably useful to be
able to just tell llvm-symbolizer where it is when you're symbolizing
stack traces from the binary.

This only provides a mechanism for specifying a single DWP file, good if
you're symbolizing a program with a single DWP file, but it's likely if
the program is dynamically linked that you might have a DWP for each
dynamic library - in which case this feature won't help (at least as
it's surfaced in llvm-symbolizer for now) - in theory it could be
extended to specify a collection of DWP files that could all be
consulted for split CU hash resolution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309498 91177308-0d34-0410-b5e6-96231b3b80d8

Migrate PGOMemOptSizeOpt to use new OptimizationRemarkEmitter Pass

Summary:
Fixes PR33790.

This patch still needs a yaml-style test, which I shall write tomorrow

Reviewers: anemet

Reviewed By: anemet

Subscribers: anemet, llvm-commits

Differential Revision: https://reviews.llvm.org/D35981

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309497 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Tie source and destination operands for AESMC/AESIMC.

Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
    AESE Vn, _
    AESMC Vn, Vn
and
    AESD Vn, _
    AESIMC Vn, Vn

The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.

A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.

I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.

Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga

Reviewed By: t.p.northover, evandro

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35299

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309495 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Use 8 bytes as preferred function alignment on Cortex-A53.

Summary:
This change gives a 0.25% speedup on execution time, a 0.82% improvement
in benchmark scores and a 0.20% increase in binary size on a Cortex-A53.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite and a range of proprietary suites.

Reviewers: t.p.northover, aadg, silviu.baranga, mcrosier, rengolin

Reviewed By: rengolin

Subscribers: grimar, davide, aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D35568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309494 91177308-0d34-0410-b5e6-96231b3b80d8

MC: simplify internal function call parameter

Rather than passing along most of the parameters, pass a reference to
the MCDWARFrameInfo instead.  This makes it easier to pass additional
information about the frame to the checks.  We need to keep the extra
constructor for the Key around to allow the construction of the null and
tombstone keys.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309493 91177308-0d34-0410-b5e6-96231b3b80d8

MC: account for the return column in the CIE key

If the return column is different, we cannot coalesce the CIE across the
FDEs. Add that to the key calculation. This ensures that we emit a
separate CIE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309492 91177308-0d34-0410-b5e6-96231b3b80d8

Fix test failure without X86 backend

move test/Transforms/SimplifyCFG/disable-lookup-table.ll into test/Transforms/SimplifyCFG/X86/disable-lookup-table.ll to avoid test failure when X86 backend is not enabled

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309487 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG][X86] CombineBT - more aggressively determine demanded bits

This patch is in 2 parts:

1 - replace combineBT's use of SimplifyDemandedBits (hasOneUse only) with SelectionDAG::GetDemandedBits to more aggressively determine the lower bits used by BT.

2 - update SelectionDAG::GetDemandedBits to support ANY_EXTEND - if the demanded bits are only in the non-extended portion, then peek through and demand from the source value and then ANY_EXTEND that if we found a match.

Differential Revision: https://reviews.llvm.org/D35896

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309486 91177308-0d34-0410-b5e6-96231b3b80d8

[tests] Do not emity binary bitcode to stdout in RegionInfo tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309485 91177308-0d34-0410-b5e6-96231b3b80d8

[OCaml] Pass -D/-UNDEBUG through to ocamlc

Detect [/-][DU]NDEBUG in CMAKE_C_FLAGS* and pass them through to ocamlc.
This is necessary because their value might affect visibility of dump
functions in LLVM and ocamlc uses its own compiler and flags by default.

Differential Revision: https://reviews.llvm.org/D35898

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309483 91177308-0d34-0410-b5e6-96231b3b80d8

Update the test to make windows bot pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309482 91177308-0d34-0410-b5e6-96231b3b80d8

[OCaml] Install dynamic libraries in 'stubdirs' directory

Install the OCaml dynamic libraries in the 'stubdirs' directory rather
than the llvm subdirectory in order to fix running executables created
by ocamlc. Otherwise, the executables fail to run being unable to locate
the libraries (unless the LLVM directory is explicitly added to
LD_LIBRARY_PATH).

The staging directories are not altered since they work for our
development setup anyway, and installing into two directories would
unnecessarily make the code more complex.

Differential Revision: https://reviews.llvm.org/D35995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309481 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Change an early exit to an assert; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309480 91177308-0d34-0410-b5e6-96231b3b80d8

update the test file that was omitted in r309478.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309479 91177308-0d34-0410-b5e6-96231b3b80d8

Refine the PGOOpt and SamplePGOSupport handling.

Summary:
Now that SamplePGOSupport is part of PGOOpt, there are several places that need tweaking:
1. AddDiscriminator pass should *not* be invoked at ThinLTOBackend (as it's already invoked in the PreLink phase)
2. addPGOInstrPasses should only be invoked when either ProfileGenFile or ProfileUseFile is non-empty.
3. SampleProfileLoaderPass should only be invoked when SampleProfileFile is non-empty.
4. PGOIndirectCallPromotion should only be invoked in ProfileUse phase, or in ThinLTOBackend of SamplePGO.

Reviewers: chandlerc, tejohnson, davidxl

Reviewed By: chandlerc

Subscribers: sanjoy, mehdi_amini, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D36040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309478 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Remove deadcode from AMDGPUInstPrinter

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309477 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Move INDIRECT_BASE_ADDR definition out of common files

Summary: This is only used by R600.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D35926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309476 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] NFC: Change IsTailCall to a call class + frame class

This commit

- Removes IsTailCall and replaces it with a target-defined unsigned
- Refactors getOutliningCallOverhead and getOutliningFrameOverhead so that they don't use IsTailCall
- Adds a call class + frame class classification to OutlinedFunction and Candidate respectively

This accomplishes a couple things.

Firstly, we don't need the notion of *tail call* in the general outlining algorithm.

Secondly, we now can have different "outlining classes" for each candidate within a set of candidates.
This will make it easy to add new ways to outline sequences for certain targets and dynamically choose
an appropriate cost model for a sequence depending on the context that that sequence lives in.

Ultimately, this should get us closer to being able to do something like, say avoid saving the link
register when outlining AArch64 instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309475 91177308-0d34-0410-b5e6-96231b3b80d8

lit::shtest-format.py: Make write-bad-encoding.py py3-aware.

  Traceback (most recent call last):
    File "llvm/utils/lit/tests/Inputs/shtest-format/external_shell/write-bad-encoding.py", line 5, in <module>
      sys.stdout.write(b"a line with bad encoding: \xc2.")

sys.stdout.write doesn't accept bytes but sys.stdout.buffer.write accepts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309473 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Make areMemAccessesTriviallyDisjoint more aware of segment flat

Checking the encoding is insufficient since now there can
be global or scratch instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309472 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Teach isLegalAddressingMode about global_* instructions

Also refine the flat check to respect flat-for-global feature,
and constant fallback should check global handling, not
specifically MUBUF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309471 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Start selecting global instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309470 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309469 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm] Update MachOObjectFile::exports interface

This diff removes the second argument of the method MachOObjectFile::exports.
In all in-tree uses this argument is equal to "this" and
without this argument the interface seems to be cleaner.

Test plan: make check-all

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309462 91177308-0d34-0410-b5e6-96231b3b80d8

Fix update_llc_test_checks.py ARM parsing

When I tried running the script, the ARM regex parser could not parse
my code. It failed because the .Lfunc_end line has a comment at the
end of it, so this commit removes the newline at the end of the regex.

Patch by Joel Galenson!

Differential Revision: https://reviews.llvm.org/D35641

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309457 91177308-0d34-0410-b5e6-96231b3b80d8

[LTO] llvm-lto2: Add option to load sample profile

Summary:
This exposes LTO's Conf.SampleProfile as a command line option
(-lto-sample-profile-file) for testing via the llvm-lto2 utility.

Reviewers: pcc, danielcdh

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36030

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309456 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the unused offset field from LiveDebugValues (NFC)

Followup to r309426.
rdar://problem/33580047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309455 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the unused offset field from LiveDebugVariables (NFC)

Followup to r309426.
rdar://problem/33580047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309451 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the unused offset from DBG_VALUE (NFC)

Followup to r309426.
rdar://problem/33580047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309450 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the unused DBG_VALUE offset parameter from GlobalISel (NFC)

Followup to r309426.
rdar://problem/33580047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309449 91177308-0d34-0410-b5e6-96231b3b80d8

Update the Go bindings for r309426 (remove offset from llvm.dbg.value)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309448 91177308-0d34-0410-b5e6-96231b3b80d8

Added tests for i8 interleaved-load-pattern of stride=4, VF=(8, 16, 32).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309447 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the unused DBG_VALUE offset parameter from RegAllocFast (NFC)

Followup to r309426.
rdar://problem/33580047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309446 91177308-0d34-0410-b5e6-96231b3b80d8

Add documentation for the attribute "no-jump-tables"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309445 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] Make the no-jump-tables attribute also disable switch lookup tables

Differential Revision: https://reviews.llvm.org/D35579

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309444 91177308-0d34-0410-b5e6-96231b3b80d8

[libFuzzer] improve support for inline-8bit-counters (make it more correct and faster)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309443 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Formatting changes, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309442 91177308-0d34-0410-b5e6-96231b3b80d8

[Inliner] Do not apply any bonus for cold callsites.

Summary:
Inlining threshold is increased by application of bonuses when the
callee has a single reachable basic block or is rich in vector
instructions. Similarly, inlining cost is reduced by applying a large
bonus when the last call to a static function is considered for
inlining. This patch disables the application of these bonuses when the
callsite or the callee is cold. The intention here is to prevent a large
cold callsite from being inlined to a non-cold caller that could prevent
the caller from being inlined. This is especially important when the
cold callsite is a last call to a static since the associated bonus is
very high.

Reviewers: chandlerc, davidxl

Subscribers: danielcdh, llvm-commits

Differential Revision: https://reviews.llvm.org/D35823

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309441 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the unused dbg.value offset from SelectionDAG (NFC)

Followup to r309426.
rdar://problem/33580047

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309436 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Use a %{python} substitution to avoid relying on python being on PATH

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309434 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Remove stale test inputs before running check-lit

This should fix googletest-format test failures on the clang modules
buildbots, which have a stale copy of the OneTest script in the build
directory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309432 91177308-0d34-0410-b5e6-96231b3b80d8

Reword sentence in LangRef

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309431 91177308-0d34-0410-b5e6-96231b3b80d8

Remove the obsolete offset parameter from @llvm.dbg.value

There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309426 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Allow vectorization of the instruction from the same basic blocks only, NFC.

Summary:
After some changes in SLP vectorizer we missed some additional checks to
limit the instructions for vectorization. We should not perform analysis
of the instructions if the parent of instruction is not the same as the
parent of the first instruction in the tree or it was analyzed already.

Subscribers: mzolotukhin

Differential Revision: https://reviews.llvm.org/D34881

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309425 91177308-0d34-0410-b5e6-96231b3b80d8

Fix conditional tail call branch folding when both edges are the same

The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:

BB#1: derived from LLVM BB %entry
    Live Ins: %EFLAGS
    Predecessors according to CFG: BB#0
        JE_1 <BB#5>, %EFLAGS<imp-use,kill>
        JMP_1 <BB#5>

BB#5: derived from LLVM BB %sw.epilog
    Predecessors according to CFG: BB#1
        TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...

We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.

This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.

Fixes PR33980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309422 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Look through a bitcast user of an out argument

This allows handling of a lot more of the interesting
cases in Blender. Most of the large functions unlikely
to be inlined have this pattern.

This is a special case for what clang emits for OpenCL 3
element vectors. Annoyingly, these are emitted as
<3 x elt>* pointers, but accessed as <4 x elt>* operations.
This also needs to handle cases where a struct containing
a single vector is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309419 91177308-0d34-0410-b5e6-96231b3b80d8

[Value Tracking] Refactor icmp comparison logic into helper. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309417 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add pass to replace out arguments

It is better to return arguments directly in registers
if we are making a call rather than introducing expensive
stack usage. In one of sample compile from one of
Blender's many kernel variants, this fires on about
~20 different functions. Future improvements may be to
recognize simple cases where the pointer is indexing a small
array. This also fails when the store to the out argument
is in a separate block from the return, which happens in
a few of the Blender functions. This should also probably
be using MemorySSA which might help with that.

I'm not sure this is correct as a FunctionPass, but
MemoryDependenceAnalysis seems to not work with
a ModulePass.

I'm also not sure where it should run.I think it should
run before DeadArgumentElimination, so maybe either
EP_CGSCCOptimizerLate or EP_ScalarOptimizerLate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309416 91177308-0d34-0410-b5e6-96231b3b80d8

[LVI] Constant-propagate a zero extension of the switch condition value through case edges

Summary:
LazyValueInfo currently computes the constant value of the switch condition through case edges, which allows the constant value to be propagated through the case edges.

But we have seen a case where a zero-extended value of the switch condition is used past case edges for which the constant propagation doesn't occur.

This patch adds a small logic to handle such a case in getEdgeValueLocal().

This is motivated by the Python 2.7 eval loop in PyEval_EvalFrameEx() where the lack of the constant propagation causes longer live ranges and more spill code than necessary.

With this patch, we see that the code size of PyEval_EvalFrameEx() decreases by ~5.4% and a performance test improves by ~4.6%.

Reviewers: wmi, dberlin, sanjoy

Reviewed By: sanjoy

Subscribers: davide, davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D34822

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309415 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: map 128-bit values to an FPR by default.

Eventually we may want to allow a pair of GPRs but absolutely nothing in the
entire world is ready for that yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309404 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Dump some FileCheck inputs to try to debug some failing tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309400 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Fix shtest-format external_shell failures

When using win32 cmd.exe, turn off command echoing at the beginning of
the script (@echo off).

Replace a bash shell script with a python script for the
fail_with_bad_encoding test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309399 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Annotate implicitarg.ptr usage

We need to pass something to functions for this to work.
It isn't derivable just from the kernarg segment pointer
because the implicit arguments are placed after the
kernel arguments.

Also fixes missing test for the intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309398 91177308-0d34-0410-b5e6-96231b3b80d8

[GVN] Recommit the patch "Add phi-translate support in scalarpre"

Recommit after workaround the bug PR31652.

Three bugs fixed in previous recommits: The first one is to use CurrentBlock
instead of PREInstr's Parent as param of performScalarPREInsertion because
the Parent of a clone instruction may be uninitialized. The second one is stop
PRE when CurrentBlock to its predecessor is a backedge and an operand of CurInst
is defined inside of CurrentBlock. The same value defined inside of loop in last
iteration can not be regarded as available. The third one is an out-of-bound
array access in a flipped if guard.

Right now scalarpre doesn't have phi-translate support, so it will miss some
simple pre opportunities. Like the following testcase, current scalarpre cannot
recognize the last "a * b" is fully redundent because a and b used by the last
"a * b" expr are both defined by phis.

long a[100], b[100], g1, g2, g3;
__attribute__((pure)) long goo();

void foo(long a, long b, long c, long d) {

  g1 = a * b;
  if (__builtin_expect(g2 > 3, 0)) {
    a = c;
    b = d;
    g2 = a * b;
  }
  g3 = a * b;      // fully redundant.

}

The patch adds phi-translate support in scalarpre. This is only a temporary
solution before the newpre based on newgvn is available.

Differential Revision: https://reviews.llvm.org/D32252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309397 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] NFC. Add intrinsics_gen target to CMake Exports

By creating a dummy of this target in LLVMConfig.cmake, projects that can build against out-of-tree LLVM can freely depend on the target without needing to have conditionals for if LLVM is in-tree or out-of-tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309389 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Remove a number of unused arguments. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309385 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Standardize suffixes for LSE Atomics mnemonics (NFCI)

This NFC changeset standardizes the suffixes used for LSE Atomics
instructions.

It changes the existing suffixes - 'b', 'h', 's', 'd' - to the existing
standard 'B', 'H', 'W' and 'X'.

This changeset is the result of the code review discussion for D35319.

Patch by: steleman

Differential Revision: https://reviews.llvm.org/D35927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309384 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add the option to directly access TLS pointer

This patch enables choice for accessing thread local
storage pointer (like '-mtp' in gcc).

Differential Revision: https://reviews.llvm.org/D34408

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309381 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test case for PR33290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309375 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Cleanup shuffle combine tests - remove old prefixes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309374 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add test to check pcs of ARM ABI runtime floating point helpers

The ARM Runtime ABI document (IHI0043) defines the AEABI floating point
helper functions in section 4.1.2 The floating-point helper functions.
The functions listed in this section must always use the base AAPCS calling
convention.

This test generates calls to all the helper functions that llvm supports
and checks that the base AAPCS calling convention has been used. We test
the equivalent of -mfloat-abi=soft, -mfloat-abi=softfp, -mfloat-abi=hardfp
with an FPU that supports single and double precision, and one that only
supports double precision.

Differential Revision: https://reviews.llvm.org/D35904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309371 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Do not visit nodes twice in containsConstantSomewhere

This patch reworks the function that searches constants in Add and Mul SCEV expression
chains so that now it does not visit a node more than once, and also renames this function
for better correspondence between its implementation and semantics.

Differential Revision: https://reviews.llvm.org/D35931

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309367 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] NFC: Comment tidying

The comment on describing the suffix tree had some pruning
stuff that was out of date in it.

Also fixed some typos.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309365 91177308-0d34-0410-b5e6-96231b3b80d8

Revert rL309320 - "[OCaml] Respect CMAKE_C_FLAGS for OCaml C files"

This causes buildbot breakage for systems where OCaml files are built
with a different compiler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309364 91177308-0d34-0410-b5e6-96231b3b80d8

test: require x86 backend

Ensure that the target is registered before using it. Should fix the
hexagon Bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309363 91177308-0d34-0410-b5e6-96231b3b80d8

MC: add support for cfi_return_column

This adds support for the CFI pseudo-op return_column. This specifies
the frame table column which contains the return address.

Addresses PR33953!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309360 91177308-0d34-0410-b5e6-96231b3b80d8

MC: clang-format enumeration (NFC)

This was hard to insert elements into. clang-format it so that it is
easier. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309359 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[SCEV] Cache results of computeExitLimit"

This reverts commit r309080. The patch needs to clear out the
ScalarEvolution::ExitLimits cache in forgetMemoizedResults.

I've replied on the commit thread for the patch with more details.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309357 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] NFC: Split up getOutliningBenefit

This is some more cleanup in preparation for some actual
functional changes. This splits getOutliningBenefit into
two cost functions: getOutliningCallOverhead and
getOutliningFrameOverhead. These functions return the
number of instructions that would be required to call
a specific function and the number of instructions
that would be required to construct a frame for a
specific funtion. The actual outlining benefit logic
is moved into the outliner, which calls these functions.

The goal of refactoring getOutliningBenefit is to:

- Get us closer to getting rid of the IsTailCall flag

- Further split up "target-specific" things and
"general algorithm" things

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309356 91177308-0d34-0410-b5e6-96231b3b80d8

[JumpThreading] Stop falsely preserving LazyValueInfo.

JumpThreading claims to preserve LVI, but it doesn't preserve
the analyses which LVI holds a reference to (e.g. the Dominator).
In the current pass manager infrastructure, after JT runs, the
PM frees these analyses (including DominatorTree) but preserves
LVI.

CorrelatedValuePropagation runs immediately after and queries
a corrupted domtree, causing weird miscompiles.

This commit disables the preservation of LVI for the time being.
Eventually, we should either move LVI to a proper dependency
tracking mechanism (i.e. an analyses shouldn't hold references
to other analyses and compute them on demand if needed), or
we should teach all the passes preserving LVI to preserve the
analyses LVI depends on.

The new pass manager has a mechanism to invalidate LVI in case
one of the analyses it depends on becomes invalid, so this problem
shouldn't exist (at least not in this immediate form), but handling
of analyses holding references is still a very delicate subject.

Fixes PR33917 (and rustc).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309355 91177308-0d34-0410-b5e6-96231b3b80d8

DebugInfo: Consider a CU containing only local imported entities to be 'empty'

This can come up in ThinLTO & wastes space & makes degenerate IR.

As per the added FIXME, ultimately, local imported entities should hang
off the function and that way the imported entity list on the CU can be
tested for emptiness like all the other CU lists.

(function-attached local imported entities are probably also the best
path forward for fixing how imported entities are handled both in
cross-module use (currently, while ThinLTO preserves the imported
entities, they would not get used at the imported inlined location -
only in the abstract origin that appears in the partial CU created by
the import (which isn't emitted under Fission due to cross-CU
limitations there)) and to reduce the number of points where imported
entities are emitted (they're currently emitted into every inlined
instance, concrete instance, and abstract origin - they should only go
in teh abstract origin if there is one, otherwise in the concrete
instance - but this requires lots of delayed handling and wiring up,
same as abstract variables & subprograms))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309354 91177308-0d34-0410-b5e6-96231b3b80d8

[JumpThreading] Add an option to dump LazyValueInfo after the run.

Differential Revision: https://reviews.llvm.org/D35973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309353 91177308-0d34-0410-b5e6-96231b3b80d8

ARMFrameLowering: Only set ExtraCSSpill for actually unused registers.

The code assumed that unclobbered/unspilled callee saved registers are
unused in the function. This is not true for callee saved registers that are
also used to pass parameters such as swiftself.

rdar://33401922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309350 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Port googletest lit tests to Windows

Summary:
The technique of directly calling subprocess.Popen on a python script
doesn't work on Windows. The executable path of the command must refer
to a valid win32 executable.

Instead, rename all the python scripts masquerading as gtest executables
to have .py extensions, so we can easily detect then and call the python
executable for them. Do this on Linux as well as Windows for
consistency.

The test suite directory names also come out in lower-case on Windows.
We can consider removing that in a later patch. This change just updates
the FileCheck lines to match on Windows.

Fixes PR33933

Reviewers: modocache, mgorny

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D35909

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309347 91177308-0d34-0410-b5e6-96231b3b80d8

Changing the default MaxNumPromotions from 2 to 3.

Summary: In performance tuning, we see performance benefits when enlarge the maximum num promotion targets to 3. This is safe as soon as we have total percentage threshold properly setup (https://reviews.llvm.org/D35962)

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D35966

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309346 91177308-0d34-0410-b5e6-96231b3b80d8

Separate the ICP total threshold and remaining threshold.

Summary: In the current implementation, isPromotionProfitable only checks if the call count to a direct target is no less than a certain percentage threshold of the remaining call counts that have not been promoted. This causes code size problems when the target count is small but greater than a large portion of remaining counts. E.g. target1 takes 99.9%, while target2 takes 0.1%. Both targets will be promoted and inlined, makes the function size too large, which potentially prevents it from further inlining into its callers. This patch adds another percentage threshold against the total indirect call count. If the target count needs to be no less than both thresholds in order to be promoted speculatively.

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D35962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309345 91177308-0d34-0410-b5e6-96231b3b80d8

Increase the ImportHotMultiplier to 10.0

Summary: The original 3.0 hot mupltiplier is too small, and would prevent hot callsites from being inline. This patch increases the hot multilier to 10.0

Reviewers: davidxl, tejohnson

Reviewed By: tejohnson

Subscribers: llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D35969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309344 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix latent bug in sibcall eligibility logic

The X86 tail call eligibility logic was correct when it was written, but
the addition of inalloca and argument copy elision broke its
assumptions. It was assuming that fixed stack objects were immutable.

Currently, we aim to emit a tail call if no arguments have to be
re-arranged in memory. This code would trace the outgoing argument
values back to check if they are loads from an incoming stack object.
If the stack argument is immutable, then we won't need to store it back
to the stack when we tail call.

Fortunately, stack objects track their mutability, so we can just make
the obvious check to fix the bug.

This was http://crbug.com/749826

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309343 91177308-0d34-0410-b5e6-96231b3b80d8

[sanitizer-coverage] rename sanitizer-coverage-create-pc-table into sanitizer-coverage-pc-table and add plumbing for a clang flag

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309337 91177308-0d34-0410-b5e6-96231b3b80d8

Remove unused function from AArch64 backend (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309336 91177308-0d34-0410-b5e6-96231b3b80d8

[sanitizer-coverage] add a feature sanitizer-coverage-create-pc-table=1 (works with trace-pc-guard and inline-8bit-counters) that adds a static table of instrumented PCs to be used at run-time

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309335 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Cleanup: move findCandidates out of suffix tree

Doing some cleanup in preparation for some functional changes.
This commit moves findCandidates out of the suffix tree and into the
MachineOutliner class. This is much easier to follow, and removes
the burden of candidate choice from the suffix tree.

It also adds a couple FIXMEs and simplifies building outlined function
names.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309334 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbutil] Clean up ExitOnError usage to add ": " to our errors

The banner parameter is supposed to end in a separator, like ": ".
Otherwise, we get ugly errors like:

Error while reading publics streamNative error: blah blah

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309332 91177308-0d34-0410-b5e6-96231b3b80d8