granicus.if.org Git

[mips] implement .set dspr2 directive

Implement .set dspr2 directive with appropriate feature bits. This
directive is a counterpart of -mattr=dspr2 command line option with the
exception that it does not influence elf header flags.

Patch by Milos Stojanovic.

Differential Revision: https://reviews.llvm.org/D38537

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314994 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Set v2i32 any_extend to expand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314993 91177308-0d34-0410-b5e6-96231b3b80d8

[RDF] Simplify construction of maximal registers

The old algoritm was not correct, although it worked most of the time.
Avoid the complex reachability analysis and simply calculate the maximal
registers out of the set of all referenced registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314991 91177308-0d34-0410-b5e6-96231b3b80d8

[ProfileData] Fix data racing in merging indexed profiles

There is data racing to the static variable RecordIndex in index profile reader
when merging in multiple threads. Make it a member variable in
IndexedInstrProfReader to fix this.

Differential Revision: https://reviews.llvm.org/D38431

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314990 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix chains update when lowering BUILD_VECTOR to a vector load

The code which lowers BUILD_VECTOR of consecutive loads into a single vector
load doesn't update chains properly. As a result the vector load can be
reordered with the store to the same location.

The current code in EltsFromConsecutiveLoads only updates the chain following
the first load. The fix is to update the chains following all the loads
comprising the vector.

This is a fix for PR10114.

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D38547

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314988 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add and set AMDGPU-specific e_flags

Differential Revision: https://reviews.llvm.org/D38556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314987 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Fix PR34743 - handle casts that sink after interleaved loads

When ignoring a load that participates in an interleaved group, make sure to
move a cast that needs to sink after it.

Testcase derived from reproducer of PR34743.

Differential Revision: https://reviews.llvm.org/D38338

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314986 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."""

broken test on windows

This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314985 91177308-0d34-0410-b5e6-96231b3b80d8

revert r314698 - [InstCombine] remove one-use restriction for icmp (shr exact X, C1), C2 --> icmp X, (C2<<C1)

There is a bot failure that appears to be related to this change:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117

...so reverting to confirm that and attempting to keep the bot green while investigating.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314984 91177308-0d34-0410-b5e6-96231b3b80d8

[TablgeGen] : Tidy up CodeGenSchedule. NFC.

Reviewed by: @MatzeB
Differential Revision: https://reviews.llvm.org/D38534

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314982 91177308-0d34-0410-b5e6-96231b3b80d8

[LV] Fix PR34711 - widen instruction ranges when sinking casts

Instead of trying to keep LastWidenRecipe updated after creating each recipe,
have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check
if it's a VPWidenRecipe when attempting to extend its range. This ensures that
such extensions, optimized to maintain the original instruction order, do so
only when the instructions are to maintain their relative order. The latter does
not always hold, e.g., when a cast needs to sink to unravel first order
recurrence (r306884).

Testcase derived from reproducer of PR34711.

Differential Revision: https://reviews.llvm.org/D38339

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314981 91177308-0d34-0410-b5e6-96231b3b80d8

Re-land "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.""

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314980 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Place certain 64 bit FPU instructions in their own decoder namespace

Previously, instructions that were defined to use the FGR64 register class
were associated with the Mips64 table which was incorrect.

Reviewers: nitesh.jain, atanasyan

Differential Revision: https://reviews.llvm.org/D38454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314976 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Insert DEBUG_VALUEs after each register redefinition

Summary:
When reinserting debug values after register allocation, make sure to
insert debug values after each redefinition of debug value register in
the slot index range. The reason for this is that DwarfDebug will end
the range of a debug variable when the physical reg is defined. For
instructions with e.g. tied operands this result in prematurely ended
debug range.

This resolves pr34545

Patch by Karl-Johan Karlsson and Bjorn Pettersson

Reviewers: rnk, aprantl

Reviewed By: rnk

Subscribers: bjope, llvm-commits

Differential Revision: https://reviews.llvm.org/D38229

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314974 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] - llvm-mc hangs on non-english characters.

Currently llvm-mc just hangs inside infinite loop
while trying to parse file which has ".section .с" inside,
where section name is non-english character.
Patch fixes the issue.

In this patch I also moved content of non-english-characters.s
to test/MC/AsmParser/Inputs folder so that non-english-characters.s
becomes a single testcase for all invalid inputs containing non-english
symbols. That is convinent because llvm-mc otherwise tries
to parse and tokenize the whole testcase file with tools invocations and
it is harder to isolate the issue.

Differential revision: https://reviews.llvm.org/D38545

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314973 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion."

Breaks
clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll

This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314972 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix a vector splat handling bug in transformZExtICmp.

We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us.

This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first.

Fixes PR34841.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314971 91177308-0d34-0410-b5e6-96231b3b80d8

[MergeICmps] Disable mergeicmps if the target does not want to handle memcmp expansion.

Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later.

Reviewers: dberlin, spatel

Subscribers: davide, nemanjai

Differential Revision: https://reviews.llvm.org/D38232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314970 91177308-0d34-0410-b5e6-96231b3b80d8

Minor refactoring regarding Cast::isNoopCast(), NFC

Summary:
FastISel::hasTrivialKill() was the only user of the "IntPtrTy" version of
Cast::isNoopCast(). According to review comments in D37894 we could instead
use the "DataLayout" version of the method, and thus get rid of the
"IntPtrTy" versions of isNoopCast() completely.

With the above done, the remaining isNoopCast() could then be simplified
a bit more.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D38497

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314969 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay][tools] Support arg1 logging entries in the basic logging mode

Summary:
The arg1 logging handler changed in compiler-rt to start writing a
different type for entries encountered when logging the first argument
of XRay-instrumented functions. This change allows the trace loader to
support reading these record types as well as prepare for when the
basic (naive) mode implementation starts writing down the argument
payloads.

Without this change, binaries with arg1 logging support enabled start
writing unreadable logs for any of the XRay tracing tools.

Reviewers: pelikan

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314967 91177308-0d34-0410-b5e6-96231b3b80d8

Enabling new pass manager in LTO (and thinLTO) link step.

Adds the option 'new-pass-manager' to the gold pluggin to enable using the
new pass manager during the lto/thinlto link step.

Patch by Graham Yiu.

Differential Revision: https://reviews.llvm.org/D38517

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314963 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r314928 to investigate thinLTO bootstrap failure

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314961 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314953 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add comment about clamps

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314952 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Do not fold clamp instructions when sources are different

Patch by hakzsam (Samuel Pitoiset)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314951 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Improve support for ashr in foldICmpAndShift

We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users.

Differential Revision: https://reviews.llvm.org/D38521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314945 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Fix not accounting for instruction size in bundles

These were counted as 0. Fixes branch limit exceeded errors
in some large programs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314944 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Correctly set EI_OSABI based on the os

Differential Revision: https://reviews.llvm.org/D38555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314943 91177308-0d34-0410-b5e6-96231b3b80d8

clang-format file.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314942 91177308-0d34-0410-b5e6-96231b3b80d8

delete commented out code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314941 91177308-0d34-0410-b5e6-96231b3b80d8

Do not call Loop::getName on possibly dead loops

This fixes PR34832.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314938 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed

Summary: Rotate on exit that actually exits the current loop.

Reviewers: davidxl, danielcdh, iteratee, chandlerc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38563

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314937 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a -Wparentheses warning. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314936 91177308-0d34-0410-b5e6-96231b3b80d8

Convert an APInt to int64_t properly in TTI::getGEPCost().

Summary:
If the pointer width is 32 bits and the calculated GEP offset is
negative, we call APInt::getLimitedValue(), which does a
*zero*-extension of the offset. That's wrong -- we should do an sext.

Fixes a bug introduced in rL314362 and found by Evgeny Astigeevich.

Reviewers: efriedma

Subscribers: sanjoy, javed.absar, llvm-commits, eastig

Differential Revision: https://reviews.llvm.org/D38557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314935 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314934 91177308-0d34-0410-b5e6-96231b3b80d8

Bring r314809 back.

But now include a check for CPU_COUNT so we still build on 10 year old
versions of glibc.

Original message:

Use sched_getaffinity instead of std::thread::hardware_concurrency.

The issue with std::thread::hardware_concurrency is that it forwards
to libc and some implementations (like glibc) don't take thread
affinity into consideration.

With this change a llvm program that can execute in only 2 cores will
use 2 threads, even if the machine has 32 cores.

This makes benchmarking a lot easier, but should also help if someone
doesn't want to use all cores for compilation for example.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314931 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyCFG] put the optional assumption cache pointer in the options struct; NFCI

This is a follow-up to https://reviews.llvm.org/D38138.

I fixed the capitalization of some functions because we're changing those
lines anyway and that helped verify that we weren't accidentally dropping
any options by using default param values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314930 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r314561 after fixing msan build failure

(trial 2) Incoming val defined by terminator instruction which
also requires bitcasts can not be handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314928 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetTransformInfo] Check if function pointer is valid before calling isLoweredToCall

Function isLoweredToCall can only accept non-null function pointer, but a function pointer can be null for indirect function call. So check it before calling isLoweredToCall from getInstructionLatency.

Differential Revision: https://reviews.llvm.org/D38204

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314927 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit : Use the basic cost if a GEP is not used as addressing mode

Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314923 91177308-0d34-0410-b5e6-96231b3b80d8

Revert D38481 due to missing cmake check for CPU_COUNT

Summary:
This reverts D38481. The change breaks systems with older versions of glibc. It
injects a use of CPU_COUNT() from sched.h without checking to ensure that the
function exists first.

Reviewers:

Subscribers:

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314922 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Improve (i8 bitcast (v8i1 x)) handling for v8i64/v8f64 512-bit vector compare results.

AVX1/AVX2 targets were missing a chance to use vmovmskps for v8f32/v8i32 results for bool vector bitcasts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314921 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add a member Subtarget to HexagonInstrInfo, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314920 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r314886 "[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)"

It broke the Chromium / SQLite build; see PR34830.

> Summary:
>    1/  Operand folding during complex pattern matching for LEAs has been
>        extended, such that it promotes Scale to accommodate similar operand
>        appearing in the DAG.
>        e.g.
>          T1 = A + B
>          T2 = T1 + 10
>          T3 = T2 + A
>        For above DAG rooted at T3, X86AddressMode will no look like
>          Base = B , Index = A , Scale = 2 , Disp = 10
>
>    2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
>        so that if there is an opportunity then complex LEAs (having 3 operands)
>        could be factored out.
>        e.g.
>          leal 1(%rax,%rcx,1), %rdx
>          leal 1(%rax,%rcx,2), %rcx
>        will be factored as following
>          leal 1(%rax,%rcx,1), %rdx
>          leal (%rdx,%rcx)   , %edx
>
>    3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
>       thus avoiding creation of any complex LEAs within a loop.
>
> Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy
>
> Reviewed By: lsaba
>
> Subscribers: jmolloy, spatel, igorb, llvm-commits
>
>     Differential Revision: https://reviews.llvm.org/D35014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314919 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix major layout bugs in llvm-objcopy

Somehow a few massive errors slipped though the cracks of testing.

1. The code in Segment::finalize was left over from the old layout
algorithm. In certain situations this would cause very strange issues
with segment layout. For instance in the shift-segments.test case it
would cause the second segment to have the same offset as the first.

2. In debugging this I discovered another issue. Namely section alignment
was not being computed based on Section->Align but instead
Section->Offset which is bizarre and makes no sense. I have no clue how
it worked in the first place. This issue is also fixed

3. Fixing #2 exposed a bug where things were not being written past the end
of the file that technically should have been. This was because in
certain cases (like overlapping-segments) the end of the file wouldn't
always be bumped if the offset could be chosen relative to an existing
segment that already had it's offset chosen. For fully nested segments
this is fine but for overlapping segments this leaves the end of the
file short. So I changed how the offset is bumped when looping though
segments.

Differential Revision: https://reviews.llvm.org/D38436

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314918 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Take fast path when applying <=1 updates

Summary:
This patch teaches `DT.applyUpdates` to take the fast when applying zero or just one update and makes it not run the internal batch updater machinery.

With this patch, it should no longer make sense to have a special check in user's code that checks the update sequence size before applying them, e.g.
```
if (!MyUpdates.empty())
  DT.applyUpdates(MyUpdates);
```
or
```
if (MyUpdates.size() == 1)
  if (...)
    DT.insertEdge(...)
  else
    DT.deleteEdge(...)
```

Reviewers: dberlin, brzycki, davide, grosser, sanjoy

Reviewed By: dberlin, davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38541

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314917 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add support for lowering v8i16 binary shuffles to PACKSS/PACKUS

Missed in D38472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314916 91177308-0d34-0410-b5e6-96231b3b80d8

[test] Fix append_path in the empty case

Summary:
normpath() was being called on an empty string and appended to
the environment variable in the case where the environment variable
was unset. This led to ":." being appended to the path, since
normpath() of an empty string is '.', presumably to represent cwd.

Reviewers: zturner, sqlbyme, modocache

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38542

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314915 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input instead of FR32/FR64

This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted.

This should fix PR33079

Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results.

Differential Revision: https://reviews.llvm.org/D38449

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314914 91177308-0d34-0410-b5e6-96231b3b80d8

"[ARM] Mark flaky test MachineBranchProb.ll unsupported again for ARM/AArch64"

r314857 changed the CFG that resulted in the flaky test MachineBranchProb.ll to
fail the bots again. Marking it as unsupported for ARM/AArch64 again until we
find the cause.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314912 91177308-0d34-0410-b5e6-96231b3b80d8

bpf: fix an insn encoding issue for neg insn

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314911 91177308-0d34-0410-b5e6-96231b3b80d8

[OptRemark] Move YAML writing to IR

Before the patch this was in Analysis.  Moving it to IR and making it implicit
part of LLVMContext::diagnose allows the full opt-remark facility to be used
outside passes e.g. the pass manager.  Jessica is planning to use this to
report function size after each pass.  The same could be used for time
reports.

Tested with BUILD_SHARED_LIBS=On.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314909 91177308-0d34-0410-b5e6-96231b3b80d8

Also update MachineORE after r314874.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314908 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add 'exact' variants of all tests; NFC

We can likely remove most of these as redundant in the near future,
but I'm trying to make sure I don't introduce any regressions with D38514.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314907 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] clang-format lib/Transforms/Scalar/MergeICmps.cpp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314906 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI.

Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314903 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUS

Extension to D38472

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314901 91177308-0d34-0410-b5e6-96231b3b80d8

[gold-plugin] - Fix compilation after LLVM update (r314883). NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314899 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Implement LPMWRdZ pseudo-instruction's expansion.

FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should
refactor a bit and unify the two

Patch by Gerdo Erdi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314898 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Factor out mayLoad in tablegen patterns

Patch by Gergo Erdi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314897 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`

Patch by Gergo Erdi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314896 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Insert JMP for long branches

Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.

Patch by Thomas Backman.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314891 91177308-0d34-0410-b5e6-96231b3b80d8

[AVR] Fix displacement overflow for LDDW/STDW

In some cases, the code generator attempts to generate instructions such as:

lddw r24, Y+63

which expands to:

ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary

This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.

Patch by Thomas Backman.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314890 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add diag string for movw/movt immediates in assembly

This adds diagnostics for invalid immediate operands to the MOVW and MOVT
instructions (ARM and Thumb).

Differential revision: https://reviews.llvm.org/D31879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314888 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM, Asm] Change grammar of immediate operand diagnostics

Currently, our diagnostics for assembly operands are not consistent.
Some start with (for example) "immediate operand must be ...",
and some with "operand must be an immediate ...". I think the latter
form is preferable for a few reasons:
* It's unambiguous that it is referring to the expected type of operand, not
  the type the user provided. For example, the user could provide an register
  operand, and get a message taking about an operand is if it is already an
  immediate, just not in the accepted range.
* It allows us to have a consistent style once we add diagnostics for operands
  that could take two forms, for example a label or pc-relative memory operand.

Differential revision: https://reviews.llvm.org/D36689

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314887 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Improvement in CodeGen instruction selection for LEAs (re-applying post required revision changes.)

Summary:
   1/  Operand folding during complex pattern matching for LEAs has been
       extended, such that it promotes Scale to accommodate similar operand
       appearing in the DAG.
       e.g.
         T1 = A + B
         T2 = T1 + 10
         T3 = T2 + A
       For above DAG rooted at T3, X86AddressMode will no look like
         Base = B , Index = A , Scale = 2 , Disp = 10

   2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs
       so that if there is an opportunity then complex LEAs (having 3 operands)
       could be factored out.
       e.g.
         leal 1(%rax,%rcx,1), %rdx
         leal 1(%rax,%rcx,2), %rcx
       will be factored as following
         leal 1(%rax,%rcx,1), %rdx
         leal (%rdx,%rcx)   , %edx

   3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops,
      thus avoiding creation of any complex LEAs within a loop.

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy

Reviewed By: lsaba

Subscribers: jmolloy, spatel, igorb, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314886 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-cov] Fix showing title when filtering and not outputting to a directory

Differential Revision: https://reviews.llvm.org/D38507

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314885 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] - Don't assert when non-english characters are used.

I found that llvm-mc does not like non-english characters even in comments,
which it tries to tokenize.

Problem happens because of functions like isdigit(), isalnum() which takes
int argument and expects it is not negative.
But at the same time MCParser uses char* to store input buffer poiner, char has signed value,
so it is possible to pass negative value to one of functions from above and
that triggers an assert.
Testcase for demonstration is provided.

To fix the issue helper functions were introduced in StringExtras.h

Differential revision: https://reviews.llvm.org/D38461

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314883 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit [UnreachableBlockElim] Use COPY if PHI input is undef

This time invoking llc with "-march=x86-64" in the testcase, so we don't assume
the default target is x86.

Summary:
If we have

%vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
%vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314882 91177308-0d34-0410-b5e6-96231b3b80d8

[IRCE] Temporarily disable unsigned latch conditions by default

We have found some corner cases connected to range intersection where IRCE makes
a bad thing when the latch condition is unsigned. The fix for that will go as a follow up.
This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed.

The unsigned latch conditions were introduced to IRCE by rL310027.

Differential Revision: https://reviews.llvm.org/D38529

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314881 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r314879 "[UnreachableBlockElim] Use COPY if PHI input is undef"

Build-bots broke on the new testcase. I'll investigate and fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314880 91177308-0d34-0410-b5e6-96231b3b80d8

[UnreachableBlockElim] Use COPY if PHI input is undef

Summary:
If we have

%vreg0<def> = PHI %vreg2<undef>, <BB#0>, %vreg3, <BB#2>; GR32:%vreg0,%vreg2,%vreg3
%vreg3<def,tied1> = ADD32ri8 %vreg0<kill,tied0>, 1, %EFLAGS<imp-def>; GR32:%vreg3,%vreg0

then we can't just change %vreg0 into %vreg3, since %vreg2 is actually
undef. We would have to also copy the undef flag to be able to change the
register.

Instead we deal with this case like other cases where we can't just
replace the register: we insert a COPY. The code creating the COPY already
copied all flags from the PHI input, so the undef flag will be transferred
as it should.

Reviewers: kparzysz

Reviewed By: kparzysz

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38235

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314879 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix using the SJLJ jump table on x86_64

The previous version didn't work if the jump table base address didn't
fit in 32 bit, since it was encoded as an immediate offset. And in case
the jump table is encoded as 32 bit label differences, we need to
load and add them to the table base first.

This solves the first half of the issues mentioned in PR34720.

Also fix some of the errors pointed out by -verify-machineinstrs, by
using GR32_NOSPRegClass.

Differential Revision: https://reviews.llvm.org/D38333

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314876 91177308-0d34-0410-b5e6-96231b3b80d8

Move verbosity check for remarks to the diag handler

Test needs some slight adjustment because we no longer check the existence of
BFI but rather that the actual hotness is set on the remark. If entry_count
is not set getBlockProfileCount returns None.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314874 91177308-0d34-0410-b5e6-96231b3b80d8

[FuzzerUtil] Partially revert D38481 on FuzzerUtil

This is because lib/Fuzzer doesn't really depend on llvm infrastucture.
It's not easy to access the llvm hardware_concurrency here.

Differential Reivision: https://reviews.llvm.org/D38481

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314870 91177308-0d34-0410-b5e6-96231b3b80d8

Add a manpage for llvm-dwarfdump.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314863 91177308-0d34-0410-b5e6-96231b3b80d8

Simplify multikey_qsort function.

This function implements the three-way radix quicksort algorithm.
This patch simplifies the implementation by using MutableArrayRef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314858 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Use LateSimplifyCFG after expanding atomic operations.

Summary:
After r308422 we defer optimizations that can destroy loop canonical forms to
LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations
can exploit more control-flow opportunities.

Reviewers: mcrosier, t.p.northover, efriedma

Reviewed By: efriedma

Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314857 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-dwarfdump: implement the --regex option in combination with --name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314855 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Expand setcc for v2f32 and v4f32

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314853 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Expand setcc for v2i32 and v4i32

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314852 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/Docs: Follow up on review feedback in https://reviews.llvm.org/D38387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314848 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Make eraseNode invalidate DFS numbers

This patch makes DT::eraseNode mark DFSInfo as invalid.
Not marking it as invalid leads to DFS numbers getting corrupted
and failing VerifyDFSNumbers check.

This patch also makes children iterator const (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314847 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add ELFOSABI_AMDGPU_MESA3D

Differential Revision: https://reviews.llvm.org/D38387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314846 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove dead declaration convertArgMovsToPushes, NFC

This was dead when it landed in r252578. We have this functionality, if
not for stack probe calls, but for regular calls in
X86CallFrameOptimization.cpp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314845 91177308-0d34-0410-b5e6-96231b3b80d8

Pre-compute the tail of the archive

An archive looks like

<header>
<symbol table>
<tail>

The symbol table refers to offsets in the tail. A complication is that
we would like to support symbol tables that use 64 bit offsets if it
turns out that any of the offsets is too big.

This patch changes the archive writer to first compute the tail. We
cannot just compute one big StringRef since that would require reading
every member upfront, but we can represent it as a series of
StringRefs.

Having done that it is much easier to compute the symbol table and all
offsets are computed before it is written. With this if there is an
accounting problem it will show up with a regular symbol table, not
just when a 64 bit one is needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314844 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Add ELFOSABI_AMDGPU_PAL

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314843 91177308-0d34-0410-b5e6-96231b3b80d8

Refactor DIBuilder dbg intrinsic insertion, NFC

Both dbg.declare and dbg.value insertion had duplicate code for the two
overloads with different insertion point conventions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314839 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for icmp gt/lt (shr X, C1), C2; NFC

Surprisingly, we have zero coverage for these patterns.

Many of these are handled in InstSimplify, but it's not obvious
what the rule for folding each case should be, so I've just
stamped out everything.

It should be possible to fold every case, but currently, we
miss these:

int ashr_slt(int x) {
return (x >> 1) < 1;
}

int ashr_sgt(int x) {
return (x >> 1) > 0;
}

https://godbolt.org/g/aB2hLE

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314837 91177308-0d34-0410-b5e6-96231b3b80d8

[MachineOutliner] Fix off-by-one in cost model

This commit does two things. Firstly, it cleans up some of the benefit
calculation wrt outlined functions and candidates. Secondly, it fixes an
off-by-one bug in the cost model which was caused by the benefit value of
an OutlinedFunction and Candidate differing by 1. It updates the remarks test
to reflect this change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314836 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Revert P9 scheduling model to incomplete

Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026
The previous change caused regressions on Power 9.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314835 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Use isSignBitCheck to simplify an if statement. Directly create new sign bit compares instead of manipulating the constant. NFCI

Since we no longer had the direct constant compares, manipulating the constant seemeded less clear.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314830 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] implemented pal metadata

Summary:
For the amdpal OS type:

We write an AMDGPU_PAL_METADATA record in the .note section in the ELF
(or as an assembler directive). It contains key=value pairs of 32 bit
ints. It is a merge of metadata from codegen of the shaders, and
metadata provided by the frontend as _amdgpu_pal_metadata IR metadata.
Where both sources have a key=value with the same key, the two values
are ORed together.

This .note record is part of the amdpal ABI and will be documented in
docs/AMDGPUUsage.rst in a future commit.

Eventually the amdpal OS type will stop generating the .AMDGPU.config
section once the frontend has safely moved over to using the .note
records above instead of .AMDGPU.config.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314829 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Avoid predicated execution of the basic blocks containing scalar
instructions.

Differential revision: https://reviews.llvm.org/D38293

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314828 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wcovered-switch-default warnings from r314821

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314826 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r314817 "[dwarfdump] Add -lookup option"

The test fails on Linux; see follow-up email on the llvm-commits list.

> Add the option to lookup an address in the debug information and print
> out the file, function, block and line table details.
>
> Differential revision: https://reviews.llvm.org/D38409

This also reverts the follow-up r314818:

> [test] Fix llvm-dwarfdump/cmdline.test
>
> Fixes test/tools/llvm-dwarfdump/cmdline.test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314825 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r314806 "[SLP] Vectorize jumbled memory loads."

All the buildbots are red, e.g.
http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/

> Summary:
> This patch tries to vectorize loads of consecutive memory accesses, accessed
> in non-consecutive or jumbled way. An earlier attempt was made with patch D26905
> which was reverted back due to some basic issue with representing the 'use mask' of
> jumbled accesses.
>
> This patch fixes the mask representation by recording the 'use mask' in the usertree entry.
>
> Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df
>
> Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh
>
> Reviewed By: Ayal
>
> Subscribers: hans, mzolotukhin
>
> Differential Revision: https://reviews.llvm.org/D36130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314824 91177308-0d34-0410-b5e6-96231b3b80d8

Fix expectations in MC wasm init-fini-array test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314823 91177308-0d34-0410-b5e6-96231b3b80d8

Implement David Blaikie's suggestion for comparison operators

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314822 91177308-0d34-0410-b5e6-96231b3b80d8

CodeView: Provide a .def file with the register ids

The list of register ids was previously written out in a couple of dirrent
places. This puts it in a .def file and also adds a few more registers (e.g.
the x87 regs) which should lead to more readable dumps, but I didn't include
the whole list since that seems unnecessary.

X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not
relying on magic constants anymore. The TODO of using tablegen still stands.

Differential revision: https://reviews.llvm.org/D38480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314821 91177308-0d34-0410-b5e6-96231b3b80d8