granicus.if.org Git

[NFC] Rename variables to reflect the actual status of GuardWidening

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353036 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][NFC] Fix trailing semicolon warning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353035 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Remove redundant parameters for better readability

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353034 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Replace equivalent condition for better readability

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353032 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add a BaseIndexOffset::print() method for debugging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353028 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Cut run time of analysis mode by another -35% (*sic*) (YamlContext::getRegNo())

Summary:

Together with the previous patch, it's an -90% improvement,
or roughly -96% improvement if you look starting with rL347204

```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'

Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):

           1483.18 msec task-clock                #    0.999 CPUs utilized            ( +-  0.10% )
                68      context-switches          #   46.085 M/sec                    ( +- 22.62% )
                 0      cpu-migrations            #    0.000 K/sec
             11641      page-faults               # 7850.880 M/sec                    ( +-  0.62% )
        5943246799      cycles                    # 4008184.428 GHz                   ( +-  0.10% )  (83.28%)
         442869514      stalled-cycles-frontend   #    7.45% frontend cycles idle     ( +-  0.41% )  (83.29%)
        1443375663      stalled-cycles-backend    #   24.29% backend cycles idle      ( +-  0.47% )  (33.43%)
        7714006752      instructions              #    1.30  insn per cycle
                                                  #    0.19  stalled cycles per insn  ( +-  0.07% )  (50.17%)
        1977242936      branches                  # 1333472193.855 M/sec              ( +-  0.07% )  (66.79%)
          32624220      branch-misses             #    1.65% of all branches          ( +-  0.18% )  (83.34%)

           1.48438 +- 0.00143 seconds time elapsed  ( +-  0.10% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-newer.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-newer.html'

Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-newer.html' (9 runs):

            963.28 msec task-clock                #    0.999 CPUs utilized            ( +-  0.37% )
                12      context-switches          #   12.695 M/sec                    ( +- 52.79% )
                 0      cpu-migrations            #    0.000 K/sec
             11599      page-faults               # 12046.971 M/sec                   ( +-  0.59% )
        3860122322      cycles                    # 4009359.596 GHz                   ( +-  0.37% )  (83.19%)
         380300669      stalled-cycles-frontend   #    9.85% frontend cycles idle     ( +-  0.34% )  (83.30%)
        1071910340      stalled-cycles-backend    #   27.77% backend cycles idle      ( +-  1.30% )  (33.51%)
        4773418224      instructions              #    1.24  insn per cycle
                                                  #    0.22  stalled cycles per insn  ( +-  0.15% )  (50.17%)
        1106990316      branches                  # 1149787979.919 M/sec              ( +-  0.11% )  (66.80%)
          23632231      branch-misses             #    2.13% of all branches          ( +-  0.18% )  (83.33%)

           0.96389 +- 0.00356 seconds time elapsed  ( +-  0.37% )
```
```
$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-newer.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-old.html
```

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57658

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353025 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Cut run time of analysis mode by -84% (*sic*) (YamlContext::getInstrOpcode())

Summary:
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-old.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-old.html'

Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-old.html' (9 runs):

           9465.46 msec task-clock                #    1.000 CPUs utilized            ( +-  0.05% )
                60      context-switches          #    6.363 M/sec                    ( +- 79.45% )
                 0      cpu-migrations            #    0.000 K/sec
             11364      page-faults               # 1200.697 M/sec                    ( +-  0.60% )
       37935623543      cycles                    # 4008083.912 GHz                   ( +-  0.05% )  (83.32%)
        2371625356      stalled-cycles-frontend   #    6.25% frontend cycles idle     ( +-  0.37% )  (83.32%)
        8476077875      stalled-cycles-backend    #   22.34% backend cycles idle      ( +-  0.18% )  (33.36%)
       41822439158      instructions              #    1.10  insn per cycle
                                                  #    0.20  stalled cycles per insn  ( +-  0.02% )  (50.03%)
       11607658944      branches                  # 1226405861.486 M/sec              ( +-  0.01% )  (66.69%)
         210864633      branch-misses             #    1.82% of all branches          ( +-  0.06% )  (83.34%)

           9.46636 +- 0.00441 seconds time elapsed  ( +-  0.05% )
```
```
$ perf stat -r 9 ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file="" -analysis-inconsistencies-output-file=/tmp/clusters-bew.html
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'
...
no exegesis target for x86_64-unknown-linux-gnu, using default
Parsed 14656 benchmark points
Printing sched class consistency analysis results to file '/tmp/clusters-bew.html'

Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput-onefull.yaml -analysis-clusters-output-file= -analysis-inconsistencies-output-file=/tmp/clusters-bew.html' (9 runs):

           1480.66 msec task-clock                #    1.000 CPUs utilized            ( +-  0.19% )
                13      context-switches          #    8.483 M/sec                    ( +- 83.10% )
                 0      cpu-migrations            #    0.075 M/sec                    ( +-100.00% )
             11596      page-faults               # 7834.247 M/sec                    ( +-  0.59% )
        5933732194      cycles                    # 4008977.535 GHz                   ( +-  0.19% )  (83.22%)
         438111928      stalled-cycles-frontend   #    7.38% frontend cycles idle     ( +-  0.37% )  (83.25%)
        1454969705      stalled-cycles-backend    #   24.52% backend cycles idle      ( +-  0.94% )  (33.53%)
        7724218604      instructions              #    1.30  insn per cycle
                                                  #    0.19  stalled cycles per insn  ( +-  0.07% )  (50.14%)
        1979796413      branches                  # 1337599858.945 M/sec              ( +-  0.06% )  (66.74%)
          32641638      branch-misses             #    1.65% of all branches          ( +-  0.18% )  (83.31%)

           1.48128 +- 0.00284 seconds time elapsed  ( +-  0.19% )

$ sha512sum /tmp/clusters-*
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-bew.html
db4bbd904fe8840853b589b032c5041bc060b91bcd9c27b914b56581fbc473550eea74b852238c79963b5adf2419f379e9f5db76784048b48e3937f9f3e732bf  /tmp/clusters-old.html
```

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits, RKSimon

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353024 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Throughput support in analysis mode

Summary:
D57000 / [[ https://bugs.llvm.org/show_bug.cgi?id=37698 | PR37698 ]] added support for measuring of the inverse throughput.
But the support for the analysis was not added.
This attempts to fix that. (analysis done o bdver2 / piledriver)

First, small-scale experiment:
```
$ ./bin/llvm-exegesis -num-repetitions=10000 -mode=inverse_throughput -opcode-name=BSF64rr
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-d0acdd.o
---
mode:            inverse_throughput
key:
  instructions:
    - 'BSF64rr RAX RDX'
  config:          ''
  register_initial_values:
    - 'RDX=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:
  - { key: inverse_throughput, value: 3.0278, per_snippet_value: 3.0278 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 48BA0000000000000000480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2480FBCC2C3
...
```
If we plug `bsfq %r12, %r10` into llvm-mca:
https://godbolt.org/z/ZtOyhJ
```
Dispatch Width:    4
uOps Per Cycle:    3.00
IPC:               0.50
Block RThroughput: 2.0
```
So RThroughput mismatch exists.

Now, let's upscale and analyse:
{F8207148}
`$ ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=1.0 -benchmarks-file=/tmp/benchmarks-inverse_throughput.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html`:
{F8207172}
{F8207197}
And if we now look at https://www.agner.org/optimize/instruction_tables.pdf,
`Reciprocal throughput` for `BSF r,r` is listed as `3`.
Yay?

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57647

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353023 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] deserializeMCInst(): bump SmallVector small size up to 16

Summary:
... from 8.
`VALIGNDZ128rmbik XMM0 XMM0 K1 XMM3 RDI i_0x1 i_0x0 i_0x1` instruction already has 9 components.
It does not matter much in terms of performance, but avoiding allocation seems to come with low cost here..

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57654

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353022 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Don't default to running&dumping all analyses to '-'

Summary:
Up until the point i have looked in the source, i didn't even understood that
i can disable 'cluster' output. I have always silenced it via ` &> /dev/null`.
(And hoped it wasn't contributing much of the run time.)

While i expect that it has it's use-cases i never once needed it so far.
If i forget to silence it, console is completely flooded with that output.

How about not expecting users to opt-out of analyses,
but to explicitly specify the analyses that should be performed?

Reviewers: courbet, gchatelet

Reviewed By: courbet

Subscribers: tschuett, RKSimon, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57648

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353021 91177308-0d34-0410-b5e6-96231b3b80d8

[SCEV] Do not bother creating separate SCEVUnknown for unreachable nodes

Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we
have a huge amounts of such code, we will be littering SE with these nodes. We could
just state that they all are undef and save some memory.

Differential Revision: https://reviews.llvm.org/D57567
Reviewed By: sanjoy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353017 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r352660 "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."

We now print ST0 as 'st' when generating the clobber list for MS inline assembly in clang. This matches what the gcc reg name list expects.

Original commit message:

This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler.

Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler.

Differential Revision: https://reviews.llvm.org/D57298

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353016 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Print %st(0) as %st when its implicit to the instruction. Continue printing it as %st(0) when its encoded in the instruction.

This is a step back from the change I made in r352985. This appears to be more consistent with gcc and objdump behavior.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353015 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Regenerate test to drop 'End function' comments some other other regex updates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353014 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r352985 "[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly"

Looking into gcc and objdump behavior more this was overly aggressive. If the register is encoded in the instruction we should print %st(0), if its implicit we should print %st.

I'll be making a more directed change in a future patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353013 91177308-0d34-0410-b5e6-96231b3b80d8

tests: loosen restriction

The MachO tests can run on any target, but require that the x86 backend
is available. Broaden the coverage of the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353012 91177308-0d34-0410-b5e6-96231b3b80d8

Compute the correct symbol size in llvm-nm even without --print-size

In llvm-nm, the symbol size was being computed only with --print-size option,
even though it was being printed in other cases, such as with --format=posix.

This patch simply removes the guard, so that the size is computed
independently of the later decision to print it or not.

Fixes PR39997.

Differential Revision: https://reviews.llvm.org/D57599

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353011 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Recommend assertions when testing.

Pointed out by Shoaib Meenai.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353008 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopIdiomRecognize] @llvm.dbg values shouldn't affect the transformation.

Summary: PR40564

Reviewers: aprantl, rnk

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57629

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353007 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Make vector types legal in UREM test

As discussed in D50222, this changes the vector types in tests required for that revision to ones legal for X86.

Patch by @hermord (Dmytro Shynkevych)

Differential Revision: https://reviews.llvm.org/D56372

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353004 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] adjust test for uaddo change in rL353001

We don't need a mtctr/bctr for this test now; a regular
conditional branch is fine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353002 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] adjust target constraints for forming uaddo

There are 2 changes visible here:
1. There's no reason to limit this transform based on number
   of condition registers. That diff allows PPC to produce
   slightly better (dot-instructions should be generally good)
   code.
   Note: someone that cares about PPC codegen might want to
   look closer at that output because it seems like we could
   still improve this.

2. We (probably?) should not bother trying to form uaddo (or
   other overflow ops) when there's no target support for such
   an op. This goes beyond checking whether the op is expanded
   because both PPC and AArch64 show better codegen for standard
   types regardless of whether the op is legal/custom.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@353001 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Support shuffle combining for VBROADCAST with smaller vector sources

getTargetShuffleMask can only do this safely if we're extracting the lowest subvector from a vector of the same result type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352999 91177308-0d34-0410-b5e6-96231b3b80d8

[PatternMatch] add special-case uaddo matching for increment-by-one (2nd try)

This is the most important uaddo problem mentioned in PR31754:
https://bugs.llvm.org/show_bug.cgi?id=31754
...but that was overcome in x86 codegen with D57637.

That patch also corrects the inc vs. add regressions seen with the previous attempt at this.

Still, we want to make this matcher complete, so we can potentially canonicalize the pattern
even if it's an 'add 1' operation.
Pattern matching, however, shouldn't assume that we have canonicalized IR, so we match 4
commuted variants of uaddo.

There's also a test with a crazy type to show that the existing CGP transform based on this
matcher is not limited by target legality checks.

I'm not sure if the Hexagon diff means the test is no longer testing what it intended to
test, but that should be solvable in a follow-up.

Differential Revision: https://reviews.llvm.org/D57516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352998 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Support shuffle combining for VPMOVZX with smaller vector sources

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352997 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] More aggressively simplify BROADCAST source operand

Aim to use scalar source or lowest 128-bit vector directly.

We're still missing some VZMOVL_LOAD combines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352994 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add CGP uaddo test with weird type; NFC

There's probably no reason to try this transform
for an obviously unsupported op.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352993 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] move test file to prevent bot failures

The test specifiies the triple, so it needs to be in the
x86 directory in case a bot has been configured without
the x86 target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352992 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] refactor optimizeCmpExpression (NFCI)

This is not truly NFC because we are bailing out without
a TLI now. That should not be a real concern though because
there should be a TLI in any real-world scenario.

That seems better than passing around a pointer and then
checking it for null-ness all over the place.

The motivation is to fix what appears to be an unintended
restriction on the uaddo transform -
hasMultipleConditionRegisters() shouldn't be reason to limit
the transform.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352988 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] add tests for saturating add; NFC

This is copied from the existing test files for x86/AArch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352987 91177308-0d34-0410-b5e6-96231b3b80d8

[DA][NewPM] Handle transitive dependencies in the new-pm version of DA

Summary:
The analysis result of DA caches pointers to AA, SCEV, and LI, but it
never checks for their invalidation. Fix that.

Reviewers: chandlerc, dmgreen, bogner

Reviewed By: dmgreen

Subscribers: hiraditya, bollu, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D56381

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352986 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber name to make MS inline asm work correctly

Summary:
When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name.

This also matches what objdump disassembly prints. It's also what is printed by gcc -S.

Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri

Reviewed By: rnk

Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D57621

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352985 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a constant 1 to encourage INC formation.

Summary:
Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions.

I believe this catches the cases that D57547 got.

Reviewers: RKSimon, spatel

Reviewed By: spatel

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352984 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix -Wunused-variable after rL352978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352982 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x

Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x

Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri

Reviewed By: spatel, lebedev.ri

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D41940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352981 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement widenScalar for G_UNMERGE_VALUES

For the scalar case only.

Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352979 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement widenScalar for G_EXTRACT vector sources

Handle the basic element extract case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352978 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Avoid reporting illegal extloads as legal

This avoids breaking a test in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352977 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize icmp for pointer types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352976 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize constant for pointer types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352975 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Legalize select for pointer types

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352974 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Legalization for inttoptr/ptrtoint

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352973 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add another test case for PR40539. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352967 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combining

Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles.

The is exposes a couple of minor issues that will be fixed shortly:
Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up
combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352963 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.

We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts.......

I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352961 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Handle case with single new PHI faster.

If there is only a single available value, all uses must be dominated by
the single value and there is no need to search for a reaching
definition.

This drastically speeds up LCSSA in some cases. For the test case
from PR37202, it speeds up LCSSA construction by 4 times.

Time-passes without this patch for test case from PR37202:

    Total Execution Time: 29.9285 seconds (29.9276 wall clock)

    ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
    5.2786 ( 17.7%)   0.0021 (  1.2%)   5.2806 ( 17.6%)   5.2808 ( 17.6%)  Unswitch loops
    4.3739 ( 14.7%)   0.0303 ( 18.1%)   4.4042 ( 14.7%)   4.4042 ( 14.7%)  Loop-Closed SSA Form Pass
    4.2658 ( 14.3%)   0.0192 ( 11.5%)   4.2850 ( 14.3%)   4.2851 ( 14.3%)  Loop-Closed SSA Form Pass #2
    2.2307 (  7.5%)   0.0013 (  0.8%)   2.2320 (  7.5%)   2.2318 (  7.5%)  Loop Invariant Code Motion
    2.0888 (  7.0%)   0.0012 (  0.7%)   2.0900 (  7.0%)   2.0897 (  7.0%)  Unroll loops
    1.6761 (  5.6%)   0.0013 (  0.8%)   1.6774 (  5.6%)   1.6774 (  5.6%)  Value Propagation
    1.3686 (  4.6%)   0.0029 (  1.8%)   1.3716 (  4.6%)   1.3714 (  4.6%)  Induction Variable Simplification
    1.1457 (  3.8%)   0.0010 (  0.6%)   1.1468 (  3.8%)   1.1468 (  3.8%)  Loop-Closed SSA Form Pass #4
    1.1384 (  3.8%)   0.0005 (  0.3%)   1.1389 (  3.8%)   1.1389 (  3.8%)  Loop-Closed SSA Form Pass #6
    1.1360 (  3.8%)   0.0027 (  1.6%)   1.1387 (  3.8%)   1.1387 (  3.8%)  Loop-Closed SSA Form Pass #5
    1.1331 (  3.8%)   0.0010 (  0.6%)   1.1341 (  3.8%)   1.1340 (  3.8%)  Loop-Closed SSA Form Pass #3

Time passes with this patch

  Total Execution Time: 19.2802 seconds (19.2813 wall clock)

   ---User Time---   --System Time--   --User+System--   ---Wall Time---  --- Name ---
   4.4234 ( 23.2%)   0.0038 (  2.0%)   4.4272 ( 23.0%)   4.4273 ( 23.0%)  Unswitch loops
   2.3828 ( 12.5%)   0.0020 (  1.1%)   2.3848 ( 12.4%)   2.3847 ( 12.4%)  Unroll loops
   1.8714 (  9.8%)   0.0020 (  1.1%)   1.8734 (  9.7%)   1.8735 (  9.7%)  Loop Invariant Code Motion
   1.7973 (  9.4%)   0.0022 (  1.2%)   1.7995 (  9.3%)   1.8003 (  9.3%)  Value Propagation
   1.4010 (  7.3%)   0.0033 (  1.8%)   1.4043 (  7.3%)   1.4044 (  7.3%)  Induction Variable Simplification
   0.9978 (  5.2%)   0.0244 ( 13.1%)   1.0222 (  5.3%)   1.0224 (  5.3%)  Loop-Closed SSA Form Pass #2
   0.9611 (  5.0%)   0.0257 ( 13.8%)   0.9868 (  5.1%)   0.9868 (  5.1%)  Loop-Closed SSA Form Pass
   0.5856 (  3.1%)   0.0015 (  0.8%)   0.5871 (  3.0%)   0.5869 (  3.0%)  Unroll loops #2
   0.4132 (  2.2%)   0.0012 (  0.7%)   0.4145 (  2.1%)   0.4143 (  2.1%)  Loop Invariant Code Motion #3

Reviewers: efriedma, davide, mzolotukhin

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D57033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352960 91177308-0d34-0410-b5e6-96231b3b80d8

[LCSSA] Add expensive verification of LCSSA form for sub-loops.

This assertion makes sure all sub-loops are in LCSSA form before
bringing their parent in LCSSA form. This precondition was added to
formLCSSA in D56848.

Reviewers: davide, efriedma, mzolotukhin

Reviewed By: davide

Differential Revision: https://reviews.llvm.org/D56921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352958 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE]: Adding full coverage of MC encoding tests for the SSE isa sets.<NFC>

Summary:
NFC.
Adding MC regressions tests to cover all the SSE ISA sets as follows:
SSE, SSE2, SSE3, SSE4, SSE42, SSEMXCSR, SSE_PREFETCH, SSSE3

This patch is part of a larger task to cover MC encoding of all X86 ISA Sets.
See revision: https://reviews.llvm.org/D39952

Patch by Gadi Haber and Wang Tianqing

Reviewers: RKSimon, zvi, craig.topper, AndreiGrischenko, gadi.haber, LuoYuanke

Reviewed By: craig.topper

Subscribers: jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D40387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352955 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Bump minimum toolchain version"

Reverting D57264 again, it looks like we're down to two bots that need fixing:

polly-amd64-linux
polly-arm-linux

They both have old versions of libstdc++ and recent clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352954 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] [BTF] Process FileName with absolute path correctly

In IR, sometimes the following attributes for DIFile may be
generated:
filename: /home/yhs/test.c
directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352952 91177308-0d34-0410-b5e6-96231b3b80d8

Bump minimum toolchain version

Summary:
The RFC on moving past C++11 got good traction:
http://lists.llvm.org/pipermail/llvm-dev/2019-January/129452.html

This patch therefore bumps the toolchain versions according to our policy:
llvm.org/docs/DeveloperPolicy.html#toolchain

Subscribers: mgorny, jkorous, dexonsmith, llvm-commits, mehdi_amini, jyknight, rsmith, chandlerc, smeenai, hans, reames, lattner, lhames, erichkeane

Differential Revision: https://reviews.llvm.org/D57264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352951 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Temporarily limit one test to darwin

Some triples in llvm-mc appear to be unavailable on some buildbots.
To please those buildbots we temporarily limit the test to darwin
(where the required triple is guranteed to be available)
until we find the right solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352950 91177308-0d34-0410-b5e6-96231b3b80d8

[ASan] Do not instrument other runtime functions with `__asan_handle_no_return`

Summary:
Currently, ASan inserts a call to `__asan_handle_no_return` before every
`noreturn` function call/invoke. This is unnecessary for calls to other
runtime funtions. This patch changes ASan to skip instrumentation for
functions calls marked with `!nosanitize` metadata.

Reviewers: TODO

Differential Revision: https://reviews.llvm.org/D57489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352948 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Fix triples in macho tests.

Update triples used by the macho tests to fix some buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352947 91177308-0d34-0410-b5e6-96231b3b80d8

[AutoUpgrade] Fix AutoUpgrade for x86.seh.recoverfp

Summary: This fixes the bug in https://reviews.llvm.org/D56747#inline-502711.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352945 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] Add ability to copy MachO object files

This diff implements first bits for copying (without modification) MachO object files.

Test plan: make check-all

Differential revision: https://reviews.llvm.org/D54674

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352944 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[BPF] [BTF] Process FileName with absolute path correctly"

This reverts commit r352939.

Some tests failed. Revert to unblock others.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352941 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Fix unused variable [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352940 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] [BTF] Process FileName with absolute path correctly

In IR, sometimes the following attributes for DIFile may be
generated:
filename: /home/yhs/test.c
directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352939 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Be as conservative about atomic accesses as for volatile

Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context.

Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change.

It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check. We should probably adjust that class hierarchy long term, but for now, that seperation is useful.

I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests.

Differential Revision: https://reviews.llvm.org/D57596

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352937 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor test checks (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352935 91177308-0d34-0410-b5e6-96231b3b80d8

[Test] Update file w/update_test_checks.py to make a follow on change obvious

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352932 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Add codegen support for the import_field attribute

This adds the LLVM side of https://reviews.llvm.org/D57602 -- the
import_field attribute. See that patch for details.

Differential Revision: https://reviews.llvm.org/D57603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352931 91177308-0d34-0410-b5e6-96231b3b80d8

[COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects

Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Reviewers: rnk, efriedma, ssijaric, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D57183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352923 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mapping

Noticed in D57514.

Differential Revision: https://reviews.llvm.org/D57519

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352922 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Mark test functions with hidden visibility

Prepare for future patch which affects codegen for calls to preemptible
functions.

Differential Revision: https://reviews.llvm.org/D57605

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352920 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Fix mkdir use in test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352918 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Expand Windows test (NFC)

Run checks for Win32 as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352917 91177308-0d34-0410-b5e6-96231b3b80d8

[DebugInfo] Don't use realpath when looking up debug binary locations.

Summary:
Using realpath makes assumptions about build systems that do not always hold true. The debug binary referred to from the .gnu_debuglink should exist in the same directory (or in a .debug directory, etc.), but the files may only exist as symlinks to a differently named files elsewhere, and using realpath causes that lookup to fail.

This was added in r189250, and this is basically a revert + regression test case.

Reviewers: dblaikie, samsonov, jhenderson

Reviewed By: dblaikie

Subscribers: llvm-commits, hiraditya

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57609

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352916 91177308-0d34-0410-b5e6-96231b3b80d8

[opaque pointer types] Pass function type for CallBase::setCalledFunction.

Differential Revision: https://reviews.llvm.org/D57174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352914 91177308-0d34-0410-b5e6-96231b3b80d8

[opaque pointer types] Pass value type to GetElementPtr creation.

This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352913 91177308-0d34-0410-b5e6-96231b3b80d8

[opaque pointer types] Pass value type to LoadInst creation.

This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352911 91177308-0d34-0410-b5e6-96231b3b80d8

[opaque pointer types] Pass function types to InvokeInst creation.

This cleans up all InvokeInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352910 91177308-0d34-0410-b5e6-96231b3b80d8

[opaque pointer types] Pass function types to CallInst creation.

This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57170

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352909 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Expand Windows test (NFC)

Run checks for Win64 as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352908 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Create regular archives for the sanitizer runtimes.

We'll need to do this eventually if we create an installable package.
For now, this lets me use the archives to build Android, whose build
system wants to copy the archives to another location.

Differential Revision: https://reviews.llvm.org/D57607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352907 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Extra null-checking on TFE/LWE support

- If that operand is not ConstantInt, skip enabling TFE/LWE.

Differential Revision: https://reviews.llvm.org/D57539

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352904 91177308-0d34-0410-b5e6-96231b3b80d8

Hopefully fix a couple more sphinx doc errors.

These seem to only appear on the buildbot runner, and it looks like we
tried to suppress them, but it's not working. Not sure why.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352903 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] - llvm-objdump can skip bytes at the end of a section.

Differential Revision: https://reviews.llvm.org/D57549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352900 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a bug in the definition of isUnordered on MachineMemOperand

Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. GlobalISEL keeps the two separate, but currently doesn't know how to lower an atomic G_LOAD at all. See https://reviews.llvm.org/D57601 for context.

The definition used for unordered was only checking volatility, not atomicity. As noted above, all atomic MMOs are currently also volatile, so this is a latent bug only. Copy the definition used in IR, after auditing the two (2) uses of the function to be sure the desired semantics are the same.

Differential Revision: https://reviews.llvm.org/D57593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352898 91177308-0d34-0410-b5e6-96231b3b80d8

test commit (add blank line) NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352897 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Add a flag to dump just the section-to-segment mapping.

Summary:
The following patch introduces a new function `printSectionMapping` which is responsible for dumping just the section-to-segment mapping.
This patch also introduces a n option `-section-mapping` that outputs that mapping without the program headers.

Previously, this functionality was controlled by `printProgramHeaders`, and the output from `-program-headers` has not been changed. I am happy to change the option name, I copied the name that was displayed when outputting the mapping table.

Reviewers: khemant, jhenderson, grimar, rupprecht

Reviewed By: jhenderson, grimar, rupprecht

Subscribers: rupprecht, jhenderson, llvm-commits

Differential Revision: https://reviews.llvm.org/D57365

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352896 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor test checks (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352895 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Add a missing dependency from llvm/test to llvm-lit

check-llvm already listed llvm-lit as script which counts as a dep, so running
check-llvm worked fine, but `ninja -C out/gn llvm/test` didn't build llvm-lit
before if it wasn't already there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352893 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Expand Windows test (NFC)

Add checks for Win64 to existing cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352892 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-nm] Report '.comment' ELF sections as 'n' instead of '?'

Summary:
The previous implementation reported `.comment` sections as '?'
GNU uses 'n' which means "The symbol is a debugging symbol." `.note` sections are represented as 'n' too.

The test related to this change was updated to CHECK-NEXT to ensure
order and that we did not miss any symbols in the dump.

Reviewers: jhenderson

Reviewed By: jhenderson

Subscribers: rupprecht, llvm-commits

Differential Revision: https://reviews.llvm.org/D57544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352891 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF v5] Fix DWARF emitter and consumer to produce/expect a uleb for a location description's length.

Reviewer: davide, JDevliegere

Differential Revision: https://reviews.llvm.org/D57550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352889 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][NFC] More error propagation (executeObjcopyOnArchive)

Summary:
Replace some reportError() calls with error propagation that was missed from rL352625.

Note this also adds an error check during Archive iteration that was being hidden by a different error check before:

```
  for (const Archive::Child &Child : Ar.children(Err)) {
    Expected<std::unique_ptr<Binary>> ChildOrErr = Child.getAsBinary();
    if (!ChildOrErr)
      // This aborts, so Err is never checked
      reportError(Ar.getFileName(), ChildOrErr.takeError());
```

Err is being checked after the loop, so during happy runs, everything is fine. But when reportError is changed to return the error instead of aborting, the fact that Err is never checked is now noticed in tests that trigger an error during the loop.

Reviewers: jhenderson, dblaikie, alexshap

Reviewed By: dblaikie

Subscribers: llvm-commits, lhames, jakehehrlich

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57462

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352888 91177308-0d34-0410-b5e6-96231b3b80d8

Fix some sphinx doc errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352887 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Refactor test checks (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352886 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix for vector element insertion

Summary:
Incorrect code was generated when lowering insertelement operations
for vectors with 8 or 16 bit elements. The value being inserted was
not adjusted for the position of the element within the 32 bit word
and so only the low element within each 32 bit word could receive
the intended value.

Fixed by simply replicating the value to each element of a
congruent vector before the mask and or operation used to
update the intended element.

A number of affected LIT tests have been updated appropriately.

before the mask & or into the intended

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57588

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352885 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] improve variable names; NFC

The version of FoldConstantArithmetic() that takes arbitrary nodes
was confusingly naming those nodes as constants when they might
not be; also "Cst" reads like "Cast".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352884 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffle

As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section.

For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts).

Differential Revision: https://reviews.llvm.org/D56784

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352883 91177308-0d34-0410-b5e6-96231b3b80d8

[TargetLowering] try harder to determine undef elements of vector binops

This might be the start of tracking all vector element constants generally if we take it to its
logical conclusion, but let's stop here and make sure this is correct/beneficial so far.

The affected tests require a convoluted path before they get simplified currently because we
don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands
directly in SimplifyDemandedVectorElts().

That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So
something like this is happening:

1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that
originates from a shuffle.
2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes.
3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again.
4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing
those to become full undef constants.
5. Simplify the binop now that it has a full undef operand.

As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution.
We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to
track NaN for FP ops too (assuming we don't have fast-math-flags set).

Differential Revision: https://reviews.llvm.org/D57066

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352880 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Combine INSERT_SUBVECTOR(SRC0, BITCAST(SHUFFLE(EXTRACT_SUBVECTOR(SRC1)))

Enable peeking through one use bitcasts to the subvector shuffle.

This still depends on the subvector being the same scalar-size but D57514 has already helped with the more tricky patterns

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352879 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-strip] Implement --keep-file-symbols

Differential revision: https://reviews.llvm.org/D57582

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352878 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy][NFC] Propagate errors in removeSymbols/removeSectionReferences

Reviewers: jhenderson, alexshap, jakehehrlich, espindola

Reviewed By: jhenderson

Subscribers: emaste, arichardson, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352877 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce duplicate code; NFC

An unused variable problem was introduced with rL352870
and stubbed out with rL352871, but we can make a better
fix by actually using the local variable in code rather
than just the assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352873 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=off

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352871 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] try to reduce x86 addcarry to generic uaddo intrinsic

If we can reduce the x86-specific intrinsic to the generic op, it allows existing
simplifications and value tracking folds. AFAICT, this always results in identical
x86 codegen in the non-reduced case...which should be true because we semi-generically
(too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must
already combine/lower this intrinsic as expected.

This isn't quite what was requested in:
https://bugs.llvm.org/show_bug.cgi?id=40486
...but we want to have these kinds of folds early for efficiency and to enable greater
simplifications. For the case in the bug report where we have:
_addcarry_u64(0, ahi, 0, &ahi)
...this gets completely simplified away in IR.

Differential Revision: https://reviews.llvm.org/D57453

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352870 91177308-0d34-0410-b5e6-96231b3b80d8