]> granicus.if.org Git - llvm/log
llvm
5 years ago[lit] Extend internal diff to support -U
Joel E. Denny [Mon, 14 Oct 2019 19:59:30 +0000 (19:59 +0000)]
[lit] Extend internal diff to support -U

When using lit's internal shell, RUN lines like the following
accidentally execute an external `diff` instead of lit's internal
`diff`:

```
 # RUN: program | diff -U1 file -
```

Such cases exist now, in `clang/test/Analysis` for example.  We are
preparing patches to ensure lit's internal `diff` is called in such
cases, which will then fail because lit's internal `diff` doesn't
recognize `-U` as a command-line option.  This patch adds `-U`
support.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D68668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374814 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Tests] Add a test demonstrating a miscompile in the off-by-default loop-pred transform
Philip Reames [Mon, 14 Oct 2019 19:49:40 +0000 (19:49 +0000)]
[Tests] Add a test demonstrating a miscompile in the off-by-default loop-pred transform

Credit goes to Evgeny Brevnov for figuring out the problematic case.

Fuzzing probably also found it (lots of failures), but due to some silly infrastructure problems I hadn't gotten to the results before Evgeny hand reduced it from a benchmark.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374812 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LoopIdiom] BCmp: loop exit count must not be wider than size_t that `bcmp` takes
Roman Lebedev [Mon, 14 Oct 2019 19:46:34 +0000 (19:46 +0000)]
[LoopIdiom] BCmp: loop exit count must not be wider than size_t that `bcmp` takes

As reported by Joerg Sonnenberger in IRC, for 32-bit systems,
where pointer and size_t are 32-bit, if you use 64-bit-wide variable
in the loop, you could end up with loop exit count being of the type
wider than the size_t. Now, i'm not sure if we can produce `bcmp`
from that (just truncate?), but we certainly should not assert/miscompile.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374811 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ASan] Fix IRTests/InstructionsTest.UnaryOperator
Cameron McInally [Mon, 14 Oct 2019 19:17:31 +0000 (19:17 +0000)]
[ASan] Fix IRTests/InstructionsTest.UnaryOperator

Fix ASan regression from r374782.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374808 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Tests] Add a few more tests for idioms with FP induction variables
Philip Reames [Mon, 14 Oct 2019 19:10:39 +0000 (19:10 +0000)]
[Tests] Add a few more tests for idioms with FP induction variables

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374807 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ThinLTO] Fix printing of NoInline function summary flag
Teresa Johnson [Mon, 14 Oct 2019 18:37:31 +0000 (18:37 +0000)]
[ThinLTO] Fix printing of NoInline function summary flag

Summary:
The guard for printing function flags in the summary was not checking
the NoInline flag.

Reviewers: wmi

Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374802 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU: Fix redundant setting of m0 for atomic load/store
Matt Arsenault [Mon, 14 Oct 2019 18:30:31 +0000 (18:30 +0000)]
AMDGPU: Fix redundant setting of m0 for atomic load/store

Atomic load/store would have their setting of m0 handled twice, which
happened to be optimized out later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374801 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAMDGPU: Remove unnecessary IR from test
Matt Arsenault [Mon, 14 Oct 2019 18:30:29 +0000 (18:30 +0000)]
AMDGPU: Remove unnecessary IR from test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374800 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoFix copy-pasto in r374759
Hans Wennborg [Mon, 14 Oct 2019 17:52:31 +0000 (17:52 +0000)]
Fix copy-pasto in r374759

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374796 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-objdump] Adjust spacing and field width for --section-headers
Jordan Rupprecht [Mon, 14 Oct 2019 17:47:17 +0000 (17:47 +0000)]
[llvm-objdump] Adjust spacing and field width for --section-headers

Summary:
- Expand the "Name" column past 13 characters when any of the section names are longer. Current behavior is a staggard output instead of a nice table if a single name is longer.
- Only print the required number of hex chars for addresses (i.e. 8 characters for 32-bit, 16 characters for 64-bit)
- Fix trailing spaces

Reviewers: grimar, jhenderson, espindola

Reviewed By: grimar

Subscribers: emaste, sbc100, arichardson, aheejin, seiya, llvm-commits, MaskRay

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68730

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374795 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdd FMF to vector ops for phi
Michael Berg [Mon, 14 Oct 2019 17:39:32 +0000 (17:39 +0000)]
Add FMF to vector ops for phi

Summary: Small amendment to handle vector cases for D67564.

Reviewers: spatel, eli.friedman, hfinkel, cameron.mcinally, arsenm, jmolloy, bogner

Reviewed By: cameron.mcinally, bogner

Subscribers: llvm-commits, efriedma, reames, bogner, wdng

Differential Revision: https://reviews.llvm.org/D68748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374794 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoReapply: [llvm-size] Tidy up error messages (PR42970)
Jordan Rupprecht [Mon, 14 Oct 2019 17:29:15 +0000 (17:29 +0000)]
Reapply: [llvm-size] Tidy up error messages (PR42970)

Clean up some formatting inconsistencies in the error messages and correctly exit with non-zero in all error cases.

Originally submitted as r374771 and then reverted as r374780, this patch fixes the libObject test case in Object/macho-invalid.test.

Patch by Alex Cameron

Differential Revision: https://reviews.llvm.org/D68906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374793 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-profdata] Weaken "malformed-ptr-to-counter-array.test" to appease arm bots
Vedant Kumar [Mon, 14 Oct 2019 17:20:22 +0000 (17:20 +0000)]
[llvm-profdata] Weaken "malformed-ptr-to-counter-array.test" to appease arm bots

There are a number arm bots failing after r374617 landed, and I'm not
sure why. It looks a bit like the error message llvm-profdata is
expected to print to stderr isn't flushed.

Weaken the test in an attempt to appease the arm bots: if this doesn't
work, that means that llvm-profdata is actually *not failing*, and that
will be a clear indication that some logic error is actually happening.

http://lab.llvm.org:8011/builders/clang-cmake-armv7-global-isel/builds/5604/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374792 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NVPTX] Restructure shfl instrinsics and add variants that return a predicate.
Artem Belevich [Mon, 14 Oct 2019 16:53:34 +0000 (16:53 +0000)]
[NVPTX] Restructure shfl instrinsics and add variants that return a predicate.

Also, amend constraints for non-sync variants that are no longer
available on sm_70+ with PTX6.4+.

Differential Revision: https://reviews.llvm.org/D68892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374790 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoBitsInit::resolveReferences - silence static analyzer null dereference warning. NFCI.
Simon Pilgrim [Mon, 14 Oct 2019 16:46:21 +0000 (16:46 +0000)]
BitsInit::resolveReferences - silence static analyzer null dereference warning. NFCI.

The static analyzer is warning about a potential null dereference, assert to check that the loop has set the cached pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374789 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoXCOFFObjectWriter - silence static analyzer dyn_cast<> null dereference warning....
Simon Pilgrim [Mon, 14 Oct 2019 16:46:11 +0000 (16:46 +0000)]
XCOFFObjectWriter - silence static analyzer dyn_cast<> null dereference warning. NFCI.

The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374788 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CostModel][X86] Add CTLZ scalar costs
Simon Pilgrim [Mon, 14 Oct 2019 16:30:17 +0000 (16:30 +0000)]
[CostModel][X86] Add CTLZ scalar costs

Add specific scalar costs for CTLZ instructions, we can't discriminate between CTLZ and CTLZ_ZERO_UNDEF so we have to assume the worst. Given how BSR is often a microcoded nightmare on some older targets we might still be underestimating it.

For targets supporting LZCNT (Intel Haswell+ or AMD Fam10+), we provide overrides that assume 1cy costs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374786 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoReapply r374743 with a fix for the ocaml binding
Joerg Sonnenberger [Mon, 14 Oct 2019 16:15:14 +0000 (16:15 +0000)]
Reapply r374743 with a fix for the ocaml binding

Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374784 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[x86] adjust select to sra tests; NFC
Sanjay Patel [Mon, 14 Oct 2019 15:53:55 +0000 (15:53 +0000)]
[x86] adjust select to sra tests; NFC

Avoid demanded-bits-based specializations (that may not be ideal,
but that's another problem).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374783 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator
Cameron McInally [Mon, 14 Oct 2019 15:35:01 +0000 (15:35 +0000)]
[IRBuilder] Update IRBuilder::CreateFNeg(...) to return a UnaryOperator

Reapply r374240 with fix for Ocaml test, namely Bindings/OCaml/core.ml.

Differential Revision: https://reviews.llvm.org/D61675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374782 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Selection for MVE VMOVN
David Green [Mon, 14 Oct 2019 15:19:33 +0000 (15:19 +0000)]
[ARM] Selection for MVE VMOVN

The adds both VMOVNt and VMOVNb instruction selection from the appropriate
shuffles. We detect shuffle masks of the form:
0, N, 2, N+2, 4, N+4, ...
or
0, N+1, 2, N+3, 4, N+5, ...
ISel will also try the opposite patterns, with inputs reversed. These are
selected to VMOVNt and VMOVNb respectively.

Differential Revision: https://reviews.llvm.org/D68283

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374781 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374771 "[llvm-size] Tidy up error messages (PR42970)"
Nico Weber [Mon, 14 Oct 2019 14:44:26 +0000 (14:44 +0000)]
Revert r374771 "[llvm-size] Tidy up error messages (PR42970)"

This reverts commit 83e52f5e1150018329b8907bb014c77ac382d611.

Makes Object/macho-invalid.test fail everywhere, e.g. here:
http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/23669/steps/test-llvm/logs/FAIL%3A%20LLVM%3A%3Amacho-invalid.test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374780 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[x86] add tests for possible select to sra transforms; NFC
Sanjay Patel [Mon, 14 Oct 2019 14:43:06 +0000 (14:43 +0000)]
[x86] add tests for possible select to sra transforms; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374779 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ARM] Add some VMOVN tests. NFC
David Green [Mon, 14 Oct 2019 14:29:26 +0000 (14:29 +0000)]
[ARM] Add some VMOVN tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374777 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CostModel][X86] Add CTPOP scalar costs (PR43656)
Simon Pilgrim [Mon, 14 Oct 2019 14:07:43 +0000 (14:07 +0000)]
[CostModel][X86] Add CTPOP scalar costs (PR43656)

Add specific scalar costs for ctpop instructions, these are based on the llvm-mca's SLM throughput numbers (the oldest model we have).

For targets supporting POPCNT, we provide overrides that assume 1cy costs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374775 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Alignment][NFC] Move and type functions from MathExtras to Alignment
Guillaume Chatelet [Mon, 14 Oct 2019 13:14:34 +0000 (13:14 +0000)]
[Alignment][NFC] Move and type functions from MathExtras to Alignment

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68942

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374773 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[AArch64] Stackframe accesses to SVE objects.
Sander de Smalen [Mon, 14 Oct 2019 13:11:34 +0000 (13:11 +0000)]
[AArch64] Stackframe accesses to SVE objects.

Materialize accesses to SVE frame objects from SP or FP, whichever is
available and beneficial.

This patch still assumes the objects are pre-allocated. The automatic
layout of SVE objects within the stackframe will be added in a separate
patch.

Reviewers: greened, cameron.mcinally, efriedma, rengolin, thegameg, rovka

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D67749

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374772 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[llvm-size] Tidy up error messages (PR42970)
Fangrui Song [Mon, 14 Oct 2019 12:51:47 +0000 (12:51 +0000)]
[llvm-size] Tidy up error messages (PR42970)

Clean up some formatting inconsistencies in the error messages and correctly exit with non-zero in all error cases.

Differential Revision: https://reviews.llvm.org/D68906
Patch by Alex Cameron

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374771 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[DebugInfo] Fix truncation of call site immediates
David Stenberg [Mon, 14 Oct 2019 12:49:58 +0000 (12:49 +0000)]
[DebugInfo] Fix truncation of call site immediates

Summary:
This addresses a bug in collectCallSiteParameters() where call site
immediates would be truncated from int64_t to unsigned.

This fixes PR43525.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: aprantl

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D68869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374770 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert "Add a pass to lower is.constant and objectsize intrinsics"
Dmitri Gribenko [Mon, 14 Oct 2019 12:22:48 +0000 (12:22 +0000)]
Revert "Add a pass to lower is.constant and objectsize intrinsics"

This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374768 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[AMDGPU] Come back patch for the 'Assign register class for cross block values accord...
Alexander Timofeev [Mon, 14 Oct 2019 12:01:10 +0000 (12:01 +0000)]
[AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.'

  Detailed description:

    After https://reviews.llvm.org/D59990 submit several issues were discovered.
    Changes in common code were preserved but AMDGPU specific part was reverted to keep the backend working correctly.

    Discovered issues were addressed in the following commits:

    https://reviews.llvm.org/D67662
    https://reviews.llvm.org/D67101
    https://reviews.llvm.org/D63953
    https://reviews.llvm.org/D63731

    This change brings back AMDGPU specific changes.

  Reviewed by: rampitec, arsenm

  Differential Revision: https://reviews.llvm.org/D68635

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374767 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoFixing typo in llvm/IR/Intrinsics.td
Victor Campos [Mon, 14 Oct 2019 11:12:23 +0000 (11:12 +0000)]
Fixing typo in llvm/IR/Intrinsics.td

Fixing typo in comment line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374766 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86][BtVer2] Improved latency and throughput of float/vector loads and stores.
Andrea Di Biagio [Mon, 14 Oct 2019 11:12:18 +0000 (11:12 +0000)]
[X86][BtVer2] Improved latency and throughput of float/vector loads and stores.

This patch introduces the following changes to the btver2 scheduling model:

- The number of micro opcodes for YMM loads and stores is now 2 (it was
  incorrectly set to 1 for both aligned and misaligned loads/stores).

- Increased the number of AGU resource cycles for YMM loads and stores
  to 2cy (instead of 1cy).

- Removed JFPU01 and JFPX from the list of resources consumed by pure
  float/vector loads (no MMX).

I verified with llvm-exegesis that pure XMM/YMM loads are no-pipe. Those
are dispatched to the FPU but not really issues on JFPU01.

Differential Revision: https://reviews.llvm.org/D68871

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374765 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC][TTI] Add Alignment for isLegalMasked[Load/Store]
Sam Parker [Mon, 14 Oct 2019 10:00:21 +0000 (10:00 +0000)]
[NFC][TTI] Add Alignment for isLegalMasked[Load/Store]

Add an extra parameter so the backend can take the alignment into
consideration.

Differential Revision: https://reviews.llvm.org/D68400

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374763 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoFix D68936
Guillaume Chatelet [Mon, 14 Oct 2019 09:31:00 +0000 (09:31 +0000)]
Fix D68936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374761 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agobuild_llvm_package.bat: Run check-clang-tools and check-clangd tests.
Hans Wennborg [Mon, 14 Oct 2019 09:08:57 +0000 (09:08 +0000)]
build_llvm_package.bat: Run check-clang-tools and check-clangd tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374759 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Alignment][NFC] Support compile time constants
Guillaume Chatelet [Mon, 14 Oct 2019 09:04:15 +0000 (09:04 +0000)]
[Alignment][NFC] Support compile time constants

Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68936

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374758 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Teach EmitTest to handle ISD::SSUBO/USUBO in order to use the Z flag from the...
Craig Topper [Mon, 14 Oct 2019 06:47:56 +0000 (06:47 +0000)]
[X86] Teach EmitTest to handle ISD::SSUBO/USUBO in order to use the Z flag from the subtract directly during isel.

This prevents isel from emitting a TEST instruction that
optimizeCompareInstr will need to remove later.

In some of the modified tests, the SUB gets duplicated due to
the flags being needed in two places and being clobbered in
between. optimizeCompareInstr was able to optimize away the TEST
that was using the result of one of them, but optimizeCompareInstr
doesn't know to turn SUB into CMP after removing the TEST. It
only knows how to turn SUB into CMP if the result was already
dead.

With this change the TEST never exists, so optimizeCompareInstr
doesn't have to remove it. Then it can just turn the SUB into
CMP immediately.

Fixes PR43649.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374755 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Autogenerate complete checks. NFC
Craig Topper [Mon, 14 Oct 2019 01:41:04 +0000 (01:41 +0000)]
[X86] Autogenerate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374748 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NewGVN] Use m_Br to simplify code a bit. (NFC)
Florian Hahn [Sun, 13 Oct 2019 23:34:13 +0000 (23:34 +0000)]
[NewGVN] Use m_Br to simplify code a bit. (NFC)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374744 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoAdd a pass to lower is.constant and objectsize intrinsics
Joerg Sonnenberger [Sun, 13 Oct 2019 23:00:15 +0000 (23:00 +0000)]
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374743 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agomerge-request.sh: Update 9.0 metabug for 9.0.1
Simon Atanasyan [Sun, 13 Oct 2019 22:10:06 +0000 (22:10 +0000)]
merge-request.sh: Update 9.0 metabug for 9.0.1

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374741 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor] Shortcut no-return through will-return
Johannes Doerfert [Sun, 13 Oct 2019 21:25:53 +0000 (21:25 +0000)]
[Attributor] Shortcut no-return through will-return

No-return and will-return are exclusive, assuming the latter is more
prominent we can avoid updates of the former unless will-return is not
known for sure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374739 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] NullPointerIsDefined needs the pointer AS (AANonNull)
Johannes Doerfert [Sun, 13 Oct 2019 20:48:26 +0000 (20:48 +0000)]
[Attributor][FIX] NullPointerIsDefined needs the pointer AS (AANonNull)

Also includes a shortcut via AADereferenceable if possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374737 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][MemBehavior] Fallback to the function state for arguments
Johannes Doerfert [Sun, 13 Oct 2019 20:47:16 +0000 (20:47 +0000)]
[Attributor][MemBehavior] Fallback to the function state for arguments

Even if an argument is captured, we cannot have an effect the function
does not have. This is fine except for the special case of `inalloca` as
it does not behave by the rules.

TODO: Maybe the special rule for `inalloca` is wrong after all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374736 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Use check prefix that is actually tested
Johannes Doerfert [Sun, 13 Oct 2019 20:40:10 +0000 (20:40 +0000)]
[Attributor][FIX] Use check prefix that is actually tested

Summary:
This changes "CHECK" check lines to "ATTRIBUTOR" check lines where
necessary and also fixes the now exposed, mostly minor, problems.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68929

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374735 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC][InstCombine] Some preparatory cleanup in dropRedundantMaskingOfLeftShiftInput()
Roman Lebedev [Sun, 13 Oct 2019 20:15:00 +0000 (20:15 +0000)]
[NFC][InstCombine] Some preparatory cleanup in dropRedundantMaskingOfLeftShiftInput()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374734 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Docs] Moves Control Flow Document to User Guides
DeForest Richards [Sun, 13 Oct 2019 20:05:22 +0000 (20:05 +0000)]
[Docs] Moves Control Flow Document to User Guides

Moves Control Flow document from Reference docs page to User guides page.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374733 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] getTargetShuffleInputs - Control KnownUndef mask element resolution as well...
Simon Pilgrim [Sun, 13 Oct 2019 19:35:35 +0000 (19:35 +0000)]
[X86] getTargetShuffleInputs - Control KnownUndef mask element resolution as well as KnownZero.

We were already controlling whether the KnownZero elements were being written to the target mask, this extends it to the KnownUndef elements as well so we can prevent the target shuffle mask being manipulated at all.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374732 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Enable use of avx512 saturating truncate instructions in more cases.
Craig Topper [Sun, 13 Oct 2019 19:07:28 +0000 (19:07 +0000)]
[X86] Enable use of avx512 saturating truncate instructions in more cases.

This enables use of the saturating truncate instructions when the
result type is less than 128 bits. It also enables the use of
saturating truncate instructions on KNL when the input is less
than 512 bits. We can do this by widening the input and then
extracting the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374731 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[ConstantFold] fix inconsistent handling of extractelement with undef index (PR42689)
Sanjay Patel [Sun, 13 Oct 2019 17:34:08 +0000 (17:34 +0000)]
[ConstantFold] fix inconsistent handling of extractelement with undef index (PR42689)

Any constant other than zero was already folded to undef if the index is undef.
https://bugs.llvm.org/show_bug.cgi?id=42689

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374729 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-defaul...
Sanjay Patel [Sun, 13 Oct 2019 17:19:08 +0000 (17:19 +0000)]
[InstCombine] don't assume 'inbounds' for bitcast deref or null pointer in non-default address space

Follow-up to D68244 to account for a corner case discussed in:
https://bugs.llvm.org/show_bug.cgi?id=43501

Add one more restriction: if the pointer is deref-or-null and in a non-default
(non-zero) address space, we can't assume inbounds.

Differential Revision: https://reviews.llvm.org/D68706

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374728 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595)
Roman Lebedev [Sun, 13 Oct 2019 17:11:16 +0000 (17:11 +0000)]
[NFC][InstCombine] More test for "sign bit test via shifts" pattern (PR43595)

While that pattern is indirectly handled via
reassociateShiftAmtsOfTwoSameDirectionShifts(),
that incursme one-use restriction on truncation,
which is pointless since we know that we'll produce a single instruction.

Additionally, *if* we are only looking for sign bit,
we don't need shifts to be identical,
which isn't the case in general,
and is the blocker for me in bug in question:

https://bugs.llvm.org/show_bug.cgi?id=43595

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374726 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with...
Simon Pilgrim [Sun, 13 Oct 2019 17:03:11 +0000 (17:03 +0000)]
[X86] SimplifyMultipleUseDemandedBitsForTargetNode - use getTargetShuffleInputs with KnownUndef/Zero results.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374725 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] getTargetShuffleInputs - add KnownUndef/Zero output support
Simon Pilgrim [Sun, 13 Oct 2019 17:03:02 +0000 (17:03 +0000)]
[X86] getTargetShuffleInputs - add KnownUndef/Zero output support

Adjust SimplifyDemandedVectorEltsForTargetNode to use the known elts masks instead of recomputing it locally.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374724 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogn build: (manually) merge r374720
Nico Weber [Sun, 13 Oct 2019 15:25:13 +0000 (15:25 +0000)]
gn build: (manually) merge r374720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374721 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86][AVX] Add i686 avx splat tests
Simon Pilgrim [Sun, 13 Oct 2019 13:18:07 +0000 (13:18 +0000)]
[X86][AVX] Add i686 avx splat tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374719 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoIRTranslator - silence static analyzer null dereference warnings. NFCI.
Simon Pilgrim [Sun, 13 Oct 2019 11:29:35 +0000 (11:29 +0000)]
IRTranslator - silence static analyzer null dereference warnings. NFCI.

The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374716 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogn build: Merge r374707
GN Sync Bot [Sun, 13 Oct 2019 08:33:14 +0000 (08:33 +0000)]
gn build: Merge r374707

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374708 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add a one use check on the setcc to the min/max canonicalization code in combin...
Craig Topper [Sun, 13 Oct 2019 06:48:05 +0000 (06:48 +0000)]
[X86] Add a one use check on the setcc to the min/max canonicalization code in combineSelect.

This seems to improve std::midpoint code where we have a min and
a max with the same condition. If we split the setcc we can end
up with two compares if the one of the operands is a constant.
Since we aggressively canonicalize compares with constants.
For non-constants it can interfere with our ability to share
control flow if we need to expand cmovs into control flow.

I'm also not sure I understand this min/max canonicalization code.
The motivating case talks about comparing with 0. But we don't
check for 0 explicitly.

Removes one instruction from the codegen for PR43658.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374706 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Enable v4i32->v4i16 and v8i16->v8i8 saturating truncates to use pack instructio...
Craig Topper [Sun, 13 Oct 2019 05:47:47 +0000 (05:47 +0000)]
[X86] Enable v4i32->v4i16 and v8i16->v8i8 saturating truncates to use pack instructions with avx512.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374705 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Add v2i64->v2i32/v2i16/v2i8 test cases to the trunc packus/ssat/usat tests...
Craig Topper [Sun, 13 Oct 2019 05:47:42 +0000 (05:47 +0000)]
[X86] Add v2i64->v2i32/v2i16/v2i8 test cases to the trunc packus/ssat/usat tests. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374704 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Avoid splitting blocks if possible
Johannes Doerfert [Sun, 13 Oct 2019 05:27:09 +0000 (05:27 +0000)]
[Attributor][FIX] Avoid splitting blocks if possible

Before, we eagerly split blocks even if it was not necessary, e.g., they
had a single unreachable instruction and only a single predecessor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374703 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Remove leftover, now unused, variable
Johannes Doerfert [Sun, 13 Oct 2019 05:19:17 +0000 (05:19 +0000)]
[Attributor][FIX] Remove leftover, now unused, variable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374702 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor] Remove unused verification flag
Johannes Doerfert [Sun, 13 Oct 2019 05:07:00 +0000 (05:07 +0000)]
[Attributor] Remove unused verification flag

We use the verify max iteration now which is more reliable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374701 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][NFC] Expose call site traversal without QueryingAA
Johannes Doerfert [Sun, 13 Oct 2019 04:16:02 +0000 (04:16 +0000)]
[Attributor][NFC] Expose call site traversal without QueryingAA

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374700 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Ensure h2s doesn't trigger on escaped pointers
Johannes Doerfert [Sun, 13 Oct 2019 04:14:15 +0000 (04:14 +0000)]
[Attributor][FIX] Ensure h2s doesn't trigger on escaped pointers

We do not yet perform h2s because we know something is free'ed but we do
it because we know the pointer does not escape. Storing the pointer
allows it to escape so we have to prevent that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374699 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Do not apply h2s for arbitrary mallocs
Johannes Doerfert [Sun, 13 Oct 2019 03:54:08 +0000 (03:54 +0000)]
[Attributor][FIX] Do not apply h2s for arbitrary mallocs

H2S did apply to mallocs of non-constant sizes if the uses were OK. This
is now forbidden through reording of the "good" and "bad" cases in the
conditional.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374698 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Add missing function declaration in test case
Johannes Doerfert [Sun, 13 Oct 2019 02:42:09 +0000 (02:42 +0000)]
[Attributor][FIX] Add missing function declaration in test case

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374696 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[Attributor][FIX] Avoid modifying naked/optnone functions
Johannes Doerfert [Sun, 13 Oct 2019 02:24:02 +0000 (02:24 +0000)]
[Attributor][FIX] Avoid modifying naked/optnone functions

The check for naked/optnone was insufficient for different reasons. We
now check before we initialize an abstract attribute and we do it for
all abstract attributes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374694 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[SROA] Reuse existing lifetime markers if possible
Johannes Doerfert [Sun, 13 Oct 2019 02:21:23 +0000 (02:21 +0000)]
[SROA] Reuse existing lifetime markers if possible

Summary:
If the underlying alloca did not change, we do not necessarily need new
lifetime markers. This patch adds a check and reuses the old ones if
possible.

Reviewers: reames, ssarda, t.p.northover, hfinkel

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68900

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374692 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374663 "[clang-format] Proposal for clang-format to give compiler style warnings"
Nico Weber [Sat, 12 Oct 2019 22:58:34 +0000 (22:58 +0000)]
Revert r374663 "[clang-format] Proposal for clang-format to give compiler style warnings"

The test fails on macOS and looks a bit wrong, see comments on the review.

Also revert follow-up r374686.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374688 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agogn build: (manually) merge r374663
Nico Weber [Sat, 12 Oct 2019 22:24:56 +0000 (22:24 +0000)]
gn build: (manually) merge r374663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374686 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374648: "Reland r374388: [lit] Make internal diff work in pipelines"
Joel E. Denny [Sat, 12 Oct 2019 18:52:46 +0000 (18:52 +0000)]
Revert r374648: "Reland r374388: [lit] Make internal diff work in pipelines"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374683 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374649: "Reland r374389: [lit] Clean up internal diff's encoding handling"
Joel E. Denny [Sat, 12 Oct 2019 18:52:31 +0000 (18:52 +0000)]
Revert r374649: "Reland r374389: [lit] Clean up internal diff's encoding handling"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374682 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374650: "Reland r374390: [lit] Extend internal diff to support `-` argument"
Joel E. Denny [Sat, 12 Oct 2019 18:52:18 +0000 (18:52 +0000)]
Revert r374650: "Reland r374390: [lit] Extend internal diff to support `-` argument"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374681 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert 374651: "Reland r374392: [lit] Extend internal diff to support -U"
Joel E. Denny [Sat, 12 Oct 2019 18:52:05 +0000 (18:52 +0000)]
Revert 374651: "Reland r374392: [lit] Extend internal diff to support -U"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374680 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it"
Joel E. Denny [Sat, 12 Oct 2019 18:51:51 +0000 (18:51 +0000)]
Revert r374652: "[lit] Fix internal diff's --strip-trailing-cr and use it"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374679 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374653: "[lit] Fix a few oversights in r374651 that broke some bots"
Joel E. Denny [Sat, 12 Oct 2019 18:51:34 +0000 (18:51 +0000)]
Revert r374653: "[lit] Fix a few oversights in r374651 that broke some bots"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374678 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots"
Joel E. Denny [Sat, 12 Oct 2019 18:51:18 +0000 (18:51 +0000)]
Revert r374665: "[lit] Try yet again to fix new tests that fail on Windows bots"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374677 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374666: "[lit] Adjust error handling for decode introduced by r374665"
Joel E. Denny [Sat, 12 Oct 2019 18:51:08 +0000 (18:51 +0000)]
Revert r374666: "[lit] Adjust error handling for decode introduced by r374665"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374676 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374671: "[lit] Try errors="ignore" for decode introduced by r374665"
Joel E. Denny [Sat, 12 Oct 2019 18:50:57 +0000 (18:50 +0000)]
Revert r374671: "[lit] Try errors="ignore" for decode introduced by r374665"

This series of patches still breaks a Windows bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374675 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings
Simon Pilgrim [Sat, 12 Oct 2019 18:33:47 +0000 (18:33 +0000)]
[X86] scaleShuffleMask - use size_t Scale to avoid overflow warnings

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374674 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoSymbolRecord - consistently use explicit for single operand constructors
Simon Pilgrim [Sat, 12 Oct 2019 17:55:09 +0000 (17:55 +0000)]
SymbolRecord - consistently use explicit for single operand constructors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374673 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoSymbolRecord - fix uninitialized variable warnings. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 17:55:01 +0000 (17:55 +0000)]
SymbolRecord - fix uninitialized variable warnings. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374672 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[lit] Try errors="ignore" for decode introduced by r374665
Joel E. Denny [Sat, 12 Oct 2019 17:23:25 +0000 (17:23 +0000)]
[lit] Try errors="ignore" for decode introduced by r374665

Still trying to fix the same error as in r374666.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374671 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC][LoopIdiom] Adjust FIXME to be self-explanatory
Roman Lebedev [Sat, 12 Oct 2019 16:48:16 +0000 (16:48 +0000)]
[NFC][LoopIdiom] Adjust FIXME to be self-explanatory

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374670 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoReplace for-loop of SmallVector::push_back with SmallVector::append. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 16:37:02 +0000 (16:37 +0000)]
Replace for-loop of SmallVector::push_back with SmallVector::append. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374669 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoFix cppcheck shadow variable name warnings. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 16:36:52 +0000 (16:36 +0000)]
Fix cppcheck shadow variable name warnings. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374668 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI.
Simon Pilgrim [Sat, 12 Oct 2019 16:36:44 +0000 (16:36 +0000)]
[X86] Use any_of/all_of patterns in shuffle mask pattern recognisers. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374667 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[lit] Adjust error handling for decode introduced by r374665
Joel E. Denny [Sat, 12 Oct 2019 16:25:46 +0000 (16:25 +0000)]
[lit] Adjust error handling for decode introduced by r374665

On that decode, Windows bots fail with:

```
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
```

That's the same error as before r374665 except it's now at the decode
before the write to stdout.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374666 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[lit] Try yet again to fix new tests that fail on Windows bots
Joel E. Denny [Sat, 12 Oct 2019 16:00:35 +0000 (16:00 +0000)]
[lit] Try yet again to fix new tests that fail on Windows bots

I seem to have misread the bot logs on my last attempt.  When lit's
internal diff runs on Windows under Python 2.7, it's text diffs not
binary diffs that need decoding to avoid this error when writing the
diff to stdout:

```
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
```

There is no `decode` attribute in this case under Python 3.6.8 under
Ubuntu, so this patch checks for the `decode` attribute before using
it here.  Hopefully nothing else is needed when `decode` isn't
available.

It might take a couple more attempts to figure out what error
handling, if any, is needed for this decoding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374665 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374657: "[lit] Try again to fix new tests that fail on Windows bots"
Joel E. Denny [Sat, 12 Oct 2019 16:00:25 +0000 (16:00 +0000)]
Revert r374657: "[lit] Try again to fix new tests that fail on Windows bots"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374664 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition
Roman Lebedev [Sat, 12 Oct 2019 15:35:32 +0000 (15:35 +0000)]
[LoopIdiomRecognize] Recommit: BCmp loop idiom recognition

Summary:
This is a recommit, this originally landed in rL370454 but was
subsequently reverted in  rL370788 due to
https://bugs.llvm.org/show_bug.cgi?id=43206
The reduced testcase was added to bcmp-negative-tests.ll
as @pr43206_different_loops - we must ensure that the SCEV's
we got are both for the same loop we are currently investigating.

Original commit message:

@mclow.lists brought up this issue up in IRC.
It is a reasonably common problem to compare some two values for equality.
Those may be just some integers, strings or arrays of integers.

In C, there is `memcmp()`, `bcmp()` functions.
In C++, there exists `std::equal()` algorithm.
One can also write that function manually.

libstdc++'s `std::equal()` is specialized to directly call `memcmp()` for
various types, but not `std::byte` from C++2a. https://godbolt.org/z/mx2ejJ

libc++ does not do anything like that, it simply relies on simple C++'s
`operator==()`. https://godbolt.org/z/er0Zwf (GOOD!)

So likely, there exists a certain performance opportunities.
Let's compare performance of naive `std::equal()` (no `memcmp()`) with one that
is using `memcmp()` (in this case, compiled with modified compiler). {F8768213}

```
#include <algorithm>
#include <cmath>
#include <cstdint>
#include <iterator>
#include <limits>
#include <random>
#include <type_traits>
#include <utility>
#include <vector>

#include "benchmark/benchmark.h"

template <class T>
bool equal(T* a, T* a_end, T* b) noexcept {
  for (; a != a_end; ++a, ++b) {
    if (*a != *b) return false;
  }
  return true;
}

template <typename T>
std::vector<T> getVectorOfRandomNumbers(size_t count) {
  std::random_device rd;
  std::mt19937 gen(rd());
  std::uniform_int_distribution<T> dis(std::numeric_limits<T>::min(),
                                       std::numeric_limits<T>::max());
  std::vector<T> v;
  v.reserve(count);
  std::generate_n(std::back_inserter(v), count,
                  [&dis, &gen]() { return dis(gen); });
  assert(v.size() == count);
  return v;
}

struct Identical {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto Tmp = getVectorOfRandomNumbers<T>(count);
    return std::make_pair(Tmp, std::move(Tmp));
  }
};

struct InequalHalfway {
  template <typename T>
  static std::pair<std::vector<T>, std::vector<T>> Gen(size_t count) {
    auto V0 = getVectorOfRandomNumbers<T>(count);
    auto V1 = V0;
    V1[V1.size() / size_t(2)]++;  // just change the value.
    return std::make_pair(std::move(V0), std::move(V1));
  }
};

template <class T, class Gen>
void BM_bcmp(benchmark::State& state) {
  const size_t Length = state.range(0);

  const std::pair<std::vector<T>, std::vector<T>> Data =
      Gen::template Gen<T>(Length);
  const std::vector<T>& a = Data.first;
  const std::vector<T>& b = Data.second;
  assert(a.size() == Length && b.size() == a.size());

  benchmark::ClobberMemory();
  benchmark::DoNotOptimize(a);
  benchmark::DoNotOptimize(a.data());
  benchmark::DoNotOptimize(b);
  benchmark::DoNotOptimize(b.data());

  for (auto _ : state) {
    const bool is_equal = equal(a.data(), a.data() + a.size(), b.data());
    benchmark::DoNotOptimize(is_equal);
  }
  state.SetComplexityN(Length);
  state.counters["eltcnt"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariant);
  state.counters["eltcnt/sec"] =
      benchmark::Counter(Length, benchmark::Counter::kIsIterationInvariantRate);
  const size_t BytesRead = 2 * sizeof(T) * Length;
  state.counters["bytes_read/iteration"] =
      benchmark::Counter(BytesRead, benchmark::Counter::kDefaults,
                         benchmark::Counter::OneK::kIs1024);
  state.counters["bytes_read/sec"] = benchmark::Counter(
      BytesRead, benchmark::Counter::kIsIterationInvariantRate,
      benchmark::Counter::OneK::kIs1024);
}

template <typename T>
static void CustomArguments(benchmark::internal::Benchmark* b) {
  const size_t L2SizeBytes = []() {
    for (const benchmark::CPUInfo::CacheInfo& I :
         benchmark::CPUInfo::Get().caches) {
      if (I.level == 2) return I.size;
    }
    return 0;
  }();
  // What is the largest range we can check to always fit within given L2 cache?
  const size_t MaxLen = L2SizeBytes / /*total bufs*/ 2 /
                        /*maximal elt size*/ sizeof(T) / /*safety margin*/ 2;
  b->RangeMultiplier(2)->Range(1, MaxLen)->Complexity(benchmark::oN);
}

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, Identical)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, Identical)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, Identical)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, Identical)
    ->Apply(CustomArguments<uint64_t>);

BENCHMARK_TEMPLATE(BM_bcmp, uint8_t, InequalHalfway)
    ->Apply(CustomArguments<uint8_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint16_t, InequalHalfway)
    ->Apply(CustomArguments<uint16_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint32_t, InequalHalfway)
    ->Apply(CustomArguments<uint32_t>);
BENCHMARK_TEMPLATE(BM_bcmp, uint64_t, InequalHalfway)
    ->Apply(CustomArguments<uint64_t>);
```
{F8768210}
```
$ ~/src/googlebenchmark/tools/compare.py --no-utest benchmarks build-{old,new}/test/llvm-bcmp-bench
RUNNING: build-old/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpb6PEUx
2019-04-25 21:17:11
Running build-old/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 0.65, 3.90, 4.14
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000           432131 ns       432101 ns         1613 bytes_read/iteration=1000k bytes_read/sec=2.20706G/s eltcnt=825.856M eltcnt/sec=1.18491G/s
BM_bcmp<uint8_t, Identical>_BigO               0.86 N          0.86 N
BM_bcmp<uint8_t, Identical>_RMS                   8 %             8 %
<...>
BM_bcmp<uint16_t, Identical>/256000          161408 ns       161409 ns         4027 bytes_read/iteration=1000k bytes_read/sec=5.90843G/s eltcnt=1030.91M eltcnt/sec=1.58603G/s
BM_bcmp<uint16_t, Identical>_BigO              0.67 N          0.67 N
BM_bcmp<uint16_t, Identical>_RMS                 25 %            25 %
<...>
BM_bcmp<uint32_t, Identical>/128000           81497 ns        81488 ns         8415 bytes_read/iteration=1000k bytes_read/sec=11.7032G/s eltcnt=1077.12M eltcnt/sec=1.57078G/s
BM_bcmp<uint32_t, Identical>_BigO              0.71 N          0.71 N
BM_bcmp<uint32_t, Identical>_RMS                 42 %            42 %
<...>
BM_bcmp<uint64_t, Identical>/64000            50138 ns        50138 ns        10909 bytes_read/iteration=1000k bytes_read/sec=19.0209G/s eltcnt=698.176M eltcnt/sec=1.27647G/s
BM_bcmp<uint64_t, Identical>_BigO              0.84 N          0.84 N
BM_bcmp<uint64_t, Identical>_RMS                 27 %            27 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000      192405 ns       192392 ns         3638 bytes_read/iteration=1000k bytes_read/sec=4.95694G/s eltcnt=1.86266G eltcnt/sec=2.66124G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.38 N          0.38 N
BM_bcmp<uint8_t, InequalHalfway>_RMS              3 %             3 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000     127858 ns       127860 ns         5477 bytes_read/iteration=1000k bytes_read/sec=7.45873G/s eltcnt=1.40211G eltcnt/sec=2.00219G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint16_t, InequalHalfway>_RMS             0 %             0 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000      49140 ns        49140 ns        14281 bytes_read/iteration=1000k bytes_read/sec=19.4072G/s eltcnt=1.82797G eltcnt/sec=2.60478G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.40 N          0.40 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            18 %            18 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000       32101 ns        32099 ns        21786 bytes_read/iteration=1000k bytes_read/sec=29.7101G/s eltcnt=1.3943G eltcnt/sec=1.99381G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.50 N          0.50 N
BM_bcmp<uint64_t, InequalHalfway>_RMS             1 %             1 %
RUNNING: build-new/test/llvm-bcmp-bench --benchmark_out=/tmp/tmpQ46PP0
2019-04-25 21:19:29
Running build-new/test/llvm-bcmp-bench
Run on (8 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 16K (x8)
  L1 Instruction 64K (x4)
  L2 Unified 2048K (x4)
  L3 Unified 8192K (x1)
Load Average: 1.01, 2.85, 3.71
---------------------------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations UserCounters...
---------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000            18593 ns        18590 ns        37565 bytes_read/iteration=1000k bytes_read/sec=51.2991G/s eltcnt=19.2333G eltcnt/sec=27.541G/s
BM_bcmp<uint8_t, Identical>_BigO               0.04 N          0.04 N
BM_bcmp<uint8_t, Identical>_RMS                  37 %            37 %
<...>
BM_bcmp<uint16_t, Identical>/256000           18950 ns        18948 ns        37223 bytes_read/iteration=1000k bytes_read/sec=50.3324G/s eltcnt=9.52909G eltcnt/sec=13.511G/s
BM_bcmp<uint16_t, Identical>_BigO              0.08 N          0.08 N
BM_bcmp<uint16_t, Identical>_RMS                 34 %            34 %
<...>
BM_bcmp<uint32_t, Identical>/128000           18627 ns        18627 ns        37895 bytes_read/iteration=1000k bytes_read/sec=51.198G/s eltcnt=4.85056G eltcnt/sec=6.87168G/s
BM_bcmp<uint32_t, Identical>_BigO              0.16 N          0.16 N
BM_bcmp<uint32_t, Identical>_RMS                 35 %            35 %
<...>
BM_bcmp<uint64_t, Identical>/64000            18855 ns        18855 ns        37458 bytes_read/iteration=1000k bytes_read/sec=50.5791G/s eltcnt=2.39731G eltcnt/sec=3.3943G/s
BM_bcmp<uint64_t, Identical>_BigO              0.32 N          0.32 N
BM_bcmp<uint64_t, Identical>_RMS                 33 %            33 %
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000        9570 ns         9569 ns        73500 bytes_read/iteration=1000k bytes_read/sec=99.6601G/s eltcnt=37.632G eltcnt/sec=53.5046G/s
BM_bcmp<uint8_t, InequalHalfway>_BigO          0.02 N          0.02 N
BM_bcmp<uint8_t, InequalHalfway>_RMS             29 %            29 %
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000       9547 ns         9547 ns        74343 bytes_read/iteration=1000k bytes_read/sec=99.8971G/s eltcnt=19.0318G eltcnt/sec=26.8159G/s
BM_bcmp<uint16_t, InequalHalfway>_BigO         0.04 N          0.04 N
BM_bcmp<uint16_t, InequalHalfway>_RMS            29 %            29 %
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000       9396 ns         9394 ns        73521 bytes_read/iteration=1000k bytes_read/sec=101.518G/s eltcnt=9.41069G eltcnt/sec=13.6255G/s
BM_bcmp<uint32_t, InequalHalfway>_BigO         0.08 N          0.08 N
BM_bcmp<uint32_t, InequalHalfway>_RMS            30 %            30 %
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000        9499 ns         9498 ns        73802 bytes_read/iteration=1000k bytes_read/sec=100.405G/s eltcnt=4.72333G eltcnt/sec=6.73808G/s
BM_bcmp<uint64_t, InequalHalfway>_BigO         0.16 N          0.16 N
BM_bcmp<uint64_t, InequalHalfway>_RMS            28 %            28 %
Comparing build-old/test/llvm-bcmp-bench to build-new/test/llvm-bcmp-bench
Benchmark                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
<...>
BM_bcmp<uint8_t, Identical>/512000                      -0.9570         -0.9570        432131         18593        432101         18590
<...>
BM_bcmp<uint16_t, Identical>/256000                     -0.8826         -0.8826        161408         18950        161409         18948
<...>
BM_bcmp<uint32_t, Identical>/128000                     -0.7714         -0.7714         81497         18627         81488         18627
<...>
BM_bcmp<uint64_t, Identical>/64000                      -0.6239         -0.6239         50138         18855         50138         18855
<...>
BM_bcmp<uint8_t, InequalHalfway>/512000                 -0.9503         -0.9503        192405          9570        192392          9569
<...>
BM_bcmp<uint16_t, InequalHalfway>/256000                -0.9253         -0.9253        127858          9547        127860          9547
<...>
BM_bcmp<uint32_t, InequalHalfway>/128000                -0.8088         -0.8088         49140          9396         49140          9394
<...>
BM_bcmp<uint64_t, InequalHalfway>/64000                 -0.7041         -0.7041         32101          9499         32099          9498
```

What can we tell from the benchmark?
* Performance of naive equality check somewhat improves with element size,
  maxing out at eltcnt/sec=1.58603G/s for uint16_t, or bytes_read/sec=19.0209G/s
  for uint64_t. I think, that instability implies performance problems.
* Performance of `memcmp()`-aware benchmark always maxes out at around
  bytes_read/sec=51.2991G/s for every type. That is 2.6x the throughput of the
  naive variant!
* eltcnt/sec metric for the `memcmp()`-aware benchmark maxes out at
  eltcnt/sec=27.541G/s for uint8_t (was: eltcnt/sec=1.18491G/s, so 24x) and
  linearly decreases with element size.
  For uint64_t, it's ~4x+ the elements/second.
* The call obvious is more pricey than the loop, with small element count.
  As it can be seen from the full output {F8768210}, the `memcmp()` is almost
  universally worse, independent of the element size (and thus buffer size) when
  element count is less than 8.

So all in all, bcmp idiom does indeed pose untapped performance headroom.
This diff does implement said idiom recognition. I think a reasonable test
coverage is present, but do tell if there is anything obvious missing.

Now, quality. This does succeed to build and pass the test-suite, at least
without any non-bundled elements. {F8768216} {F8768217}
This transform fires 91 times:
```
$ /build/test-suite/utils/compare.py -m loop-idiom.NumBCmp result-new.json
Tests: 1149
Metric: loop-idiom.NumBCmp

Program                                         result-new

MultiSourc...Benchmarks/7zip/7zip-benchmark    79.00
MultiSource/Applications/d/make_dparser         3.00
SingleSource/UnitTests/vla                      2.00
MultiSource/Applications/Burg/burg              1.00
MultiSourc.../Applications/JM/lencod/lencod     1.00
MultiSource/Applications/lemon/lemon            1.00
MultiSource/Benchmarks/Bullet/bullet            1.00
MultiSourc...e/Benchmarks/MallocBench/gs/gs     1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc     1.00
MultiSourc...Prolangs-C/simulator/simulator     1.00
```
The size changes are:
I'm not sure what's going on with SingleSource/UnitTests/vla.test yet, did not look.
```
$ /build/test-suite/utils/compare.py -m size..text result-{old,new}.json --filter-hash
Tests: 1149
Same hash: 907 (filtered out)
Remaining: 242
Metric: size..text

Program                                        result-old result-new diff
test-suite...ingleSource/UnitTests/vla.test   753.00     833.00     10.6%
test-suite...marks/7zip/7zip-benchmark.test   1001697.00 966657.00  -3.5%
test-suite...ngs-C/simulator/simulator.test   32369.00   32321.00   -0.1%
test-suite...plications/d/make_dparser.test   89585.00   89505.00   -0.1%
test-suite...ce/Applications/Burg/burg.test   40817.00   40785.00   -0.1%
test-suite.../Applications/lemon/lemon.test   47281.00   47249.00   -0.1%
test-suite...TimberWolfMC/timberwolfmc.test   250065.00  250113.00   0.0%
test-suite...chmarks/MallocBench/gs/gs.test   149889.00  149873.00  -0.0%
test-suite...ications/JM/lencod/lencod.test   769585.00  769569.00  -0.0%
test-suite.../Benchmarks/Bullet/bullet.test   770049.00  770049.00   0.0%
test-suite...HMARK_ANISTROPIC_DIFFUSION/128    NaN        NaN        nan%
test-suite...HMARK_ANISTROPIC_DIFFUSION/256    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/64    NaN        NaN        nan%
test-suite...CHMARK_ANISTROPIC_DIFFUSION/32    NaN        NaN        nan%
test-suite...ENCHMARK_BILATERAL_FILTER/64/4    NaN        NaN        nan%
Geomean difference                                                   nan%
         result-old    result-new       diff
count  1.000000e+01  10.00000      10.000000
mean   3.152090e+05  311695.40000  0.006749
std    3.790398e+05  372091.42232  0.036605
min    7.530000e+02  833.00000    -0.034981
25%    4.243300e+04  42401.00000  -0.000866
50%    1.197370e+05  119689.00000 -0.000392
75%    6.397050e+05  639705.00000 -0.000005
max    1.001697e+06  966657.00000  0.106242
```

I don't have timings though.

And now to the code. The basic idea is to completely replace the whole loop.
If we can't fully kill it, don't transform.
I have left one or two comments in the code, so hopefully it can be understood.

Also, there is a few TODO's that i have left for follow-ups:
* widening of `memcmp()`/`bcmp()`
* step smaller than the comparison size
* Metadata propagation
* more than two blocks as long as there is still a single backedge?
* ???

Reviewers: reames, fhahn, mkazantsev, chandlerc, craig.topper, courbet

Reviewed By: courbet

Subscribers: miyuki, hiraditya, xbolva00, nikic, jfb, gchatelet, courbet, llvm-commits, mclow.lists

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61144

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374662 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206.
Roman Lebedev [Sat, 12 Oct 2019 15:35:16 +0000 (15:35 +0000)]
[NFC][LoopIdiom] Add bcmp loop idiom miscompile test from PR43206.

The transform forgot to check SCEV loop scopes.

https://bugs.llvm.org/show_bug.cgi?id=43206

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374661 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[NFC][LoopIdiom] Move one bcmp test into the proper place
Roman Lebedev [Sat, 12 Oct 2019 15:35:09 +0000 (15:35 +0000)]
[NFC][LoopIdiom] Move one bcmp test into the proper place

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374660 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction
Simon Pilgrim [Sat, 12 Oct 2019 15:19:13 +0000 (15:19 +0000)]
[X86][SSE] Avoid unnecessary PMOVZX in v4i8 sum reduction

This should go away once D66004 has landed and we can simplify shuffle chains using demanded elts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374658 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[lit] Try again to fix new tests that fail on Windows bots
Joel E. Denny [Sat, 12 Oct 2019 14:58:43 +0000 (14:58 +0000)]
[lit] Try again to fix new tests that fail on Windows bots

Based on the bot logs, when lit's internal diff runs on Windows, it
looks like binary diffs must be decoded also for Python 2.7.
Otherwise, writing the diff to stdout fails with:

```
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
```

I did not need to decode using Python 2.7.15 under Ubuntu.  When I do
it anyway in that case, `errors="backslashreplace"` fails for me:

```
TypeError: don't know how to handle UnicodeDecodeError in error callback
```

However, `errors="ignore"` works, so this patch uses that, hoping
it'll work on Windows as well.

This patch leaves `errors="backslashreplace"` for Python >= 3.5 as
there's no evidence yet that doesn't work and it produces more
informative binary diffs.  This patch also adjusts some lit tests to
succeed for either error handler.

This patch adjusts changes introduced by D68664.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374657 91177308-0d34-0410-b5e6-96231b3b80d8

5 years agoRevert r374654: "[lit] Try to fix new tests that fail on Windows bots"
Joel E. Denny [Sat, 12 Oct 2019 14:58:30 +0000 (14:58 +0000)]
Revert r374654: "[lit] Try to fix new tests that fail on Windows bots"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374656 91177308-0d34-0410-b5e6-96231b3b80d8

5 years ago[CostModel][X86] Improve sum reduction costs.
Simon Pilgrim [Sat, 12 Oct 2019 13:21:50 +0000 (13:21 +0000)]
[CostModel][X86] Improve sum reduction costs.

I can't see any notable differences in costs between SSE2 and SSE42 arches for FADD/ADD reduction, so I've lowered the target to just SSE2.

I've also added vXi8 sum reduction costs in line with the PSADBW codegen and discussions on PR42674.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374655 91177308-0d34-0410-b5e6-96231b3b80d8