granicus.if.org Git

[ConstantFold] Fix misfolding of icmp with a bitcast FP second operand.

In the process of trying to eliminate the bitcast, this was producing a
malformed icmp with FP operands.

Differential revision: https://reviews.llvm.org/D51215

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354380 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-cov] Add support for gcov --hash-filenames option

The patch adds support for --hash-filenames to llvm-cov. This option adds md5
hash of the source path to the name of the generated .gcov file. The option is
crucial for cases where you have multiple files with the same name but can't
use --preserve-paths as resulting filenames exceed the limit.

from gcov(1):

```
-x
--hash-filenames
    By default, gcov uses the full pathname of the source files to to
    create an output filename.  This can lead to long filenames that
    can overflow filesystem limits.  This option creates names of the
    form source-file##md5.gcov, where the source-file component is
    the final filename part and the md5 component is calculated from
    the full mangled name that would have been used otherwise.
```

Patch by Igor Ignatev!

Differential Revision: https://reviews.llvm.org/D58370

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354379 91177308-0d34-0410-b5e6-96231b3b80d8

Testing commit access

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354378 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't consider functions ABI compatible for ArgumentPromotion pass if they view 512-bit vectors differently.

The use of the -mprefer-vector-width=256 command line option mixed with functions
using vector intrinsics can create situations where one function thinks 512 vectors
are legal, but another fucntion does not.

If a 512 bit vector is passed between them via a pointer, its possible ArgumentPromotion
might try to pass by value instead. This will result in type legalization for the two
functions handling the 512 bit vector differently leading to runtime failures.

Had the 512 bit vector been passed by value from clang codegen, both functions would
have been tagged with a min-legal-vector-width=512 function attribute. That would
make them be legalized the same way.

I observed this issue in 32-bit mode where a union containing a 512 bit vector was
being passed by a function that used intrinsics to one that did not. The caller
ended up passing in zmm0 and the callee tried to read it from ymm0 and ymm1.

The fix implemented here is just to consider it a mismatch if two functions
would handle 512 bit differently without looking at the types that are being
considered. This is the easist and safest fix, but it can be improved in the future.

Differential Revision: https://reviews.llvm.org/D58390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354376 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Revert "[llvm-objdump] Allow short options without arguments to be grouped""

- Tests that use multiple short switches now test them grouped and ungrouped.

- Ensure the output of ungrouped and grouped variants is identical

Differential Revision: https://reviews.llvm.org/D57904

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354375 91177308-0d34-0410-b5e6-96231b3b80d8

Fix builds for older macOS deployment targets after r354365

Surprisingly, check_symbol_exists is not sufficient. The macOS linker checks the
called functions against a compatibility list for the given deployment target
and check_symbol_exists doesn't trigger this check as it never calls the
function.

This fixes the GreenDragon bots where the deployment target is 10.9

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354374 91177308-0d34-0410-b5e6-96231b3b80d8

Annotate timeline in Instruments with passes and other timed regions.

Summary:
Instruments is a useful tool for finding performance issues in LLVM but it can
be difficult to identify regions of interest on the timeline that we can use
to filter the profiler or allocations instrument. Xcode 10 and the latest
macOS/iOS/etc. added support for the os_signpost() API which allows us to
annotate the timeline with information that's meaningful to LLVM.

This patch causes timer start and end events to emit signposts. When used with
-time-passes, this causes the passes to be annotated on the Instruments timeline.
In addition to visually showing the duration of passes on the timeline, it also
allows us to filter the profile and allocations instrument down to an individual
pass allowing us to find the issues within that pass without being drowned out
by the noise from other parts of the compiler.

Using this in conjunction with the Time Profiler (in high frequency mode) and
the Allocations instrument is how I found the SparseBitVector that should have
been a BitVector and the DenseMap that could be replaced by a sorted vector a
couple months ago. I added NamedRegionTimers to TableGen and used the resulting
annotations to identify the slow portions of the Register Info Emitter. Some of
these were placed according to educated guesses while others were placed
according to hot functions from a previous profile. From there I filtered the
profile to a slow portion and the aforementioned issues stood out in the
profile.

To use this feature enable LLVM_SUPPORT_XCODE_SIGNPOSTS in CMake and run the
compiler under Instruments with -time-passes like so:
instruments -t 'Time Profiler' bin/llc -time-passes -o - input.ll'
Then open the resulting trace in Instruments.

There was a talk at WWDC 2018 that explained the feature which can be found at
https://developer.apple.com/videos/play/wwdc2018/405/ if you'd like to know
more about it.

Reviewers: bogner

Reviewed By: bogner

Subscribers: jdoerfert, mgorny, kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D52954

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354365 91177308-0d34-0410-b5e6-96231b3b80d8

[libObject][NFC] Use sys::path::convert_to_slash.

Summary: As suggested in rL353995

Reviewers: compnerd

Reviewed By: compnerd

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58298

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354364 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Generalize X86ISD::BLENDI support to more value types

D42042 introduced the ability for the ExecutionDomainFixPass to more easily change between BLENDPD/BLENDPS/PBLENDW as the domains required.

With this ability, we can avoid most bitcasts/scaling in the DAG that was occurring with X86ISD::BLENDI lowering/combining, blend with the vXi32/vXi64 vectors directly and use isel patterns to lower to the float vector equivalent vectors.

This helps the shuffle combining and SimplifyDemandedVectorElts be more aggressive as we lose track of fewer UNDEF elements than when we go up/down through bitcasts.

I've introduced a basic blend(bitcast(x),bitcast(y)) -> bitcast(blend(x,y)) fold, there are more generalizations I can do there (e.g. widening/scaling and handling the tricky v16i16 repeated mask case).

The vector-reduce-smin/smax regressions will be fixed in a future improvement to SimplifyDemandedBits to peek through bitcasts and support X86ISD::BLENDV.

Reapplied after reversion at rL353699 - AVX2 isel fix was applied at rL354358, additional test at rL354360/rL354361

Differential Revision: https://reviews.llvm.org/D57888

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354363 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Remove unused headers in Optional.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354362 91177308-0d34-0410-b5e6-96231b3b80d8

Fix stupid assembly comment typo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354361 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Add pblendw commuted load test case

Reduced test case for the regression caused in D57888/rL353610

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354360 91177308-0d34-0410-b5e6-96231b3b80d8

[SDAG] Use shift amount type in MULO promotion; NFC

Directly use the correct shift amount type if it is possible, and
future-proof the code against vectors. The added test makes sure that
bitwidths that do not fit into the shift amount type do not assert.

Split out from D57997.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354359 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX2] Hide VPBLENDD instructions behind AVX2 predicate

This was the cause of the regression in D57888 - the commuted load pattern wasn't hidden by the predicate so once we enabled v4i32 blends on SSE41+ targets then isel was incorrectly matched against AVX2+ instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354358 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Bugfix for nullptr check by klocwork

klocwork critical issues in CG files:

Patch by Xiang Zhang (xiangzhangllvm)

Differential Revision: https://reviews.llvm.org/D58363

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354357 91177308-0d34-0410-b5e6-96231b3b80d8

X86AsmParser AVX-512: Return error instead of hitting assert

When parsing a sequence of tokens beginning with {, it will hit an assert and crash if the token afterwards is not an identifier. Instead of this, return a more verbose error as seen elsewhere in the function.

Patch by Brandon Jones (BrandonTJones)

Differential Revision: https://reviews.llvm.org/D57375

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354356 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Filter out tuning feature flags and a few ISA feature flags when checking for function inline compatibility.

Tuning flags don't have any effect on the available instructions so aren't a good reason to prevent inlining.

There are also some ISA flags that don't have any intrinsics our ABI requirements that we can exclude. I've put only the most basic ones like cmpxchg16b and lahfsahf. These are interesting because they aren't present in all 64-bit CPUs, but we have codegen workarounds when they aren't present.

Loosening these checks can help with scenarios where a caller has a more specific CPU than a callee. The default tuning flags on our generic 'x86-64' CPU can currently make it inline compatible with other CPUs. I've also added an example test for 'nocona' and 'prescott' where 'nocona' is just a 64-bit capable version of 'prescott' but in 32-bit mode they should be completely compatible.

I've based the implementation here of the similar code in AMDGPU.

Differential Revision: https://reviews.llvm.org/D58371

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354355 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement moreElementsVector for select

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354354 91177308-0d34-0410-b5e6-96231b3b80d8

index.rst: Remove bb-chapuni from list of IRC bots

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354353 91177308-0d34-0410-b5e6-96231b3b80d8

index.rst: Remove Dragonegg link

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354352 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement moreElementsVector for G_EXTRACT source

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354348 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] Update VBROADCAST folds to always use v2i64 X86vzload

The VBROADCAST combines and SimplifyDemandedVectorElts improvements mean that we now more consistently use shorter (128-bit) X86vzload input operands.

Follow up to D58053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354346 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement moreElementsVector for bit ops

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354345 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj][obj2yaml] Remove section type range markers from allowed mappings and support hex values

yaml2obj/obj2yaml previously supported SHT_LOOS, SHT_HIOS, and
SHT_LOPROC for section types. These are simply values that delineate a
range and don't really make sense as valid values. For example if a
section has type value 0x70000000, obj2yaml shouldn't print this value
as SHT_LOPROC. Additionally, this was missing the three other range
markers (SHT_HIPROC, SHT_LOUSER and SHT_HIUSER).

This change removes these three range markers. It also adds support for
specifying the type as an integer, to allow section types that LLVM
doesn't know about.

Reviewed by: grimar

Differential Revision: https://reviews.llvm.org/D58383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354344 91177308-0d34-0410-b5e6-96231b3b80d8

Cast from SDValue directly instead of superfluous getNode(). NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354343 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Verify g_insert

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354342 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][AVX] EltsFromConsecutiveLoads - Add BROADCAST lowering support

This patch adds scalar/subvector BROADCAST handling to EltsFromConsecutiveLoads.

It mainly shows codegen changes to 32-bit code which failed to handle i64 loads, although 64-bit code is also using this new path to more efficiently combine to a broadcast load.

Differential Revision: https://reviews.llvm.org/D58053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354340 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj][obj2yaml] - Support SHT_GNU_versym (.gnu.version) section.

This patch adds support for parsing dumping the .gnu.version section.
Description of the section is: https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symversion.html#SYMVERTBL

Differential revision: https://reviews.llvm.org/D58280

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354338 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section."

Fix:
Replace
assert(!IO.getContext() && "The IO context is initialized already");
with
assert(IO.getContext() && "The IO context is not initialized");
(this was introduced in r354329, where I tried to quickfix the darwin BB
and seems copypasted the assert from the wrong place).

Original commit message:

The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html

Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.

We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.

Everything else changed in this patch seems to be straightforward.

Differential revision: https://reviews.llvm.org/D58119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354335 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV][NFC] Move some std::string to StringRef

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354333 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r354328, r354329 "[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section."

Something went wrong. Bots are unhappy:
http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/44113/steps/test/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354332 91177308-0d34-0410-b5e6-96231b3b80d8

Fix BB after r354328.

Bot:
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/30188/steps/build_Lld/logs/stdio

Error:
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:1013:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
  const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
              ^
/Users/buildslave/as-bldslv9_new/lld-x86_64-darwin13/llvm.src/lib/ObjectYAML/ELFYAML.cpp:1023:15: error: unused variable 'Object' [-Werror,-Wunused-variable]
  const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());

Fix:
change
  const auto *Object = static_cast<ELFYAML::Object *>(IO.getContext());
  assert(Object && "The IO context is not initialized");
to
  assert(!IO.getContext() && "The IO context is initialized already");

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354329 91177308-0d34-0410-b5e6-96231b3b80d8

[obj2yaml][yaml2obj] - Add support of parsing/dumping of the .gnu.version_r section.

The section is described here:
https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/symverrqmts.html

Patch just teaches obj2yaml/yaml2obj to dump and parse such sections.

We did the finalization of string tables very late,
and I had to move the logic to make it a bit earlier.
That was needed in this patch since .gnu.version_r adds strings to .dynstr.
This might also be useful for implementing other special sections.

Everything else changed in this patch seems to be straightforward.

Differential revision: https://reviews.llvm.org/D58119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354328 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Re-organise calling convention tests

Re-organise calling convention tests to prepare for ilp32f and ilp32d hard
float ABI tests. It's also clear that we need to introduce similar tests for
lp64.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354323 91177308-0d34-0410-b5e6-96231b3b80d8

Fix BB after r354319 "[yaml2obj] - Do not skip zeroes blocks if there are relocations against them."

Fix: move the test to x86 folder.
Seems it is needed, because llvm-objdump invocation used in test has -D (disasm) flag.

BB: http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/23016

/local/buildbot/slaves/hexagon-build-02/clang-hexagon-elf/stage1/bin/llvm-objdump:
error: '/local/buildbot/slaves/hexagon-build-02/clang-hexagon-elf/stage1/test/tools/llvm-objdump/Output/disasm-zeroes-relocations.test.tmp':
can't find target: : error: unable to get target for 'x86_64--', see --version and --triple.
.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354322 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] - Do not skip zeroes blocks if there are relocations against them.

This is for -D -reloc combination.

With this patch, we do not skip the zero bytes that have a relocation against
them when -reloc is used. If -reloc is not used, then the behavior will be the same.

Differential revision: https://reviews.llvm.org/D58174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354319 91177308-0d34-0410-b5e6-96231b3b80d8

[yaml2obj] - Do not ignore explicit addresses for .dynsym and .dynstr

This fixes https://bugs.llvm.org/show_bug.cgi?id=40339

Previously if the addresses were set in YAML they were ignored for
.dynsym and .dynstr sections. The patch fixes that.

Differential revision: https://reviews.llvm.org/D58168

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354318 91177308-0d34-0410-b5e6-96231b3b80d8

Fix obsolete comment. NFC

Both files mentioned in the comment now include TargetOpcodes.def. Just
mention that directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354316 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] API for signaling that the current loop is being deleted

We are planning to be able to delete the current loop in LoopSimplifyCFG
in the future. Add API to notify the loop pass manager that it happened.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354314 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Store loop header in a local to keep it available after the loop is deleted

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354313 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM GlobalISel] Support G_PHI for Thumb2

Same as arm mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354310 91177308-0d34-0410-b5e6-96231b3b80d8

[Dominators] Fix and optimize edge insertion of depth-based search

Summary:
After (x,y) is inserted, depth-based search finds all affected v that satisfies:

depth(nca(x,y))+1 < depth(v) && there exists a path P from y to v where every w on P satisfies depth(v) <= depth(w)

This reduces to a widest path problem (maximizing the depth of the
minimum vertex in the path) which can be solved by a modified version of
Dijkstra with a bucket queue (named depth-based search in the paper).

The algorithm visits vertices in decreasing order of bucket number.
However, the current code misused priority_queue to extract them in
increasing order. I cannot think of a failing scenario but it surely may
process vertices more than once due to the local usage of Processed.

This patch fixes this bug and simplifies/optimizes the code a bit. Also
add more comments.

Reviewers: kuhar

Reviewed By: kuhar

Subscribers: kristina, jdoerfert, llvm-commits, NutshellySima, brzycki

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58349

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354306 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Remove command line strings from the ProcIntel* features.

These should always follow the CPU string. There's no reason to control them independently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354304 91177308-0d34-0410-b5e6-96231b3b80d8

[GlobalISel][AArch64] Legalize + select some llvm.ctlz.* intrinsics

Legalize/select llvm.ctlz.*

Add select-ctlz to show that we actually select them. Update arm64-clrsb.ll and
arm64-vclz.ll to show that we perform valid transformations in optimized builds,
and document where GISel can improve.

Differential Revision: https://reviews.llvm.org/D58155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354299 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] form usub with overflow from sub+icmp

The motivating x86 cases for forming the intrinsic are shown in PR31754 and PR40487:
https://bugs.llvm.org/show_bug.cgi?id=31754
https://bugs.llvm.org/show_bug.cgi?id=40487
..and those are shown in the IR test file and x86 codegen file.

Matching the usubo pattern is harder than uaddo because we have 2 independent values rather than a def-use.

This adds a TLI hook that should preserve the existing behavior for uaddo formation, but disables usubo
formation by default. Only x86 overrides that setting for now although other targets will likely benefit
by forming usbuo too.

Differential Revision: https://reviews.llvm.org/D57789

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354298 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use MachineInstr::mayAlias to replace areMemAccessesTriviallyDisjoint in LoadStoreOptimizer pass.

Summary:
  This is to fix a memory dependence bug in LoadStoreOptimizer.

Reviewers:
  arsenm, rampitec

Differential Revision:
  https://reviews.llvm.org/D58295

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354295 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Implement widenScalar for g_extract scalar results

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354293 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Make buildExtract use DstOp/SrcOp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354292 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Fix double count of offset for irregular vector breakdowns

Fixes cases with odd vectors that break into multiple requested size
pieces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354280 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] split more v8f32/v8i32 shuffles in lowering

Similar to D57867 - this is a small patch with lots of test diffs.
With half-vector-width narrowing potential, using an extract + 128-bit vshufps
is a win because it replaces a 256-bit shuffle with a 128-bit shufle.

This seems like it should be a win even for targets with 'fast-variable-shuffle',
but we are intentionally deferring that to an independent change to make sure
that is true.

Differential Revision: https://reviews.llvm.org/D58181

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354279 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "[InstCombine] reduce even more unsigned saturated add with 'not' op"

This reverts commit 079b610c29b4a428b3ae7b64dbac0378facf6632.
Bots are failing after this change on a stage 2 compile of clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354277 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce even more unsigned saturated add with 'not' op

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: uaddsat, -1 fval
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %notx, %y
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

  Name: uaddsat, -1 fval + ult
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %y, %notx
  %r = select i1 %c, i32 %a, i32 -1
  =>
  %a = add i32 %x, %y
  %c2 = icmp ugt i32 %y, %a
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/nTp

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354276 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Correctly update register definitions in the PRF after move elimination.

This patch fixes a bug where register writes performed by optimizable register
moves were sometimes wrongly treated like partial register updates. Before this
patch, llvm-mca wrongly predicted a 1.50 IPC for test reg-move-elimination-6.s
(added by this patch). With this patch, llvm-mca correctly updates the register
defintions in the PRF, and the IPC for that test is now correctly reported as 2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354271 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] - Simplify .gnu.version_d dumping.

This is similar to D58048.

Instead of scanning the dynamic table to read the
DT_VERDEFNUM, we could take it from the sh_info field.
(https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-94076/index.html)

The patch does this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354270 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Make Optional<T> trivially copyable when T is trivially copyable

This is a follow-up to r354246 and a reimplementation of https://reviews.llvm.org/D57097?id=186600
that should not trigger any UB thanks to the use of an union.

This may still be subject to the problem solved by std::launder, but I'm unsure how it interacts whith union.
/me plans to revert if this triggers any relevant bot failure. At least this validates in Release mode with
clang 6.0.1 and gcc 4.8.5.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354264 91177308-0d34-0410-b5e6-96231b3b80d8

[MCA] Slightly refactor method writeStartEvent in WriteState and ReadState. NFCI

This is another change in preparation for PR37494.
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354261 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] [NFC] Fixing typo.

Reviewers: courbet, gchatelet

Reviewed By: courbet, gchatelet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54895

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354250 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit [NFC] Better encapsulation of llvm::Optional Storage

Second attempt, trying to navigate out of the UB zone using
union for storage instead of raw bytes.

I'm prepared to revert that commit as soon as validation breaks,
which is likely to happen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354246 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r354244 "[DAGCombiner] Eliminate dead stores to stack."

Breaks some bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354245 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Eliminate dead stores to stack.

Summary:
A store to an object whose lifetime is about to end can be removed.

See PR40550 for motivation.

Reviewers: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D57541

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354244 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Make SubtargetFeatureKV only store one FeatureBitset and use an 'unsigned' to hold the value.

This class is used for two difference tablegen generated tables. For one of the tables the Value FeatureBitset only has one bit set. For the other usage the Implies field was unused.

This patch changes the Value field to just be an unsigned. For the usage that put a real vector in bitset, we now use the previously unused Implies field and leave the Value field unused instead.

This is good for a 16K reduction in the size of llc on my local build with all targets enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354243 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r354156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354242 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [NFC] Better encapsulation of llvm::Optional Storage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354240 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Better encapsulation of llvm::Optional Storage, part II

Fix for better Windows support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354239 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Better encapsulation of llvm::Optional Storage

Second attempt, trying to navigate out of the UB zone using
union for storage instead of raw bytes.

I'm prepared to revert that commit as soon as validation breaks,
which is likely to happen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354238 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVM-C] Add bindings to create enumerators

Summary: The C API don't have the bindings to create enumerators, needed to create an enumeration.

Reviewers: whitequark, CodaFi, harlanhaskins, deadalnix

Reviewed By: whitequark, CodaFi, harlanhaskins

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58323

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354237 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add even more tests for unsigned saturated add; NFC

The pattern-matching from rL354221 / rL354224 doesn't cover
these, so we're up to 8 different commuted possibilities.

There may still be 1 more variant of this pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354236 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] In FP_TO_INTHelper, when moving data from SSE register to X87 register file via the stack, use the same stack slot we use for the integer conversion.

No need for a separate stack slot. The lifetimes don't overlap.

Also fix the MachinePointerInfo for the final load after the integer conversion to indicate it came from the stack slot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354234 91177308-0d34-0410-b5e6-96231b3b80d8

[TEST] Remove 2>&1 from tests

Avoid confusing CHECKS with debug dumps of sets that can be printed differently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354229 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Teach getInnermostLoopFor walk up the loop trees

This should be NFC in current use case of this method, but it will
help to use it for solving more compex tasks in follow-up patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354227 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Extract [US]MULO expansion into TL method; NFC

In preparation for supporting vector expansion.

Add an isPostTypeLegalization flag to makeLibCall(), because this
expansion relies on the legalized form using MERGE_VALUES. Drop
the corresponding variant of ExpandLibCall, which is no longer used.

Differential Revision: https://reviews.llvm.org/D58006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354226 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce more unsigned saturated add with 'not' op

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

  Name: not op
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ult i32 %notx, %y
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

  Name: not op ugt
  %notx = xor i32 %x, -1
  %a = add i32 %x, %y
  %c = icmp ugt i32 %y, %notx
  %r = select i1 %c, i32 -1, i32 %a
  =>
  %a = add i32 %x, %y
  %c2 = icmp ult i32 %a, %y
  %r = select i1 %c2, i32 -1, i32 %a

https://rise4fun.com/Alive/niom

(The matching here is still incomplete.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354224 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add more tests for unsigned saturated add; NFC

Extend the pattern-matching from rL354219 / rL354221.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354223 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] reduce unsigned saturated add with 'not' op

We want to use the sum in the icmp to allow matching with
m_UAddWithOverflow and eliminate the 'not'. This is discussed
in D51929 and is another step towards solving PR14613:
https://bugs.llvm.org/show_bug.cgi?id=14613

(The matching here is incomplete. Trying to take minimal steps
to make sure we don't induce infinite looping from existing
canonicalizations of the 'select'.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354221 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix name and clarifying comment for factored-out function

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354220 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] add tests for unsigned saturated add; NFC

We're missing IR canonicalizations for this op as shown in D51929.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354219 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Factor out a function for future reuse

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354218 91177308-0d34-0410-b5e6-96231b3b80d8

Revert [NFC] Better encapsulation of llvm::Optional Storage

I'm getting the feealing that current Optional implementation is full of UB :-/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354217 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Better encapsulation of llvm::Optional Storage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354214 91177308-0d34-0410-b5e6-96231b3b80d8

[LLVMSupport]: Remove a severely outdated README.

The LLVM Support library implementation has resided in
//llvm/lib/Support for a significant amount of time now,
with documentation having been updated with all references
to the "System library" being replaced with "Support library".

Since this file mirrors already existing documentation available
for Support library, includes dead links to documentation and
still refers to it as "System library", having it there is
confusing and updating it has very little point as it duplicates
information in documentation, except documentation is a lot more
up to date while this file has not been maintained.

Up to date documentation concerning this can be found here:
http://llvm.org/docs/SupportLibrary.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354209 91177308-0d34-0410-b5e6-96231b3b80d8

[bindings/go] Fix building on 32-bit systems (ARM etc.)

Summary:
The patch in https://reviews.llvm.org/D53883 (by me) fails to build on 32-bit systems like ARM. Fix the array size to be less ridiculously large. 2<<20 should still be enough for all practical purposes.

Bug: https://bugs.llvm.org/show_bug.cgi?id=40426

Reviewers: whitequark, pcc

Reviewed By: whitequark

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58030

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354207 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed code snippet in Kaleidoscope tutorial to reflect final full code listing

Patch by Frank He.

Differential Revision: https://reviews.llvm.org/D52166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354205 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in docs

Patch by Alex Yursha.

Differential Revision: https://reviews.llvm.org/D45903

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354203 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r354199: Make Optional<T> Trivially Copyable when T is trivially copyable

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354200 91177308-0d34-0410-b5e6-96231b3b80d8

Make Optional<T> Trivially Copyable when T is trivially copyable

This is another attempt in the process, works nicely on my setup,
let's check how it behaves on other targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354199 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] When type legalizing the result of a i64 fp_to_uint on 32-bit targets. Generate all of the ops as i64 and let them be legalized.

No need to manually split everything. We can let the type legalizer work for us.

The test change seems to be caused by some DAG ordering issue that was previously circumventing a one use check in LowerSELECT where FP selects are turned into blends if the setcc has one use. But it was running after an integer select and the same setcc had been legalized to cmov and X86SISD::CMP. This dropped the use count of the setcc, but wasn't what was intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354197 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-nm: Observe -no-llvm-bc for archive members

Summary:
This change fixes the `-no-llvm-bc` flag to work with object files within
archives. Currently the `-no-llvm-bc` flag works for regular object files, but
not static libraries, where it continues to show bitcode symbol info.

Original support was added in D4371.

Reviewers: compnerd, smeenai, pcc

Reviewed By: compnerd

Subscribers: rupprecht, keith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D48798

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354196 91177308-0d34-0410-b5e6-96231b3b80d8

[CMake] Use variables rather than ":" delimiters

This is a follow up to D37644, this block was missed in that change.

Differential Revision: https://reviews.llvm.org/D58093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354194 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't prevent load folding for cvtsi2ss/cvtsi2sd based on hasPartialRegUpdate.

Preventing the load fold won't fix the partial register update since the
input we can fold is a GPR. So it will do nothing to prevent a false dependency
on an XMM register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354193 91177308-0d34-0410-b5e6-96231b3b80d8

[lit] Remove LitTestCase

From the docs: `class LitTestCase(unittest.TestCase)`
LitTestCase is an adaptor for providing a 'unittest' compatible
interface to 'lit' tests so that we can run lit tests with standard
python test runners.

It does not seem to be used anywhere.

Differential Revision: https://reviews.llvm.org/D58264

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354188 91177308-0d34-0410-b5e6-96231b3b80d8

[lit][NFC] Cleanup lit worker process handling

Move code that is executed on worker process to separate file. This
makes the use of the pickled arguments stored in global variables in the
worker a bit clearer. (Still not pretty though.)

Extract handling of parallelism groups to it's own function.

Use BoundedSemaphore instead of Semaphore. BoundedSemaphore raises for
unmatched release() calls.

Cleanup imports.

Differential Revision: https://reviews.llvm.org/D58196

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354187 91177308-0d34-0410-b5e6-96231b3b80d8

[EarlyCSE & MSSA] Cap the clobbering calls in EarlyCSE.

Summary:
Unlimitted number of calls to getClobberingAccess can lead to high
compile times in pathological cases.
Limitting getClobberingAccess to a fairly high number. Can be adjusted
based on users/need.
Note: this is the only user of MemorySSA currently enabled by default.
The same handling exists in LICM (disabled atm). As MemorySSA gains more
users, this logic of capping will need to move inside MemorySSA.

Reviewers: george.burgess.iv

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D58248

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354182 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't set exception mask bits when modifying FPCW to change rounding mode for fp->int conversion

When we need to do an fp->int conversion using x87 instructions, we need to temporarily change the rounding mode to 0b11 and perform a store. To do this we save the old value of the fpcw to the stack, then set the fpcw to 0xc7f, do the store, then restore fpcw. But the 0xc7f value forces the exception mask bits 1. While this is what they would be in the default FP environment, as we move to support changing the FP environments, we shouldn't make this assumption.

This patch changes the code to explicitly OR 0xc00 with the old value so that only the rounding mode is changed. Unfortunately, this requires two stack temporaries instead of one. One to hold the old value and one to hold the new value. Without two stack temporaries we would need an additional GPR. We already need one to do the OR operation in. This is similar to what gcc and icc do for this operation. Though they are both better at reusing the stack temporaries when there are multiple truncates in a function(or at least in a basic block)

Differential Revision: https://reviews.llvm.org/D57788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354178 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Address a couple stylistic issues pointed out by reviewer [NFC]

Better addressing comments from https://reviews.llvm.org/D58290.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354171 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Convert atomicrmws to xchg or store where legal

Implement two more transforms of atomicrmw:
1) We can convert an atomicrmw which produces a known value in memory into an xchg instead.
2) We can convert an atomicrmw xchg w/o users into a store for some orderings.

Differential Revision: https://reviews.llvm.org/D58290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354170 91177308-0d34-0410-b5e6-96231b3b80d8

[docs] Document LLVM_ENABLE_IDE

Use some of the wording and the motivating example from r344555. The
lack of documentation was pointed out by Roman Lebedev.

Differential Revision: https://reviews.llvm.org/D58286

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354167 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Fix LowerAsmOutputForConstraint.

Summary:
Update Flag when generating cc output.

Fixes PR40737.

Reviewers: rnk, nickdesaulniers, craig.topper, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58283

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354163 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Move all the SSE legality checks out of FP_TO_INTHelper and up to LowerFP_TO_INT. NFCI

These checks aren't needed on the call to FP_TO_INTHelper from the type legalizer for splitting i64. We always want to use X87 FIST/FISTT to memory there.

Moving up the SSE checks will allow this routine to focus on what it cares about and makes its return semantics cleaner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354161 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit "[SystemZ] Do not emit VEXTEND or VROUND nodes without vector support."

It seems there were some problem with using a .mir test. For some reason
doing '-stop-before=codegenprepare' and then '-start-before=codegenprepare'
on the output .mir file results in the NoVRegs Property after instruction
selection.

Recommitting the same test as an .ll file instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354160 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeExtractor] Do not lift lifetime.end markers for region inputs

If a lifetime.end marker occurs along one path through the extraction
region, but not another, then it's still incorrect to lift the marker,
because there is some path through the extracted function which would
ordinarily not reach the marker. If the call to the extracted function
is in a loop, unrolling can cause inputs to the function to become
optimized out as undef after the first iteration.

To prevent incorrect stack slot merging in the calling function, it
should be sufficient to lift lifetime.start markers for region inputs.
I've tested this theory out by doing a stage2 check-all with randomized
splitting enabled.

This is a follow-up to r353973, and there's additional context for this
change in https://reviews.llvm.org/D57834.

rdar://47896986

Differential Revision: https://reviews.llvm.org/D58253

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@354159 91177308-0d34-0410-b5e6-96231b3b80d8