granicus.if.org Git

[Tests] Add masked.gather tests for non-constant masks + speculation possibilities

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356782 91177308-0d34-0410-b5e6-96231b3b80d8

[ConstantFolding] Fix GetConstantFoldFPValue to avoid cast overflow.

Summary:
In C++, the behavior of casting a double value that is beyond the range
of a single precision floating-point to a float value is undefined. This
change replaces such a cast with APFloat::convert to convert the value,
which is consistent with how we convert a double value to a half value.

Reviewers: sanjoy

Subscribers: lebedev.ri, sanjoy, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59500

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356781 91177308-0d34-0410-b5e6-96231b3b80d8

Make clang-move use same file naming convention as other tools

In all the other clang-foo tools, the main library file is called
Foo.cpp and the file in the tool/ folder is called ClangFoo.cpp.
Do this for clang-move too.

No intended behavior change.

Differential Revision: https://reviews.llvm.org/D59700

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356780 91177308-0d34-0410-b5e6-96231b3b80d8

[tests] Add a generic masked.gather test to show sometimes we can't transform

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356779 91177308-0d34-0410-b5e6-96231b3b80d8

[tests] Add tests for converting masked.load to load speculatively

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356778 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Revert bad changes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356777 91177308-0d34-0410-b5e6-96231b3b80d8

[Tests] Use valid alignment in masked.gather tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356775 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356750

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356772 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356771 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356770 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356692

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356769 91177308-0d34-0410-b5e6-96231b3b80d8

InstCombineSimplifyDemanded: Allow v3 results for AMDGCN buffer and image intrinsics

This helps to avoid the situation where RA spots that only 3 of the
v4f32 result of a load are used, and immediately reallocates the 4th
register for something else, requiring a stall waiting for the load.

Differential Revision: https://reviews.llvm.org/D58906

Change-Id: I947661edfd5715f62361a02b100f14aeeada29aa

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356768 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356767 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356652 (and follow-up r56655)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356766 91177308-0d34-0410-b5e6-96231b3b80d8

gn build: Merge r356729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356765 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Separate `Symbol Version` dumpers into `LLVM style` and `GNU style`

Summary:
Currently, llvm-readobj can dump symbol version sections only in LLVM style. In this patch, I would like to separate these dumpers into GNU style and
LLVM style for future implementation.

Reviewers: grimar, jhenderson, mattd, rupprecht

Reviewed By: rupprecht

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356764 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] auto-generate complete test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356763 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] auto-generate complete test checks; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356762 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add 'nounwind' to tests to reduce noise; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356761 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] auto-generate complete checks for test; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356760 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Use three- and five-dword result type in image ops

Some image ops return three or five dwords. Previously, we modeled that
with a 4 or 8 dword register class. The register allocator could
cleverly spot that some subregs were dead and allocate something else
there, but that caused the de-optimization that waitcnt insertion would
think that the result was used immediately.

This commit allows such an image op to have a result with a three or
five dword result, avoiding the above de-optimization.

Differential Revision: https://reviews.llvm.org/D58905

Change-Id: I3651211bbd7ed22721ee7b9fefd7bcc60a809d8b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356757 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Implemented dwordx3 variants of buffer/tbuffer load/store intrinsics

Now we have vec3 MVTs, this commit implements dwordx3 variants of the
buffer intrinsics.

On gfx6, a dwordx3 buffer load intrinsic is implemented as a dwordx4
instruction, and a dwordx3 buffer store intrinsic is not supported.
We need to support the dwordx3 load intrinsic because it is generated by
subtarget-unaware code in InstCombine.

Differential Revision: https://reviews.llvm.org/D58904

Change-Id: I016729d8557b98a52f529638ae97c340a5922a4e

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356755 91177308-0d34-0410-b5e6-96231b3b80d8

[SLPVectorizer] Add test related to SLP Throttling support, NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356754 91177308-0d34-0410-b5e6-96231b3b80d8

[ObjectYAML] Add basic minidump generation support

Summary:
This patch adds the ability to read a yaml form of a minidump file and
write it out as binary. Apart from the minidump header and the stream
directory, only three basic stream kinds are supported:
- Text: This kind is used for streams which contain textual data. This
  is typically the contents of a /proc file on linux (e.g.
  /proc/PID/maps). In this case, we just put the raw stream contents
  into the yaml.
- SystemInfo: This stream contains various bits of information about the
  host system in binary form. We expose the data in a structured form.
- Raw: This kind is used as a fallback when we don't have any special
  knowledge about the stream. In this case, we just print the stream
  contents in hex.

For this code to be really useful, more stream kinds will need to be
added (particularly for things like lists of memory regions and loaded
modules). However, these can be added incrementally.

Reviewers: jhenderson, zturner, clayborg, aprantl

Subscribers: mgorny, lemo, llvm-commits, lldb-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356753 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Fix compilation before c++17.

ClusteringTest.cpp:25:23: error: constexpr variable cannot have non-literal type 'const llvm::exegesis::(anonymous namespace)::(lambda at /home/buildslave/ps4-buildslave4/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/llvm.src/unittests/tools/llvm-exegesis/ClusteringTest.cpp:25:35)'
static constexpr auto HasPoints = [](const std::vector<int> &Indices) {

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356748 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-exegesis] Add clustering test.

Summary: To show that dbscan is insensitive to the order of the points.

Subscribers: tschuett, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59693

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356747 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy]Add coverage for --split-dwo and --output-format

Also fix up a couple of minor issues in the test being updated, where
FileCheck could match on incorrect output and fix the test case order to
match the struct order.

Reviewed by: grimar

Differential Revision: https://reviews.llvm.org/D59691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356746 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r356738 "[llvm-objcopy] - Implement replaceSectionReferences for GroupSection class."

Seems this broke ubsan bot:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap-ubsan/builds/11760

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356745 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Add basic RV32E definitions and MC layer support

The RISC-V ISA defines RV32E as an alternative "base" instruction set
encoding, that differs from RV32I by having only 16 rather than 32 registers.
This patch adds basic definitions for RV32E as well as MC layer support
(assembling, disassembling) and tests. The only supported ABI on RV32E is
ILP32E.

Add a new RISCVFeatures::validate() helper to RISCVUtils which can be called
from codegen or MC layer libraries to validate the combination of TargetTriple
and FeatureBitSet. Other targets have similar checks (e.g. erroring if SPE is
enabled on PPC64 or oddspreg + o32 ABI on Mips), but they either duplicate the
checks (Mips), or fail to check for both codegen and MC codepaths (PPC).

Codegen for the ILP32E ABI support and RV32E codegen are left for a future
patch/patches.

Differential Revision: https://reviews.llvm.org/D59470

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356744 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Optimize emission of SELECT sequences

This patch optimizes the emission of a sequence of SELECTs with the same
condition, avoiding the insertion of unnecessary control flow. Such a sequence
often occurs when a SELECT of values wider than XLEN is legalized into two
SELECTs with legal types. We have identified several use cases where the
SELECTs could be interleaved with other instructions. Therefore, we extend the
sequence to include non-SELECT instructions if we are able to detect that the
non-SELECT instructions do not impact the optimization.

This patch supersedes https://reviews.llvm.org/D59096, which attempted to
address this issue by introducing a new SelectionDAG node. Hat tip to Eli
Friedman for his feedback on how to best handle this issue.

Differential Revision: https://reviews.llvm.org/D59355
Patch by Luís Marques.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356741 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV] Allow conversion of CC logic to bitwise logic

Indicates in the TargetLowering interface that conversions from CC logic to
bitwise logic are allowed. Adds tests that show the benefit when optimization
opportunities are detected. Also adds tests that show that when the optimization
is not applied correct code is generated (but opportunities for other
optimizations remain).

Differential Revision: https://reviews.llvm.org/D59596
Patch by Luís Marques.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356740 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] - Fix a st_name of the first symbol table entry.

Spec says about the first symbol table entry that index 0 both designates the first entry in the table
and serves as the undefined symbol index. It should have zero value.
Hence the first symbol table entry has no name. And so has to have a st_name == 0.
(http://refspecs.linuxbase.org/elf/gabi4+/ch4.symtab.html)

Currently, we do not emit zero value for the first symbol table entry.
That happens because we add empty strings to the string builder, which
for each such case adds a zero byte:
(https://github.com/llvm-mirror/llvm/blob/master/lib/MC/StringTableBuilder.cpp#L185)
After the string optimization performed it might return non zero indexes for the
empty string requested.

The patch fixes this issue for the case above and other sections with no names.

Differential revision: https://reviews.llvm.org/D59496

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356739 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy] - Implement replaceSectionReferences for GroupSection class.

Currently, llvm-objcopy incorrectly handles compression and decompression of the
sections from COMDAT groups, because we do not implement the
replaceSectionReferences for this type of the sections.

The patch does that.

Differential revision: https://reviews.llvm.org/D59638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356738 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objcopy]Add support for *-freebsd output formats

GNU objcopy can support output formats like elf32-i386-freebsd and
elf64-x86-64-freebsd. The only difference from their regular non-freebsd
counterparts that I have observed is that the freebsd versions set the
OS/ABI field to ELFOSABI_FREEBSD. This patch sets the OS/ABI field
according based on the format whenever --output-format is specified.

Reviewed by: rupprecht, grimar

Differential Revision: https://reviews.llvm.org/D59645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356737 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV][NFC] Add test case to MC/RISCV/linker-relaxation.s showing incorrect relocations being emitted

A follow-up patch will fix this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356736 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Added v5i32 and v5f32 register classes

They are not used by anything yet, but a subsequent commit will start
using them for image ops that return 5 dwords.

Differential Revision: https://reviews.llvm.org/D58903

Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356735 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV][NFC] Expand test/MC/RISCV/linker-relaxation.s tests

Add more complete CHECK lines for the relocations generated when relaxation is
enabled, and add cases where a locally defined symbol is referenced.

Two instances of pcrel_lo(defined_symbol) are commented out, as they will
produce an error. A follow-up patch will fix this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356734 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add 32-bit command lines with and without SSE2 to atomic-non-integer.ll. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356733 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] fix flaky btf unit test static-var-derived-type.ll

The DataSecEentries is defined as an unordered_map since
order does not really matter.
std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>>
DataSecEntries;
This seems causing the test static-var-derived-type.ll flaky
as two sections ".bss" and ".readonly" have undeterministic
ordering when performing map iterating, which decides the
output assembly code sequence of BTF_KIND_DATASEC entries.

Fix the test to have only one data section to remove
flakiness.

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356731 91177308-0d34-0410-b5e6-96231b3b80d8

[DWARF] Refactor RelocVisitor and fix computation of SHT_RELA-typed relocation entries

Summary:
getRelocatedValue may compute incorrect value for SHT_RELA-typed relocation entries.

// DWARFDataExtractor.cpp
uint64_t DWARFDataExtractor::getRelocatedValue(uint32_t Size, uint32_t *Off,
...
  // This formula is correct for REL, but may be incorrect for RELA if the value
  // stored in the location (getUnsigned(Off, Size)) is not zero.
  return getUnsigned(Off, Size) + Rel->Value;

In this patch, we

* refactor these visit* functions to include a new parameter `uint64_t A`.
  Since these visit* functions are no longer used as visitors, rename them to resolve*.
  + REL: A is used as the addend. A is the value stored in the location where the
    relocation applies: getUnsigned(Off, Size)
  + RELA: The addend encoded in RelocationRef is used, e.g. getELFAddend(R)
* and add another set of supports* functions to check if a given relocation type is handled.
  DWARFObjInMemory uses them to fail early.

Reviewers: echristo, dblaikie

Reviewed By: echristo

Subscribers: mgorny, aprantl, aheejin, fedor.sergeev, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356729 91177308-0d34-0410-b5e6-96231b3b80d8

[BPF] handle derived type properly for computing type id

Currently, the type id for a derived type is computed incorrectly.
For example,
type #1: int
type #2: ptr to #1

For a global variable "int *a", type #1 will be attributed to variable "a".
This is due to a bug which assigns the type id of the basetype of
that derived type as the derived type's type id. This happens
to "const", "volatile", "restrict", "typedef" and "pointer" types.

This patch fixed this bug, fixed existing test cases and added
a new one focusing on pointers plus other derived types.

Signed-off-by: Yonghong Song <yhs@fb.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356727 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Autogenerate complete checks. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356723 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Split the neon.addp intrinsic into integer and fp variants.

This is the result of discussions on the list about how to deal with intrinsics
which require codegen to disambiguate them via only the integer/fp overloads.
It causes problems for GlobalISel as some of that information is lost during
translation, while with other operations like IR instructions the information is
encoded into the instruction opcode.

This patch changes clang to emit the new faddp intrinsic if the vector operands
to the builtin have FP element types. LLVM IR AutoUpgrade has been taught to
upgrade existing calls to aarch64.neon.addp with fp vector arguments, and
we remove the workarounds introduced for GlobalISel in r355865.

This is a more permanent solution to PR40968.

Differential Revision: https://reviews.llvm.org/D59655

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356722 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Use LoadInst->getType() instead of LoadInst->getPointerOperandType()->getElementType(). NFCI

For the future day when the pointer's don't have element types, we shoudl just use the type of the load result instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356721 91177308-0d34-0410-b5e6-96231b3b80d8

[InstSimplify] Add tests for signed icmp of and/or; NFC

Even if a signed predicate is used, the ranges computed for and/or
are unsigned, resulting in missed simplifications.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356720 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Fix reading objects created with -fembed-bitcode-marker

Currently, this fails with many tools, e.g.

$ clang -fembed-bitcode-marker -c -o test.o test.c
$ nm test.o
nm: test.o The file was not recognized as a valid object file

-fembed-bitcode-marker creates a LLVM,bitcode section consisting of a single
byte. When reading the object file, IRObjectFile::findBitcodeInObject succeeds,
causing SymbolicFile::createSymbolicFile to try to read the "bitcode" rather
than using the outer Mach-O data - when then fails.

Fix this by making findBitcodeInObject return an error if the section size <= 1.

Patched by: Nicholas Allegra

Differential Revision: https://reviews.llvm.org/D44373

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356718 91177308-0d34-0410-b5e6-96231b3b80d8

Mips: Fix typo in assert message

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356717 91177308-0d34-0410-b5e6-96231b3b80d8

Mips: Don't create copy of nothing

This was creating a copy of the register the pseudo itself was
def'ing, leaving a copy of an undefined register. I'm not sure how
the verifier is not catching this, but this avoids asserting in a
future change to RegAllocFast

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356716 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalISel: Fix RegBankSelect for REG_SEQUENCE

The AArch64 test was broken since the result register already had a
set register class, so this test was a no-op. The mapping verify call
would fail because the result size is not the same as the inputs like
in a copy or phi.

The AMDGPU testcases are half broken and introduce illegal VGPR->SGPR
copies which need much more work to handle correctly (same for phis),
but add them as a baseline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356713 91177308-0d34-0410-b5e6-96231b3b80d8

Don't add a tail keyword to calls to ObjC runtime functions if the calls
are annotated with notail.

r356705 annotated calls to objc_retainAutoreleasedReturnValue with
notail on x86-64. This commit teaches ARC optimizer to check the notail
marker on the call before turning it into a tail call.

rdar://problem/38675807

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356707 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Update for Exynos

Fix the feature set for Exynos M4 by removing support for `+fp16fml` and fix test case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356698 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-objdump] Support arg grouping for -j and -M (e.g. llvm-objdump -sj.foo -dMreg-names-raw)

Summary:
r354375 added support for most objdump groupings, but didn't add support for -j|--sections, because that wasn't possible.
r354870 added --disassembler options, but grouping still wasn't available.
r355185 supported values for grouped options.

This just puts the three of them together. This supports -j in modes like `-s -j .foo`, `-sj .foo`, `-sj=.foo`, or `-sj.foo`, and similar for `-M`.

Reviewers: ormris, jhenderson, ikudrin

Reviewed By: jhenderson, ikudrin

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59618

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356697 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] canonicalizeBitSelect - don't attempt to canonicalize mask registers

We don't use X86ISD::ANDNP for mask registers.

Test case from @craig.topper (Craig Topper)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356696 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbutil] Add -type-ref-stats to help find unused type info

Summary:
This considers module symbol streams and the global symbol stream to be
roots. Most types that this considers "unreferenced" are referenced by
LF_UDT_MOD_SRC_LINE id records, which VC seems to always include.
Essentially, they are types that the user can only find in the debugger
if they call them by name, they cannot be found by traversing a symbol.

In practice, around 80% of type information in a PDB is referenced by a
symbol. That seems like a reasonable number.

I don't really plan to do anything with this tool. It mostly just exists
for informational purposes, and to confirm that we probably don't need
to implement type reference tracking in LLD. We can continue to merge
all types as we do today without wasting space.

Reviewers: zturner, aganea

Subscribers: mgorny, hiraditya, arphaman, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356692 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests with movmsk potential (PR39665); NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356691 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Don't transform ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) if either zext or OP has another use.

If they have other users we'll just end up increasing the instruction count.

We might be able to weaken this to only one of them having a single use if we can prove that the and will be removed.

Fixes PR41164.

Differential Revision: https://reviews.llvm.org/D59630

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356690 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Don't avoid folding multiple use sign extended 8-bit immediate into instructions under optsize.

Under optsize we try to avoid folding immediates into instructions under optsize. But if the immediate is 16-bits or 32 bits, but can be encoded as an 8-bit immediate we don't save enough from disabling the folding unless the immediate has enough uses to make up for the size of the move which is either 3 bytes or 5 bytes since there are no sign extended 8-bit moves. We would also save something if the immediate was a live out of the basic block and thus a move was unavoidable, but that would require a more advanced heuristic than just counting uses.

Note we only avoid folding multiple use immediates into the patterns that use X86ISD::ADD/SUB/XOR/OR/AND/CMP/ADC/SBB nodes and not the more common ISD::ADD/SUB/XOR/OR/AND nodes.

Differential Revision: https://reviews.llvm.org/D59522

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356688 91177308-0d34-0410-b5e6-96231b3b80d8

[ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and compressstore intrinsics.

This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle.

I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway.

Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that.

Differential Revision: https://reviews.llvm.org/D59180

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356687 91177308-0d34-0410-b5e6-96231b3b80d8

[ValueTracking] Use ConstantRange based overflow check for signed sub

This is D59450, but for signed sub. This case is not NFC, because
the overflow logic in ConstantRange is more powerful than the existing
check. This resolves the TODO in the function.

I've added two tests to show that this indeed catches more cases than
the previous logic, but the main correctness test coverage here is in
the existing ConstantRange unit tests.

Differential Revision: https://reviews.llvm.org/D59617

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356685 91177308-0d34-0410-b5e6-96231b3b80d8

Add more rotate tests, including ORs of rotates

This is a part of https://reviews.llvm.org/D47735.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356683 91177308-0d34-0410-b5e6-96231b3b80d8

Fixup opt-remarks.ll gold plugin test. NFC

Now that rL356594 has added a TailCallElim pass to LTO, the call gets marked as
tail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356669 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombiner] Use getTokenFactor in a few more cases.

SDNodes can only have 64k operands and for some inputs (e.g. large
number of stores), we can reach this limit when creating TokenFactor
nodes. This patch is a follow up to D56740 and updates a few more places
that potentially can create TokenFactors with too many operands.

Reviewers: efriedma, craig.topper, aemerson, RKSimon

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D59156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356668 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] SimplifySelectCC - call FoldSetCC with the setcc result type

We were calling FoldSetCC with the compare operand type instead of the result type.

Found by OSS-Fuzz #13838 (https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13838)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356667 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGenPrepare] limit formation of overflow intrinsics (PR41129)

This is probably a bigger limitation than necessary, but since we don't have any evidence yet
that this transform led to real-world perf improvements rather than regressions, I'm making a
quick, blunt fix.

In the motivating x86 example from:
https://bugs.llvm.org/show_bug.cgi?id=41129
...and shown in the regression test, we want to avoid an extra instruction in the dominating
block because that could be costly.

The x86 LSR test diff is reversing the changes from D57789. There's no evidence that 1 version
is any better than the other yet.

Differential Revision: https://reviews.llvm.org/D59602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356665 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-readobj] Format codes. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356664 91177308-0d34-0410-b5e6-96231b3b80d8

[Thumb] Fix infinite loop in ABS expansion (PR41160)

Don't expand ISD::ABS node if its legal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356661 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Support for v3i32/v3f32

Added support for dwordx3 for most load/store types, but not DS, and not
intrinsics yet.

SI (gfx6) does not have dwordx3 instructions, so they are not enabled
there.

Some of this patch is from Matt Arsenault, also of AMD.

Differential Revision: https://reviews.llvm.org/D58902

Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356659 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wmisleading-indentation gcc7 warning. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356658 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Allow -mattr=tpidr-el[1|2|3]

Added subtarget features for AArch64 to use TPIDR_EL[1|2|3] as the TLS base
register, rather than the default TPIDR_EL0.

Patch by Philip Derrin!

Differential revision: https://reviews.llvm.org/D54685

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356657 91177308-0d34-0410-b5e6-96231b3b80d8

[SelectionDAG] Add scalarization of ABS node (PR41149)

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D59577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356656 91177308-0d34-0410-b5e6-96231b3b80d8

Fix two more issues with r356652

The first problem was a use-after-free in the tests (detected by asan
bots). The temporary array created for the "create" call is guaranteed
to live only until the end of the statement. The fix there is to store
the test data in a local variable to ensure it has the right lifetime

The second issue is broken BUILD_SHARED_LIBS build, which I fix by
adding the appropriate BinaryFormat dependency to the Object unit tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356655 91177308-0d34-0410-b5e6-96231b3b80d8

[RISCV][NFC] Remove old CHECK lines from linker-relaxation.s test

The RELOC: check lines are no longer used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356654 91177308-0d34-0410-b5e6-96231b3b80d8

Attempt to fix modules build for r356652

The commit added a new .def file. This adds it to the list of textual
headers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356653 91177308-0d34-0410-b5e6-96231b3b80d8

[Object] Add basic minidump support

Summary:
This patch adds basic support for reading minidump files. It contains
the definitions of various important minidump data structures (header,
stream directory), and of one minidump stream (SystemInfo). The ability
to read other streams will be added in follow-up patches. However, all
streams can be read even now as raw data, which means lldb's minidump
support (where this code is taken from) can be immediately rebased on
top of this patch as soon as it lands.

As we don't have any support for generating minidump files (yet), this
tests the code via unit tests with some small handcrafted binaries in
the form of c char arrays.

Reviewers: Bigcheese, jhenderson, zturner

Subscribers: srhines, dschuff, mgorny, fedor.sergeev, lemo, clayborg, JDevlieghere, aprantl, lldb-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59291

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356652 91177308-0d34-0410-b5e6-96231b3b80d8

[BasicAA] Use DenseMap::try_emplace after D59151. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356651 91177308-0d34-0410-b5e6-96231b3b80d8

Silence warning about unused variable in builds without asserts [NFC]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356648 91177308-0d34-0410-b5e6-96231b3b80d8

[ScalarizeMaskedMemIntrinsics] Reverse some if conditions to reduce indentations to remove curly braces.

Pre-commit for D59180

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356646 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add test case for PR41164. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356645 91177308-0d34-0410-b5e6-96231b3b80d8

[BasicAA] Reduce no of map seaches [NFCI].

Summary:
This is a refactoring patch.
- Reduce the number of map searches by reusing the iterator.
- Add asserts to check that the entry is in the cache, as this is something BasicAA relies on to avoid infinite recursion.

Reviewers: chandlerc, aschwaighofer

Subscribers: sanjoy, jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59151

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356644 91177308-0d34-0410-b5e6-96231b3b80d8

[instcombine] Add some todos, and arrange code for readibility

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356642 91177308-0d34-0410-b5e6-96231b3b80d8

[MSSA] Delete move ctor; remove dynamic never-moved verification

Code archaeology in D59315 revealed that MSSA should never be moved.
Rather than trying to check dynamically that this hasn't happened in the
verify() functions of Walkers, it's likely best to just delete its move
constructor.

Since all these verify() functions did is check that MSSA hasn't moved,
this allows us to remove these verify functions.

I can readd the verification checks if someone's super concerned about
us trying to `memcpy` MemorySSA or something somewhere, but I imagine we
have other problems if we're trying anything like that...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356641 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add CMPXCHG8B feature flag. Set it for all CPUs except i386/i486 including 'generic'. Disable use of CMPXCHG8B when this flag isn't set.

CMPXCHG8B was introduced on i586/pentium generation.

If its not enabled, limit the atomic width to 32 bits so the AtomicExpandPass will expand to lib calls. Unclear if we should be using a different limit for other configs. The default is 1024 and experimentation shows that using an i256 atomic will cause a crash in SelectionDAG.

Differential Revision: https://reviews.llvm.org/D59576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356631 91177308-0d34-0410-b5e6-96231b3b80d8

Fix Mach-O bind and rebase validation errors in libObject

Summary:
llvm-objdump (via libObject) validates DYLD_INFO rebase and bind
entries against the basic structure found in the Mach-O file before
evaluating the contents of those entries. Certain malformed Mach-Os can
defeat the validation check and force llvm-objdump (libObject) to crash.

The previous logic verified a rebase or bind started in a valid Mach-O
section, but did not verify that the section wholely contained the
fixup. It also generally allows rebases or binds to start immediately
after a valid section even if that range is not itself part of a valid
section. Finally, bind and rebase opcodes that indicate more than one
fixup (apply N times...) are not completely validated: only the first
and final fixups are checked.

The previous logic also rejected certain binaries as false positives.
Some bind and rebase opcodes can modify the state machine such that the
next bind or rebase will fail. libObject will reject these opcodes as
invalid in order to be helpful and print an error message associated
with the instruction that caused the problem, even though the binary is
not actually illegal until it consumes the invalid state in the state
machine. In other words, libObject may reject a Mach-O binary that
Apple's dynamic linker may consider legal. The original version of
macho-rebase-add-addr-uleb-too-big is an example of such a binary.

I have replaced the existing checkSegAndOffset and checkCountAndSkip
functions with a single function, checkSegAndOffsets, which validates
all of the fixups realized by a DYLD_INFO opcode. checkSegAndOffsets
verifies that a Mach-O section fully contains each fixup. Every fixup
realized by an opcode is validated, and some (but not all!)
inconsistencies in the state machine are allowed until a fixup is
realized. This means that libObject may fail on an opcode that realizes
a fixup, not on the opcode that introduced the arithmetic error.

Existing test cases have been modified to reflect the changes in error
messages returned by libObject. What's more, the test case for
macho-rebase-add-addr-uleb-too-big has been modified so that it actually
triggers the error condition; the new code in libObject considers the
original test binary "legal".

rdar://47797757

Reviewers: lhames, pete, ab

Reviewed By: pete

Subscribers: rupprecht, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356629 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly][NFC] Fix formatting error from rL356610

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356622 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Do not generate spurious PAL metadata

My previous fix rL356591 "[AMDGPU] Added MsgPack format PAL metadata"
accidentally caused a spurious PAL metadata .note record to be emitted
for any AMDGPU output. That caused failures in the lld test
amdgpu-relocs.s. Fixed.

Differential Revision: https://reviews.llvm.org/D59613

Change-Id: Ie04a2aaae890dcd490f22c89edf9913a77ce070e

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356621 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Add additional sub nsw inference tests; NFC

nsw can be determined based on known bits here, but currently
isn't.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356620 91177308-0d34-0410-b5e6-96231b3b80d8

Allow machine dce to remove uses in the same instruction

Machine DCE cannot remove a dead definition if there are non-dbg uses.
A use however can be in the same instruction:

dead %0 = INST %0

Such instructions sometimes created by Detect dead lanes pass.

Allow this instruction to be deleted despite the use if the only use
belongs to the same instruction.

Differential Revision: https://reviews.llvm.org/D59565

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356619 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Call lowerShuffleAsBitMask for 512-bit vectors in lowerShuffleAsBlend.

This patch enables the use of lowerShuffleAsBitMask for 512-bit blends before
falling back to move immedate, GPR to k-register, and masked op.

I had to make some changes to support v8i64 when i64 is not a legal type. And to
support floating point types.

This trades a load for the move immediate and GPR move which is higher latency.
But its probably better for register pressure not having to hop through other
register classes. The load+and should play better with LICM and
rematerialization I think.

Differential Revision: https://reviews.llvm.org/D59479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356618 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix dependency on `BinaryFormat`

Summary: - The linking is broken when this library is built as shared one.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59610

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356617 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Don't look for constant in insert/extract_vector_elt regbankselect

The constantness shouldn't change the register bank choice. We also
don't need to restrict this to only indexing VGPRs, since it's
possible to index SGPRs (but SelectionDAG made using this
difficult). Allow directly indexing SGPRs when appropriate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356611 91177308-0d34-0410-b5e6-96231b3b80d8

[WebAssembly] Target features section

Summary:
Implements a new target features section in assembly and object files
that records what features are used, required, and disallowed in
WebAssembly objects. The linker uses this information to ensure that
all objects participating in a link are feature-compatible and records
the set of used features in the output binary for use by optimizers
and other tools later in the toolchain.

The "atomics" feature is always required or disallowed to prevent
linking code with stripped atomics into multithreaded binaries. Other
features are marked used if they are enabled globally or on any
function in a module.

Future CLs will add linker flags for ignoring feature compatibility
checks and for specifying the set of allowed features, implement using
the presence of the "atomics" feature to control the type of memory
and segments in the linked binary, and add front-end flags for
relaxing the linkage policy for atomics.

Reviewers: aheejin, sbc100, dschuff

Subscribers: jgravelle-google, hiraditya, sunfish, mgrang, jfb, jdoerfert, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59173

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356610 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Fix clamp bit DAG operand

Summary:
- Should use `targetconstant` instead of `constant` operand for clamp
  bit, which is expected as an immediate operand. Under certain
  conditions, such as a common `i1 false` constant is used in other
  place and selected before the instruction with clamp bit, register
  operand may be added instead of immediate one. Use `targetcosntant` to
  enforce that.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356608 91177308-0d34-0410-b5e6-96231b3b80d8

[ARC] Add ARCOptAddrMode pass to generate postincrement loads/stores.

Build on newly introduced ARC postincrement loads/stores from r356200.

Patch By Denis Antrushin! <denis@synopsys.com>

Differential Revision: https://reviews.llvm.org/D59409

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356606 91177308-0d34-0410-b5e6-96231b3b80d8

[AArch64] Fix formatting (NFC)

Indent macro instances properly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356604 91177308-0d34-0410-b5e6-96231b3b80d8

AMDHSA: Fix COMPUTE_PGM_RSRC2.USER_SGPR calculation when parsing ISA assembly

It must match https://llvm.org/docs/AMDGPUUsage.html#initial-kernel-execution-state

Differential Revision: https://reviews.llvm.org/D59570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356603 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Eliminate redundant "mov rN, sp" instructions in Thumb1.

This takes sequences like "mov r4, sp; str r0, [r4]", and optimizes them
to something like "str r0, [sp]".

For regular stack variables, this optimization was already implemented:
we lower loads and stores using frame indexes, which are expanded later.
However, when constructing a call frame for a call with more than four
arguments, the existing optimization doesn't apply. We need to use
stores which are actually relative to the current value of sp, and don't
have an associated frame index.

This patch adds a special case to handle that construct. At the DAG
level, this is an ISD::STORE where the address is a CopyFromReg from SP
(plus a small constant offset).

This applies only to Thumb1: in Thumb2 or ARM mode, a regular store
instruction can access SP directly, so the COPY gets eliminated by
existing code.

The change to ARMDAGToDAGISel::SelectThumbAddrModeSP is a related
cleanup: we shouldn't pretend that it can select anything other than
frame indexes.

Differential Revision: https://reviews.llvm.org/D59568

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356601 91177308-0d34-0410-b5e6-96231b3b80d8

[Linker] Fix crash handling appending linkage

Summary:
When linking two llvm.used arrays, if the resulting merged
array ends up with duplicated elements (with the same name) but with
different types, the IRLinker was crashing. This was supposed to be
legal, as the IRLinker bitcasts elements to match types in these
situations.

This bug was exposed by D56928 in clang to support attribute used
in member functions of class templates. Crash happened when self-hosting
with LTO. Since LLVM depends on attribute used to generate code
for the dump() method, ubiquitous in the code base, many input bc
had a definition of this method referenced in their llvm.used array.
Some of these classes got optimized, changing the type of the first
parameter (this) in the dump method, leading to a scenario with a
pool of valid definitions but some with a different type, triggering
this bug.

This is a memory bug: ValueMapper depends on (calls) the materializer
provided by IRLinker, and this materializer was freely calling RAUW
methods whenever a global definition was updated in the temporary merged
output file. However, replaceAllUsesWith may or may not destroy
constants that use this global. If the linked definition has a type
mismatch regarding the new def and the old def, the materializer would
bitcast the old type to the new type and the elements of the llvm.used
array, which already uses bitcast to i8*, would end up with elements
cascading two bitcasts. RAUW would then indirectly call the
constantfolder to update the constant to the new ref, which would,
instead of updating the constant, destroy it to be able to create
a new constant that folds the two bitcasts into one. The problem is that
ValueMapper works with pointers to the same constants that may be
getting destroyed by RAUW. Obviously, RAUW can update references in the
Module to do not use the old destroyed constant, but it can't update
ValueMapper's internal pointers to these constants, which are now
invalid.

The approach here is to move the task of RAUWing old definitions
outside of the materializer.

Test Plan:
Added LIT test case, tested clang self-hosting with D56928 and
verified it works

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D59552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356597 91177308-0d34-0410-b5e6-96231b3b80d8

[NFC] Fix brace indentation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356596 91177308-0d34-0410-b5e6-96231b3b80d8

Resubmit r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"

Failing LLD tests have been fixed in r356593.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356594 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Added MsgPack format PAL metadata

Summary:
PAL metadata now supports both the old linear reg=val pairs format and
the new MsgPack format.

The MsgPack format uses YAML as its textual representation. On output to
YAML, a mnemonic name is provided for some hardware registers.

Differential Revision: https://reviews.llvm.org/D57028

Change-Id: I2bbaabaaca4b3574f7e03b80fbef7c7a69d06a94

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356591 91177308-0d34-0410-b5e6-96231b3b80d8