granicus.if.org Git

[AMDGPU] Fix for issue in alloca to vector promotion pass

Summary:
Alloca promotion pass not dealing with non-canonical input

Added some additional checks so the pass simply backs-off forms it can't deal with (non-canonical)

Also added some test cases in non-canonical form to check that it no longer crashes

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31710

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305079 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Custom machine-scheduler. NFCI.

This patch creates a customised machine-scheduler for ARM targets,
so that subsequently DAG mutations etc can be added.
Reviewed by: hahn, rengolin, rovka.
Differential Revision: https://reviews.llvm.org/D34039

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305078 91177308-0d34-0410-b5e6-96231b3b80d8

[MC] Fix compiler crash in AsmParser::Lex

When an empty comment is present in an assembly file, the compiler will crash because it checks the first character for '\n' or '\r'.
The fix consists of also checking if the string is empty before accessing the *front* method of the StringRef.
A test is included for the x86 target, but this issue is reproducible with other targets as well.

Patch by Alexandru Guduleasa!

Reviewers: niravd, grosbach, llvm-commits

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D33993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305077 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Add LLVM header to HexagonPatterns.td

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305074 91177308-0d34-0410-b5e6-96231b3b80d8

[XRay] Fix computation of function size subject to XRay threshold

Summary:
Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions.

The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks.

I see two options:
1. Count the number of MachineInstr`s in MachineFunction : this gives better estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly.
2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted.

Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric.

Reviewers: dberris, rengolin

Reviewed By: dberris

Subscribers: llvm-commits, iid_iunknown

Differential Revision: https://reviews.llvm.org/D34027

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305072 91177308-0d34-0410-b5e6-96231b3b80d8

Prevent RemoveDeadNodes from deleted already deleted node.

This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.

Reviewers: bkramer, spatel, davide

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33731

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305070 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Add scheduling info for VFMS

The scalar VFMS instructions did not have scheduling information attached (but
VFMA did), which was causing assertion failures with the Cortex-A57 scheduling
model and -fp-contract=fast.

Differential Revision: https://reviews.llvm.org/D34040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305064 91177308-0d34-0410-b5e6-96231b3b80d8

llvm/test/DebugInfo/PDB/pdbdump-debug-subsections.test: Try to unbreak r305043.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305063 91177308-0d34-0410-b5e6-96231b3b80d8

Test commit: remove whitespace

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305059 91177308-0d34-0410-b5e6-96231b3b80d8

bugpoint: disabling symbolication of bugpoint-executed programs

Initial implementation - needs similar work/testing for other tools
bugpoint invokes (llc, lli I think, maybe more).

Alternatively (as suggested by chandlerc@) an environment variable could
be used. This would allow the option to pass transparently through user
scripts, pass to compilers if they happened to be LLVM-ish, etc.

I worry a bit about using cl::opt in the crash handling code - LLVM
might crash early, perhaps before the cl::opt is properly initialized?
Or at least before arguments have been parsed?

- should be OK since it defaults to "pretty", so if the crash is very
early in opt parsing, etc, then crash reports will still be symbolized.

I shyed away from doing this with an environment variable when I
realized that would require copying the existing environment and
appending the env variable of interest. But it seems there's no existing
LLVM API for accessing the environment (even the Support tests for
process launching have their own ifdefs for getting the environment). It
could be added, but seemed like a higher bar/untested codepath to
actually add environment variables.

Most importantly, this reduces the runtime of test/BugPoint/metadata.ll
in a split-dwarf Debug build from 1m34s to 6.5s by avoiding a lot of
symbolication. (this wasn't a problem for non-split-dwarf builds only
because the executable was too large to map into memory (due to bugpoint
setting a 400MB memory (including address space - not sure why? Going to
remove that) limit on the child process) so symbolication would fail
fast & wouldn't spend all that time parsing DWARF, etc)

Reviewers: chandlerc, dannyb

Differential Revision: https://reviews.llvm.org/D33804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305056 91177308-0d34-0410-b5e6-96231b3b80d8

[IndVars] Add an option to be able to disable LFTR

This change adds an option disable-lftr to be able to disable Linear Function Test Replace optimization.
By default option is off so current behavior is not changed.

Reviewers: reames, sanjoy, wmi, andreadb, apilipenko
Reviewed By: sanjoy
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305055 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.

If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.

If we do, we get incorrect optimizations in cases like:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] - 128;
}

which gets optimized to:

void foo(const unsigned char *from, unsigned char *to, int n) {
  for (int i = 0; i < n; i++)
    to[i] = from[i] | 128;
}

Because:
- InstCombine turned `sub i32 %from.i, 128` into
  `add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
  vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
  "couldn't wrap", and changed the `add` to an `or`.

InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305053 91177308-0d34-0410-b5e6-96231b3b80d8

Inliner: Don't touch indirect calls

Other comments/implications are that this isn't intended behavior (nor
perserved/reimplemented in the new inliner) & complicates fixing the
'inlining' of trivially dead calls without consulting the cost function
first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305052 91177308-0d34-0410-b5e6-96231b3b80d8

Fix -Wunused-variable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305051 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbdump] Fix -Wpessimizing-move warnings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305050 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] Pass a proper context instruction to all of the calls into InstSimplify

Summary: This matches the behavior we already had for compares and makes us consistent everywhere.

Reviewers: dberlin, hfinkel, spatel

Reviewed By: dberlin

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305049 91177308-0d34-0410-b5e6-96231b3b80d8

[codeview] use 32-bit integer for RelocOffset in DebugLinesSubsection

Summary:
RelocOffset is a 32-bit value, but we previously truncated it to 16 bits.

Fixes PR33335.

Reviewers: zturner, hiraditya!

Reviewed By: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D33968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305043 91177308-0d34-0410-b5e6-96231b3b80d8

[pdb] Don't crash on unknown debug subsections.

More and more unknown debug subsection kinds are being discovered
so we should make it possible to dump these and display the
bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305041 91177308-0d34-0410-b5e6-96231b3b80d8

sink DebugCompressionType into MC for exposing to clang

This is a preparatory change to expose the debug compression style to
clang. It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.

Minor tweak to the ELF Object Writer to use a variable for re-used
values. Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305038 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeView] Support remaining debug subsection types

This adds support for Symbols, StringTable, and FrameData subsection
types.  Even though these subsections rarely if ever appear in a PDB
file (they are usually in object files), there's no theoretical reason
why they *couldn't* appear in a PDB.  The real issue though is that in
order to add support for dumping and writing them (which will be useful
for object files), we need a way to test them.  And since there is no
support for reading and writing them to / from object files yet, making
PDB support them is the best way to both add support for the underlying
format and add support for tests at the same time.  Later, when we go
to add support for reading / writing them from object files, we'll need
only minimal changes in the underlying read/write code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305037 91177308-0d34-0410-b5e6-96231b3b80d8

Fix build by adding includes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305036 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbdump] Support native ordering of subsections in raw mode.

This is the same change for the YAML Output style applied to the
raw output style.  Previously we would queue up all subsections
until every one had been read, and then output them in a pre-
determined order.  This was because some subsections need to be
read first in order to properly dump later subsections.  This
patch allows them to be dumped in the order they appear.

Differential Revision: https://reviews.llvm.org/D34015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305034 91177308-0d34-0410-b5e6-96231b3b80d8

[llvm-pdbdump] Improve consistency among subcommands.

The pdb2yaml and raw subcommands did something very
similar but with a different output format, and they
used a lot of the same command line options, but each
one re-implemented the command line option with slightly
different spellings / options. This patch merges them
together into a single definition which is shared by
both subcommands. This new syntax also allows for more
flexibility in the way debug subsections are dumped.

Differential Revision: https://reviews.llvm.org/D33996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305032 91177308-0d34-0410-b5e6-96231b3b80d8

[CFI] Remove LinkerSubsectionsViaSymbols.

Since D17854 LinkerSubsectionsViaSymbols is unnecessary.

It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.

This is the second attempt to land this change after fixing PR33316.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305031 91177308-0d34-0410-b5e6-96231b3b80d8

[ExtractGV] Fix the doxygen comment on the constructor and the class to refer to global values instead of functions. While there fix an 80 column violation. NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305030 91177308-0d34-0410-b5e6-96231b3b80d8

Fixed warning: dereferencing type-punned pointer will break strict-aliasing rules.

No need in reinterpret_cast<StringTableOffset &> here, as struct coff_symbol Name is a unin
with the member StringTableOffset Offset. This union member could be accessed directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305029 91177308-0d34-0410-b5e6-96231b3b80d8

[IR] Remove getNumSuccessorsV/getSuccessorV/setSuccessorV from the TerminatorInst subclasses as much as possible now that Value has been de-virtualized

These used to be virtual methods that would enable doing the right thing with only a TerminatorInst pointer. I believe they were also acting as vtable anchors in my cases. I think the fact that they had a separate name ending in V was to allow a version without V to be called without a virtual call in a pre-C++11 final keyword world.

Where possible the base methods in TerminatorInst dispatch directly to the public methods in the classes that have the same signature. For some classes this wasn't possible so I've left private method versions that match the name and signature of the version in TerminatorInst. All versions have been moved into the class definitions since we no longer need vtable anchors here.

Differential Revision: https://reviews.llvm.org/D34011

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305028 91177308-0d34-0410-b5e6-96231b3b80d8

Write summaries for merged modules when splitting modules for ThinLTO.

This is to prepare to allow for dead stripping of globals in the
merged modules.

Differential Revision: https://reviews.llvm.org/D33921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305027 91177308-0d34-0410-b5e6-96231b3b80d8

[sanitizer-coverage] one more flavor of coverage: -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305026 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Move datalayout check into irsymtab::build. NFCI.

This check is a requirement of the irsymtab builder, not of any
particular caller.

Differential Revision: https://reviews.llvm.org/D33970

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305023 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Introduce a BitcodeFileContents data type. NFCI.

This data type includes the contents of a bitcode file.
Right now a bitcode file can only contain modules, but
a later change will add a symbol table.

Differential Revision: https://reviews.llvm.org/D33969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305019 91177308-0d34-0410-b5e6-96231b3b80d8

test-release.sh: Remove workaround for test-suite build

Summary: We aren't actually building the test suite, so this isn't needed.

Reviewers: rengolin, hansw

Reviewed By: rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D29840

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305017 91177308-0d34-0410-b5e6-96231b3b80d8

RegAllocPBQP: Do not assign reserved physical register

(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.

(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.

(2) MachineVerifier: updated the test per MatzeB's review.

(3) +testcase

Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!

Differential Revision: https://reviews.llvm.org/D33947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305016 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Re-enable machine verifier after codegen passes

Remove "false" from the arguments to "addPass" in Hexagon's target pass
config.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305015 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Skip mux generation when predicate register is undefined

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305014 91177308-0d34-0410-b5e6-96231b3b80d8

[MachO] Fix codegen of alias of alias.

Fixes PR33316.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305012 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP, x86] add tests for potential memcmp expansion; NFC

No IR tests were added with rL304313 ( https://reviews.llvm.org/D28637 ),
so I want these for extra coverage if we enable memcmp expansion for x86.
As shown, nothing is expanded for x86 in CGP yet.

Also fundamentally, we're doing an IR transform, so we should have IR tests
for just that part. If something goes wrong, we need to know if the bug is
in CGP or later lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305011 91177308-0d34-0410-b5e6-96231b3b80d8

Do not early-inline recursive calls in sample profile loader.

Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it.

Reviewers: davidxl, dnovillo, iteratee

Reviewed By: iteratee

Subscribers: iteratee, llvm-commits, sanjoy

Differential Revision: https://reviews.llvm.org/D34017

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305009 91177308-0d34-0410-b5e6-96231b3b80d8

fix formatting; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305008 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] don't expand a memcmp with nobuiltin attribute

This matches the behavior used in the SDAG when expanding memcmp.

For reference, we're intentionally treating the earlier fortified call transforms differently after:
https://bugs.llvm.org/show_bug.cgi?id=23093
https://reviews.llvm.org/rL233776

One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers:
https://reviews.llvm.org/D19781
https://reviews.llvm.org/D19801

Differential Revision: https://reviews.llvm.org/D34043

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305007 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Work around build special casing .inc files

It complains because it assumes these were autogenerated files
in the source directory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305005 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU: Use correct register names in inline assembly

Fixes using physical registers in inline asm from clang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305004 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Speedup NumNodesBlocking calculation. NFCI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305003 91177308-0d34-0410-b5e6-96231b3b80d8

[PPC] In PPCBoolRetToInt change the bool value to i64 if the target is ppc64

In PPCBoolRetToInt bool value is changed to i32 type. On ppc64 it may introduce an extra zero extension for the return value. This patch changes the integer type to i64 to avoid the zero extension on ppc64.

This patch fixed PR32442.

Differential Revision: https://reviews.llvm.org/D31407

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305001 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU] Force qsads instrs to use different dest register than source registers

The V_MQSAD_PK_U16_U8, V_QSAD_PK_U16_U8, and V_MQSAD_U32_U8 take more than 1 pass in hardware. For these three instructions, the destination registers must be different than all sources, so that the first pass does not overwrite sources for the following passes.

Differential Revision: https://reviews.llvm.org/D33783

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304998 91177308-0d34-0410-b5e6-96231b3b80d8

Update release notes for BinaryFormat library.

Differential Revision: https://reviews.llvm.org/D34001

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304995 91177308-0d34-0410-b5e6-96231b3b80d8

Changed a comparison operator for std::stable_sort to implement strict weak ordering.

This is a temporarily fix which needs additional work, as it triggers a test3 failure.
test3 is commented out till then.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304993 91177308-0d34-0410-b5e6-96231b3b80d8

[Power9] Exploit vector integer extend instructions

This patch adds build vector patterns to exploit the vector integer
extend instructions:
vextsb2w - Vector Extend Sign Byte To Word
vextsb2d - Vector Extend Sign Byte To Doubleword
vextsh2w - Vector Extend Sign Halfword To Word
vextsh2d - Vector Extend Sign Halfword To Doubleword
vextsw2d - Vector Extend Sign Word To Doubleword

Differential Revision: https://reviews.llvm.org/D33510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304992 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] add memcmp test with nobuiltin attr; NFC

In SDAG, we don't expand libcalls with a nobuiltin attribute.
It's not clear if that's correct from the existing code comment:
"Don't do the check if marked as nobuiltin for some reason."

...adding a test here either way to show that there is currently
a different behavior implemented in the CGP-based expansion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304991 91177308-0d34-0410-b5e6-96231b3b80d8

[LazyValueInfo] Make LVILatticeVal intersect method take arguments by reference so we don't copy ConstantRanges unless we need to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304990 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] remove unused param from tests; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304989 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion

The test diff for PowerPC shows we can better optimize if this case is one block.

For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed
cheap and SDAG can't optimize across blocks.

Instead of this:

_cmp_eq8:
  movq  (%rdi), %rax
  cmpq  (%rsi), %rax
  je  LBB23_1
## BB#2:                                ## %res_block
  movl  $1, %ecx
  jmp LBB23_3
LBB23_1:
  xorl  %ecx, %ecx
LBB23_3:                                ## %endblock
  xorl  %eax, %eax
  testl %ecx, %ecx
  sete  %al
  retq

We get this:

cmp_eq8:
  movq  (%rdi), %rcx
  xorl  %eax, %eax
  cmpq  (%rsi), %rcx
  sete  %al
  retq

And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall().
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we
can lower the IR produced here optimally.

Differential Revision: https://reviews.llvm.org/D34005

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304987 91177308-0d34-0410-b5e6-96231b3b80d8

Add scheduler classes to integer/float horizontal operations.
This patch will close PR32801.
Differential Revision: https://reviews.llvm.org/D33203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304986 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] More comments fix, NFC.

Fixed spelling errors on function description.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304985 91177308-0d34-0410-b5e6-96231b3b80d8

[PDB] Don't crash on /debug:fastlink PDBs.

Apparently support for /debug:fastlink PDBs isn't part of the
DIA SDK (!), and it was causing llvm-pdbdump to crash because
we weren't checking for a null pointer return value. This
manifests when calling findChildren on the IDiaSymbol, and
it returns E_NOTIMPL.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304982 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] add tests for memcmp expansion; NFC

We already had a test to demonstrate PR33325:
https://bugs.llvm.org/show_bug.cgi?id=33325

I'm adding tests for general memcmp expansion (see D34005 / D33963) and:
https://bugs.llvm.org/show_bug.cgi?id=33329

...plus non-power-of-2 sizes, so we can see what that looks like currently or if expanded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304979 91177308-0d34-0410-b5e6-96231b3b80d8

InferAddressSpaces: Avoid assertion failure with replacing identical
cloned constexpr

Have cloneConstantExprWithNewAddressSpaces return nullptr when
returning initial ConstantExpr.

Reviewers: arsenm

Subscribers: jholewinski, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D33995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304975 91177308-0d34-0410-b5e6-96231b3b80d8

Regenerate test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304973 91177308-0d34-0410-b5e6-96231b3b80d8

This patch closes PR28513: an optimization of multiplication by different constants.
The initial patch was rejected: I fixed the issue and re-apply it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304972 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Add more tests. NFC

Add a couple of tests to increase coverage for the TableGen'erated code,
in particular for rules where 2 generic instructions may be combined
into a single machine instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304971 91177308-0d34-0410-b5e6-96231b3b80d8

[BPI] Don't assume that strcmp returning >0 is more likely than <0

The zero heuristic assumes that integers are more likely positive than negative,
but this also has the effect of assuming that strcmp return values are more
likely positive than negative. Given that for nonzero strcmp return values it's
the ordering of arguments that determines the sign of the result there's no
reason to assume that's true.

Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to
decide if it's strcmp-like, and if so only assume that nonzero is more likely
than zero i.e. strings are more often different than the same. This causes a
slight code generation change in the spec2006 benchmark 403.gcc, but with no
noticeable performance impact. The intent of this patch is to allow better
optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are
also some changes that need to be made to if-conversion.

Differential Revision: https://reviews.llvm.org/D33934

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304970 91177308-0d34-0410-b5e6-96231b3b80d8

[Go] Subtypes function

This patch adds LLVMGetSubtypes to Go API (as Type.Subtypes), tests included.

Patch by Ekaterina Vaartis!

Differential Revision: https://reviews.llvm.org/D33901

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304968 91177308-0d34-0410-b5e6-96231b3b80d8

Correct AMDGPU Hawaii and Kabini target names

The FirePro and Radeon versions of Hawaii have different 64 bit floating point configurations so use distinct target names for them. Rename the target name for Kabini to accommodate.

Differential Revision: https://reviews.llvm.org/D34016

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304959 91177308-0d34-0410-b5e6-96231b3b80d8

Object: Factor out the code for creating the irsymtab for an arbitrary bitcode file.

This code now lives in lib/Object. The idea is that it can now be reused by
IRObjectFile among other things.

Differential Revision: https://reviews.llvm.org/D31921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304958 91177308-0d34-0410-b5e6-96231b3b80d8

[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304954 91177308-0d34-0410-b5e6-96231b3b80d8

GlobalsModRef: Ensure optnone+readonly/readnone attributes are respected

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304945 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Comment fix, NFC.

Fixed comment in function description.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304940 91177308-0d34-0410-b5e6-96231b3b80d8

[InstCombine] fold lshr (sext X), C1 --> zext (lshr X, C2)

This was discussed in D33338. We have larger pattern-matching ending in a truncate that
we can reduce or remove by handling these smaller patterns first. Further motivation is
that narrower shift ops are easier for value tracking and zext is better than sext.

http://rise4fun.com/Alive/rhh

Name: boolshift
%sext = sext i1 %x to i8
%r = lshr i8 %sext, 7

=>

%r = zext i1 %x to i8

Name: noboolshift
%sext = sext i3 %x to i8
%r = lshr i8 %sext, 7

=>

%sh = lshr i3 %x, 2
%r = zext i3 %sh to i8

Differential Revision: https://reviews.llvm.org/D33879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304939 91177308-0d34-0410-b5e6-96231b3b80d8

[SLP] Comment fix, NFC.

Added a description of getReductionCost() function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304938 91177308-0d34-0410-b5e6-96231b3b80d8

[Hexagon] Generate 'inbounds' GEPs in HexagonCommonGEP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304937 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Improve Store Merge candidate pruning. NFC.

When considering merging stores values are the results of loads only
consider stores whose values come from loads from the same base.

This fixes much of the longer compile times in PR33330.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304934 91177308-0d34-0410-b5e6-96231b3b80d8

Fix builin_expect lowering bug
PR33346

Skip cases when expected value is not constant int.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304933 91177308-0d34-0410-b5e6-96231b3b80d8

Add BinaryFormat module definition

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304928 91177308-0d34-0410-b5e6-96231b3b80d8

[mssa] Fix case when there is no definition in a block prior to an inserted use.

Summary:
Check that the first access before one being tested is valid.
Before this patch, if there was no definition prior to the Use being tested,
the first time Iter was deferenced, it hit the sentinel.

Reviewers: dberlin, gbiv

Subscribers: sanjoy, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D33950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304926 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] avoid zext/trunc of a memcmp expansion compare

This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.

The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:

_cmp_eq4_zexted_to_i64:
  movl  (%rdi), %ecx
  movl  (%rsi), %edx
  xorl  %eax, %eax
  cmpq  %rdx, %rcx
  sete  %al

_cmp_eq4_better:
  movl  (%rdi), %ecx
  xorl  %eax, %eax
  cmpl  (%rsi), %ecx
  sete  %al

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304923 91177308-0d34-0410-b5e6-96231b3b80d8

[AMDGPU][MC] Corrected error message for s_waitcnt helpers

See Bug 32711: https://bugs.llvm.org//show_bug.cgi?id=32711

Reviewers: artem.tamazov

Differential Revision: https://reviews.llvm.org/D33781

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304922 91177308-0d34-0410-b5e6-96231b3b80d8

LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else).

This makes it so that the code quality for CFI checks when compiling
with -O2 and linking with --lto-O0 is similar to that of the rest of
the code.

Reduces the size of a chrome binary built with -O2/--lto-O0 by
about 750KB.

Differential Revision: https://reviews.llvm.org/D33925

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304921 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] pass size as param in MemCmpExpansion; NFCI

Avoid extracting the constant int twice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304920 91177308-0d34-0410-b5e6-96231b3b80d8

PR33331 - opt-viewer.py produces broken output for directories with spaces

Fix: Properly quote href attributes.

Patch by Simon Whittaker!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304919 91177308-0d34-0410-b5e6-96231b3b80d8

[mips][dsp] Modify repl.ph to accept signed immediate values

Changed immediate type for repl.ph from uimm10 to simm10 as per the specs.
Repl.qb still accepts uimm8. Both instructions now mimic the behaviour of
GNU as.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33594

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304918 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] pass size as param in MemCmpExpansion; NFCI

Avoid extracting the constant int twice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304917 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] getParent()->getParent() --> getFunction(); NFCI

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304916 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Add test to demonstrate inefficient lowering of v48i8 shuffle.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304915 91177308-0d34-0410-b5e6-96231b3b80d8

[SystemZ] Propagate MachineMemOperands

In emitCondStore() and emitMemMemWrapper().

Review: Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304913 91177308-0d34-0410-b5e6-96231b3b80d8

[DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering.

This will allow commutation of target-specific DAG nodes in future patches

Differential Revision: https://reviews.llvm.org/D33882

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304911 91177308-0d34-0410-b5e6-96231b3b80d8

AMDGPU/GlobalISel: Mark 32-bit G_SELECT as legal

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33949

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304910 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] avoid flipping sign bits for vector icmp by using known bits

If we know that both operands of an unsigned integer vector comparison are non-negative,
then it's safe to directly use a signed-compare-greater-than instruction (the only non-equality
integer vector compare predicate provided by SSE/AVX).

We're intentionally not changing the condition code to signed in order to preserve the
existing transforms that use min/max/psubus below here.

This should solve PR33276:
https://bugs.llvm.org/show_bug.cgi?id=33276

Differential Revision: https://reviews.llvm.org/D33862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304909 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] add helper function for generating compare of load pairs; NFCI

In the special (but also the likely common) case, we can avoid
the multi-block complexity of the general algorithm, so moving
this part off on its own will make it re-usable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304908 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Eliminate integer compare instructions - vol. 5

Adds handling for i64 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304907 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] do not use FastISel when -mxgot is present

The clang compiler by default uses FastISel when invoked with -O0, which
is also the default. In that case, passing of -mxgot does not get honored,
i.e. the code path that is to deal with large got is not taken.
Clang produces same output regardless of -mxgot being present or not.
This change checks whether -mxgot is passed as an option, and turns off
FastISel if it is.

Patch by Stefan Maksimovic.

Differential Revision: https://reviews.llvm.org/D33593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304906 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] Use FixupKind variable in processFixupValue (cleanup, NFC).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304905 91177308-0d34-0410-b5e6-96231b3b80d8

[CGP] fix formatting in MemCmpExpansion; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304903 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Purge G_SEQUENCE

According to the commit message from r296921, G_MERGE_VALUES and
G_INSERT are to be preferred over G_SEQUENCE. Therefore, stop generating
G_SEQUENCE in the ARM backend and remove the code dealing with it.

This boils down to the code breaking up double values for the soft float
calling convention. Use G_MERGE_VALUES + G_UNMERGE_VALUES instead of
G_SEQUENCE + G_EXTRACT for it. This maps very nicely to VMOVDRR +
VMOVRRD and simplifies the code in the instruction selector.

There's one occurence of G_SEQUENCE left in arm-irtranslator.ll, but
that is part of the target-independent code for translating constant
structs. Therefore, it is beyond the scope of this commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304902 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Eliminate integer compare instructions - vol. 3

Adds handling for i32 SETNE comparison (both sign and zero extended).

Differential Revision: https://reviews.llvm.org/D33718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304901 91177308-0d34-0410-b5e6-96231b3b80d8

[FileCheck] Don't scan past the closing CHECK-DAG for CHECK-NOT inside CHECK-DAG

If there's enough data in fron of it the skipped region would just
become arbitrarily large, and we scan for the CHECK-NOT everywhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304900 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Support G_XOR

Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select to EORrr via TableGen'erated code

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304898 91177308-0d34-0410-b5e6-96231b3b80d8

evert "[mips] Fix test mips64fpldst.ll with machine verifier enabled"

This reverts commit r301394. It broke some internal buildbots, reverting
while the issue is being investigated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304896 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining

We were checking that the index was in range of the destination vector type, not the (larger) source vector type

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304894 91177308-0d34-0410-b5e6-96231b3b80d8

[ARM] GlobalISel: Support G_OR

Same as the other binary operators:
- legalize to 32 bits
- map to GPRs
- select ORRrr thanks to TableGen'erated code

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304890 91177308-0d34-0410-b5e6-96231b3b80d8

[Linker] Remove llc usage from link-arm-and-thumb.ll test case.

This fixes a buildbot failure when the ARM target is not built.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304888 91177308-0d34-0410-b5e6-96231b3b80d8