granicus.if.org Git

IR: Make MDNodeFwdDecl destructor public

Now that the leak detector is gone, anyone can call this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225689 91177308-0d34-0410-b5e6-96231b3b80d8

[X86] Make SSE min/max testcases more explicit. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225687 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Move creation logic down to MDTuple, NFC

Move creation logic for `MDTuple`s down where it belongs. Once there
are a few more subclasses, these functions really won't make much sense
here (the `friend` relationship was already awkward). For now, leave
the `MDNode` versions around, but have it forward down.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225685 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Push storeDistinctInContext() down to UniquableMDNode, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225683 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Split GenericMDNode into MDTuple and UniquableMDNode

Split `GenericMDNode` into two classes (with more descriptive names).

  - `UniquableMDNode` will be a common subclass for `MDNode`s that are
    sometimes uniqued like constants, and sometimes 'distinct'.

    This class gets the (short-lived) RAUW support and related API.

  - `MDTuple` is the basic tuple that has always been returned by
    `MDNode::get()`.  This is as opposed to more specific nodes to be
    added soon, which have additional fields, custom assembly syntax,
    and extra semantics.

    This class gets the hash-related logic, since other sublcasses of
    `UniquableMDNode` may need to hash based on other fields.

To keep this diff from getting too big, I've added casts to `MDTuple`
that won't really scale as new subclasses of `UniquableMDNode` are
added, but I'll clean those up incrementally.

(No functionality change intended.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225682 91177308-0d34-0410-b5e6-96231b3b80d8

[LIT] Decode string result in lit.util.capture

Summary: I think this is probably a bug, but I'm putting this up for review just to be sure. I think that `lit.util.capture` should decode the resulting string in the same way `lit.util.executeCommand` does.

Reviewers: ddunbar, EricWF

Reviewed By: EricWF

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225681 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Invert logic to simplify control flow, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225670 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Separate out decrementUnresolvedOperandCount(), NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225667 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Prevent handleChangedOperand() recursion

Instead of returning early on `handleChangedOperand()` recursion
(finally identified (and test added) in r225657), prevent it upfront by
releasing operands before RAUW.

Aside from massively different program flow, there should be no
functionality change ;).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225665 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Use RegisterOperands to specify which operands can accept immediates

There are some operands which can take either immediates or registers
and we were previously using different register class to distinguish
between operands that could take immediates and those that could not.

This patch switches to using RegisterOperands which should simplify the
backend by reducing the number of register classes and also make it
easier to implement the assembler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225662 91177308-0d34-0410-b5e6-96231b3b80d8

Target: Allow target specific operand types

This adds two new fields to the RegisterOperand TableGen class:

string OperandNamespace = "MCOI";
string OperandType = "OPERAND_REGISTER";

These fields can be used to specify a target specific operand type,
which will be stored in the OperandType member of the MCOperandInfo
object.

This can be useful for targets that need to store some extra information
about operands that cannot be expressed using the target independent
types. For example, in the R600 backend, there are operands which
can take either registers or immediates and it is convenient to be able
to specify this in the TableGen definitions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225661 91177308-0d34-0410-b5e6-96231b3b80d8

GVN: propagate equalities for floating point compares

Allow optimizations based on FP comparison values in the same way
as integers.

This resolves PR17713:
http://llvm.org/bugs/show_bug.cgi?id=17713

Differential Revision: http://reviews.llvm.org/D6911

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225660 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Add test for handleChangedOperand() recursion

Turns out this can happen. Remove the `FIXME` and add a testcase that
crashes without the extra logic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225657 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Separate out recalculateHash(), NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225655 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Separate out helper: resolveAfterOperandChange(), NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225654 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Use SubclassData32 directly, NFC

Simplify some logic by accessing `SubclassData32` directly instead of
relying on API.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225653 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: Turn some impossible conditions into asserts

This is a fixed version of reverted r225500. It fixes the too early
if() continue; of the last patch and adds a comment to the unorthodox
loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225652 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Don't allow operands to become unresolved

Operands shouldn't change from being resolved to unresolved during graph
construction. Simplify the logic based on that assumption.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225649 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Remove redundant comment, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225648 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Simplify code, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225647 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Make temporary nodes distinct

Change the return of `MDNode::isDistinct()` for `MDNode::getTemporary()`
to `true`. They aren't uniqued.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225646 91177308-0d34-0410-b5e6-96231b3b80d8

Add r224985 back with two fixes.

One is that AArch64 has additional restrictions on when local relocations can
be used. We have to take those into consideration when deciding to put a L
symbol in the symbol table or not.

The other is that ld64 requires the relocations to cstring to use linker
visible symbols on AArch64.

Thanks to Michael Zolotukhin for testing this!

Remove doesSectionRequireSymbols.

In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225644 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Simplify replaceOperandWith(), NFC

This will call `handleChangedOperand()` less frequently, but in that
case (i.e., `isStoredDistinctInContext()`) it has identical logic to
here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225643 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Remove redundant calls to MDNode::setHash(), NFC

`storeDistinctInContext()` already calls `setHash(0)`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225642 91177308-0d34-0410-b5e6-96231b3b80d8

[ASan] Move the shadow on Windows 32-bit from 0x20000000 to 0x40000000

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225641 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Factor out fortified libcall handling.

This lets us remove CGP duplicate.

Differential Revision: http://reviews.llvm.org/D6541

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225640 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Factor out str/mem libcall optimizations.

Put them in a separate function, so we can reuse them to further
simplify fortified libcalls as well.

Differential Revision: http://reviews.llvm.org/D6540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225639 91177308-0d34-0410-b5e6-96231b3b80d8

[SimplifyLibCalls] Factor out signature checks for fortifiable libcalls.

The checks are the same for fortified counterparts to the libcalls, so
we might as well do them in a single place.

Differential Revision: http://reviews.llvm.org/D6539

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225638 91177308-0d34-0410-b5e6-96231b3b80d8

[mips][microMIPS] Implement BEQZ16 and BNEZ16 instructions

Differential Revision: http://reviews.llvm.org/D5271

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225627 91177308-0d34-0410-b5e6-96231b3b80d8

Put this test's input in the Inputs directory where it belongs, rather than
reusing a file from a different test directory.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225621 91177308-0d34-0410-b5e6-96231b3b80d8

Add a new utility script that helps update very simple regression tests.

This script is currently specific to x86 and limited to use with very
small regression or feature tests using 'llc' and 'FileCheck' in
a reasonably canonical way. It is in no way general purpose or robust at
this point. However, it works quite well for simple examples. Here is
the intended workflow:

- Make a change that requires updating N test files and M functions'
  assertions within those files.
- Stash the change.
- Update those N test files' RUN-lines to look "canonical"[1].
- Refresh the FileCheck lines for either the entire file or select
  functions by running this script.
  - The script will parse the RUN lines and run the 'llc' binary you
    give it according to each line, collecting the asm.
  - It will then annotate each function with the appropriate FileCheck
    comments to check every instruction from the start of the first
    basic block to the last return.
  - There will be numerous cases where the script either fails to remove
    the old lines, or inserts checks which need to be manually editted,
    but the manual edits tend to be deletions or replacements of
    registers with FileCheck variables which are fast manual edits.
  - A common pattern is to have the script insert complete checking of
    every instruction, and then edit it down to only check the relevant
    ones.
  - Be careful to do all of these cleanups though! The script is
    designed to make transferring and formatting the asm output of llc
    into a test case fast, it is *not* designed to be authoratitive
    about what constitutes a good test!
- Commit the nice fresh baseline of checks.
- Unstash your change and rebuild llc.
- Re-run script to regenerate the FileCheck annotations
  - Remember to re-cleanup these annotations!!!
- Check the diff to make sure this is sane, checking the things you
  expected it to, and check that the newly updated tests actually pass.
- Profit!

Also, I'm *terrible* at writing Python, and frankly I didn't spend a lot
of time making this script beautiful or well engineered. But it's useful
to me and may be useful to others so I thought I'd send it out.

http://reviews.llvm.org/D5546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225618 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix calls to non-function objects

Looking at r225438 inspired me to see how the PowerPC backend handled the
situation (calling a bitcasted TLS global), and it turns out we also produced
an error (cannot select ...). What it means to "call" something that is not a
function is implementation and platform specific, but in the name of doing
something (besides crashing), this makes sure we do what GCC does (treat all
such calls as calls through a function pointer -- meaning that the pointer is
assumed, as is the convention on PPC, to point to a function descriptor
structure holding the actual code address along with the function's TOC pointer
and environment pointer). As GCC does, we now do the same for calling regular
(non-TLS) non-function globals too.

I'm not sure whether this is the most useful way to define the behavior, but at
least we won't be alone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225617 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Minor fix to VPBLENDW AVX2 commutation.

D6015 / rL221313 enabled commutation for SSE immediate blend instructions, but due to a typo the AVX2 VPBLENDW ymm instructions weren't flagged as commutative along with the others in the tables, but were still being commuted in code and tested for.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225612 91177308-0d34-0410-b5e6-96231b3b80d8

Fix silly mistake in release notes for Mips.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225608 91177308-0d34-0410-b5e6-96231b3b80d8

Added release notes for the Mips target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225607 91177308-0d34-0410-b5e6-96231b3b80d8

Revert most of r225597

We can't rely on a DataLayout enlightened constant folder.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225599 91177308-0d34-0410-b5e6-96231b3b80d8

X86: Properly decode shuffle masks when the constant pool type is weird

It's possible for the constant pool entry for the shuffle mask to come
from a completely different operation. This occurs when Constants have
the same bit pattern but have different types.

Make DecodePSHUFBMask tolerant of types which, after a bitcast, are
appropriately sized vector types.

This fixes PR22188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225597 91177308-0d34-0410-b5e6-96231b3b80d8

X86: teach X86TargetLowering about L,M,O constraints

Teach the ISelLowering for X86 about the L,M,O target specific constraints.
Although, for the moment, clang performs constraint validation and prevents
passing along inline asm which may have immediate constant constraints violated,
the backend should be able to cope with the invalid inline asm a bit better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225596 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: add support for segment base relocations (SBREL)

This adds support for parsing and emitting the SBREL relocation variant for the
ARM target. Handling this relocation variant is necessary for supporting the
full ARM ELF specification. Addresses PR22128.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225595 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Remove some windows line endings that snuck into the tests here.

Folks on Windows, remember to set up your subversion to strip these when
submitting...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225593 91177308-0d34-0410-b5e6-96231b3b80d8

[ADT] Remove the unused default constructor for iterator_range.

This default constructor is a bit weird. It left the range in an invalid
state. That might be reasonable so that you can construct a local
iterator range and assign to it based on some logic to compute the range
you want. If folks would like to support that use case, I can add it
back, but in 238-odd usages none have actually wanted to do this. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225592 91177308-0d34-0410-b5e6-96231b3b80d8

Fix PR22179.

We were incorrectly inferring nsw for certain SCEVs. We can be more
aggressive here (see Richard Smith's comment on
http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just
focuses on correctness.

Differential Revision: http://reviews.llvm.org/D6914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225591 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r225500, it leads to infinite loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225590 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Improved (v)insertps shuffle matching

In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable.

This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced.

We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function.

Differential Revision: http://reviews.llvm.org/D6879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225589 91177308-0d34-0410-b5e6-96231b3b80d8

.gitignore: add some rules for tagging programs

Often, we miss committing new files, and 'arc diff' is supposed to warn
us about this. Unfortunately, because of the spurious output of the
command (due to unignored untracked files), we tend to ignore it and
lose information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225588 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Mark zext of a small scalar load as free

This initial implementation of PPCTargetLowering::isZExtFree marks as free
zexts of small scalar loads (that are not sign-extending). This callback is
used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to
determine whether a zext or an anyext is used to lower illegally-typed PHIs.
Because later truncates of zero-extended values are nops, this allows for the
elimination of later unnecessary truncations.

Fixes the initial complaint associated with PR22120.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225584 91177308-0d34-0410-b5e6-96231b3b80d8

Remove some whitespace.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225583 91177308-0d34-0410-b5e6-96231b3b80d8

ConvertUTFTest: fix misleading empty line

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225580 91177308-0d34-0410-b5e6-96231b3b80d8

tests: fix previous commit

The previous commit accidentally missed changes to the test output checking,
resulting in an errant failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225577 91177308-0d34-0410-b5e6-96231b3b80d8

test: merge ARM relocations test

There is a fair number of relocations that are part of the AAELF specification.
Simply merge the tests into a single test file, otherwise, we will end up with
far too many test files to test each relocation type. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225576 91177308-0d34-0410-b5e6-96231b3b80d8

tests: convert a couple of ARM relocation tests to readobj

These tests are checking the relocation generation. Use the readobj output as
it is much easier to follow when glancing over the tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225575 91177308-0d34-0410-b5e6-96231b3b80d8

Fully fix Bug #22115.

Summary:
In the previous commit, the register was saved, but space was not allocated.
This resulted in the parameter save area potentially clobbering r30, leading to
nasty results.

Test Plan: Tests updated

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225573 91177308-0d34-0410-b5e6-96231b3b80d8

Fix undefined behavior (shift of negative value) in RuntimeDyldMachOAArch64::encodeAddend.

Test Plan: regression test suite with/without UBSan.

Reviewers: lhames, ributzka

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D6908

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225568 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Readjust the loop unrolling threshold

Now that the way that the partial unrolling threshold for small loops is used
to compute the unrolling factor as been corrected, a slightly smaller threshold
is preferable. This is expected; other targets may need to re-tune as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225566 91177308-0d34-0410-b5e6-96231b3b80d8

[LoopUnroll] Fix the partial unrolling threshold for small loop sizes

When we compute the size of a loop, we include the branch on the backedge and
the comparison feeding the conditional branch. Under normal circumstances,
these don't get replicated with the rest of the loop body when we unroll. This
led to the somewhat surprising behavior that really small loops would not get
unrolled enough -- they could be unrolled more and the resulting loop would be
below the threshold, because we were assuming they'd take
(LoopSize * UnrollingFactor) instructions after unrolling, instead of
(((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225565 91177308-0d34-0410-b5e6-96231b3b80d8

Use the DiagnosticHandler to print diagnostics when reading bitcode.

The bitcode reading interface used std::error_code to report an error to the
callers and it is the callers job to print diagnostics.

This is not ideal for error handling or diagnostic reporting:

* For error handling, all that the callers care about is 3 possibilities:
  * It worked
  * The bitcode file is corrupted/invalid.
  * The file is not bitcode at all.

* For diagnostic, it is user friendly to include far more information
  about the invalid case so the user can find out what is wrong with the
  bitcode file. This comes up, for example, when a developer introduces a
  bug while extending the format.

The compromise we had was to have a lot of error codes.

With this patch we use the DiagnosticHandler to communicate with the
human and std::error_code to communicate with the caller.

This allows us to have far fewer error codes and adds the infrastructure to
print better diagnostics. This is so because the diagnostics are printed when
he issue is found. The code that detected the problem in alive in the stack and
can pass down as much context as needed. As an example the patch updates
test/Bitcode/invalid.ll.

Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the
caller. A simple one like llvm-dis can just use fatal errors. The gold plugin
needs a bit more complex treatment because of being passed non-bitcode files. An
hypothetical interactive tool would make all bitcode errors non-fatal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225562 91177308-0d34-0410-b5e6-96231b3b80d8

Fix UBSan error reports in ValueMapCallbackVH and AssertingVH<T> empty/tombstone keys generation.

Summary:
One more attempt to fix UBSan reports: make sure DenseMapInfo::getEmptyKey()
and DenseMapInfo::getTombstoneKey() doesn't do any upcasts/downcasts to/from Value*.

Test Plan: check-llvm test suite with/without UBSan bootstrap

Reviewers: chandlerc, dexonsmith

Subscribers: llvm-commits, majnemer

Differential Revision: http://reviews.llvm.org/D6903

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225558 91177308-0d34-0410-b5e6-96231b3b80d8

Disable Go bindings test under UBSan.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225557 91177308-0d34-0410-b5e6-96231b3b80d8

Fix the JIT event listeners and replace the associated tests.

The changes to EventListenerCommon.h were contributed by Arch Robison.

This fixes bug 22095.

http://reviews.llvm.org/D6905

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225554 91177308-0d34-0410-b5e6-96231b3b80d8

Update comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225553 91177308-0d34-0410-b5e6-96231b3b80d8

SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210)

The previous code assumed that such instructions could not have any uses
outside CaseDest, with the motivation that the instruction could not
dominate CommonDest because CommonDest has phi nodes in it. That simply
isn't true; e.g., CommonDest could have an edge back to itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225552 91177308-0d34-0410-b5e6-96231b3b80d8

[X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros

pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable.

Differential Revision: http://reviews.llvm.org/D6878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225551 91177308-0d34-0410-b5e6-96231b3b80d8

Fix an ASAN failure introduced with r225537 (adding the -universal-headers to llvm-obdump).
And a fly by fix to some formatting issues with the same commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225550 91177308-0d34-0410-b5e6-96231b3b80d8

Add a testcase of llvm-lto error handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225545 91177308-0d34-0410-b5e6-96231b3b80d8

Remove duplicating code. NFC.

The removed condition is checked in the previous loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225542 91177308-0d34-0410-b5e6-96231b3b80d8

Add the option, -universal-headers, used with -macho to print the Mach-O universal headers to llvm-objdump.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225537 91177308-0d34-0410-b5e6-96231b3b80d8

Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"

It's not really expected to stick around, last time it provoked a weird LTO
build failure that I can't reproduce now, and the bot logs are long gone. I'll
re-revert it if the failures recur.

Original description: Perform Scalar PRE on gep indices that feed loads before
doing Load PRE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225536 91177308-0d34-0410-b5e6-96231b3b80d8

Recommit r224935 with a fix for the ObjC++/AArch64 bug that that revision
introduced.

A test case for the bug was already committed in r225385.

Patch by Rafael Espindola.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225534 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD"

This reverts commit r225498 (but leaves r225499, which was a worthy
cleanup).

My plan was to change `DEBUG_LOC` to store the `MDNode` directly rather
than its operands (patch was to go out this morning), but on reflection
it's not clear that it's strictly better. (I had missed that the
current code is unlikely to emit the `MDNode` at all.)

Conflicts:
lib/Bitcode/Reader/BitcodeReader.cpp (due to r225499)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225531 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add support for accessing $gp as a named register.

Summary:
Mips Linux uses $gp to hold a pointer to thread info structure and accesses it
with a named register. This makes this work for LLVM.

The N32 ABI doesn't quite work yet since the frontend generates incorrect IR
for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before
converting to a pointer. Given correct IR (as in the testcase in this patch),
it works correctly.

Reviewers: sstankovic, vmedic, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225529 91177308-0d34-0410-b5e6-96231b3b80d8

fix typos; remove names from comments; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225528 91177308-0d34-0410-b5e6-96231b3b80d8

remove names from comments; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225526 91177308-0d34-0410-b5e6-96231b3b80d8

fix typos; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225525 91177308-0d34-0410-b5e6-96231b3b80d8

fix typo; NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225524 91177308-0d34-0410-b5e6-96231b3b80d8

more efficient use of a dyn_cast; no functional change intended

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225523 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Enable late partial unrolling on the POWER7

The P7 benefits from not have really-small loops so that we either have
multiple dispatch groups in the loop and/or the ability to form more-full
dispatch groups during scheduling. Setting the partial unrolling threshold to
44 seems good, empirically, for the P7. Compared to using no late partial
unrolling, this yields the following test-suite speedups:

SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding
-66.3253% +/- 24.1975%
SingleSource/Benchmarks/Misc-C++/oopack_v1p8
-44.0169% +/- 29.4881%
SingleSource/Benchmarks/Misc/pi
-27.8351% +/- 12.2712%
SingleSource/Benchmarks/Stanford/Bubblesort
-30.9898% +/- 22.4647%

I've speculatively added a similar setting for the P8. Also, I've noticed that
the unroller does not quite calculate the unrolling factor correctly for really
tiny loops because it neglects to account for the fact that not every loop body
replicant contains an ending branch and counter increment. I'll fix that later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225522 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Add comment which explains why we need to change the assembler options before and after inline asm blocks. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225521 91177308-0d34-0410-b5e6-96231b3b80d8

Assumption that "VectorizedValue" will always be an Instruction is not correct.
It can be a constant or a vector argument.

ex :

define i32 @hadd(<4 x i32> %a) #0 {
entry:
  %vecext = extractelement <4 x i32> %a, i32 0
  %vecext1 = extractelement <4 x i32> %a, i32 1
  %add = add i32 %vecext, %vecext1
  %vecext2 = extractelement <4 x i32> %a, i32 2
  %add3 = add i32 %add, %vecext2
  %vecext4 = extractelement <4 x i32> %a, i32 3
  %add5 = add i32 %add3, %vecext4
  ret i32 %add5
}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225517 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: add support for R_ARM_ABS16

Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225510 91177308-0d34-0410-b5e6-96231b3b80d8

test: add additional test for SVN r225507

Add an additional test case to ensure that we generate the relocation even if
the thumb target is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225509 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: add support for R_ARM_ABS8 relocations

Add support for R_ARM_ABS8 relocation. Addresses PR22126.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225507 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness

The code that eliminated additional coalescable copies in
removeCopyByCommutingDef() used MergeValueNumberInto() which internally
may merge A into B or B into A. In this case A and B had different Def
points, so we have to reset ValNo.Def to the intended one after merging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225503 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: Some cleanup in removeCopyByCommutingDef(), NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225502 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: No need to set kill flags, they are recompute later anyway

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225501 91177308-0d34-0410-b5e6-96231b3b80d8

RegisterCoalescer: Turn some impossible conditions into asserts

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225500 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Share logic for last instruction, NFC

Share logic for getting the last instruction emitted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225499 91177308-0d34-0410-b5e6-96231b3b80d8

Bitcode: Move the DEBUG_LOC record to DEBUG_LOC_OLD

Prepare to simplify the `DebugLoc` record.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225498 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Add a flag for experimenting with subreg liveness tracking

This cannot yet be enabled by default, it causes ~50 miscompiles in the test
suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225497 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fold [sz]ext with fp_to_int lowering where possible

On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32
to i64 into the load necessary for an i64 -> fp conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225493 91177308-0d34-0410-b5e6-96231b3b80d8

[DAGCombine] Remainder of fix to r225380 (More FMA folding opportunities)

As pointed out by Aditya (and Owen), when we elide an FP extend to form an FMA,
we need to extend the incoming operands so that the resulting node will really
be legal. This is currently enabled only for PowerPC, and it happens to work
there regardless, but this should fix the functionality for everyone else
should anyone else wish to use it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225492 91177308-0d34-0410-b5e6-96231b3b80d8

[x86] Add a flag to control the vector shuffle legality predicates that
complements the new vector shuffle lowering code path. This flag,
naturally, is *off* because we've not tested or evaluated the results of
this at all. However, the flag will make it much easier to evaluate
whether we can be this aggressive and whether there are missing vector
shuffle lowering optimizations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225491 91177308-0d34-0410-b5e6-96231b3b80d8

Cleaup ValueHandle to no longer keep a PointerIntPair for the Value*.
This was used previously for metadata but is no longer needed there. Not
doing this simplifies ValueHandle and will make it easier to fix things
like AssertingVH's DenseMapInfo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225487 91177308-0d34-0410-b5e6-96231b3b80d8

Partial fix to r225380 (More FMA folding opportunities)

As pointed out by Aditya (and Owen), there are two things wrong with this code.
First, it adds patterns which elide FP extends when forming FMAs, and that might
not be profitable on all targets (it belongs behind the pre-existing
aggressive-FMA-formation flag). This is fixed by this change.

Second, the resulting nodes might have operands of different types (the
extensions need to be re-added). That will be fixed in the follow-up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225485 91177308-0d34-0410-b5e6-96231b3b80d8

[REFACTOR] Push logic from MemDepPrinter into getNonLocalPointerDependency

Previously, MemDepPrinter handled volatile and unordered accesses without involving MemoryDependencyAnalysis. By making a slight tweak to the documented interface - which is respected by both callers - we can move this responsibility to MDA for the benefit of any future callers. This is basically just cleanup.

In the future, we may decide to extend MDA's non local dependency analysis to return useful results for ordered or volatile loads. I believe (but have not really checked in detail) that local dependency analyis does get useful results for ordered, but not volatile, loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225483 91177308-0d34-0410-b5e6-96231b3b80d8

ReleaseNotes.rst: these are for 3.6

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225482 91177308-0d34-0410-b5e6-96231b3b80d8

[Refactor] Have getNonLocalPointerDependency take the query instruction

Previously, MemoryDependenceAnalysis::getNonLocalPointerDependency was taking a list of properties about the instruction being queried. Since I'm about to need one more property to be passed down through the infrastructure - I need to know a query instruction is non-volatile in an inner helper - fix the interface once and for all.

I also added some assertions and behaviour clarifications around volatile and ordered field accesses. At the moment, this is mostly to document expected behaviour. The only non-standard instructions which can currently reach this are atomic, but unordered, loads and stores. Neither ordered or volatile accesses can reach here.

The call in GVN is protected by an isSimple check when it first considers the load. The calls in MemDepPrinter are protected by isUnordered checks. Both utilities also check isVolatile for loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225481 91177308-0d34-0410-b5e6-96231b3b80d8

LangRef: Add usage points for distinct MDNodes

Omission pointed out by Sean Silva!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225479 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Drop TODO now that PR22111 is finished

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225477 91177308-0d34-0410-b5e6-96231b3b80d8

Utils: Keep distinct MDNodes distinct in MapMetadata()

Create new copies of distinct `MDNode`s instead of following the
uniquing `MDNode` logic.

Just like self-references (or other cycles), `MapMetadata()` creates a
new node. In practice most calls use `RF_NoModuleLevelChanges`, in
which case nothing is duplicated anyway.

Part of PR22111.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225476 91177308-0d34-0410-b5e6-96231b3b80d8

IR: Add 'distinct' MDNodes to bitcode and assembly

Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode).  This adds the `distinct` keyword to assembly.

Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:

  - self-references, which are never uniqued, and
  - nodes whose operands are replaced that hit a uniquing collision.

The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.

Part of PR22111.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225474 91177308-0d34-0410-b5e6-96231b3b80d8