granicus.if.org Git

[release_36] Cherry-pick r231219.

Original message:
[DAGCombine] Fix a bug in a BUILD_VECTOR combine

When trying to convert a BUILD_VECTOR into a shuffle, we try to split a single source vector that is twice as wide as the destination vector.
We can not do this when we also need the zero vector to create a blend.
This fixes PR22774.

Differential Revision: http://reviews.llvm.org/D8040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@232807 91177308-0d34-0410-b5e6-96231b3b80d8

[release_36] Cherry-pick r232046.

Original message:
[X86] Fix wrong target specific combine on SETCC nodes.

Part of the folding logic implemented by function 'PerformISDSETCCCombine'
only worked under the assumption that the condition code in input could have
been either SETNE or SETEQ.
Unfortunately that assumption was incorrect, and in some cases the algorithm
ended up incorrectly folding SETCC nodes.

The incorrect folding only affected SETCC dag nodes where:
- one of the operands was a build_vector of all zeroes;
- the other operand was a SIGN_EXTEND from a vector of MVT:i1 elements;
- the condition code was neither SETNE nor SETEQ.

Example:
(setcc (v4i32 (sign_extend v4i1:%A)), (v4i32 VectorOfAllZeroes), setge)

Before this patch, the entire dag node sequence from the example was
incorrectly folded to node %A.

With this patch, the dag node sequence is folded to a
(xor %A, (v4i1 VectorOfAllOnes)).

Added test setcc-combine.ll.

Thanks to Greg Bedwell for spotting this issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@232804 91177308-0d34-0410-b5e6-96231b3b80d8

[release_36] Cherry-pick r231601.

Original commit message:
[X86][AVX] Fix wrong lowering of VPERM2X128 nodes

There were cases where the backend computed a wrong permute mask for a VPERM2X128 node.

Example:
\code
define <8 x float> @foo(<8 x float> %a, <8 x float> %b) {
  %shuffle = shufflevector <8 x float> %a, <8 x float> %b, <8 x i32> <i32 undef, i32 undef, i32 6, i32 7, i32 undef, i32 undef, i32 6, i32 7>
  ret <8 x float> %shuffle
}
\code end

Before this patch, llc (with -mattr=+avx) emitted the following vperm2f128:
  vperm2f128 $0, %ymm0, %ymm0, %ymm0  # ymm0 = ymm0[0,1,0,1]

With this patch, llc emits a vperm2f128 with a correct permute mask:
  vperm2f128 $17, %ymm0, %ymm0, %ymm0  # ymm0 = ymm0[2,3,2,3]

Differential Revision: http://reviews.llvm.org/D8119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@232803 91177308-0d34-0410-b5e6-96231b3b80d8

[release_36] Cherry-pick r232179.

Original commit message:
[X86][AVX] Fix wrong lowering of v4x64 shuffles into concat_vector plus extract_subvector nodes.

This patch fixes a bug in the shuffle lowering logic implemented by function
'lowerV2X128VectorShuffle'.

The are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a
shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR
nodes. The problematic expansion only occurs when the shuffle mask M has an
'undef' element at position 2, and M is equivalent to mask <0,1,4,5>.
In that case, the algorithm propagates the wrong vector to one of the two
new EXTRACT_SUBVECTOR nodes.

Example:
;;
define <4 x double> @test(<4 x double> %A, <4 x double> %B) {
entry:
  %0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32><i32 undef, i32 1, i32 undef, i32 5>
  ret <4 x double> %0
}
;;

Before this patch, llc (-mattr=+avx) generated:
  vinsertf128 $1, %xmm0, %ymm0, %ymm0

With this patch, llc correctly generates:
  vinsertf128 $1, %xmm1, %ymm0, %ymm0

Added test lower-vec-shuffle-bug.ll

Differential Revision: http://reviews.llvm.org/D8259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@232753 91177308-0d34-0410-b5e6-96231b3b80d8

Merge r232189 from trunk to 3.6

Original commit message:

Summary:
ScalarEvolutionExpander assumes that the header block of a loop is a
legal place to have a use for a phi node. This is true only for phis
that are either in the header or dominate the header block, but it is
not true for phi nodes that are strictly internal to the loop body.

This change teaches ScalarEvolutionExpander to place uses of PHI nodes
in the basic block the PHI nodes belong to. This is always legal, and
`hoistIVInc` ensures that the said position dominates `IsomorphicInc`.

Reviewers: atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8311

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@232416 91177308-0d34-0410-b5e6-96231b3b80d8

[release_36] Cherry-pick r232085.

Original commit message:
[X86] Fix a regression introduced by r223641.
The permps and permd instructions have their operands swapped compared to the
intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP
category.

I did not create a new category for those two, as they are the only one AFAICT
in that case.

<rdar://problem/20108262>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@232118 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227089:
------------------------------------------------------------------------
r227089 | vkalintiris | 2015-01-26 13:33:22 +0100 (Mon, 26 Jan 2015) | 15 lines

[mips] Enable arithmetic and binary operations for the i128 data type.

Summary:
This patch adds support for some operations that were missing from
128-bit integer types (add/sub/mul/sdiv/udiv... etc.). With these
changes we can support the __int128_t and __uint128_t data types
from C/C++.

Depends on D7125

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@231860 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227087:
------------------------------------------------------------------------
r227087 | vkalintiris | 2015-01-26 13:04:40 +0100 (Mon, 26 Jan 2015) | 7 lines

[mips] Add tests for bitwise binary and integer arithmetic operators.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@231857 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r231563 from adibiagio:

[DAGCombiner] Fix wrong folding of AND dag nodes.

This patch fixes the logic in the DAGCombiner that folds an AND node according
to rule: (and (X (load V)), C) -> (X (load V))

An AND between a vector load 'X' and a constant build_vector 'C' can be folded
into the load itself only if we can prove that the AND operation is redundant.
The algorithm implemented by 'visitAND' firstly computes the splat value 'S'
from C, and then checks if S has the lower 'B' bits set (where B is the size in
bits of the vector element type). The algorithm takes into account also the
'undef' bits in the splat mask.

Unfortunately, the algorithm only worked under the assumption that the size of S
is a multiple of the vector element type. With this patch, we conservatively
avoid folding the AND if the splat bits are not compatible with the vector
element type.

Added X86 test and-load-fold.ll

Differential Revision: http://reviews.llvm.org/D8085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@231702 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r230058 by d0k:
LoopRotate: When reconstructing loop simplify form don't split edges
from indirectbrs.

Yet another chapter in the endless story. While this looks like we leave
the loop in a non-canonical state this replicates the logic in
LoopSimplify so it doesn't diverge from the canonical form in any way.

PR21968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@231698 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229911 by d0k:
MC: Allow multiple comma-separated expressions on the .uleb128
directive.

For compatiblity with GNU as. Binutils documents this as
'.uleb128 expressions'. Subtle, isn't it?

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@231675 91177308-0d34-0410-b5e6-96231b3b80d8

Bump version to 3.6.1

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@231258 91177308-0d34-0410-b5e6-96231b3b80d8

Revert 224782: "Finish removing DestroySource."

Filip Pizlo pointed out that this changes the C API.

It's too late in the release process to figure out how we want to
handle this. Reverting the patch is essentially a way of buying time:
we don't change the API at the source level for now, we're not
trying to fix it with a last-minute patch with a risk of unintended
effects, and we preserve our options for fixing this in 3.6.1.

This is not ideal, but I think it's the best compromise at this stage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230431 91177308-0d34-0410-b5e6-96231b3b80d8

ReleaseNotes: final touch-ups

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230346 91177308-0d34-0410-b5e6-96231b3b80d8

ReleaseNotes: add LLVMSharp & ClangSharp, by Mukul Sabharwal

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230287 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in the release notes: mechanism -> mechanisms

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230284 91177308-0d34-0410-b5e6-96231b3b80d8

Fix typo in release notes (owns -> own)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230282 91177308-0d34-0410-b5e6-96231b3b80d8

Note stackmap/patchpoint support for PowerPC in the release notes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230281 91177308-0d34-0410-b5e6-96231b3b80d8

Release Notes: add text about garbage collection, from Philip Reames

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@230277 91177308-0d34-0410-b5e6-96231b3b80d8

Add release notes about vectorcall support and Win64 unwind info

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229874 91177308-0d34-0410-b5e6-96231b3b80d8

Add Go bindings to release notes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229800 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229731:
------------------------------------------------------------------------
r229731 | sanjoy | 2015-02-18 11:32:25 -0800 (Wed, 18 Feb 2015) | 10 lines

Partial fix for bug 22589

Don't spend the entire iteration space in the scalar loop prologue if
computing the trip count overflows. This change also gets rid of the
backedge check in the prologue loop and the extra check for
overflowing trip-count.

Differential Revision: http://reviews.llvm.org/D7715

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229757 91177308-0d34-0410-b5e6-96231b3b80d8

Remove another in-progress warning

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229712 91177308-0d34-0410-b5e6-96231b3b80d8

Remove warnings about the release notes being for the 'next' release

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229711 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229352:
------------------------------------------------------------------------
r229352 | majnemer | 2015-02-15 20:02:09 -0800 (Sun, 15 Feb 2015) | 8 lines

IR: Properly return nullptr when getAggregateElement is out-of-bounds

We didn't properly handle the out-of-bounds case for
ConstantAggregateZero and UndefValue. This would manifest as a crash
when the constant folder was asked to fold a load of a constant global
whose struct type has no operands.

This fixes PR22595.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229588 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229529:
------------------------------------------------------------------------
r229529 | rnk | 2015-02-17 12:02:34 -0800 (Tue, 17 Feb 2015) | 8 lines

Expose LLVM_VERSION_PATCH in llvm-config.h

There was no reason to keep this private in config.h, and users
requested that it be available in PR22615.

Also fix a bug where patch versions of '0' would cause the macro to
remain undefined. The "#cmakedefine" command only creates a macro if the
named variable would be considered true in the context of an if().
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229567 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229495:
------------------------------------------------------------------------
r229495 | delena | 2015-02-17 05:10:05 -0800 (Tue, 17 Feb 2015) | 8 lines

Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229564 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226808:
------------------------------------------------------------------------
r226808 | delena | 2015-01-22 04:07:59 -0800 (Thu, 22 Jan 2015) | 10 lines

Fixed a bug in type legalizer for masked load/store intrinsics.
The problem occurs when after vectorization we have type
<2 x i32>. This type is promoted to <2 x i64> and then requires
additional efforts for expanding loads and truncating stores.
I added EXPAND / TRUNCATE attributes to the masked load/store
SDNodes. The code now contains additional shuffles.
I've prepared changes in the cost estimation for masked memory
operations, it will be submitted separately.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229561 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226791:
------------------------------------------------------------------------
r226791 | delena | 2015-01-22 00:20:06 -0800 (Thu, 22 Jan 2015) | 7 lines

Fixed a bug in masked load/store in reversed loop.
Added a test.

The bug was submitted to bugzilla:
http://llvm.org/bugs/show_bug.cgi?id=22225

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229555 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229343 and r229351:

------------------------------------------------------------------------
r229343 | lhames | 2015-02-15 15:22:43 -0800 (Sun, 15 Feb 2015) | 6 lines

[ExecutionEngine] Fix dependence issue by moving RTDyldMemoryManager into
RuntimeDyld.

This should fix http://llvm.org/PR22593.

------------------------------------------------------------------------
r229351 | chapuni | 2015-02-15 18:13:30 -0800 (Sun, 15 Feb 2015) | 1 line

[CMake] Add RuntimeDyld to libdeps corresponding to r229343.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229553 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229421:
------------------------------------------------------------------------
r229421 | dexonsmith | 2015-02-16 11:18:01 -0800 (Mon, 16 Feb 2015) | 33 lines

Bitcode: Fix major regression: large files w/ debug info

The metadata/value split introduced a major regression reading large
bitcode files that contain debug info (or other cyclic (non-self
reference) metadata graphs).  For the first time in a while, I dropped
from libLTO.dylib down to `llvm-lto` with a non-trivial bitcode file
(~350MB), and I hit this when reading the result of ld64's `-save-temps`
in `llvm-lto`.

Here's pseudo-code for what was going on:

    read-main-metadata-block:
      for each md:
        if has-fwd-ref: // Only true for cyclic graphs.
          any-fwd-refs <- true
      if any-fwd-refs:
        foreach md:
          resolve-cycles(md) // Handle cycles.

    foreach function:
      read-function-metadata-block: // Such as !alias, !loop
        if any-fwd-refs:
          foreach md: // (all metadata, not just this block)
            resolve-cycles(md) // A no-op, but the loop is expensive!!

This commit resets the `AnyFwdRefs` flag to `false`.  This on its own
was enough to change my Release+Asserts `llvm-lto` time for reading this
bitcode from over 20 minutes (I gave up on it) to 20 seconds.  I've gone
further by tracking the min/max metadata forward-references in a
metadata block.  This protects against a schema that has lots of
functions that each reference their own metadata cycle.

Unfortunately, this regression is in the 3.6 branch as well.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229545 91177308-0d34-0410-b5e6-96231b3b80d8

Add LDC compiler as external project to the 3.6 release notes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229418 91177308-0d34-0410-b5e6-96231b3b80d8

added Likely to ReleaseNotes - Open Source External Projects

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229259 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r229029, minus the test which didn't work on the branch:

------------------------------------------------------------------------
r229029 | chandlerc | 2015-02-12 18:30:01 -0800 (Thu, 12 Feb 2015) | 16 lines

[IC] Fix a bug with the instcombine canonicalizing of loads and
propagating of metadata.

We were propagating !nonnull metadata even when the newly formed load is
no longer of a pointer type. This is clearly broken and results in LLVM
failing the verifier and aborting. This patch just restricts the
propagation of !nonnull metadata to when we actually have a pointer
type.

This bug report and the initial version of this patch was provided by
Charles Davis! Many thanks for finding this!

We still need to add logic to round-trip the metadata correctly if we
combine from pointer types to integer types and then back by using range
metadata for the integer type loads. But this is the minimal and safe
version of the patch, which is important so we can backport it into 3.6.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@229036 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228969:
------------------------------------------------------------------------
r228969 | hfinkel | 2015-02-12 14:43:52 -0800 (Thu, 12 Feb 2015) | 7 lines

[SDAG] Don't try to use FP_EXTEND/FP_ROUND for int<->fp promotions

The PowerPC backend has long promoted some floating-point vector operations
(such as select) to integer vector operations. Unfortunately, this behavior was
broken by r216555. When using FP_EXTEND/FP_ROUND for promotions, we must check
that both the old and new types are floating-point types. Otherwise, we must
use BITCAST as we did prior to r216555 for everything.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228986 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226616:
------------------------------------------------------------------------
r226616 | adrian | 2015-01-20 14:37:25 -0800 (Tue, 20 Jan 2015) | 2 lines

DebugLocs without a scope should fail the verification.
Follow-up to r226588.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228985 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226588:
------------------------------------------------------------------------
r226588 | adrian | 2015-01-20 10:03:37 -0800 (Tue, 20 Jan 2015) | 1 line

Add an assertion and prefer a crash over an infinite loop.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228984 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228979:
------------------------------------------------------------------------
r228979 | majnemer | 2015-02-12 15:26:26 -0800 (Thu, 12 Feb 2015) | 8 lines

X86: Don't crash if we can't decode the pshufb mask

Constant pool entries are uniqued by their contents regardless of their
type. This means that a pshufb can have a shuffle mask which isn't a
simple array of bytes.

The code path which attempts to decode the mask didn't check for
failure, causing PR22559.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228983 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228957:
------------------------------------------------------------------------
r228957 | bsteinbr | 2015-02-12 13:04:22 -0800 (Thu, 12 Feb 2015) | 14 lines

Fix a crash in the assumption cache when inlining indirect function calls

Summary:
Instances of the AssumptionCache are per function, so we can't re-use
the same AssumptionCache instance when recursing in the CallAnalyzer to
analyze a different function. Instead we have to pass the
AssumptionCacheTracker to the CallAnalyzer so it can get the right
AssumptionCache on demand.

Reviewers: hfinkel

Subscribers: llvm-commits, hans

Differential Revision: http://reviews.llvm.org/D7533
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228965 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228842:
------------------------------------------------------------------------
r228842 | jvoung | 2015-02-11 08:12:50 -0800 (Wed, 11 Feb 2015) | 17 lines

Gold-plugin: Broaden scope of get/release_input_file to scope of Module.

Summary:
Move calls to get_input_file and release_input_file out of
getModuleForFile(). Otherwise release_input_file may end up
unmapping a view of the file while the view is still being
used by the Module (on 32-bit hosts).

Fix for PR22482.

Test Plan: Add test using --no-map-whole-files.

Reviewers: rafael, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7539
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228942 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228899:
------------------------------------------------------------------------
r228899 | chandlerc | 2015-02-11 18:30:56 -0800 (Wed, 11 Feb 2015) | 28 lines

[slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228940 91177308-0d34-0410-b5e6-96231b3b80d8

Reverting r225904 to fix PR22505:

------------------------------------------------------------------------
r225904 | rnk | 2015-01-13 17:05:27 -0800 (Tue, 13 Jan 2015) | 27 lines

CodeGen support for x86_64 SEH catch handlers in LLVM

This adds handling for ExceptionHandling::MSVC, used by the
x86_64-pc-windows-msvc triple. It assumes that filter functions have
already been outlined in either the frontend or the backend. Filter
functions are used in place of the landingpad catch clause type info
operands. In catch clause order, the first filter to return true will
catch the exception.

The C specific handler table expects the landing pad to be split into
one block per handler, but LLVM IR uses a single landing pad for all
possible unwind actions. This patch papers over the mismatch by
synthesizing single instruction BBs for every catch clause to fill in
the EH selector that the landing pad block expects.

Missing functionality:
- Accessing data in the parent frame from outlined filters
- Cleanups (from __finally) are unsupported, as they will require
outlining and parent frame access
- Filter clauses are unsupported, as there's no clear analogue in SEH

In other words, this is the minimal set of changes needed to write IR to
catch arbitrary exceptions and resume normal execution.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D6300
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228891 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228793:
------------------------------------------------------------------------
r228793 | bogner | 2015-02-10 18:52:44 -0800 (Tue, 10 Feb 2015) | 12 lines

InstrProf: Lower coverage mappings by setting their sections appropriately

Add handling for __llvm_coverage_mapping to the InstrProfiling
pass. We need to make sure the constant and any profile names it
refers to are in the correct sections, which is easier and cleaner to
do here where we have to know about profiling sections anyway.

This is really tricky to test without a frontend, so I'm committing
the test for the fix in clang. If anyone knows a good way to test this
within LLVM, please let me know.

Fixes PR22531.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228800 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228760 and r228761:

------------------------------------------------------------------------
r228760 | majnemer | 2015-02-10 15:09:43 -0800 (Tue, 10 Feb 2015) | 3 lines

EarlyCSE: It isn't safe to CSE across synchronization boundaries

This fixes PR22514.
------------------------------------------------------------------------

------------------------------------------------------------------------
r228761 | majnemer | 2015-02-10 15:11:02 -0800 (Tue, 10 Feb 2015) | 1 line

EarlyCSE: Add check lines for test added in r228760
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228790 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228656:
------------------------------------------------------------------------
r228656 | chandlerc | 2015-02-09 18:25:56 -0800 (Mon, 09 Feb 2015) | 16 lines

[x86] Fix PR22524: the DAG combiner was incorrectly handling illegal
nodes when folding bitcasts of constants.

We can't fold things and then check after-the-fact whether it was legal.
Once we have formed the DAG node, arbitrary other nodes may have been
collapsed to it. There is no easy way to go back. Instead, we need to
test for the specific folding cases we're interested in and ensure those
are legal first.

This could in theory make this less powerful for bitcasting from an
integer to some vector type, but AFAICT, that can't actually happen in
the SDAG so its fine. Now, we *only* whitelist specific int->fp and
fp->int bitcasts for post-legalization folding. I've added the test case
from the PR.

(Also as a note, this does not appear to be in 3.6, no backport needed)
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228787 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228565:
------------------------------------------------------------------------
r228565 | majnemer | 2015-02-08 22:31:31 -0800 (Sun, 08 Feb 2015) | 3 lines

MC: Calculate intra-section symbol differences correctly for COFF

This fixes PR22060.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228667 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228525:
------------------------------------------------------------------------
r228525 | bsteinbr | 2015-02-08 09:07:14 -0800 (Sun, 08 Feb 2015) | 14 lines

Correctly combine alias.scope metadata by a union instead of intersecting

Summary:
The alias.scope metadata represents sets of things an instruction might
alias with. When generically combining the metadata from two
instructions the result must be the union of the original sets, because
the new instruction might alias with anything any of the original
instructions aliased with.

Reviewers: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7490
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228561 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228518:
------------------------------------------------------------------------
r228518 | tnorthover | 2015-02-07 16:50:47 -0800 (Sat, 07 Feb 2015) | 15 lines

ARM & AArch64: teach LowerVSETCC that output type size may differ from input.

While various DAG combines try to guarantee that a vector SETCC
operation will have the same output size as input, there's nothing
intrinsic to either creation or LegalizeTypes that actually guarantees
it, so the function needs to be ready to handle a mismatch.

Fortunately this is easy enough, just extend or truncate the naturally
compared result.

I couldn't reproduce the failure in other backends that I know have
SIMD, so it's probably only an issue for these two due to shared
heritage.

Should fix PR21645.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228560 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228507:
------------------------------------------------------------------------
r228507 | joerg | 2015-02-07 13:24:06 -0800 (Sat, 07 Feb 2015) | 4 lines

Avoid integer overflows around realloc calls resulting in potential
heap. Problem identified by Guido Vranken. Changes differ from original
OpenBSD sources by not depending on non-portable reallocarray.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228511 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228500:
------------------------------------------------------------------------
r228500 | bsteinbr | 2015-02-07 09:54:36 -0800 (Sat, 07 Feb 2015) | 5 lines

Properly update AA metadata when performing call slot optimization

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7482
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228504 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228490:
------------------------------------------------------------------------
r228490 | majnemer | 2015-02-07 00:26:40 -0800 (Sat, 07 Feb 2015) | 5 lines

MC: Emit COFF section flags in the "proper" order

COFF section flags are not idempotent:
'rd' will make a read-write section because 'd' implies write
'dr' will make a read-only section because 'r' disables write
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228502 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227628:
------------------------------------------------------------------------
r227628 | lhames | 2015-01-30 14:28:49 -0800 (Fri, 30 Jan 2015) | 6 lines

[PBQP] Fix transposed worst row/column check in handleAdd/RemoveNode in the PBQP
allocator.

Patch by Jonas Paulsson. Thanks Jonas!

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228501 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228411:
------------------------------------------------------------------------
r228411 | rnk | 2015-02-06 09:59:49 -0800 (Fri, 06 Feb 2015) | 3 lines

Don't dllexport declarations

Fixes PR22488
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228480 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228444:
------------------------------------------------------------------------
r228444 | eugenis | 2015-02-06 13:47:39 -0800 (Fri, 06 Feb 2015) | 8 lines

[msan] Fix "missing origin" in atomic store.

An atomic store always make the target location fully initialized (in the
current implementation). It should not store origin. Initialized memory can't
have meaningful origin, and, due to origin granularity (4 bytes) there is a
chance that this extra store would overwrite meaningfull origin for an adjacent
location.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228445 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228331:
------------------------------------------------------------------------
r228331 | sylvestre | 2015-02-05 10:57:02 -0800 (Thu, 05 Feb 2015) | 30 lines

Fix an incorrect identifier

Summary:
EIEIO is not a correct declaration and breaks the build under Debian HURD.
Instead, E_IEIO is used.

//
http://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html
Some additional classes of identifier names are reserved for future
extensions to the C language or the POSIX.1 environment. While using
these names for your own purposes right now might not cause a problem,
they do raise the possibility of conflict with future versions of the C
or POSIX standards, so you should avoid these names.
...
Names beginning with a capital ?\226?\128?\152E?\226?\128?\153 followed a digit or uppercase letter
may be used for additional error code names. See Error Reporting.//

Reported here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776965
And patch wrote by Svante Signell
With this patch, LLVM, Clang & LLDB build under Debian HURD:
https://buildd.debian.org/status/fetch.php?pkg=llvm-toolchain-3.6&arch=hurd-i386&ver=1%3A3.6~%2Brc2-2&stamp=1423040039

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7437
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228335 91177308-0d34-0410-b5e6-96231b3b80d8

Merge r227983 (along with r227815 and r227972 to make it apply cleanly)

------------------------------------------------------------------------
r227815 | spatel | 2015-02-02 09:47:30 -0800 (Mon, 02 Feb 2015) | 2 lines

fix typo

------------------------------------------------------------------------

------------------------------------------------------------------------
r227972 | spatel | 2015-02-03 07:37:18 -0800 (Tue, 03 Feb 2015) | 10 lines

Improve test to actually check for a folded load.

This test was checking for lack of a "movaps" (an aligned load)
rather than a "movups" (an unaligned load). It also included
a store which complicated the checking.

Add specific CPU runs to prevent subtarget feature flag overrides
from inhibiting this optimization.
------------------------------------------------------------------------

------------------------------------------------------------------------
r227983 | spatel | 2015-02-03 09:13:04 -0800 (Tue, 03 Feb 2015) | 22 lines

Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371).

r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit.
The commit log says that change did not affect anything, but that's not correct.
That change allowed SSE instructions to have unaligned mem operands folded into
math ops, and that's not allowed in the default specification for any SSE variant.

The bug is exposed when compiling for an AVX-capable CPU that had this feature
flag but without enabling AVX codegen. Another mistake in r224330 was not adding
the feature flag to all AVX CPUs; the AMD chips were excluded.

This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ).

This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem".
Changed the existing test case for the feature bit to reflect the new name and
renamed the test file itself to better reflect the feature.
Added runs to fold-vex.ll to check for the failing codegen.

Note that the feature bit is not set by default on any CPU because it may require a
configuration register setting to enable the enhanced unaligned behavior.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228323 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228303:

------------------------------------------------------------------------
r228303 | thomas.stellard | 2015-02-05 10:32:18 -0500 (Thu, 05 Feb 2015) | 11 lines

R600/SI: Fix bug in TTI loop unrolling preferences

We should be setting UnrollingPreferences::MaxCount to MAX_UINT instead
of UnrollingPreferences::Count.

Count is a 'forced unrolling factor', while MaxCount sets an upper
limit to the unrolling factor.

Setting Count to MAX_UINT was causing the loop in the testcase to be
unrolled 15 times, when it only had a maximum of 4 iterations.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228320 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228302:

------------------------------------------------------------------------
r228302 | thomas.stellard | 2015-02-05 10:32:15 -0500 (Thu, 05 Feb 2015) | 34 lines

R600/SI: Fix bug from insertion of llvm.SI.end.cf into loop headers

The llvm.SI.end.cf intrinsic is used to mark the end of if-then blocks,
if-then-else blocks, and loops.  It is responsible for updating the
exec mask to re-enable threads that had been masked during the preceding
control flow block.  For example:

s_mov_b64 exec, 0x3                 ; Initial exec mask
s_mov_b64 s[0:1], exec              ; Saved exec mask
v_cmpx_gt_u32 exec, s[2:3], v0, 0   ; llvm.SI.if
do_stuff()
s_or_b64 exec, exec, s[0:1]         ; llvm.SI.end.cf

The bug fixed by this patch was one where the llvm.SI.end.cf intrinsic
was being inserted into the header of loops.  This would happen when
an if block terminated in a loop header and we would end up with
code like this:

s_mov_b64 exec, 0x3                 ; Initial exec mask
s_mov_b64 s[0:1], exec              ; Saved exec mask
v_cmpx_gt_u32 exec, s[2:3], v0, 0   ; llvm.SI.if
do_stuff()

LOOP:                       ; Start of loop header
s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf <-BUG: The exec mask has the
                              same value at the beginning of each loop
      iteration.
do_stuff();
s_cbranch_execnz LOOP

The fix is to create a new basic block before the loop and insert the
llvm.SI.end.cf there.  This way the exec mask is restored before the
start of the loop instead of at the beginning of each iteration.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228319 91177308-0d34-0410-b5e6-96231b3b80d8

Add microMIPS codegen to the release notes.

It only fails bullet, and smallpt now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228287 91177308-0d34-0410-b5e6-96231b3b80d8

Utils: Resolve cycles under distinct MDNodes

Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`,
and resolve them at the end. Previously, these cycles wouldn't get
resolved.

Conflicts:
lib/Transforms/Utils/ValueMapper.cpp

(This is a reimplementation of r228180 for the 3.6 release branch.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228199 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228168:
------------------------------------------------------------------------
r228168 | mkuper | 2015-02-04 10:54:01 -0800 (Wed, 04 Feb 2015) | 3 lines

Fixes a bug in vector load legalization that confused bits and bytes.

Differential Revision: http://reviews.llvm.org/D7400
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228197 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228049:
------------------------------------------------------------------------
r228049 | hans | 2015-02-03 14:08:20 -0800 (Tue, 03 Feb 2015) | 12 lines

[CMake] add_llvm_library: don't use .imp suffix for import libraries on Windows (PR22334)

This was added in r188351 to fix a naming conflict between the
profile_rt-static and profile_rt-shared who both ended up in
lib/profile_rt.lib.

The change also affected other libraries (like libclang), and
users are reporting that they find it surprising that there's
no longer a libclang.lib. Since the profile_rt naming conflict
doesn't seem to exist any more, I think we can remove this.

Differential Revision: http://reviews.llvm.org/D7391
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228165 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228129:
------------------------------------------------------------------------
r228129 | rengolin | 2015-02-04 02:11:59 -0800 (Wed, 04 Feb 2015) | 8 lines

Reverting VLD1/VST1 base-updating/post-incrementing combining

This reverts patches 223862, 224198, 224203, and 224754, which were all
related to the vector load/store combining and were reverted/reaplied
a few times due to the same alignment problems we're seeing now.

Further tests, mainly self-hosting Clang, will be needed to reapply this
patch in the future.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228153 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226809:
------------------------------------------------------------------------
r226809 | timurrrr | 2015-01-22 04:24:21 -0800 (Thu, 22 Jan 2015) | 1 line

[ASan/Win] Move the shadow to 0x30000000
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228008 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227934:
------------------------------------------------------------------------
r227934 | rafael | 2015-02-02 17:53:03 -0800 (Mon, 02 Feb 2015) | 1 line

Propagate a better error message to the C api.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227985 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227903:
------------------------------------------------------------------------
r227903 | rafael | 2015-02-02 16:49:57 -0800 (Mon, 02 Feb 2015) | 1 line

Use a non-fatal diag handler in the C API. FIxes PR22368.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227984 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227809:
------------------------------------------------------------------------
r227809 | jvoung | 2015-02-02 08:56:50 -0800 (Mon, 02 Feb 2015) | 16 lines

Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0.

Summary:
Previously it only avoided optimizing signed comparisons to 0.
Sometimes the DAGCombiner will optimize the unsigned comparisons
to 0 before it gets to the peephole pass, but sometimes it doesn't.

Fix for PR22373.

Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll

Reviewers: jfb, manmanren

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D7274
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227864 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227670:
------------------------------------------------------------------------
r227670 | compnerd | 2015-01-30 20:12:06 -0800 (Fri, 30 Jan 2015) | 5 lines

ARM: make a table more readable (NFC)

This adds some comments and splits the flag calculation on type boundaries to
make the table more readable. Addresses some post-commit review comments to SVN
r227603. NFC.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227856 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227603:
------------------------------------------------------------------------
r227603 | compnerd | 2015-01-30 11:35:18 -0800 (Fri, 30 Jan 2015) | 7 lines

ARM: further correct .fpu directive handling

If the original FPU specification involved a restricted VFP unit (d16), ensure
that we reset the functionality when we encounter a new FPU type. In
particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has
32 double precision registers), we would fail to reset the D16 feature, and
treat it as being equivalent to vfpv3-d16.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227854 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227618:

------------------------------------------------------------------------
r227618 | thomas.stellard | 2015-01-30 16:51:51 -0500 (Fri, 30 Jan 2015) | 4 lines

R600/SI: Handle SI_SPILL_V96_RESTORE in SIRegisterInfo::eliminateFrameIndex()

This fixes a crash in Unigine Heaven.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227823 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r227332, which merged in r227300: "[LPM] Rip all of ManagedStatic
and ThreadLocal out of the pretty stack tracing code."

The patch has been having trouble on trunk and doesn't seem ready for 3.6.
Reverting to get it out of the branch before tagging rc2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227646 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227584:
------------------------------------------------------------------------
r227584 | compnerd | 2015-01-30 09:58:25 -0800 (Fri, 30 Jan 2015) | 10 lines

ARM: correct handling of .fpu directive

The FPU directive permits the user to switch the target FPU, enabling
instructions that would be otherwise unavailable. However, when configuring the
new subtarget features, we would not enable the implied functions for newer
FPUs. This would result in invalid rejection of valid input. Ensure that we
inherit the implied FPU functionality when enabling newer versions of the FPU.
Fortunately, these are mostly hierarchical, unlike the CPUs.

Addresses PR22395.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227637 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227462:

------------------------------------------------------------------------
r227462 | thomas.stellard | 2015-01-29 11:55:28 -0500 (Thu, 29 Jan 2015) | 2 lines

R600/SI: Remove stray debug statements

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227597 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Define a schedule model and enable the generic machine scheduler

The schedule model is not complete yet, and could be improved.

This is a partial merge of r227461. The difference is that it
does not enable the machine scheduler by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227596 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227491:
------------------------------------------------------------------------
r227491 | spatel | 2015-01-29 12:51:49 -0800 (Thu, 29 Jan 2015) | 13 lines

[GVN] don't propagate equality comparisons of FP zero (PR22376)

In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities
to allow some simple value range optimizations. But that introduced a bug
when comparing to -0.0 or 0.0: these compare equal even though they are not
bitwise identical.

This patch disallows propagating zero constants in equality comparisons.
Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376
Differential Revision: http://reviews.llvm.org/D7257

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227537 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227290:
------------------------------------------------------------------------
r227290 | dblaikie | 2015-01-27 18:34:53 -0800 (Tue, 27 Jan 2015) | 1 line

PR22356: DebugInfo: Handle the size of a member where the type of that member is a typedef (or other sugar) of a declaration.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227492 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227339:
------------------------------------------------------------------------
r227339 | bsteinbr | 2015-01-28 10:32:31 -0800 (Wed, 28 Jan 2015) | 3 lines

Fix build breakage caused by memory leaks in llvm-c-test

I accidently introduced those in r227319.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227477 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227319:
------------------------------------------------------------------------
r227319 | bsteinbr | 2015-01-28 08:35:59 -0800 (Wed, 28 Jan 2015) | 10 lines

Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes

Summary:
MetadataAsValue uses a canonical format that strips the MDNode if it
contains only a single constant value. This triggers an assertion when
trying to cast the value to a MDNode.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7165
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227475 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227085:
------------------------------------------------------------------------
r227085 | joerg | 2015-01-26 03:41:48 -0800 (Mon, 26 Jan 2015) | 13 lines

The canonical CPU variant for ARM according to config.guess uses a
suffix it seems:

# ./config.guess
earmv7hfeb-unknown-netbsd7.99.4

Extend the triple parsing to support this. Avoid running the ARM parser
multiple times because StringSwitch is not lazy.

Reviewers: Renato Golin, Tim Northover

Differential Revision: http://reviews.llvm.org/D7166

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227394 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226970:

------------------------------------------------------------------------
r226970 | thomas.stellard | 2015-01-23 18:59:08 -0500 (Fri, 23 Jan 2015) | 2 lines

R600/SI: Emit .hsa.version section for amdhsa OS

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227365 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226945:

------------------------------------------------------------------------
r226945 | thomas.stellard | 2015-01-23 17:05:45 -0500 (Fri, 23 Jan 2015) | 9 lines

R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select()

We used to do this promotion during DAG legalization, but this
caused an infinite loop in ExpandUnalignedLoad() because it assumed
that i64 loads were legal if i64 was a legal type.

It also seems better to report i64 loads as legal, since they actually
are and we were just promoting them to simplify our tablegen files.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227364 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227250:
------------------------------------------------------------------------
r227250 | ab | 2015-01-27 13:52:16 -0800 (Tue, 27 Jan 2015) | 31 lines

[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.

This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227346 91177308-0d34-0410-b5e6-96231b3b80d8

Backport r226026 and r226031 to 3.6.

This fixes pr22351.

Original messages:

r226026:
Fix handling of extern_weak. This was broken by r225983

r226031:
Fix linking of shared libraries.

In shared libraries the plugin can see non-weak declarations that are still
undefined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227344 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227300:
------------------------------------------------------------------------
r227300 | chandlerc | 2015-01-28 01:52:14 -0800 (Wed, 28 Jan 2015) | 34 lines

[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack
tracing code.

Managed static was just insane overhead for this. We took memory fences
and external function calls in every path that pushed a pretty stack
frame. This includes a multitude of layers setting up and tearing down
passes, the parser in Clang, everywhere. For the regression test suite
or low-overhead JITs, this was contributing to really significant
overhead.

Even the LLVM ThreadLocal is really overkill here because it uses
pthread_{set,get}_specific logic, and has careful code to both allocate
and delete the thread local data. We don't actually want any of that,
and this code in particular has problems coping with deallocation. What
we want is a single TLS pointer that is valid to use during global
construction and during global destruction, any time we want. That is
exactly what every host compiler and OS we use has implemented for
a long time, and what was standardized in C++11. Even though not all of
our host compilers support the thread_local keyword, we can directly use
the platform-specific keywords to get the minimal functionality needed.
Provided this limited trial survives the build bots, I will move this to
Compiler.h so it is more widely available as a light weight if limited
alternative to the ThreadLocal class. Many thanks to David Majnemer for
helping me think through the implications across platforms and craft the
MSVC-compatible syntax.

The end result is *substantially* faster. When running llc in a tight
loop over a small IR file targeting the aarch64 backend, this improves
its performance by over 10% for me. It also seems likely to fix the
remaining regressions seen by JIT users with threading enabled.

This may actually have more impact on real-world compile times due to
the use of the pretty stack tracing utility throughout the rest of Clang
or LLVM, but I've not collected any detailed measurements.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227332 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227299:
------------------------------------------------------------------------
r227299 | chandlerc | 2015-01-28 01:47:21 -0800 (Wed, 28 Jan 2015) | 48 lines

[LPM] A targeted but somewhat horrible fix to the legacy pass manager's
querying of the pass registry.

The pass manager relies on the static registry of PassInfo objects to
perform all manner of its functionality. I don't understand why it does
much of this. My very vague understanding is that this registry is
touched both during static initialization *and* while each pass is being
constructed. As a consequence it is hard to make accessing it not
require a acquiring some lock. This lock ends up in the hot path of
setting up, tearing down, and invaliditing analyses in the legacy pass
manager.

On most systems you can observe this as a non-trivial % of the time
spent in 'ninja check-llvm'. However, I haven't really seen it be more
than 1% in extreme cases of compiling more real-world software,
including LTO.

Unfortunately, some of the GPU JITs are seeing this taking essentially
all of their time because they have very small IR running through
a small pass pipeline very many times (at least, this is the vague
understanding I have of it).

This patch tries to minimize the cost of looking up PassInfo objects by
leveraging the fact that the objects themselves are immutable and they
are allocated separately on the heap and so don't have their address
change. It also requires a change I made the last time I tried to debug
this problem which removed the ability to de-register a pass from the
registry. This patch creates a single access path to these objects
inside the PMTopLevelManager which memoizes the result of querying the
registry. This is somewhat gross as I don't really know if
PMTopLevelManager is the *right* place to put it, and I dislike using
a mutable member to memoize things, but it seems to work.

For long-lived pass managers this should completely eliminate
the cost of acquiring locks to look into the pass registry once the
memoized cache is warm. For 'ninja check' I measured about 1.5%
reduction in CPU time and in total time on a machine with 32 hardware
threads. For normal compilation, I don't know how much this will help,
sadly. We will still pay the cost while we populate the memoized cache.
I don't think it will hurt though, and for LTO or compiles with many
small functions it should still be a win. However, for tight loops
around a pass manager with many passes and small modules, this will help
tremendously. On the AArch64 backend I saw nearly 50% reductions in time
to complete 2000 cycles of spinning up and tearing down the pipeline.
Measurements from Owen of an actual long-lived pass manager show more
along the lines of 10% improvements.

Differential Revision: http://reviews.llvm.org/D7213
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227331 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227294:
------------------------------------------------------------------------
r227294 | chandlerc | 2015-01-27 20:57:56 -0800 (Tue, 27 Jan 2015) | 23 lines

[LPM] Stop using the string based preservation API. It is an
abomination.

For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.

To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.

String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/

I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227328 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227261:
------------------------------------------------------------------------
r227261 | compnerd | 2015-01-27 14:57:39 -0800 (Tue, 27 Jan 2015) | 6 lines

SymbolRewriter: allow rewriting with comdats

COMDATs must be identically named to the symbol. When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs. This corrects the behaviour and
adds test cases for this.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227324 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227260:
------------------------------------------------------------------------
r227260 | compnerd | 2015-01-27 14:57:35 -0800 (Tue, 27 Jan 2015) | 4 lines

SymbolRewriter: prevent unnecessary rewrite

The rewrite for the pattern based rewrite is unnecessary if the existing name
matches the pattern.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227323 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227005:
------------------------------------------------------------------------
r227005 | dsanders | 2015-01-24 14:35:11 +0000 (Sat, 24 Jan 2015) | 38 lines

[mips] Fix 'jumpy' debug line info around calls.

Summary:
At the moment, address calculation is taking the debug line info from the
address node (e.g. TargetGlobalAddress). When a function is called multiple
times, this results in output of the form:

  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .. address calculation ..
  .loc $second_call_location
  .. function call ..
  .loc $first_call_location
  .. address calculation ..
  .loc $third_call_location
  .. function call ..

This patch makes address calculations for function calls take the debug line
info for the call node and results in output of the form:
  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .loc $second_call_location
  .. address calculation ..
  .. function call ..
  .loc $third_call_location
  .. address calculation ..
  .. function call ..

All other address calculations continue to use the address node.

Test Plan: Fixes test/DebugInfo/multiline.ll on a mips host.

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D7050

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227193 91177308-0d34-0410-b5e6-96231b3b80d8

pocl and TCE work with LLVM 3.6 now. Added them to the ReleaseNotes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227188 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226664:
------------------------------------------------------------------------
r226664 | tnorthover | 2015-01-21 07:43:31 -0800 (Wed, 21 Jan 2015) | 7 lines

AArch64: add backend option to reserve x18 (platform register)

AAPCS64 says that it's up to the platform to specify whether x18 is
reserved, and a first step on that way is to add a flag controlling
it.

From: Andrew Turner <andrew@fubar.geek.nz>
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227150 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226711:
------------------------------------------------------------------------
r226711 | jroelofs | 2015-01-21 14:39:43 -0800 (Wed, 21 Jan 2015) | 8 lines

Fix load-store optimizer on thumbv4t

Thumbv4t does not have lo->lo copies other than MOVS,
and that can't be predicated. So emit MOVS when needed
and bail if there's a predicate.

http://reviews.llvm.org/D6592

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226918 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226708:
------------------------------------------------------------------------
r226708 | majnemer | 2015-01-21 14:32:04 -0800 (Wed, 21 Jan 2015) | 4 lines

InstCombine: Don't strip bitcasts off of callsites marked 'thunk'

The return type of a thunk is meaningless, we just want the arguments
and return value to be forwarded.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226854 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226755:
------------------------------------------------------------------------
r226755 | sanjoy | 2015-01-21 16:48:47 -0800 (Wed, 21 Jan 2015) | 11 lines

Make ScalarEvolution less aggressive with respect to no-wrap flags.

ScalarEvolution currently lowers a subtraction recurrence to an add
recurrence with the same no-wrap flags as the subtraction. This is
incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y`
and `sub nuw X, Y` is not the same as `add nuw X, -Y`. This patch
fixes the issue, and adds two test cases demonstrating the bug.

Differential Revision: http://reviews.llvm.org/D7081

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226839 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226597:

------------------------------------------------------------------------
r226597 | thomas.stellard | 2015-01-20 14:33:04 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Add subtarget feature to enable VGPR spilling for all shader types

This is disabled by default, but can be enabled with the subtarget
feature: 'vgpr-spilling'

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226728 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226596:

------------------------------------------------------------------------
r226596 | thomas.stellard | 2015-01-20 14:33:02 -0500 (Tue, 20 Jan 2015) | 2 lines

R600/SI: Fix simple-loop.ll test

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226727 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226591:

------------------------------------------------------------------------
r226591 | thomas.stellard | 2015-01-20 14:24:31 -0500 (Tue, 20 Jan 2015) | 2 lines

R600/SI: Remove stray debugging code from r226586

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226726 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226586:

------------------------------------------------------------------------
r226586 | thomas.stellard | 2015-01-20 12:49:47 -0500 (Tue, 20 Jan 2015) | 6 lines

R600/SI: Use external symbols for scratch buffer

We were passing the scratch buffer address to the shaders via user sgprs,
but now we use external symbols and have the driver patch the shader
using reloc information.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226725 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226585:

------------------------------------------------------------------------
r226585 | thomas.stellard | 2015-01-20 12:49:45 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Add kill flag when copying scratch offset to a register

This allows us to re-use the same register for the scratch offset
when accessing large private arrays.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226724 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226584:

------------------------------------------------------------------------
r226584 | thomas.stellard | 2015-01-20 12:49:43 -0500 (Tue, 20 Jan 2015) | 6 lines

R600/SI: Don't store scratch buffer frame index in MUBUF offset field

We don't have a good way of legalizing this if the frame index offset
is more than the 12-bits, which is size of MUBUF's offset field, so
now we store the frame index in the vaddr field.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226723 91177308-0d34-0410-b5e6-96231b3b80d8