granicus.if.org Git

Merging r228500:
------------------------------------------------------------------------
r228500 | bsteinbr | 2015-02-07 09:54:36 -0800 (Sat, 07 Feb 2015) | 5 lines

Properly update AA metadata when performing call slot optimization

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7482
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228504 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228490:
------------------------------------------------------------------------
r228490 | majnemer | 2015-02-07 00:26:40 -0800 (Sat, 07 Feb 2015) | 5 lines

MC: Emit COFF section flags in the "proper" order

COFF section flags are not idempotent:
'rd' will make a read-write section because 'd' implies write
'dr' will make a read-only section because 'r' disables write
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228502 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227628:
------------------------------------------------------------------------
r227628 | lhames | 2015-01-30 14:28:49 -0800 (Fri, 30 Jan 2015) | 6 lines

[PBQP] Fix transposed worst row/column check in handleAdd/RemoveNode in the PBQP
allocator.

Patch by Jonas Paulsson. Thanks Jonas!

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228501 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228411:
------------------------------------------------------------------------
r228411 | rnk | 2015-02-06 09:59:49 -0800 (Fri, 06 Feb 2015) | 3 lines

Don't dllexport declarations

Fixes PR22488
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228480 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228444:
------------------------------------------------------------------------
r228444 | eugenis | 2015-02-06 13:47:39 -0800 (Fri, 06 Feb 2015) | 8 lines

[msan] Fix "missing origin" in atomic store.

An atomic store always make the target location fully initialized (in the
current implementation). It should not store origin. Initialized memory can't
have meaningful origin, and, due to origin granularity (4 bytes) there is a
chance that this extra store would overwrite meaningfull origin for an adjacent
location.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228445 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228331:
------------------------------------------------------------------------
r228331 | sylvestre | 2015-02-05 10:57:02 -0800 (Thu, 05 Feb 2015) | 30 lines

Fix an incorrect identifier

Summary:
EIEIO is not a correct declaration and breaks the build under Debian HURD.
Instead, E_IEIO is used.

//
http://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html
Some additional classes of identifier names are reserved for future
extensions to the C language or the POSIX.1 environment. While using
these names for your own purposes right now might not cause a problem,
they do raise the possibility of conflict with future versions of the C
or POSIX standards, so you should avoid these names.
...
Names beginning with a capital ?\226?\128?\152E?\226?\128?\153 followed a digit or uppercase letter
may be used for additional error code names. See Error Reporting.//

Reported here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776965
And patch wrote by Svante Signell
With this patch, LLVM, Clang & LLDB build under Debian HURD:
https://buildd.debian.org/status/fetch.php?pkg=llvm-toolchain-3.6&arch=hurd-i386&ver=1%3A3.6~%2Brc2-2&stamp=1423040039

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7437
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228335 91177308-0d34-0410-b5e6-96231b3b80d8

Merge r227983 (along with r227815 and r227972 to make it apply cleanly)

------------------------------------------------------------------------
r227815 | spatel | 2015-02-02 09:47:30 -0800 (Mon, 02 Feb 2015) | 2 lines

fix typo

------------------------------------------------------------------------

------------------------------------------------------------------------
r227972 | spatel | 2015-02-03 07:37:18 -0800 (Tue, 03 Feb 2015) | 10 lines

Improve test to actually check for a folded load.

This test was checking for lack of a "movaps" (an aligned load)
rather than a "movups" (an unaligned load). It also included
a store which complicated the checking.

Add specific CPU runs to prevent subtarget feature flag overrides
from inhibiting this optimization.
------------------------------------------------------------------------

------------------------------------------------------------------------
r227983 | spatel | 2015-02-03 09:13:04 -0800 (Tue, 03 Feb 2015) | 22 lines

Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371).

r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit.
The commit log says that change did not affect anything, but that's not correct.
That change allowed SSE instructions to have unaligned mem operands folded into
math ops, and that's not allowed in the default specification for any SSE variant.

The bug is exposed when compiling for an AVX-capable CPU that had this feature
flag but without enabling AVX codegen. Another mistake in r224330 was not adding
the feature flag to all AVX CPUs; the AMD chips were excluded.

This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ).

This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem".
Changed the existing test case for the feature bit to reflect the new name and
renamed the test file itself to better reflect the feature.
Added runs to fold-vex.ll to check for the failing codegen.

Note that the feature bit is not set by default on any CPU because it may require a
configuration register setting to enable the enhanced unaligned behavior.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228323 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228303:

------------------------------------------------------------------------
r228303 | thomas.stellard | 2015-02-05 10:32:18 -0500 (Thu, 05 Feb 2015) | 11 lines

R600/SI: Fix bug in TTI loop unrolling preferences

We should be setting UnrollingPreferences::MaxCount to MAX_UINT instead
of UnrollingPreferences::Count.

Count is a 'forced unrolling factor', while MaxCount sets an upper
limit to the unrolling factor.

Setting Count to MAX_UINT was causing the loop in the testcase to be
unrolled 15 times, when it only had a maximum of 4 iterations.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228320 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228302:

------------------------------------------------------------------------
r228302 | thomas.stellard | 2015-02-05 10:32:15 -0500 (Thu, 05 Feb 2015) | 34 lines

R600/SI: Fix bug from insertion of llvm.SI.end.cf into loop headers

The llvm.SI.end.cf intrinsic is used to mark the end of if-then blocks,
if-then-else blocks, and loops.  It is responsible for updating the
exec mask to re-enable threads that had been masked during the preceding
control flow block.  For example:

s_mov_b64 exec, 0x3                 ; Initial exec mask
s_mov_b64 s[0:1], exec              ; Saved exec mask
v_cmpx_gt_u32 exec, s[2:3], v0, 0   ; llvm.SI.if
do_stuff()
s_or_b64 exec, exec, s[0:1]         ; llvm.SI.end.cf

The bug fixed by this patch was one where the llvm.SI.end.cf intrinsic
was being inserted into the header of loops.  This would happen when
an if block terminated in a loop header and we would end up with
code like this:

s_mov_b64 exec, 0x3                 ; Initial exec mask
s_mov_b64 s[0:1], exec              ; Saved exec mask
v_cmpx_gt_u32 exec, s[2:3], v0, 0   ; llvm.SI.if
do_stuff()

LOOP:                       ; Start of loop header
s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf <-BUG: The exec mask has the
                              same value at the beginning of each loop
      iteration.
do_stuff();
s_cbranch_execnz LOOP

The fix is to create a new basic block before the loop and insert the
llvm.SI.end.cf there.  This way the exec mask is restored before the
start of the loop instead of at the beginning of each iteration.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228319 91177308-0d34-0410-b5e6-96231b3b80d8

Add microMIPS codegen to the release notes.

It only fails bullet, and smallpt now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228287 91177308-0d34-0410-b5e6-96231b3b80d8

Utils: Resolve cycles under distinct MDNodes

Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`,
and resolve them at the end. Previously, these cycles wouldn't get
resolved.

Conflicts:
lib/Transforms/Utils/ValueMapper.cpp

(This is a reimplementation of r228180 for the 3.6 release branch.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228199 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228168:
------------------------------------------------------------------------
r228168 | mkuper | 2015-02-04 10:54:01 -0800 (Wed, 04 Feb 2015) | 3 lines

Fixes a bug in vector load legalization that confused bits and bytes.

Differential Revision: http://reviews.llvm.org/D7400
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228197 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228049:
------------------------------------------------------------------------
r228049 | hans | 2015-02-03 14:08:20 -0800 (Tue, 03 Feb 2015) | 12 lines

[CMake] add_llvm_library: don't use .imp suffix for import libraries on Windows (PR22334)

This was added in r188351 to fix a naming conflict between the
profile_rt-static and profile_rt-shared who both ended up in
lib/profile_rt.lib.

The change also affected other libraries (like libclang), and
users are reporting that they find it surprising that there's
no longer a libclang.lib. Since the profile_rt naming conflict
doesn't seem to exist any more, I think we can remove this.

Differential Revision: http://reviews.llvm.org/D7391
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228165 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r228129:
------------------------------------------------------------------------
r228129 | rengolin | 2015-02-04 02:11:59 -0800 (Wed, 04 Feb 2015) | 8 lines

Reverting VLD1/VST1 base-updating/post-incrementing combining

This reverts patches 223862, 224198, 224203, and 224754, which were all
related to the vector load/store combining and were reverted/reaplied
a few times due to the same alignment problems we're seeing now.

Further tests, mainly self-hosting Clang, will be needed to reapply this
patch in the future.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228153 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226809:
------------------------------------------------------------------------
r226809 | timurrrr | 2015-01-22 04:24:21 -0800 (Thu, 22 Jan 2015) | 1 line

[ASan/Win] Move the shadow to 0x30000000
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@228008 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227934:
------------------------------------------------------------------------
r227934 | rafael | 2015-02-02 17:53:03 -0800 (Mon, 02 Feb 2015) | 1 line

Propagate a better error message to the C api.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227985 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227903:
------------------------------------------------------------------------
r227903 | rafael | 2015-02-02 16:49:57 -0800 (Mon, 02 Feb 2015) | 1 line

Use a non-fatal diag handler in the C API. FIxes PR22368.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227984 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227809:
------------------------------------------------------------------------
r227809 | jvoung | 2015-02-02 08:56:50 -0800 (Mon, 02 Feb 2015) | 16 lines

Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0.

Summary:
Previously it only avoided optimizing signed comparisons to 0.
Sometimes the DAGCombiner will optimize the unsigned comparisons
to 0 before it gets to the peephole pass, but sometimes it doesn't.

Fix for PR22373.

Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll

Reviewers: jfb, manmanren

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D7274
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227864 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227670:
------------------------------------------------------------------------
r227670 | compnerd | 2015-01-30 20:12:06 -0800 (Fri, 30 Jan 2015) | 5 lines

ARM: make a table more readable (NFC)

This adds some comments and splits the flag calculation on type boundaries to
make the table more readable. Addresses some post-commit review comments to SVN
r227603. NFC.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227856 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227603:
------------------------------------------------------------------------
r227603 | compnerd | 2015-01-30 11:35:18 -0800 (Fri, 30 Jan 2015) | 7 lines

ARM: further correct .fpu directive handling

If the original FPU specification involved a restricted VFP unit (d16), ensure
that we reset the functionality when we encounter a new FPU type. In
particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has
32 double precision registers), we would fail to reset the D16 feature, and
treat it as being equivalent to vfpv3-d16.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227854 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227618:

------------------------------------------------------------------------
r227618 | thomas.stellard | 2015-01-30 16:51:51 -0500 (Fri, 30 Jan 2015) | 4 lines

R600/SI: Handle SI_SPILL_V96_RESTORE in SIRegisterInfo::eliminateFrameIndex()

This fixes a crash in Unigine Heaven.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227823 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r227332, which merged in r227300: "[LPM] Rip all of ManagedStatic
and ThreadLocal out of the pretty stack tracing code."

The patch has been having trouble on trunk and doesn't seem ready for 3.6.
Reverting to get it out of the branch before tagging rc2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227646 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227584:
------------------------------------------------------------------------
r227584 | compnerd | 2015-01-30 09:58:25 -0800 (Fri, 30 Jan 2015) | 10 lines

ARM: correct handling of .fpu directive

The FPU directive permits the user to switch the target FPU, enabling
instructions that would be otherwise unavailable. However, when configuring the
new subtarget features, we would not enable the implied functions for newer
FPUs. This would result in invalid rejection of valid input. Ensure that we
inherit the implied FPU functionality when enabling newer versions of the FPU.
Fortunately, these are mostly hierarchical, unlike the CPUs.

Addresses PR22395.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227637 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227462:

------------------------------------------------------------------------
r227462 | thomas.stellard | 2015-01-29 11:55:28 -0500 (Thu, 29 Jan 2015) | 2 lines

R600/SI: Remove stray debug statements

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227597 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Define a schedule model and enable the generic machine scheduler

The schedule model is not complete yet, and could be improved.

This is a partial merge of r227461. The difference is that it
does not enable the machine scheduler by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227596 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227491:
------------------------------------------------------------------------
r227491 | spatel | 2015-01-29 12:51:49 -0800 (Thu, 29 Jan 2015) | 13 lines

[GVN] don't propagate equality comparisons of FP zero (PR22376)

In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities
to allow some simple value range optimizations. But that introduced a bug
when comparing to -0.0 or 0.0: these compare equal even though they are not
bitwise identical.

This patch disallows propagating zero constants in equality comparisons.
Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376
Differential Revision: http://reviews.llvm.org/D7257

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227537 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227290:
------------------------------------------------------------------------
r227290 | dblaikie | 2015-01-27 18:34:53 -0800 (Tue, 27 Jan 2015) | 1 line

PR22356: DebugInfo: Handle the size of a member where the type of that member is a typedef (or other sugar) of a declaration.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227492 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227339:
------------------------------------------------------------------------
r227339 | bsteinbr | 2015-01-28 10:32:31 -0800 (Wed, 28 Jan 2015) | 3 lines

Fix build breakage caused by memory leaks in llvm-c-test

I accidently introduced those in r227319.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227477 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227319:
------------------------------------------------------------------------
r227319 | bsteinbr | 2015-01-28 08:35:59 -0800 (Wed, 28 Jan 2015) | 10 lines

Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes

Summary:
MetadataAsValue uses a canonical format that strips the MDNode if it
contains only a single constant value. This triggers an assertion when
trying to cast the value to a MDNode.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7165
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227475 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227085:
------------------------------------------------------------------------
r227085 | joerg | 2015-01-26 03:41:48 -0800 (Mon, 26 Jan 2015) | 13 lines

The canonical CPU variant for ARM according to config.guess uses a
suffix it seems:

# ./config.guess
earmv7hfeb-unknown-netbsd7.99.4

Extend the triple parsing to support this. Avoid running the ARM parser
multiple times because StringSwitch is not lazy.

Reviewers: Renato Golin, Tim Northover

Differential Revision: http://reviews.llvm.org/D7166

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227394 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226970:

------------------------------------------------------------------------
r226970 | thomas.stellard | 2015-01-23 18:59:08 -0500 (Fri, 23 Jan 2015) | 2 lines

R600/SI: Emit .hsa.version section for amdhsa OS

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227365 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226945:

------------------------------------------------------------------------
r226945 | thomas.stellard | 2015-01-23 17:05:45 -0500 (Fri, 23 Jan 2015) | 9 lines

R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select()

We used to do this promotion during DAG legalization, but this
caused an infinite loop in ExpandUnalignedLoad() because it assumed
that i64 loads were legal if i64 was a legal type.

It also seems better to report i64 loads as legal, since they actually
are and we were just promoting them to simplify our tablegen files.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227364 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227250:
------------------------------------------------------------------------
r227250 | ab | 2015-01-27 13:52:16 -0800 (Tue, 27 Jan 2015) | 31 lines

[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.

This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227346 91177308-0d34-0410-b5e6-96231b3b80d8

Backport r226026 and r226031 to 3.6.

This fixes pr22351.

Original messages:

r226026:
Fix handling of extern_weak. This was broken by r225983

r226031:
Fix linking of shared libraries.

In shared libraries the plugin can see non-weak declarations that are still
undefined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227344 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227300:
------------------------------------------------------------------------
r227300 | chandlerc | 2015-01-28 01:52:14 -0800 (Wed, 28 Jan 2015) | 34 lines

[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack
tracing code.

Managed static was just insane overhead for this. We took memory fences
and external function calls in every path that pushed a pretty stack
frame. This includes a multitude of layers setting up and tearing down
passes, the parser in Clang, everywhere. For the regression test suite
or low-overhead JITs, this was contributing to really significant
overhead.

Even the LLVM ThreadLocal is really overkill here because it uses
pthread_{set,get}_specific logic, and has careful code to both allocate
and delete the thread local data. We don't actually want any of that,
and this code in particular has problems coping with deallocation. What
we want is a single TLS pointer that is valid to use during global
construction and during global destruction, any time we want. That is
exactly what every host compiler and OS we use has implemented for
a long time, and what was standardized in C++11. Even though not all of
our host compilers support the thread_local keyword, we can directly use
the platform-specific keywords to get the minimal functionality needed.
Provided this limited trial survives the build bots, I will move this to
Compiler.h so it is more widely available as a light weight if limited
alternative to the ThreadLocal class. Many thanks to David Majnemer for
helping me think through the implications across platforms and craft the
MSVC-compatible syntax.

The end result is *substantially* faster. When running llc in a tight
loop over a small IR file targeting the aarch64 backend, this improves
its performance by over 10% for me. It also seems likely to fix the
remaining regressions seen by JIT users with threading enabled.

This may actually have more impact on real-world compile times due to
the use of the pretty stack tracing utility throughout the rest of Clang
or LLVM, but I've not collected any detailed measurements.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227332 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227299:
------------------------------------------------------------------------
r227299 | chandlerc | 2015-01-28 01:47:21 -0800 (Wed, 28 Jan 2015) | 48 lines

[LPM] A targeted but somewhat horrible fix to the legacy pass manager's
querying of the pass registry.

The pass manager relies on the static registry of PassInfo objects to
perform all manner of its functionality. I don't understand why it does
much of this. My very vague understanding is that this registry is
touched both during static initialization *and* while each pass is being
constructed. As a consequence it is hard to make accessing it not
require a acquiring some lock. This lock ends up in the hot path of
setting up, tearing down, and invaliditing analyses in the legacy pass
manager.

On most systems you can observe this as a non-trivial % of the time
spent in 'ninja check-llvm'. However, I haven't really seen it be more
than 1% in extreme cases of compiling more real-world software,
including LTO.

Unfortunately, some of the GPU JITs are seeing this taking essentially
all of their time because they have very small IR running through
a small pass pipeline very many times (at least, this is the vague
understanding I have of it).

This patch tries to minimize the cost of looking up PassInfo objects by
leveraging the fact that the objects themselves are immutable and they
are allocated separately on the heap and so don't have their address
change. It also requires a change I made the last time I tried to debug
this problem which removed the ability to de-register a pass from the
registry. This patch creates a single access path to these objects
inside the PMTopLevelManager which memoizes the result of querying the
registry. This is somewhat gross as I don't really know if
PMTopLevelManager is the *right* place to put it, and I dislike using
a mutable member to memoize things, but it seems to work.

For long-lived pass managers this should completely eliminate
the cost of acquiring locks to look into the pass registry once the
memoized cache is warm. For 'ninja check' I measured about 1.5%
reduction in CPU time and in total time on a machine with 32 hardware
threads. For normal compilation, I don't know how much this will help,
sadly. We will still pay the cost while we populate the memoized cache.
I don't think it will hurt though, and for LTO or compiles with many
small functions it should still be a win. However, for tight loops
around a pass manager with many passes and small modules, this will help
tremendously. On the AArch64 backend I saw nearly 50% reductions in time
to complete 2000 cycles of spinning up and tearing down the pipeline.
Measurements from Owen of an actual long-lived pass manager show more
along the lines of 10% improvements.

Differential Revision: http://reviews.llvm.org/D7213
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227331 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227294:
------------------------------------------------------------------------
r227294 | chandlerc | 2015-01-27 20:57:56 -0800 (Tue, 27 Jan 2015) | 23 lines

[LPM] Stop using the string based preservation API. It is an
abomination.

For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.

To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.

String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/

I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227328 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227261:
------------------------------------------------------------------------
r227261 | compnerd | 2015-01-27 14:57:39 -0800 (Tue, 27 Jan 2015) | 6 lines

SymbolRewriter: allow rewriting with comdats

COMDATs must be identically named to the symbol. When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs. This corrects the behaviour and
adds test cases for this.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227324 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227260:
------------------------------------------------------------------------
r227260 | compnerd | 2015-01-27 14:57:35 -0800 (Tue, 27 Jan 2015) | 4 lines

SymbolRewriter: prevent unnecessary rewrite

The rewrite for the pattern based rewrite is unnecessary if the existing name
matches the pattern.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227323 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227005:
------------------------------------------------------------------------
r227005 | dsanders | 2015-01-24 14:35:11 +0000 (Sat, 24 Jan 2015) | 38 lines

[mips] Fix 'jumpy' debug line info around calls.

Summary:
At the moment, address calculation is taking the debug line info from the
address node (e.g. TargetGlobalAddress). When a function is called multiple
times, this results in output of the form:

  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .. address calculation ..
  .loc $second_call_location
  .. function call ..
  .loc $first_call_location
  .. address calculation ..
  .loc $third_call_location
  .. function call ..

This patch makes address calculations for function calls take the debug line
info for the call node and results in output of the form:
  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .loc $second_call_location
  .. address calculation ..
  .. function call ..
  .loc $third_call_location
  .. address calculation ..
  .. function call ..

All other address calculations continue to use the address node.

Test Plan: Fixes test/DebugInfo/multiline.ll on a mips host.

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D7050

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227193 91177308-0d34-0410-b5e6-96231b3b80d8

pocl and TCE work with LLVM 3.6 now. Added them to the ReleaseNotes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227188 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226664:
------------------------------------------------------------------------
r226664 | tnorthover | 2015-01-21 07:43:31 -0800 (Wed, 21 Jan 2015) | 7 lines

AArch64: add backend option to reserve x18 (platform register)

AAPCS64 says that it's up to the platform to specify whether x18 is
reserved, and a first step on that way is to add a flag controlling
it.

From: Andrew Turner <andrew@fubar.geek.nz>
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227150 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226711:
------------------------------------------------------------------------
r226711 | jroelofs | 2015-01-21 14:39:43 -0800 (Wed, 21 Jan 2015) | 8 lines

Fix load-store optimizer on thumbv4t

Thumbv4t does not have lo->lo copies other than MOVS,
and that can't be predicated. So emit MOVS when needed
and bail if there's a predicate.

http://reviews.llvm.org/D6592

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226918 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226708:
------------------------------------------------------------------------
r226708 | majnemer | 2015-01-21 14:32:04 -0800 (Wed, 21 Jan 2015) | 4 lines

InstCombine: Don't strip bitcasts off of callsites marked 'thunk'

The return type of a thunk is meaningless, we just want the arguments
and return value to be forwarded.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226854 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226755:
------------------------------------------------------------------------
r226755 | sanjoy | 2015-01-21 16:48:47 -0800 (Wed, 21 Jan 2015) | 11 lines

Make ScalarEvolution less aggressive with respect to no-wrap flags.

ScalarEvolution currently lowers a subtraction recurrence to an add
recurrence with the same no-wrap flags as the subtraction. This is
incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y`
and `sub nuw X, Y` is not the same as `add nuw X, -Y`. This patch
fixes the issue, and adds two test cases demonstrating the bug.

Differential Revision: http://reviews.llvm.org/D7081

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226839 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226597:

------------------------------------------------------------------------
r226597 | thomas.stellard | 2015-01-20 14:33:04 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Add subtarget feature to enable VGPR spilling for all shader types

This is disabled by default, but can be enabled with the subtarget
feature: 'vgpr-spilling'

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226728 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226596:

------------------------------------------------------------------------
r226596 | thomas.stellard | 2015-01-20 14:33:02 -0500 (Tue, 20 Jan 2015) | 2 lines

R600/SI: Fix simple-loop.ll test

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226727 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226591:

------------------------------------------------------------------------
r226591 | thomas.stellard | 2015-01-20 14:24:31 -0500 (Tue, 20 Jan 2015) | 2 lines

R600/SI: Remove stray debugging code from r226586

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226726 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226586:

------------------------------------------------------------------------
r226586 | thomas.stellard | 2015-01-20 12:49:47 -0500 (Tue, 20 Jan 2015) | 6 lines

R600/SI: Use external symbols for scratch buffer

We were passing the scratch buffer address to the shaders via user sgprs,
but now we use external symbols and have the driver patch the shader
using reloc information.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226725 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226585:

------------------------------------------------------------------------
r226585 | thomas.stellard | 2015-01-20 12:49:45 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Add kill flag when copying scratch offset to a register

This allows us to re-use the same register for the scratch offset
when accessing large private arrays.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226724 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226584:

------------------------------------------------------------------------
r226584 | thomas.stellard | 2015-01-20 12:49:43 -0500 (Tue, 20 Jan 2015) | 6 lines

R600/SI: Don't store scratch buffer frame index in MUBUF offset field

We don't have a good way of legalizing this if the frame index offset
is more than the 12-bits, which is size of MUBUF's offset field, so
now we store the frame index in the vaddr field.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226723 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226583:

------------------------------------------------------------------------
r226583 | thomas.stellard | 2015-01-20 12:49:41 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Update SIInstrInfo:verifyInstruction() after r225662

Now that we have our own custom register operand types, we need
to handle them in the verifiier.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226722 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226226:

------------------------------------------------------------------------
r226226 | Matthew.Arsenault | 2015-01-15 18:17:03 -0500 (Thu, 15 Jan 2015) | 5 lines

R600/SI: Fix trailing comma with modifiers

Instructions with 1 operand can still use source modifiers,
so make sure we don't print an extra comma afterwards.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226721 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226191:

------------------------------------------------------------------------
r226191 | marek.olsak | 2015-01-15 13:43:06 -0500 (Thu, 15 Jan 2015) | 11 lines

R600/SI: Unify VOP2 instructions which are VOP3-only on VI

This removes some duplicated classes and definitions.

These instructions are defined:
  _e32 // pseudo
  _e32_si
  _e64 // pseudo
  _e64_si
  _e64_vi

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226720 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226190:

------------------------------------------------------------------------
r226190 | marek.olsak | 2015-01-15 13:43:01 -0500 (Thu, 15 Jan 2015) | 2 lines

R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226719 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226189:

------------------------------------------------------------------------
r226189 | marek.olsak | 2015-01-15 13:42:55 -0500 (Thu, 15 Jan 2015) | 6 lines

R600/SI: Add V_READLANE_B32 and V_WRITELANE_B32 for VI

These are VOP3-only on VI.

The new multiclass doesn't define VOP3 versions of VOP2 instructions.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226718 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226188:

------------------------------------------------------------------------
r226188 | marek.olsak | 2015-01-15 13:42:51 -0500 (Thu, 15 Jan 2015) | 7 lines

R600/SI: Don't shrink instructions whose e32 encoding doesn't exist

v2: modify hasVALU32BitEncoding instead
v3: - add pseudoToMCOpcode helper to AMDGPUInstInfo, which is used by both
hasVALU32BitEncoding and AMDGPUMCInstLower::lower
- report an error if a pseudo can't be lowered

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226717 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226187:

------------------------------------------------------------------------
r226187 | marek.olsak | 2015-01-15 13:42:44 -0500 (Thu, 15 Jan 2015) | 2 lines

R600/SI: Add common class VOPAnyCommon

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226715 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226186:

------------------------------------------------------------------------
r226186 | marek.olsak | 2015-01-15 13:42:40 -0500 (Thu, 15 Jan 2015) | 2 lines

R600/SI: Don't select SI-only VOP3 opcodes on VI

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226714 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226473:
------------------------------------------------------------------------
r226473 | garious | 2015-01-19 09:40:05 -0800 (Mon, 19 Jan 2015) | 8 lines

[AArch64] Implement GHC calling convention

Original patch by Luke Iannini. Minor improvements and test added by
Erik de Castro Lopo.

Differential Revision: http://reviews.llvm.org/D6877

From: Erik de Castro Lopo <erikd@mega-nerd.com>
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226604 91177308-0d34-0410-b5e6-96231b3b80d8

Add a few items to the 3.6 release notes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226582 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226171:
------------------------------------------------------------------------
r226171 | dsanders | 2015-01-15 15:41:03 +0000 (Thu, 15 Jan 2015) | 11 lines

[mips] Fix a typo in the compare patterns for MIPS32r6/MIPS64r6.

Summary: The patterns intended for the SETLE node were actually matching the SETLT node.

Reviewers: atanasyan, sstankovic, vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6997
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226379 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226170:
------------------------------------------------------------------------
r226170 | vkalintiris | 2015-01-15 15:36:04 +0000 (Thu, 15 Jan 2015) | 7 lines

Fix the C-API MCJIT test for 32-bit big endian machines.

Avoid using unions for storing the return value from
LLVMGetGlobalValueAddress() and LLVMGetFunctionAddress() and accessing it as
a pointer through another pointer member. This causes problems on 32-bit big
endian machines since the pointer gets the higher part of the return value of
the aforementioned functions.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226378 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r225957 from the 3.6 branch.

It should get more testing on trunk before going into a release.

Original message:

Use the integrated assembler by default on SPARC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226355 91177308-0d34-0410-b5e6-96231b3b80d8

Port r226022 to the 3.6 branch.

r225644 is a revert of r225644. A fixed version for trunk is being reviewed, but there is
no need to rush this into 3.6.

Original message:

Revert "Add r224985 back with two fixes."

This reverts commit r225644 while I debug a regression.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226354 91177308-0d34-0410-b5e6-96231b3b80d8

Fix nit from review of metadata release notes.

Somehow I missed this review comment until now...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226213 91177308-0d34-0410-b5e6-96231b3b80d8

Write 3.6 metadata release notes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226212 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226182:
------------------------------------------------------------------------
r226182 | joerg | 2015-01-15 09:59:02 -0800 (Thu, 15 Jan 2015) | 2 lines

Support @PLT loads on 32bit x86.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226203 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226075:
------------------------------------------------------------------------
r226075 | sanjoy | 2015-01-14 17:46:09 -0800 (Wed, 14 Jan 2015) | 10 lines

Fix PR22222

The bug was introduced in r225282. r225282 assumed that sub X, Y is
the same as add X, -Y. This is not correct if we are going to upgrade
the sub to sub nuw. This change fixes the issue by making the
optimization ignore sub instructions.

Differential Revision: http://reviews.llvm.org/D6979

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226193 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226058:
------------------------------------------------------------------------
r226058 | dexonsmith | 2015-01-14 15:11:51 -0800 (Wed, 14 Jan 2015) | 1 line

IR: Fix comment spelling, NFC
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226095 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226048:
------------------------------------------------------------------------
r226048 | dexonsmith | 2015-01-14 14:27:36 -0800 (Wed, 14 Jan 2015) | 17 lines

IR: Move MDLocation into place

This commit moves `MDLocation`, finishing off PR21433.  There's an
accompanying clang commit for frontend testcases.  I'll attach the
testcase upgrade script I used to PR21433 to help out-of-tree
frontends/backends.

This changes the schema for `DebugLoc` and `DILocation` from:

    !{i32 3, i32 7, !7, !8}

to:

    !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)

Note that empty fields (line/column: 0 and inlinedAt: null) don't get
printed by the assembly writer.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226094 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226046:
------------------------------------------------------------------------
r226046 | dexonsmith | 2015-01-14 14:14:26 -0800 (Wed, 14 Jan 2015) | 3 lines

IR: Always print MDLocation line

Print `MDLocation`'s `line` field even when it's 0.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226092 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226044:
------------------------------------------------------------------------
r226044 | dexonsmith | 2015-01-14 13:58:17 -0800 (Wed, 14 Jan 2015) | 15 lines

IR: Drop metadata references more aggressively during teardown

Sometimes teardown happens before the debug info graph is complete
(e.g., when clang throws an error). In that case, `MDNode`s will still
have RAUW, so deleting constants that the `MDNode`s point at will be
relatively expensive -- it'll cause re-uniquing all up the chain (what
I've been referring to as "teardown madness").

So, drop references *before* deleting constants. We need to drop a few
more references now: the metadata side of the metadata/value bridges
needs to be dropped off the cliff along with the rest of it (previously,
the bridges were cleaned before we did anything with the `MDNode`s).

There's no real functionality change here -- state before and after
`LLVMContextImpl::~LLVMContextImpl()` is unchanged -- so no testcase.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226091 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226029:
------------------------------------------------------------------------
r226029 | dexonsmith | 2015-01-14 11:56:10 -0800 (Wed, 14 Jan 2015) | 7 lines

IR: Fix a use-after-free in RAUW

Happened pretty commonly during `LLVMContext` teardown when `clang -g`
hit an error. This fixes the use-after-free. Next I'll clean up
teardown so that it's not RAUW'ing when metadata-tracked values are
deleted (only really causes a problem if the graph is mid-construction
when teardown starts, but it's still unnecessary work).
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226090 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226023:
------------------------------------------------------------------------
r226023 | majnemer | 2015-01-14 11:26:56 -0800 (Wed, 14 Jan 2015) | 3 lines

InstCombine: Don't take A-B<0 into A<B if A-B has other uses

This fixes PR22226.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226059 91177308-0d34-0410-b5e6-96231b3b80d8

Change version from 3.6.0svn to 3.6.0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226020 91177308-0d34-0410-b5e6-96231b3b80d8

Creating release_36 branch off revision 225991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@225992 91177308-0d34-0410-b5e6-96231b3b80d8

fix typos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225991 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Use IMPLICIT_DEF and KILL when failing to spill VGPRs

This helps us avoid 'invalid register class for operand' verifier
errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225989 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Spill VGPRs to scratch space for compute shaders

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225988 91177308-0d34-0410-b5e6-96231b3b80d8

Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225987 91177308-0d34-0410-b5e6-96231b3b80d8

Override the TLI callback enableAggressiveFMAFusion and return true. Indeed, fmul, fmadd and fadd nodes cost the same number of cycles, so we can enable more combining heuristics to produce more fmadd nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225984 91177308-0d34-0410-b5e6-96231b3b80d8

Handle a symbol being undefined.

This can happen if:
* It is present in a comdat in one file.
* It is not present in the comdat of the file that is kept.
* Is is not used.

This should fix the LTO boostrap.

Thanks to Takumi NAKAMURA for setting up the bot!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225983 91177308-0d34-0410-b5e6-96231b3b80d8

Add disassembler tests for mips32r2 platform. There are no functional changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225980 91177308-0d34-0410-b5e6-96231b3b80d8

reapply: SLPVectorizer: Cache results from memory alias checking.

This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.

Compared to the original commit, this re-applied commit contains a bug fix which ensures that there are
no incorrect collisions in the alias cache.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225977 91177308-0d34-0410-b5e6-96231b3b80d8

[cleanup] Re-sort all the #include lines in LLVM using
utils/sort_includes.py.

I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225974 91177308-0d34-0410-b5e6-96231b3b80d8

Correct POP handling for v7m

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225972 91177308-0d34-0410-b5e6-96231b3b80d8

[dom] Make the DominatorTreeBase not a dynamic class!

Now that the passes are wrappers around this, we no longer need
a vtable, virtual destructor, and other associated mess. This is
particularly nice to me as this is a class template. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225970 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Port domtree to the new pass manager (at last).

This adds the domtree analysis to the new pass manager. The analysis
returns the same DominatorTree result entity used by the old pass
manager and essentially all of the code is shared. We just have
different boilerplate for running and printing the analysis.

I've converted one test to run in both modes just to make sure this is
exercised while both are live in the tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225969 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Refine octeon instructions seq/seqi/sne/snei

This commit refines the pattern for the octeon seq/seqi/sne/snei instructions.
The target register is set to 0 or 1 according to the result of the comparison.
In C, this is something like

rd = (unsigned long)(rs == rt)

This commit adds a zext to bring the result to i64. With this change the
instruction is selected for this type of code. (gcc produces the same code for
the above C code.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225968 91177308-0d34-0410-b5e6-96231b3b80d8

Add disassembler tests for mips32r2 platform. There are no functional changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225967 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Make DominatorTrees (corectly) movable so that we can move them
into the new pass manager's analysis cache which stores results
by-value.

Technically speaking, the dom trees were originally not movable but
copyable! This, unsurprisingly, didn't work at all -- the copy was
shallow and just resulted in rampant memory corruption. This change
explicitly forbids copying (as it would need to be a deep copy) and
makes them explicitly movable with the unsurprising boiler plate to
member-wise move them because we can't rely on MSVC to generate this
code for us. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225966 91177308-0d34-0410-b5e6-96231b3b80d8

Use the integrated assembler by default on SPARC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225957 91177308-0d34-0410-b5e6-96231b3b80d8

Use the operand vector instead so inline assembly can be validated too

The buildbots got upset after r225941, this should hopefully fix things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225954 91177308-0d34-0410-b5e6-96231b3b80d8

SelectionDAG: add a -filter-view-dags option to llc

This option takes the name of the basic block you want to visualize
with -view-*-dags

Differential Revision: http://reviews.llvm.org/D6948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225953 91177308-0d34-0410-b5e6-96231b3b80d8

DAG Combiner: Fold SelectCC When Cond is UNDEF

In case folding a node end up with a NaN as operand for the select,
the folding of the condition of the selectcc node returns "UNDEF".

Differential Revision: http://reviews.llvm.org/D6889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225952 91177308-0d34-0410-b5e6-96231b3b80d8

Add assertions for out of bound index in ComputeLinearIndex

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225951 91177308-0d34-0410-b5e6-96231b3b80d8

X86: only access operands if they are present

If there is no associated immediate (MS style inline asm), do not try to access
the operand, assume that it is valid. This should fix the buildbots after SVN
r225941.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225950 91177308-0d34-0410-b5e6-96231b3b80d8

Fold a loop for array processing in ComputeLinearIndex

When processing an array, every Elt has the same layout, it is
useless to recursively call each ComputeLinearIndex on each element.
Just do it once and multiply by the number of elements.

Differential Revision: http://reviews.llvm.org/D6832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225949 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Insert random noops to increase security against ROP attacks (llvm)"

This reverts commit:
http://reviews.llvm.org/D3392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225948 91177308-0d34-0410-b5e6-96231b3b80d8