granicus.if.org Git

Merging r227603:
------------------------------------------------------------------------
r227603 | compnerd | 2015-01-30 11:35:18 -0800 (Fri, 30 Jan 2015) | 7 lines

ARM: further correct .fpu directive handling

If the original FPU specification involved a restricted VFP unit (d16), ensure
that we reset the functionality when we encounter a new FPU type. In
particular, if the user specified vfpv3-d16, but switched to a VFPv3 (which has
32 double precision registers), we would fail to reset the D16 feature, and
treat it as being equivalent to vfpv3-d16.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227854 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227618:

------------------------------------------------------------------------
r227618 | thomas.stellard | 2015-01-30 16:51:51 -0500 (Fri, 30 Jan 2015) | 4 lines

R600/SI: Handle SI_SPILL_V96_RESTORE in SIRegisterInfo::eliminateFrameIndex()

This fixes a crash in Unigine Heaven.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227823 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r227332, which merged in r227300: "[LPM] Rip all of ManagedStatic
and ThreadLocal out of the pretty stack tracing code."

The patch has been having trouble on trunk and doesn't seem ready for 3.6.
Reverting to get it out of the branch before tagging rc2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227646 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227584:
------------------------------------------------------------------------
r227584 | compnerd | 2015-01-30 09:58:25 -0800 (Fri, 30 Jan 2015) | 10 lines

ARM: correct handling of .fpu directive

The FPU directive permits the user to switch the target FPU, enabling
instructions that would be otherwise unavailable. However, when configuring the
new subtarget features, we would not enable the implied functions for newer
FPUs. This would result in invalid rejection of valid input. Ensure that we
inherit the implied FPU functionality when enabling newer versions of the FPU.
Fortunately, these are mostly hierarchical, unlike the CPUs.

Addresses PR22395.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227637 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227462:

------------------------------------------------------------------------
r227462 | thomas.stellard | 2015-01-29 11:55:28 -0500 (Thu, 29 Jan 2015) | 2 lines

R600/SI: Remove stray debug statements

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227597 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Define a schedule model and enable the generic machine scheduler

The schedule model is not complete yet, and could be improved.

This is a partial merge of r227461. The difference is that it
does not enable the machine scheduler by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227596 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227491:
------------------------------------------------------------------------
r227491 | spatel | 2015-01-29 12:51:49 -0800 (Thu, 29 Jan 2015) | 13 lines

[GVN] don't propagate equality comparisons of FP zero (PR22376)

In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities
to allow some simple value range optimizations. But that introduced a bug
when comparing to -0.0 or 0.0: these compare equal even though they are not
bitwise identical.

This patch disallows propagating zero constants in equality comparisons.
Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376
Differential Revision: http://reviews.llvm.org/D7257

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227537 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227290:
------------------------------------------------------------------------
r227290 | dblaikie | 2015-01-27 18:34:53 -0800 (Tue, 27 Jan 2015) | 1 line

PR22356: DebugInfo: Handle the size of a member where the type of that member is a typedef (or other sugar) of a declaration.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227492 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227339:
------------------------------------------------------------------------
r227339 | bsteinbr | 2015-01-28 10:32:31 -0800 (Wed, 28 Jan 2015) | 3 lines

Fix build breakage caused by memory leaks in llvm-c-test

I accidently introduced those in r227319.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227477 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227319:
------------------------------------------------------------------------
r227319 | bsteinbr | 2015-01-28 08:35:59 -0800 (Wed, 28 Jan 2015) | 10 lines

Fix LLVMSetMetadata and LLVMAddNamedMetadataOperand for single value MDNodes

Summary:
MetadataAsValue uses a canonical format that strips the MDNode if it
contains only a single constant value. This triggers an assertion when
trying to cast the value to a MDNode.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7165
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227475 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227085:
------------------------------------------------------------------------
r227085 | joerg | 2015-01-26 03:41:48 -0800 (Mon, 26 Jan 2015) | 13 lines

The canonical CPU variant for ARM according to config.guess uses a
suffix it seems:

# ./config.guess
earmv7hfeb-unknown-netbsd7.99.4

Extend the triple parsing to support this. Avoid running the ARM parser
multiple times because StringSwitch is not lazy.

Reviewers: Renato Golin, Tim Northover

Differential Revision: http://reviews.llvm.org/D7166

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227394 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226970:

------------------------------------------------------------------------
r226970 | thomas.stellard | 2015-01-23 18:59:08 -0500 (Fri, 23 Jan 2015) | 2 lines

R600/SI: Emit .hsa.version section for amdhsa OS

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227365 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226945:

------------------------------------------------------------------------
r226945 | thomas.stellard | 2015-01-23 17:05:45 -0500 (Fri, 23 Jan 2015) | 9 lines

R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select()

We used to do this promotion during DAG legalization, but this
caused an infinite loop in ExpandUnalignedLoad() because it assumed
that i64 loads were legal if i64 was a legal type.

It also seems better to report i64 loads as legal, since they actually
are and we were just promoting them to simplify our tablegen files.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227364 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227250:
------------------------------------------------------------------------
r227250 | ab | 2015-01-27 13:52:16 -0800 (Tue, 27 Jan 2015) | 31 lines

[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.

This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).

The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.

The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification.  This was introduced in a
refactoring (r225640) to match the original behavior.

However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.

For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW.  When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
  stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
  and simplifies the first stpcpy to a memcpy.  We now have
  two memcpys.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227346 91177308-0d34-0410-b5e6-96231b3b80d8

Backport r226026 and r226031 to 3.6.

This fixes pr22351.

Original messages:

r226026:
Fix handling of extern_weak. This was broken by r225983

r226031:
Fix linking of shared libraries.

In shared libraries the plugin can see non-weak declarations that are still
undefined.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227344 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227300:
------------------------------------------------------------------------
r227300 | chandlerc | 2015-01-28 01:52:14 -0800 (Wed, 28 Jan 2015) | 34 lines

[LPM] Rip all of ManagedStatic and ThreadLocal out of the pretty stack
tracing code.

Managed static was just insane overhead for this. We took memory fences
and external function calls in every path that pushed a pretty stack
frame. This includes a multitude of layers setting up and tearing down
passes, the parser in Clang, everywhere. For the regression test suite
or low-overhead JITs, this was contributing to really significant
overhead.

Even the LLVM ThreadLocal is really overkill here because it uses
pthread_{set,get}_specific logic, and has careful code to both allocate
and delete the thread local data. We don't actually want any of that,
and this code in particular has problems coping with deallocation. What
we want is a single TLS pointer that is valid to use during global
construction and during global destruction, any time we want. That is
exactly what every host compiler and OS we use has implemented for
a long time, and what was standardized in C++11. Even though not all of
our host compilers support the thread_local keyword, we can directly use
the platform-specific keywords to get the minimal functionality needed.
Provided this limited trial survives the build bots, I will move this to
Compiler.h so it is more widely available as a light weight if limited
alternative to the ThreadLocal class. Many thanks to David Majnemer for
helping me think through the implications across platforms and craft the
MSVC-compatible syntax.

The end result is *substantially* faster. When running llc in a tight
loop over a small IR file targeting the aarch64 backend, this improves
its performance by over 10% for me. It also seems likely to fix the
remaining regressions seen by JIT users with threading enabled.

This may actually have more impact on real-world compile times due to
the use of the pretty stack tracing utility throughout the rest of Clang
or LLVM, but I've not collected any detailed measurements.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227332 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227299:
------------------------------------------------------------------------
r227299 | chandlerc | 2015-01-28 01:47:21 -0800 (Wed, 28 Jan 2015) | 48 lines

[LPM] A targeted but somewhat horrible fix to the legacy pass manager's
querying of the pass registry.

The pass manager relies on the static registry of PassInfo objects to
perform all manner of its functionality. I don't understand why it does
much of this. My very vague understanding is that this registry is
touched both during static initialization *and* while each pass is being
constructed. As a consequence it is hard to make accessing it not
require a acquiring some lock. This lock ends up in the hot path of
setting up, tearing down, and invaliditing analyses in the legacy pass
manager.

On most systems you can observe this as a non-trivial % of the time
spent in 'ninja check-llvm'. However, I haven't really seen it be more
than 1% in extreme cases of compiling more real-world software,
including LTO.

Unfortunately, some of the GPU JITs are seeing this taking essentially
all of their time because they have very small IR running through
a small pass pipeline very many times (at least, this is the vague
understanding I have of it).

This patch tries to minimize the cost of looking up PassInfo objects by
leveraging the fact that the objects themselves are immutable and they
are allocated separately on the heap and so don't have their address
change. It also requires a change I made the last time I tried to debug
this problem which removed the ability to de-register a pass from the
registry. This patch creates a single access path to these objects
inside the PMTopLevelManager which memoizes the result of querying the
registry. This is somewhat gross as I don't really know if
PMTopLevelManager is the *right* place to put it, and I dislike using
a mutable member to memoize things, but it seems to work.

For long-lived pass managers this should completely eliminate
the cost of acquiring locks to look into the pass registry once the
memoized cache is warm. For 'ninja check' I measured about 1.5%
reduction in CPU time and in total time on a machine with 32 hardware
threads. For normal compilation, I don't know how much this will help,
sadly. We will still pay the cost while we populate the memoized cache.
I don't think it will hurt though, and for LTO or compiles with many
small functions it should still be a win. However, for tight loops
around a pass manager with many passes and small modules, this will help
tremendously. On the AArch64 backend I saw nearly 50% reductions in time
to complete 2000 cycles of spinning up and tearing down the pipeline.
Measurements from Owen of an actual long-lived pass manager show more
along the lines of 10% improvements.

Differential Revision: http://reviews.llvm.org/D7213
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227331 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227294:
------------------------------------------------------------------------
r227294 | chandlerc | 2015-01-27 20:57:56 -0800 (Tue, 27 Jan 2015) | 23 lines

[LPM] Stop using the string based preservation API. It is an
abomination.

For starters, this API is incredibly slow. In order to lookup the name
of a pass it must take a memory fence to acquire a pointer to the
managed static pass registry, and then potentially acquire locks while
it consults this registry for information about what passes exist by
that name. This stops the world of LLVMs in your process no matter
how little they cared about the result.

To make this more joyful, you'll note that we are preserving many passes
which *do not exist* any more, or are not even analyses which one might
wish to have be preserved. This means we do all the work only to say
"nope" with no error to the user.

String-based APIs are a *bad idea*. String-based APIs that cannot
produce any meaningful error are an even worse idea. =/

I have a patch that simply removes this API completely, but I'm hesitant
to commit it as I don't really want to perniciously break out-of-tree
users of the old pass manager. I'd rather they just have to migrate to
the new one at some point. If others disagree and would like me to kill
it with fire, just say the word. =]
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227328 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227261:
------------------------------------------------------------------------
r227261 | compnerd | 2015-01-27 14:57:39 -0800 (Tue, 27 Jan 2015) | 6 lines

SymbolRewriter: allow rewriting with comdats

COMDATs must be identically named to the symbol. When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs. This corrects the behaviour and
adds test cases for this.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227324 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227260:
------------------------------------------------------------------------
r227260 | compnerd | 2015-01-27 14:57:35 -0800 (Tue, 27 Jan 2015) | 4 lines

SymbolRewriter: prevent unnecessary rewrite

The rewrite for the pattern based rewrite is unnecessary if the existing name
matches the pattern.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227323 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r227005:
------------------------------------------------------------------------
r227005 | dsanders | 2015-01-24 14:35:11 +0000 (Sat, 24 Jan 2015) | 38 lines

[mips] Fix 'jumpy' debug line info around calls.

Summary:
At the moment, address calculation is taking the debug line info from the
address node (e.g. TargetGlobalAddress). When a function is called multiple
times, this results in output of the form:

  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .. address calculation ..
  .loc $second_call_location
  .. function call ..
  .loc $first_call_location
  .. address calculation ..
  .loc $third_call_location
  .. function call ..

This patch makes address calculations for function calls take the debug line
info for the call node and results in output of the form:
  .loc $first_call_location
  .. address calculation ..
  .. function call ..
  .loc $second_call_location
  .. address calculation ..
  .. function call ..
  .loc $third_call_location
  .. address calculation ..
  .. function call ..

All other address calculations continue to use the address node.

Test Plan: Fixes test/DebugInfo/multiline.ll on a mips host.

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D7050

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227193 91177308-0d34-0410-b5e6-96231b3b80d8

pocl and TCE work with LLVM 3.6 now. Added them to the ReleaseNotes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227188 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226664:
------------------------------------------------------------------------
r226664 | tnorthover | 2015-01-21 07:43:31 -0800 (Wed, 21 Jan 2015) | 7 lines

AArch64: add backend option to reserve x18 (platform register)

AAPCS64 says that it's up to the platform to specify whether x18 is
reserved, and a first step on that way is to add a flag controlling
it.

From: Andrew Turner <andrew@fubar.geek.nz>
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@227150 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226711:
------------------------------------------------------------------------
r226711 | jroelofs | 2015-01-21 14:39:43 -0800 (Wed, 21 Jan 2015) | 8 lines

Fix load-store optimizer on thumbv4t

Thumbv4t does not have lo->lo copies other than MOVS,
and that can't be predicated. So emit MOVS when needed
and bail if there's a predicate.

http://reviews.llvm.org/D6592

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226918 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226708:
------------------------------------------------------------------------
r226708 | majnemer | 2015-01-21 14:32:04 -0800 (Wed, 21 Jan 2015) | 4 lines

InstCombine: Don't strip bitcasts off of callsites marked 'thunk'

The return type of a thunk is meaningless, we just want the arguments
and return value to be forwarded.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226854 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226755:
------------------------------------------------------------------------
r226755 | sanjoy | 2015-01-21 16:48:47 -0800 (Wed, 21 Jan 2015) | 11 lines

Make ScalarEvolution less aggressive with respect to no-wrap flags.

ScalarEvolution currently lowers a subtraction recurrence to an add
recurrence with the same no-wrap flags as the subtraction. This is
incorrect because `sub nsw X, Y` is not the same as `add nsw X, -Y`
and `sub nuw X, Y` is not the same as `add nuw X, -Y`. This patch
fixes the issue, and adds two test cases demonstrating the bug.

Differential Revision: http://reviews.llvm.org/D7081

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226839 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226597:

------------------------------------------------------------------------
r226597 | thomas.stellard | 2015-01-20 14:33:04 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Add subtarget feature to enable VGPR spilling for all shader types

This is disabled by default, but can be enabled with the subtarget
feature: 'vgpr-spilling'

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226728 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226596:

------------------------------------------------------------------------
r226596 | thomas.stellard | 2015-01-20 14:33:02 -0500 (Tue, 20 Jan 2015) | 2 lines

R600/SI: Fix simple-loop.ll test

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226727 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226591:

------------------------------------------------------------------------
r226591 | thomas.stellard | 2015-01-20 14:24:31 -0500 (Tue, 20 Jan 2015) | 2 lines

R600/SI: Remove stray debugging code from r226586

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226726 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226586:

------------------------------------------------------------------------
r226586 | thomas.stellard | 2015-01-20 12:49:47 -0500 (Tue, 20 Jan 2015) | 6 lines

R600/SI: Use external symbols for scratch buffer

We were passing the scratch buffer address to the shaders via user sgprs,
but now we use external symbols and have the driver patch the shader
using reloc information.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226725 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226585:

------------------------------------------------------------------------
r226585 | thomas.stellard | 2015-01-20 12:49:45 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Add kill flag when copying scratch offset to a register

This allows us to re-use the same register for the scratch offset
when accessing large private arrays.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226724 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226584:

------------------------------------------------------------------------
r226584 | thomas.stellard | 2015-01-20 12:49:43 -0500 (Tue, 20 Jan 2015) | 6 lines

R600/SI: Don't store scratch buffer frame index in MUBUF offset field

We don't have a good way of legalizing this if the frame index offset
is more than the 12-bits, which is size of MUBUF's offset field, so
now we store the frame index in the vaddr field.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226723 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226583:

------------------------------------------------------------------------
r226583 | thomas.stellard | 2015-01-20 12:49:41 -0500 (Tue, 20 Jan 2015) | 5 lines

R600/SI: Update SIInstrInfo:verifyInstruction() after r225662

Now that we have our own custom register operand types, we need
to handle them in the verifiier.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226722 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226226:

------------------------------------------------------------------------
r226226 | Matthew.Arsenault | 2015-01-15 18:17:03 -0500 (Thu, 15 Jan 2015) | 5 lines

R600/SI: Fix trailing comma with modifiers

Instructions with 1 operand can still use source modifiers,
so make sure we don't print an extra comma afterwards.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226721 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226191:

------------------------------------------------------------------------
r226191 | marek.olsak | 2015-01-15 13:43:06 -0500 (Thu, 15 Jan 2015) | 11 lines

R600/SI: Unify VOP2 instructions which are VOP3-only on VI

This removes some duplicated classes and definitions.

These instructions are defined:
  _e32 // pseudo
  _e32_si
  _e64 // pseudo
  _e64_si
  _e64_vi

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226720 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226190:

------------------------------------------------------------------------
r226190 | marek.olsak | 2015-01-15 13:43:01 -0500 (Thu, 15 Jan 2015) | 2 lines

R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226719 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226189:

------------------------------------------------------------------------
r226189 | marek.olsak | 2015-01-15 13:42:55 -0500 (Thu, 15 Jan 2015) | 6 lines

R600/SI: Add V_READLANE_B32 and V_WRITELANE_B32 for VI

These are VOP3-only on VI.

The new multiclass doesn't define VOP3 versions of VOP2 instructions.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226718 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226188:

------------------------------------------------------------------------
r226188 | marek.olsak | 2015-01-15 13:42:51 -0500 (Thu, 15 Jan 2015) | 7 lines

R600/SI: Don't shrink instructions whose e32 encoding doesn't exist

v2: modify hasVALU32BitEncoding instead
v3: - add pseudoToMCOpcode helper to AMDGPUInstInfo, which is used by both
hasVALU32BitEncoding and AMDGPUMCInstLower::lower
- report an error if a pseudo can't be lowered

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226717 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226187:

------------------------------------------------------------------------
r226187 | marek.olsak | 2015-01-15 13:42:44 -0500 (Thu, 15 Jan 2015) | 2 lines

R600/SI: Add common class VOPAnyCommon

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226715 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226186:

------------------------------------------------------------------------
r226186 | marek.olsak | 2015-01-15 13:42:40 -0500 (Thu, 15 Jan 2015) | 2 lines

R600/SI: Don't select SI-only VOP3 opcodes on VI

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226714 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226473:
------------------------------------------------------------------------
r226473 | garious | 2015-01-19 09:40:05 -0800 (Mon, 19 Jan 2015) | 8 lines

[AArch64] Implement GHC calling convention

Original patch by Luke Iannini. Minor improvements and test added by
Erik de Castro Lopo.

Differential Revision: http://reviews.llvm.org/D6877

From: Erik de Castro Lopo <erikd@mega-nerd.com>
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226604 91177308-0d34-0410-b5e6-96231b3b80d8

Add a few items to the 3.6 release notes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226582 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226171:
------------------------------------------------------------------------
r226171 | dsanders | 2015-01-15 15:41:03 +0000 (Thu, 15 Jan 2015) | 11 lines

[mips] Fix a typo in the compare patterns for MIPS32r6/MIPS64r6.

Summary: The patterns intended for the SETLE node were actually matching the SETLT node.

Reviewers: atanasyan, sstankovic, vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6997
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226379 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226170:
------------------------------------------------------------------------
r226170 | vkalintiris | 2015-01-15 15:36:04 +0000 (Thu, 15 Jan 2015) | 7 lines

Fix the C-API MCJIT test for 32-bit big endian machines.

Avoid using unions for storing the return value from
LLVMGetGlobalValueAddress() and LLVMGetFunctionAddress() and accessing it as
a pointer through another pointer member. This causes problems on 32-bit big
endian machines since the pointer gets the higher part of the return value of
the aforementioned functions.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226378 91177308-0d34-0410-b5e6-96231b3b80d8

Revert r225957 from the 3.6 branch.

It should get more testing on trunk before going into a release.

Original message:

Use the integrated assembler by default on SPARC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226355 91177308-0d34-0410-b5e6-96231b3b80d8

Port r226022 to the 3.6 branch.

r225644 is a revert of r225644. A fixed version for trunk is being reviewed, but there is
no need to rush this into 3.6.

Original message:

Revert "Add r224985 back with two fixes."

This reverts commit r225644 while I debug a regression.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226354 91177308-0d34-0410-b5e6-96231b3b80d8

Fix nit from review of metadata release notes.

Somehow I missed this review comment until now...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226213 91177308-0d34-0410-b5e6-96231b3b80d8

Write 3.6 metadata release notes

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226212 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226182:
------------------------------------------------------------------------
r226182 | joerg | 2015-01-15 09:59:02 -0800 (Thu, 15 Jan 2015) | 2 lines

Support @PLT loads on 32bit x86.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226203 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226075:
------------------------------------------------------------------------
r226075 | sanjoy | 2015-01-14 17:46:09 -0800 (Wed, 14 Jan 2015) | 10 lines

Fix PR22222

The bug was introduced in r225282. r225282 assumed that sub X, Y is
the same as add X, -Y. This is not correct if we are going to upgrade
the sub to sub nuw. This change fixes the issue by making the
optimization ignore sub instructions.

Differential Revision: http://reviews.llvm.org/D6979

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226193 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226058:
------------------------------------------------------------------------
r226058 | dexonsmith | 2015-01-14 15:11:51 -0800 (Wed, 14 Jan 2015) | 1 line

IR: Fix comment spelling, NFC
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226095 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226048:
------------------------------------------------------------------------
r226048 | dexonsmith | 2015-01-14 14:27:36 -0800 (Wed, 14 Jan 2015) | 17 lines

IR: Move MDLocation into place

This commit moves `MDLocation`, finishing off PR21433.  There's an
accompanying clang commit for frontend testcases.  I'll attach the
testcase upgrade script I used to PR21433 to help out-of-tree
frontends/backends.

This changes the schema for `DebugLoc` and `DILocation` from:

    !{i32 3, i32 7, !7, !8}

to:

    !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)

Note that empty fields (line/column: 0 and inlinedAt: null) don't get
printed by the assembly writer.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226094 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226046:
------------------------------------------------------------------------
r226046 | dexonsmith | 2015-01-14 14:14:26 -0800 (Wed, 14 Jan 2015) | 3 lines

IR: Always print MDLocation line

Print `MDLocation`'s `line` field even when it's 0.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226092 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226044:
------------------------------------------------------------------------
r226044 | dexonsmith | 2015-01-14 13:58:17 -0800 (Wed, 14 Jan 2015) | 15 lines

IR: Drop metadata references more aggressively during teardown

Sometimes teardown happens before the debug info graph is complete
(e.g., when clang throws an error). In that case, `MDNode`s will still
have RAUW, so deleting constants that the `MDNode`s point at will be
relatively expensive -- it'll cause re-uniquing all up the chain (what
I've been referring to as "teardown madness").

So, drop references *before* deleting constants. We need to drop a few
more references now: the metadata side of the metadata/value bridges
needs to be dropped off the cliff along with the rest of it (previously,
the bridges were cleaned before we did anything with the `MDNode`s).

There's no real functionality change here -- state before and after
`LLVMContextImpl::~LLVMContextImpl()` is unchanged -- so no testcase.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226091 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226029:
------------------------------------------------------------------------
r226029 | dexonsmith | 2015-01-14 11:56:10 -0800 (Wed, 14 Jan 2015) | 7 lines

IR: Fix a use-after-free in RAUW

Happened pretty commonly during `LLVMContext` teardown when `clang -g`
hit an error. This fixes the use-after-free. Next I'll clean up
teardown so that it's not RAUW'ing when metadata-tracked values are
deleted (only really causes a problem if the graph is mid-construction
when teardown starts, but it's still unnecessary work).
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226090 91177308-0d34-0410-b5e6-96231b3b80d8

Merging r226023:
------------------------------------------------------------------------
r226023 | majnemer | 2015-01-14 11:26:56 -0800 (Wed, 14 Jan 2015) | 3 lines

InstCombine: Don't take A-B<0 into A<B if A-B has other uses

This fixes PR22226.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226059 91177308-0d34-0410-b5e6-96231b3b80d8

Change version from 3.6.0svn to 3.6.0

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@226020 91177308-0d34-0410-b5e6-96231b3b80d8

Creating release_36 branch off revision 225991

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_36@225992 91177308-0d34-0410-b5e6-96231b3b80d8

fix typos

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225991 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Use IMPLICIT_DEF and KILL when failing to spill VGPRs

This helps us avoid 'invalid register class for operand' verifier
errors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225989 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Spill VGPRs to scratch space for compute shaders

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225988 91177308-0d34-0410-b5e6-96231b3b80d8

Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225987 91177308-0d34-0410-b5e6-96231b3b80d8

Override the TLI callback enableAggressiveFMAFusion and return true. Indeed, fmul, fmadd and fadd nodes cost the same number of cycles, so we can enable more combining heuristics to produce more fmadd nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225984 91177308-0d34-0410-b5e6-96231b3b80d8

Handle a symbol being undefined.

This can happen if:
* It is present in a comdat in one file.
* It is not present in the comdat of the file that is kept.
* Is is not used.

This should fix the LTO boostrap.

Thanks to Takumi NAKAMURA for setting up the bot!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225983 91177308-0d34-0410-b5e6-96231b3b80d8

Add disassembler tests for mips32r2 platform. There are no functional changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225980 91177308-0d34-0410-b5e6-96231b3b80d8

reapply: SLPVectorizer: Cache results from memory alias checking.

This speeds up the dependency calculations for blocks with many load/store/call instructions.
Beside the improved runtime, there is no functional change.

Compared to the original commit, this re-applied commit contains a bug fix which ensures that there are
no incorrect collisions in the alias cache.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225977 91177308-0d34-0410-b5e6-96231b3b80d8

[cleanup] Re-sort all the #include lines in LLVM using
utils/sort_includes.py.

I clearly haven't done this in a while, so more changed than usual. This
even uncovered a missing include from the InstrProf library that I've
added. No functionality changed here, just mechanical cleanup of the
include order.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225974 91177308-0d34-0410-b5e6-96231b3b80d8

Correct POP handling for v7m

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225972 91177308-0d34-0410-b5e6-96231b3b80d8

[dom] Make the DominatorTreeBase not a dynamic class!

Now that the passes are wrappers around this, we no longer need
a vtable, virtual destructor, and other associated mess. This is
particularly nice to me as this is a class template. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225970 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Port domtree to the new pass manager (at last).

This adds the domtree analysis to the new pass manager. The analysis
returns the same DominatorTree result entity used by the old pass
manager and essentially all of the code is shared. We just have
different boilerplate for running and printing the analysis.

I've converted one test to run in both modes just to make sure this is
exercised while both are live in the tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225969 91177308-0d34-0410-b5e6-96231b3b80d8

[mips] Refine octeon instructions seq/seqi/sne/snei

This commit refines the pattern for the octeon seq/seqi/sne/snei instructions.
The target register is set to 0 or 1 according to the result of the comparison.
In C, this is something like

rd = (unsigned long)(rs == rt)

This commit adds a zext to bring the result to i64. With this change the
instruction is selected for this type of code. (gcc produces the same code for
the above C code.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225968 91177308-0d34-0410-b5e6-96231b3b80d8

Add disassembler tests for mips32r2 platform. There are no functional changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225967 91177308-0d34-0410-b5e6-96231b3b80d8

[PM] Make DominatorTrees (corectly) movable so that we can move them
into the new pass manager's analysis cache which stores results
by-value.

Technically speaking, the dom trees were originally not movable but
copyable! This, unsurprisingly, didn't work at all -- the copy was
shallow and just resulted in rampant memory corruption. This change
explicitly forbids copying (as it would need to be a deep copy) and
makes them explicitly movable with the unsurprising boiler plate to
member-wise move them because we can't rely on MSVC to generate this
code for us. =/

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225966 91177308-0d34-0410-b5e6-96231b3b80d8

Use the integrated assembler by default on SPARC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225957 91177308-0d34-0410-b5e6-96231b3b80d8

Use the operand vector instead so inline assembly can be validated too

The buildbots got upset after r225941, this should hopefully fix things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225954 91177308-0d34-0410-b5e6-96231b3b80d8

SelectionDAG: add a -filter-view-dags option to llc

This option takes the name of the basic block you want to visualize
with -view-*-dags

Differential Revision: http://reviews.llvm.org/D6948

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225953 91177308-0d34-0410-b5e6-96231b3b80d8

DAG Combiner: Fold SelectCC When Cond is UNDEF

In case folding a node end up with a NaN as operand for the select,
the folding of the condition of the selectcc node returns "UNDEF".

Differential Revision: http://reviews.llvm.org/D6889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225952 91177308-0d34-0410-b5e6-96231b3b80d8

Add assertions for out of bound index in ComputeLinearIndex

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225951 91177308-0d34-0410-b5e6-96231b3b80d8

X86: only access operands if they are present

If there is no associated immediate (MS style inline asm), do not try to access
the operand, assume that it is valid. This should fix the buildbots after SVN
r225941.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225950 91177308-0d34-0410-b5e6-96231b3b80d8

Fold a loop for array processing in ComputeLinearIndex

When processing an array, every Elt has the same layout, it is
useless to recursively call each ComputeLinearIndex on each element.
Just do it once and multiply by the number of elements.

Differential Revision: http://reviews.llvm.org/D6832

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225949 91177308-0d34-0410-b5e6-96231b3b80d8

Revert "Insert random noops to increase security against ROP attacks (llvm)"

This reverts commit:
http://reviews.llvm.org/D3392

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225948 91177308-0d34-0410-b5e6-96231b3b80d8

NVPTX: Use MapMetadata() instead of custom/stale/untested logic

Copy the `GVMap` over to a standard `ValueToValueMapTy` so that we can
reuse the `MapMetadata()` logic. Unfortunately the `GVMap` can't just
be replaced, since `MapMetadata()` likes to modify the map, but at least
this will prevent NVPTX from bitrotting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225944 91177308-0d34-0410-b5e6-96231b3b80d8

NVPTX: Remove bogus remap logic for global variable address spaces

The comment is incorrect, and the code mangles debug info. Remove the
bad logic, which wasn't tested anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225943 91177308-0d34-0410-b5e6-96231b3b80d8

X86: validate 'int' instruction

The int instruction takes as an operand an 8-bit immediate value. Validate that
the input is valid rather than silently truncating the value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225941 91177308-0d34-0410-b5e6-96231b3b80d8

Disable a couple of tests, CodeGen/X86/noop-insert.ll and CodeGen/X86/noop-insert-percentage.ll, in r225908, to unbreak tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225940 91177308-0d34-0410-b5e6-96231b3b80d8

[dom] Remove extraneous inline keywords. Template functions and inline
class members are implicitly "inline", no key word needed.

Naturally, this could change how LLVM inlines these functions because
<GRR>, but that's not an excuse to use the keyword. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225939 91177308-0d34-0410-b5e6-96231b3b80d8

[dom] The formatting of the generic domtree has bitrotted over the years
significantly. Clean it up with the help of clang-format.

I've touched this up by hand in a couple of places that weren't quite
right (IMO). I think most of these actually have bugs open about
already.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225938 91177308-0d34-0410-b5e6-96231b3b80d8

[dom] Clean up some comments in this header that were confusingly
formatted or placed incorrectly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225937 91177308-0d34-0410-b5e6-96231b3b80d8

[dom] Add a basic dominator tree test.

Correct, we have *zero* basic testing of the dominator tree in the
regression test suite. There is a single test that even prints it out,
and that test only checks a single line of the output. There are
a handful of tests that check post dominators, but all of those are
looking for bugs rather than just exercising the basic machinery.

This test is super boring and unexciting. But hey, it's something.
I needed there to be something so I could switch the basic test to run
with both the old and new pass manager.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225936 91177308-0d34-0410-b5e6-96231b3b80d8

Fix a wrong comment in LoopVectorize.
I.E. more than two -> exactly two
Fix a typo function name in LoopVectorize.
I.E. collectStrideAcccess() -> collectStrideAccess()

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225935 91177308-0d34-0410-b5e6-96231b3b80d8

TargetInstrInfo.h: Fix \param in r225772. [-Wdocumentation]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225933 91177308-0d34-0410-b5e6-96231b3b80d8

Disable -Wunknown-pragmas in a test so that Clang without -Wself-move will not
complain that the flag doesn't exist.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225931 91177308-0d34-0410-b5e6-96231b3b80d8

ARM: add test for crc32 instructions in CodeGen.

Somehow we seem to have ended up without any actual tests of the
CodeGen side. Easy enough to fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225930 91177308-0d34-0410-b5e6-96231b3b80d8

Remove trailing slash from r225924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225929 91177308-0d34-0410-b5e6-96231b3b80d8

[PowerPC] Fix the noop-insert test

The form of nops used is CPU-specific (some CPUs, such as the POWER7, have
special group-terminating nops). We probably want a different callback for this
kind of nop insertion (something more like MCAsmBackend::writeNopData), or for
PPC to use a different mechanism for scheduling nops, but this will stop the
test from failing for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225928 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Remove some redudant load testcases.

This reduces coverage for Evergreen, since the more
complete tests have those run lines disabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225927 91177308-0d34-0410-b5e6-96231b3b80d8

R600/SI: Fix bad code with unaligned byte vector loads

Don't do the v4i8 -> v4f32 combine if the load will need to
be expanded due to alignment. This stops adding instructions
to repack into a single register that the v_cvt_ubyteN_f32
instructions read.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225926 91177308-0d34-0410-b5e6-96231b3b80d8

Implement new way of expanding extloads.

Now that the source and destination types can be specified,
allow doing an expansion that doesn't use an EXTLOAD of the
result type. Try to do a legal extload to an intermediate type
and extend that if possible.

This generalizes the special case custom lowering of extloads
R600 has been using to work around this problem.

This also happens to fix a bug that would incorrectly use more
aligned loads than should be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225925 91177308-0d34-0410-b5e6-96231b3b80d8

Utils: Remove unreachable break, NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225924 91177308-0d34-0410-b5e6-96231b3b80d8

Utils: Handle remapping distinct MDLocations

Part of PR21433.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225921 91177308-0d34-0410-b5e6-96231b3b80d8