AArch64: Safely handle the incoming sret call argument.
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.
Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.
Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering
only propagate equality comparisons of FP values that we are certain are non-zero
This is a follow-on to r227491 which tightens the check for propagating FP
values. If a non-constant value happens to be a zero, we would hit the same
bug as before.
R600/SI: Fix verifier errors from the SIAnnotateControlFlow pass
This pass was generating 'Instruction does not dominate all uses!'
errors for programs which had loops with a condition variable that
depended on the result of a phi instruction from outside of the loop.
The pass was inserting new phi nodes outside of the loop which used values
defined inside the loop.
R600/SI: Fix indirect addressing with a negative constant offset
When the base register index of the vector plus the constant offset
was less than zero, we were passing the wrong base register to the indirect
addressing instruction.
In this case, we need to set the base register to v0 and then add
the computed (negative) index to m0.
x86: Fix large model calls to __chkstk for dynamic allocas
In the large code model, we now put __chkstk in %r11 before calling it.
Refactor the code so that we only do this once. Simplify things by using
__chkstk_ms instead of __chkstk on cygming. We already use that symbol
in the prolog emission, and it simplifies our logic.
Tom Stellard [Fri, 1 May 2015 02:43:13 +0000 (02:43 +0000)]
Merging r232438:
------------------------------------------------------------------------
r232438 | mail | 2015-03-16 19:52:03 -0400 (Mon, 16 Mar 2015) | 6 lines
GCOV: Make the exit block placement from r223193 optional
By default we want our gcov emission to stay 4.2 compatible, which
means we need to continue emit the exit block last by default. We add
an option to emit it before the body for users that need it.
Daniel Sanders [Wed, 29 Apr 2015 15:29:14 +0000 (15:29 +0000)]
Attempt #2: Fix ABI compatibility with 3.6.0 by moving new virtual function to end of subclass. NFC
The previous attempt at fixing this only moved the problem to the subclass
vtable. We can safely move the function into the subclass so attempt to fix it
that way.
Merge r232176: ConstantFold: Fix big shift constant folding
Constant folding for shift IR instructions ignores all bits above 32 of
second argument (shift amount).
Because of that, some undef results are not recognized and APInt can
raise an assert failure if second argument has more than 64 bits.
R600/SI: Fix verifier error caused by SIAnnotateControlFlow
This pass will always try to insert llvm.SI.ifbreak intrinsics
in the same block that its conditional value is computed in. This is
a problem when conditions for breaks or continue are computed outside
of the loop, because the llvm.SI.ifbreak intrinsic ends up being inserted
outside of the loop.
This patch fixes this problem by inserting the llvm.SI.ifbreak
intrinsics in the loop header when the condition is computed outside
the loop.
[CodeGen] Don't attempt a tail-call with a non-forwarded explicit sret.
Tailcalls are only OK with forwarded sret pointers. With explicit sret,
one approximation is to check that the pointer isn't an Instruction, as
in that case it might point into some local memory (alloca). That's not
OK with tailcalls.
Explicit sret counterpart to r233409.
Differential Revison: http://reviews.llvm.org/D8510
[MachineCopyPropagation] Fix a bug causing incorrect removal for the instruction sequences as follows
%Q5_Q6<def> = COPY %Q2_Q3
%D5<def> =
%D3<def> =
%D3<def> = COPY %D6 // Incorrectly removed in MachineCopyPropagation
Using of %D3 results in incorrect result ...
Fix sign extension for MIPS64 in makeLibCall function
Fixing sign extension in makeLibCall for MIPS64. In MIPS64 architecture all
32 bit arguments (int, unsigned int, float 32 (soft float)) must be sign
extended. This fixes test "MultiSource/Applications/oggenc/".
Fix makeLibCall argument (signed) in SoftenFloatRes_XINT_TO_FP function
The isSigned argument of makeLibCall function was hard-coded to false
(unsigned). This caused zero extension on MIPS64 soft float.
As the result SingleSource/Benchmarks/Stanford/FloatMM test and
SingleSource/UnitTests/2005-07-17-INT-To-FP test failed.
The solution was to use the proper argument.
[mips] Specify the correct value type when combining a CMovFP node.
This commit fixes a bug introduced in r230956 where we were creating
CMovFP_{T,F} nodes with multiple return value types (one for each operand).
With this change the return value type of the new node is the same as the
value type of the True/False operands of the original node.
------------------------------------------------------------------------
[mips] Optimize conditional moves where RHS is zero.
Summary:
When the RHS of a conditional move node is zero, we can utilize the $zero
register by inverting the conditional move instruction and by swapping the
order of its True/False operands.
Daniel Sanders [Mon, 27 Apr 2015 14:50:09 +0000 (14:50 +0000)]
Merging r230500:
------------------------------------------------------------------------
r230500 | vmedic | 2015-02-25 15:24:37 +0000 (Wed, 25 Feb 2015) | 1 line
[MIPS]Multiple and add instructions for Mips are currently available in mips32r2/mips64r2 and later but should also be available in mips4, mips5, and mips64. This patch fixes the requested features and updates the corresponding test files.
------------------------------------------------------------------------
Fix justify error for small structures bigger than 32 bits in fixed
arguments for MIPS64 big endian. There was a problem when small structures
are passed as fixed arguments. The structures that are bigger than 32 bits
but smaller than 64 bits were not left justified properly on MIPS64 big
endian. This is fixed by shifting the value to make it left justified when
appropriate.
[mips] Account for constant-zero operands in ADDE nodes.
Summary:
We identify the cases where the operand to an ADDE node is a constant
zero. In such cases, we can avoid generating an extra ADDu instruction
disguised as an identity move alias (ie. addu $r, $r, 0 --> move $r, $r).
Fix justify error for small structures in varargs for MIPS64BE
There was a problem when passing structures as variable arguments.
The structures smaller than 64 bit were not left justified on MIPS64
big endian. This is now fixed by shifting the value to make it left-
justified when appropriate.
This fixes the bug http://llvm.org/bugs/show_bug.cgi?id=21608
[mips] Honour -mno-odd-spreg for vector insert/extract when MSA is enabled.
Summary:
-mno-odd-spreg prohibits the use of odd-numbered single-precision floating
point registers. However, vector insert/extract was still using them when
manipulating the subregisters of an MSA register. Fixed this by ensuring
that insertion/extraction is only performed on even-numbered vector
registers when -mno-odd-spreg is given.
Daniel Sanders [Mon, 27 Apr 2015 11:59:49 +0000 (11:59 +0000)]
Merging r227430:
------------------------------------------------------------------------
r227430 | vmedic | 2015-01-29 11:33:41 +0000 (Thu, 29 Jan 2015) | 1 line
[Mips][Disassembler] When disassembler meets cache/pref instructions for r6 it crashes as the access to operands array is out of range. This patch adds dedicated decoder method for R6 CACHE_HINT_DESC class that properly handles decoding of these instructions.
------------------------------------------------------------------------
Daniel Sanders [Mon, 27 Apr 2015 10:29:59 +0000 (10:29 +0000)]
Merging r227084:
------------------------------------------------------------------------
r227084 | vmedic | 2015-01-26 10:33:43 +0000 (Mon, 26 Jan 2015) | 1 line
When disassembler meets compact jump instructions for r6 it crashes as the access to operands array is out of range. This patch removes dedicated decoder method that wrongly handles decoding of these instructions.
------------------------------------------------------------------------
[mips] Add new error message and improve testing for parsing the .module directive.
Summary:
We used to silently ignore any empty .module's and we used to give an error saying that we found
an "unexpected token at start of statement" when the value of the option wasn't an identifier (e.g. if it was a number).
We now give an error saying that we "expected .module option identifier" in both of those cases.
I also fixed the other tests in mips-abi-bad.s, which all seemed to be broken.
Daniel Sanders [Mon, 27 Apr 2015 09:42:44 +0000 (09:42 +0000)]
Merging r226652:
------------------------------------------------------------------------
r226652 | vmedic | 2015-01-21 10:47:36 +0000 (Wed, 21 Jan 2015) | 1 line
[Mips][Disassembler]When disassembler meets load/store from coprocessor 2 instructions for mips r6 it crashes as the access to operands array is out of range. This patch adds dedicated decoder method that properly handles decoding of these instructions.
------------------------------------------------------------------------
[mips] Make whitespace in disassembler tests more consistent. NFC.
The tests for the ISA's should now be approximately diffable. That is, the
output of 'diff valid-mips1.txt valid-mips2.txt' should be emit the lines
for instructions that were added/removed to/from MIPS-I by MIPS-II. This
doesn't work perfectly at the moment due to ordering differences but it
should be close.
Daniel Sanders [Mon, 27 Apr 2015 08:51:28 +0000 (08:51 +0000)]
Merging r226166:
------------------------------------------------------------------------
r226166 | vmedic | 2015-01-15 14:18:12 +0000 (Thu, 15 Jan 2015) | 1 line
Add disassembler tests for mips64r6 platform. There are no functional changes.
------------------------------------------------------------------------
Daniel Sanders [Mon, 27 Apr 2015 08:50:58 +0000 (08:50 +0000)]
Merging r226165:
------------------------------------------------------------------------
r226165 | vmedic | 2015-01-15 14:11:38 +0000 (Thu, 15 Jan 2015) | 1 line
Add disassembler tests for mips32r6 platform. There are no functional changes.
------------------------------------------------------------------------
Daniel Sanders [Mon, 27 Apr 2015 08:50:30 +0000 (08:50 +0000)]
Merging r226164:
------------------------------------------------------------------------
r226164 | vmedic | 2015-01-15 14:06:34 +0000 (Thu, 15 Jan 2015) | 1 line
Add disassembler tests for mips64r2 platform. There are no functional changes.
------------------------------------------------------------------------
v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for
all address spaces. We had marked them as custom in order to lower
them for the private address space, but this is no longer necessary.
This enables lowering of misaligned stores of these types in the
DAGLegalizer.
R600/SI: Rewrite VOP1InstSI to contain a pseudo and _si opcode
What this does is that if you accidentally select these instructions on VI,
the code generation will fail, because the pseudo -> _vi mapping will be
undefined.
The idea is to be able to catch possible future bugs easily.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
------------------------------------------------------------------------
R600/SI: Don't generate non-existent LSHL, LSHR, ASHR B32 variants on VI
This can happen when a REV instruction is commuted.
The trick is not to define the _vi versions of instructions, which has these
consequences:
- code generation will always fail if a pseudo cannot be lowered
(very useful to catch bugs where an unsupported instruction somehow makes
it to the printer)
- ability to query if a pseudo can be lowered, which is done in commuteOpcode
to prevent REV from commuting to non-REV on VI
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
------------------------------------------------------------------------
R600/SI: Determine target-specific encoding of READLANE and WRITELANE early v2
These are VOP2 on SI and VOP3 on VI, and their pseudos are neither, which can
be a problem. In order to make isVOP2 and isVOP3 queries behave as expected,
the encoding must be determined first.
This doesn't fix any known issue, but better safe than sorry.
v2: add and use getMCOpcodeFromPseudo
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
------------------------------------------------------------------------
R600/SI: 64-bit and larger memory access must be at least 4-byte aligned
This is true for SI only. CI+ supports unaligned memory accesses,
but this requires driver support, so for now we disallow unaligned
accesses for all GCN targets.
Original message from r231227:
Fix PR22408 - LLVM producing AArch64 TLS relocations that GNU linkers cannot handle yet.
As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU linkers
ld.bfd and ld.gold currently only support a subset of the whole range of AArch64
ELF TLS relocations. Furthermore, they assume that some of the code sequences to
access thread-local variables are produced in a very specific sequence.
When the sequence is not as the linker expects, it can silently mis-relaxe/mis-optimize
the instructions.
Even if that wouldn't be the case, it's good to produce the exact sequence,
as that ensures that linkers can perform optimizing relaxations.
This patch:
* implements support for 16MiB TLS area size instead of 4GiB TLS area size. Ideally clang
would grow an -mtls-size option to allow support for both, but that's not part of this patch.
* by default doesn't produce local dynamic access patterns, as even modern ld.bfd and ld.gold
linkers do not support the associated relocations. An option (-aarch64-elf-ldtls-generation)
is added to enable generation of local dynamic code sequence, but is off by default.
* makes sure that the exact expected code sequence for local dynamic and general dynamic
accesses is produced, by making use of a new pseudo instruction. The patch also removes
two (AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing AArch64-specific pseudo
SDNode instructions that are superseded by the new one (TLSDESC_CALLSEQ).
Original message from r233351:
Fix a bug in SelectionDAG scheduling backtracking code: PR22304.
It can happen (by line CurSU->isPending = true; // This SU is not in
AvailableQueue right now.) that a SUnit is mark as available but is
not in the AvailableQueue. For SUnit being selected for scheduling
both conditions must be met.
This patch mainly defensively protects from invalid removing a node
from a queue. Sometimes nodes are marked isAvailable but are not in
the queue because they have been defered due to some hazard.
The other two commits move a test from CodeGen/Generic to Codegen/X86.
Paul Robinson [Thu, 2 Apr 2015 00:15:35 +0000 (00:15 +0000)]
Merging r233153 and r233584:
------------------------------------------------------------------------
r233153 | probinson | 2015-03-24 17:10:24 -0700 (Tue, 24 Mar 2015) | 7 lines
'optnone' should not disable DAG combiner.
Reverts the code change from r221168 and the relevant test.
It was a mistake to disable the combiner, and based on the ultimate
definition of 'optnone' we shouldn't have considered the test case
as failing in the first place.
Verify 'optnone' can run DAG combiner when appropriate.
Adds a test to verify the behavior that r233153 restored: 'optnone'
does not spuriously disable the DAG combiner, and in fact there are
cases where the DAG combiner must run (even at -O0 or 'optnone') in
order for codegen to succeed.