This implements the FastLowerCall hook, which is based on the DoSelectCall
function. The implementation is very similar, but the target-independent call
lowering part has been factored out.
This should also enable patchpoint intrinsic lowering for FastISel on X86.
David Majnemer [Tue, 15 Jul 2014 02:34:12 +0000 (02:34 +0000)]
CodeGen: Handle ConstantVector and undef in WinCOFF constant pools
The constant pool entry code for WinCOFF assumed that vector constants
would be formed using ConstantDataVector, it did not expect to see a
ConstantVector. Furthermore, it did not expect undef as one of the
elements of the vector.
ConstantVectors should be handled like ConstantDataVectors, treat Undef
as zero.
This implements the FastLowerCall hook, which is based on the DoSelectCall
function. The implementation is very similar, but the target-independent call
lowering part has been factored out.
This should also enable patchpoint intrinsic lowering for FastISel on X86.
[FastISel] Insert patchpoint instruction before the target generated call instruction.
The patchpoint instruction should have been inserted before the target
generated call instruction to be inside the ADJSTACKDOWN/ADJSTACKUP call
sequence window.
[FastISel] Fix patchpoint lowering to set the result register.
Always update the value map with the result register (if there is one), for the
patchpoint instruction we created to replace the target-specific call
instruction.
Matt Arsenault [Tue, 15 Jul 2014 02:06:31 +0000 (02:06 +0000)]
R600: Add dag combine for copy of an illegal type.
This helps avoid redundant instructions to unpack, and repack
the vectors. Ideally we could recognize that pattern and eliminate
it. Currently v4i8 and other small element type vectors are scalarized,
so this has the added bonus of avoiding that.
Andrea Di Biagio [Tue, 15 Jul 2014 01:29:27 +0000 (01:29 +0000)]
Improve test 'CodeGen/X86/combine-vec-shuffle-3.ll'.
Now functions 'test4', 'test9', 'test14' and 'test19' correctly perform
a move of two packed values from the high quadword of vector %b to the low
quadword of vector %a (movhlps idiom).
Andrea Di Biagio [Tue, 15 Jul 2014 00:02:32 +0000 (00:02 +0000)]
[DAGCombiner] Avoid calling method 'isShuffleMaskLegal' on illegal vector types.
This patch fixes a crasher in method 'DAGCombiner::visitOR' due to an invalid
call to method 'isShuffleMaskLegal'. On x86, method 'isShuffleMaskLegal'
always expects a legal vector value type in input.
With this patch, we immediately check if the input OR dag node has a legal
vector type; we only try to fold a OR dag node into a single shufflevector
if we know that the resulting shuffle will have a legal type.
This is to avoid calling method 'isShuffleMaskLegal' on a potentially
illegal vector value type.
Added a new test-case to file 'CodeGen/X86/combine-or.ll' to verify that
DAGCombiner doesn't crash in the attempt to check/combine an OR between shuffles
with illegal types.
Lang Hames [Mon, 14 Jul 2014 23:19:50 +0000 (23:19 +0000)]
[RuntimeDyld] Handle endiannes differences between the host and target while
reading MachO files magic numbers in RuntimeDyld.
This is required now that we're testing cross-platform JITing (via
RuntimeDyldChecker), and should fix some issues that David Fang has seen on PPC
builds.
Adam Nemet [Mon, 14 Jul 2014 23:18:39 +0000 (23:18 +0000)]
[X86] Specify all TSFlags bit-offsets symbolically
No functional change.
The offsets for the other bitfields are specified symbolically. I need to
increase the size for one of the earlier fields which is easier after this
cleanup.
Why these bits are relative to VEXShift is a bit strange but that is for
another cleanup.
I made sure that the values for the enums are unchanged after this change.
David Majnemer [Mon, 14 Jul 2014 22:57:27 +0000 (22:57 +0000)]
CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF
COFF lacks a feature that other object file formats support: mergeable
sections.
To work around this, MSVC sticks constant pool entries in special COMDAT
sections so that each constant is in it's own section. This permits
unused constants to be dropped and it also allows duplicate constants in
different translation units to get merged together.
Andrea Di Biagio [Mon, 14 Jul 2014 22:46:26 +0000 (22:46 +0000)]
[DAGCombiner] Add more rules to combine shuffle vector dag nodes.
This patch teaches the DAGCombiner how to fold a pair of shuffles
according to rules:
1. shuffle(shuffle A, B, M0), B, M1) -> shuffle(A, B, M2)
2. shuffle(shuffle A, B, M0), A, M1) -> shuffle(A, B, M3)
The new rules would only trigger if the resulting shuffle has legal type and
legal mask.
Added test 'combine-vec-shuffle-3.ll' to verify that DAGCombiner correctly
folds shuffles on x86 when the resulting mask is legal. Also added some negative
cases to verify that we avoid introducing illegal shuffles.
David Majnemer [Mon, 14 Jul 2014 21:56:54 +0000 (21:56 +0000)]
ADT: Surface LowerCase argument for utohexstr
The underlying function. utohex_buffer, already supports an argument for
deciding if the hex characters should be upper or lower case. Expose an
identical argument for utohexstr.
Support: Fix option handling when using cl::Required with aliasopt
Until now, attempting to create an alias of a required option would
complain if the user supplied the alias, because the required option
didn't have a value. Similarly, if you said the alias was required,
then using the base option would complain that the alias wasn't
supplied. Lastly, if you put required on both, *neither* option would
work.
By changning alias to overload addOccurrence and setting cl::Required
on the original option, we can get this to behave in a more useful
way. I've also added a test and updated a user that was getting this
wrong.
David Majnemer [Mon, 14 Jul 2014 20:38:45 +0000 (20:38 +0000)]
InstSimplify: Correct sdiv x / -1
Determining the bounds of x/ -1 would start off with us dividing it by
INT_MIN. Suffice to say, this would not work very well.
Instead, handle it upfront by checking for -1 and mapping it to the
range: [INT_MIN + 1, INT_MAX. This means that the result of our
division can be any value other than INT_MIN.
David Majnemer [Mon, 14 Jul 2014 19:49:57 +0000 (19:49 +0000)]
InstSimplify: The upper bound of X / C was missing a rounding step
Summary:
When calculating the upper bound of X / -8589934592, we would perform
the following calculation: Floor[INT_MAX / 8589934592]
However, flooring the result would make us wrongly come to the
conclusion that 1073741824 was not in the set of possible values.
Instead, use the ceiling of the result.
We would emit a libcall for a 64-bit atomic on x86 after SVN r212119. This was
due to the misuse of hasCmpxchg16 to indicate if cmpxchg8b was supported on a
32-bit target. They were added at different times and would result in the
border condition being mishandled.
This fixes the border case to emit the cmpxchg8b instruction for 64-bit atomic
operations on x86 at the cost of restoring a long-standing bug in the codegen.
We emit a cmpxchg8b on all x86 targets even where the CPU does not support this
instruction (pre-Pentium CPUs). Although this bug should be fixed, this was
present prior to SVN r212119 and this change, so this is not really introducing
a regression.
Found during windows unwinding work. This header is indirectly included through
a chain leading through Support/Win64EH.h. Explicitly include the header. NFC.
Tim Northover [Mon, 14 Jul 2014 15:31:13 +0000 (15:31 +0000)]
X86: remove temporary atomicrmw used during lowering.
We construct a temporary "atomicrmw xchg" instruction when lowering atomic
stores for widths that aren't supported natively. This isn't on the top-level
worklist though, so it won't be removed automatically and we have to do it
ourselves once that itself has been lowered.
Daniel Sanders [Mon, 14 Jul 2014 14:02:14 +0000 (14:02 +0000)]
[mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags
Summary:
.bss, .text, and .data are at least 16-byte aligned.
.reginfo is 4-byte aligned and has a 24-byte EntrySize.
.MIPS.abiflags has an 24-byte EntrySize.
.MIPS.options is 8-byte aligned and has 1-byte EntrySize.
Using a 1-byte EntrySize for .MIPS.options seems strange because the
records are neither 1-byte long nor fixed-length but this matches the value
that GAS emits.
Daniel Sanders [Mon, 14 Jul 2014 13:08:14 +0000 (13:08 +0000)]
[mips] For the FP64A ABI, odd-numbered double-precision moves must not use mtc1/mfc1.
Summary:
This is because the FP64A the hardware will redirect 32-bit reads/writes
from/to odd-numbered registers to the upper 32-bits of the corresponding
even register. In effect, simulating FR=0 mode when FR=0 mode is not
available.
Unfortunately, we have to make the decision to avoid mfc1/mtc1 before
register allocation so we currently do this for even registers too.
FPXX has a similar requirement on 32-bit architectures that lack
mfhc1/mthc1 so this patch also handles the affected moves from the FPU for
FPXX too. Moves to the FPU were supported by an earlier commit.
[CMake][Win32.DLL] Let llvm_add_library(SHARED) link dependent libraries as PRIVATE.
For example, c-index-test.exe requires just libclang.dll (its import library).
When libraries in libclang were not PRIVATE but PUBLIC, c-index-test required libraries transitive by libclang.
Note, on mingw with BUILD_SHARED_LIBS, library dependencies would become more strict.
In principle, required libraries should be "required in its source file".
Sasa Stankovic [Mon, 14 Jul 2014 09:40:29 +0000 (09:40 +0000)]
[mips] Expand BuildPairF64 to a spill and reload when the O32 FPXX ABI is
enabled and mthc1 and dmtc1 are not available (e.g. on MIPS32r1)
This prevents the upper 32-bits of a double precision value from being moved to
the FPU with mtc1 to an odd-numbered FPU register. This is necessary to ensure
that the code generated executes correctly regardless of the current FPU mode.
MIPS32r2 and above continues to use mtc1/mthc1, while MIPS-IV and above continue
to use dmtc1.
Bill Wendling [Mon, 14 Jul 2014 06:22:36 +0000 (06:22 +0000)]
Support lowering of empty aggregates.
This crash was pretty common while compiling Rust for iOS (armv7). Reason -
SjLj preparation step was lowering aggregate arguments as ExtractValue +
InsertValue. ExtractValue has assertion which checks that there is some data in
value, which is not true in case of empty (no fields) structures. Rust uses
them quite extensively so this patch uses a 'select true, %val, undef'
instruction to lower the argument.
Andrea Di Biagio [Sun, 13 Jul 2014 21:02:14 +0000 (21:02 +0000)]
[DAGCombiner] Fix a crash caused by a missing check for legal type when trying to fold shuffles.
Verify that DAGCombiner does not crash when trying to fold a pair of shuffles
according to rule (added at r212539):
(shuffle (shuffle A, Undef, M0), Undef, M1) -> (shuffle A, Undef, M2)
The DAGCombiner avoids folding shuffles if the resulting shuffle dag node
is not legal for the target. That means, the resulting shuffle must have
legal type and legal mask.
Before, the DAGCombiner only called method
'TargetLowering::isShuffleMaskLegal' to check if it was "safe" to fold according
to the above-mentioned rule. However, this caused a crash in the x86 backend
since method 'isShuffleMaskLegal' always expects to be called on a
legal vector type.
This is the first of a number of changes designed to generalise
MCWin64EHInstruction to support different target architectures. An ordered set
(vector) of these instructions is saved per frame to permit the emission of
information for Windows NT style unwinding. The only bit of information which
is actually target specific here is the Opcode for the unwinding bytecode. The
remainder of the information is simply generic information that is relevant to
the Windows NT unwinding model.
Remove the accessors for the fields, making them const and public instead. Sink
the knowledge of the alias'ed name into the single source and sink a single-use
check method into the use.
MC: make DWARF and Windows unwinding handling more similar
Rename member variables and functions for the MCStreamer for DWARF-like
unwinding management. Rename the Windows ones as well and make the naming and
handling similar across the two. No functional change intended.
AArch64: add support for llvm.aarch64.hint intrinsic
This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to
support the various hint intrinsic functions in the ACLE.
Add an optional pattern field that permits the subclass to specify the pattern
that matches the selection. The intrinsic pattern is set as mayLoad, mayStore,
so overload the value for the definition of the hint instruction.
Due to the fact that the windows unwinding has the concept of chained frames, we
maintain a current frame info pointer that is adjusted on any push and pop of a
unwinding context. This just removes an unnecessary variable that was used to
mirror the DWARF unwinding code.
This structure contains information related to the call frame used to generate
unwinding information. Rename this to reflect the future use to represent the
shared state between various architectures for WinCFI information.
Owen Anderson [Sat, 12 Jul 2014 07:12:47 +0000 (07:12 +0000)]
Fix an issue with the MergeBasicBlockIntoOnlyPred() helper function where it did
not properly handle the case where the predecessor block was the entry block to
the function. The only in-tree client of this is JumpThreading, which worked
around the issue in its own code. This patch moves the solution into the helper
so that JumpThreading (and other clients) do not have to replicate the same fix
everywhere.
[ASan] Collect unmangled names of global variables in Clang to print them in error reports.
Currently ASan instrumentation pass creates a string with global name
for each instrumented global (to include global names in the error report). Global
name is already mangled at this point, and we may not be able to demangle it
at runtime (e.g. there is no __cxa_demangle on Android).
Instead, create a string with fully qualified global name in Clang, and pass it
to ASan instrumentation pass in llvm.asan.globals metadata. If there is no metadata
for some global, ASan will use the original algorithm.
This fixes https://code.google.com/p/address-sanitizer/issues/detail?id=264.