Chris Bieneman [Thu, 3 Dec 2015 18:45:39 +0000 (18:45 +0000)]
[CMake] Add option LLVM_EXTERNALIZE_DEBUGINFO
Summary: This adds support for generating dSYM files and stripping debug info from executables and dylibs. It also supports passing -object_path_lto to the linker to generate dSYMs for LTO builds.
Teresa Johnson [Thu, 3 Dec 2015 18:20:05 +0000 (18:20 +0000)]
[ThinLTO] Appending linkage fixes
Summary:
Fix import from module with appending var, which cannot be imported. The
first fix is to remove an overly-aggressive error check.
The second fix is to deal with restructuring introduced to the module
linker yesterday in r254418 (actually, this fix was included already
in r254559, just added some additional cleanup).
Colin LeMahieu [Thu, 3 Dec 2015 16:37:21 +0000 (16:37 +0000)]
[Hexagon] NFC Using canonicalizePacket to compound/duplex/pad packets rather than doing it separately. This also ensures the integrated assembler path matches the assembly parser path.
Marina Yatsina [Thu, 3 Dec 2015 12:17:03 +0000 (12:17 +0000)]
[X86] MS inline asm: produce error when encountering "<type> ptr <reg name>"
Currently "<type> ptr <reg name>" treated as <reg name> in MS inline asm, ignoring the "<type> ptr" completely and possibly ignoring the intention of the user.
Fixed llvm to produce an error when encountering "<type> ptr <reg name>" operands.
For example: andpd xmm1,xmmword ptr xmm1 --> andpd xmm1, xmm1
though andpd has 2 possible matching formats - andpd xmm, xmm/m128
Andy Gibbs [Thu, 3 Dec 2015 08:20:20 +0000 (08:20 +0000)]
Fix class SCEVPredicate has virtual functions and accessible non-virtual destructor.
It is not enough to simply make the destructor virtual since there is a g++ 4.7
issue (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53613) that throws the
error "looser throw specifier for ... overridding ~SCEVPredicate() noexcept".
Craig Topper [Thu, 3 Dec 2015 05:57:37 +0000 (05:57 +0000)]
[TableGen] Remove an assumption about the order of encodings in the MVT::SimpleValueType enum. Instead of assuming the types are sorted by size, scan the typeset arrays to find the smallest/largest type. NFC
This works mostly fine but breaks some stage 1 builders when compiling
compiler-rt on i386. Revert for further investigation as I can't see an
obvious cause/fix.
Mehdi Amini [Thu, 3 Dec 2015 02:37:23 +0000 (02:37 +0000)]
Remove "ExportingModule" from ThinLTO Index (NFC)
There is no real reason the index has to have the concept of an
exporting Module. We should be able to have one single unique
instance of the Index, and it should be read-only after creation
for the whole ThinLTO processing.
The linker plugin should be able to process multiple modules (in
parallel or in sequence) with the same index.
The only reason the ExportingModule was present seems to be to
implement hasExportedFunctions() that is used by the Module linker
to decide what to do with the current Module.
For now I replaced it with a query to the map of Modules path to
see if this module was declared in the Index and consider that if
it is the case then it is probably exporting function.
On the long term the Linker interface needs to evolve and this
call should not be needed anymore.
Matthias Braun [Thu, 3 Dec 2015 02:05:27 +0000 (02:05 +0000)]
ScheduleDAGInstrs: Rework schedule graph builder.
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
(not used for now)
The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.
Alexey Samsonov [Wed, 2 Dec 2015 21:25:28 +0000 (21:25 +0000)]
[PowerPC] Remove wild call to RegScavenger::initRegState().
This call should in fact be made by RegScavenger::enterBasicBlock()
called below. The first call does nothing except for triggering UB,
indicated by UBSan (passing nullptr to memset()).
Alexey Samsonov [Wed, 2 Dec 2015 21:13:43 +0000 (21:13 +0000)]
[Hexagon] Remove std::hex in favor of format().
std::hex is not used anywhere in LLVM code base except for this place,
and it has a known undefined behavior (at least in libstdc++ 4.9.3):
https://llvm.org/bugs/show_bug.cgi?id=18156, which fires in UBSan
bootstrap of LLVM.
Kyle Butt [Wed, 2 Dec 2015 18:58:51 +0000 (18:58 +0000)]
[CodeGen]: Fix bad interaction with AntiDep breaking and inline asm.
AggressiveAntiDepBreaker was renaming registers specified by the user
for inline assembly. While this will work for compiler-specified
registers, it won't work for user-specified registers, and at the time
this runs, I don't currently see a way to distinguish them.
Fiona Glaser [Wed, 2 Dec 2015 18:32:59 +0000 (18:32 +0000)]
Scheduler / Regalloc: use unique_ptr[] instead of std::vector
vector.resize() is significantly slower than memset in many STLs
and the cost of initializing these vectors is significant on targets
with many registers. Since we don't need the overhead of a vector,
use a simple unique_ptr instead.
[llvm-profdata] Change instr prof counter overflow to saturate rather than discard
Summary: This changes overflow handling during instrumentation profile merge. Rathar than throwing away records that would result in counter overflow, merged counts are instead clamped to the maximum representable value. A warning about counter overflow is still surfaced to the user as before.
Tim Northover [Wed, 2 Dec 2015 18:12:57 +0000 (18:12 +0000)]
AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.
|9B DD /7| FSTSW m2byte| Valid Valid Store FPU status word at m2byteafter checking for pending unmasked floating-point exceptions.|
|9B DF E0| FSTSW AX| Valid Valid Store FPU status word in AX register after checking for pending unmasked floating-point exceptions.|
|DD /7 |FNSTSW *m2byte| Valid Valid Store FPU status word at m2bytewithout checking for pending unmasked floating-point exceptions.|
|DF E0 |FNSTSW *AX| Valid Valid Store FPU status word in AX register without checking for pending unmasked floating-point exceptions|
m2byte is word register, and therefor instruction operand need to be change from f32mem to i16mem.
Andy Gibbs [Wed, 2 Dec 2015 14:22:18 +0000 (14:22 +0000)]
Fix buildbots broken by r254508
g++ 4.7 does not allow an inline defaulted virtual destructor to be overridden,
giving the error "looser throw specifier for ... overridding ~SCEVPredicate()
noexcept (true)" (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53613).
The work-around given in the bug report above has been utilised here.
AVX-512: Updated cost of FP/SINT/UINT conversion operations
I checked and updated the cost of AVX-512 conversion operations. Added cost of conversion operations in DQ mode.
Conversion of illegal types that requires vector split is not calculated right now (like for other X86 targets).
Akira Hatanaka [Wed, 2 Dec 2015 06:58:49 +0000 (06:58 +0000)]
[AttributeSet] Overload AttributeSet::addAttribute to reduce compile
time.
The new overloaded function is used when an attribute is added to a
large number of slots of an AttributeSet (for example, to function
parameters). This is much faster than calling AttributeSet::addAttribute
once per slot, because AttributeSet::getImpl (which calls
FoldingSet::FIndNodeOrInsertPos) is called only once per function
instead of once per slot.
With this commit, clang compiles a file which used to take over 22
minutes in just 13 seconds.
Craig Topper [Wed, 2 Dec 2015 06:39:19 +0000 (06:39 +0000)]
[X86] Change getZeroVector to take an MVT instead of EVT. One minor change needed to only try to perform 256-it shuffle combines on legal vector types.
David Blaikie [Wed, 2 Dec 2015 06:21:34 +0000 (06:21 +0000)]
[llvm-dwp] Emit a rather fictional debug_cu_index
This is very rudimentary support for debug_cu_index, but it is enough to
allow llvm-dwarfdump to find the offsets for contributions and
correctly dump debug_info.
It will need to actually find the real signature of the unit and build
the real hash table with the right number of buckets, as per the DWP
specification.
It will also need to be expanded to cover the tu_index as well.
[sanitizer coverage] when adding a bb trace instrumentation, do it instead, not in addition to, regular coverage. Do the regular coverage in the run-time instead
Mehdi Amini [Wed, 2 Dec 2015 02:00:29 +0000 (02:00 +0000)]
Modify FunctionImport to take a callback to load modules
When linking static archive, there is no individual module files to
load. Instead they can be mmap'ed and could be initialized from a
buffer directly. The callback provide flexibility to override the
scheme for loading module from the summary.
Tim Northover [Wed, 2 Dec 2015 00:33:54 +0000 (00:33 +0000)]
AArch64: fix 128-bit shifts
We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an
UNDEF value (and worse, it's not what you want with the natural Arch64
implementation).
The generated code is pretty horrific, but I couldn't come up with an obviously
better alternative (if the amount is constant EXTR could help). Turns out
128-bit shifts are just nasty.
For the struct with trailing objects, define
a member operator delete. Without this, the program
will fail when -fsized-deallocation option is used
where the wrong size will be passed to the global
delete operator.
Cong Hou [Tue, 1 Dec 2015 21:50:20 +0000 (21:50 +0000)]
Fix a bug in IfConversion.cpp.
The bug is introduced in r254377 which failed some tests on ARM, where a new
probability is assigned to a successor but the provided BB may not be a
successor.