Owen Anderson [Fri, 30 Jan 2015 09:05:49 +0000 (09:05 +0000)]
Change a very hot piece of code in TableGen's register unit computations to use bit vectors rather than arrays.
For target descriptions with very large and very dense register files, TableGen
can take an extremely long time to run. This change makes a dent in that (~15%
in my measurements) by accelerating the single hottest operation with better data
structures.
I believe there's still a lot of room to make this even faster with more global
changes that require replacing some of the existing datastructures in this area
with bit vectors, but that's a more involved change and I wanted to get this
simpler improvement in first.
Hao Liu [Fri, 30 Jan 2015 05:02:21 +0000 (05:02 +0000)]
[LoopVectorize] Induction variables: support arbitrary constant step.
Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support
arbitrary constant steps.
Initial patch by Alexey Volkov. Some bug fixes are added in the following version.
Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193
Hao Liu [Fri, 30 Jan 2015 02:13:53 +0000 (02:13 +0000)]
[AArch64]Fix PR21675, a bug about lowering llvm.ctpop.i32. We should noot use "DAG.getUNDEF(MVT::v8i8)" to get all zero vector.
Patch by Wei-cheng Wang.
Akira Hatanaka [Fri, 30 Jan 2015 01:16:24 +0000 (01:16 +0000)]
[LTO] Scan all per-function subtargets when collecting runtime library names.
accumulateAndSortLibcalls in LTOCodeGenerator.cpp collects names of runtime
library functions which are used to identify user-defined functions that should
be protected. Previously, this function would only scan the TargetLowering
object belonging to the "main" subtarget for the library function names. This
commit changes it to scan all per-function subtargets.
Reid Kleckner [Thu, 29 Jan 2015 23:58:04 +0000 (23:58 +0000)]
x86: Fix large model calls to __chkstk for dynamic allocas
In the large code model, we now put __chkstk in %r11 before calling it.
Refactor the code so that we only do this once. Simplify things by using
__chkstk_ms instead of __chkstk on cygming. We already use that symbol
in the prolog emission, and it simplifies our logic.
Eric Christopher [Thu, 29 Jan 2015 23:27:36 +0000 (23:27 +0000)]
Remove most of the TargetMachine::getSubtarget/getSubtargetImpl
calls that don't take a Function argument from Mips. Notable
exceptions: the AsmPrinter and MipsTargetObjectFile. The
latter needs to be fixed, and the former will be fixed when the
general AsmPrinter changes happen.
Chandler Carruth [Thu, 29 Jan 2015 23:26:37 +0000 (23:26 +0000)]
[LPM] Remove a PPC64 hack to try to work around a bad interaction
between the linker's TLS optimizations and Clang's TLS code generation.
For now, Clang has been changed to disable linker TLS optimizations
until it (and LLVM more generally) are emitting TLS code sequences
compatible with the old bugs found in the linkers. That's a better fix
to handle bootstrapping on that platform.
James Molloy [Thu, 29 Jan 2015 21:52:03 +0000 (21:52 +0000)]
[LoopReroll] Alter the data structures used during reroll validation.
The validation algorithm used an incremental approach, building each
iteration's data structures temporarily, validating them, then
adding them to a global set.
This does not scale well to having multiple sets of Root nodes, as the
set of instructions used in each iteration is the union over all
the root nodes. Therefore, refactor the logic to create a single, simple
container to which later logic then refers. This makes it simpler
control-flow wise to make the creation of the container more complex with
the addition of multiple root sets.
Sanjay Patel [Thu, 29 Jan 2015 20:51:49 +0000 (20:51 +0000)]
[GVN] don't propagate equality comparisons of FP zero (PR22376)
In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities
to allow some simple value range optimizations. But that introduced a bug
when comparing to -0.0 or 0.0: these compare equal even though they are not
bitwise identical.
This patch disallows propagating zero constants in equality comparisons. Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376
Differential Revision: http://reviews.llvm.org/D7257
Matt Arsenault [Thu, 29 Jan 2015 19:34:32 +0000 (19:34 +0000)]
R600/SI: Implement enableAggressiveFMAFusion
Add tests for the various combines. This should
always be at least cycle neutral on all subtargets for f64,
and faster on some. For f32 we should prefer selecting
v_mad_f32 over v_fma_f32.
David Blaikie [Thu, 29 Jan 2015 19:09:18 +0000 (19:09 +0000)]
DebugInfo: Teach Fast ISel to respect the debug location of comparisons in jumps
The use of the DbgLoc in FastISel is probably something we should fix.
It's prone to leaking the wrong location into instructions - we should
have a clear chain of custody from the debug location of an IR
Instruction to that of a MachineInstr to avoid such leakage.
Zachary Turner [Thu, 29 Jan 2015 18:44:14 +0000 (18:44 +0000)]
Disable compilation of llvm-pdbdump for versions of MSVC < 2013.
Certain aspects of llvm-pdbdump require language support only present in
MSVC 2013 and higher. Since this is strictly a utility, and since we hope
to drop support for MSVC 2012 soon, don't build this unless MSVC 2013 or
higher.
Robert Lougher [Thu, 29 Jan 2015 16:18:29 +0000 (16:18 +0000)]
[X86] Use single add/sub for large stack offsets
For large stack offsets the compiler generates multiple immediate mode
sub/add instructions in the prologue/epilogue. This patch makes the
compiler place the final amount to be added/subtracted into a register,
which is then added/substracted with a single operation.
Bill Schmidt [Thu, 29 Jan 2015 15:59:09 +0000 (15:59 +0000)]
[PowerPC] Complete setting the baseline for ppc64le
Patch by Nemanja Ivanovic.
As was uncovered by the failing test case (when run on non-PPC
platforms), the feature set when compiling with -march=ppc64le was not
being picked up. This change ensures that if the -mcpu option is not
specified, the correct feature set is picked up regardless of whether
we are on PPC or not.
Aaron Ballman [Thu, 29 Jan 2015 15:49:22 +0000 (15:49 +0000)]
Temporarily reverting the fuzzer library as it causes too many build issues for MSVC users. This reverts: 227445, 227395, 227389, 227357, 227254, 227252
James Molloy [Thu, 29 Jan 2015 13:48:05 +0000 (13:48 +0000)]
[LoopReroll] Refactor most of reroll() into a helper class
reroll() was slightly monolithic and a pain to modify. Refactor
a bunch of its state from local variables to member variables
of a helper class, and do some trivial simplification while we're
there.
Vladimir Medic [Thu, 29 Jan 2015 11:33:41 +0000 (11:33 +0000)]
[Mips][Disassembler] When disassembler meets cache/pref instructions for r6 it crashes as the access to operands array is out of range. This patch adds dedicated decoder method for R6 CACHE_HINT_DESC class that properly handles decoding of these instructions.
Owen Anderson [Thu, 29 Jan 2015 07:35:31 +0000 (07:35 +0000)]
Use the existing build configuration parameter ENABLE_BACKTRACE to compile out all pretty stack trace support when backtraces are disabled.
This has the nice secondary effect of allowing LLVM to continue to build
for targets without __thread or thread_local support to continue to work
so long as they build without support for backtraces.
Chandler Carruth [Thu, 29 Jan 2015 02:34:17 +0000 (02:34 +0000)]
[LPM] Try again to appease powerpc64 in its self host. I've been unable
to get a powerpc64 host so that I can reproduce and test this, but it
only impacts that platform so trying the only other realistic option.
According to Ulrich, who debugged this initially, initial-exec is likely
to be sufficient for our needs and not subject to this bug. Will watch
the build bots to see.
If this doesn't work, I'll be forced to cut a really ugly pthread-based
approach into the primary user (our stack trace printing) as that user
cannot use the ThreadLocal implementation due to lifetime issues.
Chandler Carruth [Thu, 29 Jan 2015 01:23:04 +0000 (01:23 +0000)]
[LPM] Clean up the use of TLS in pretty stack trace and disable it
entirely when threads are not enabled. This should allow anyone who
needs to bootstrap or cope with a host loader without TLS support to
limp along without threading support.
There is still some bug in the PPC TLS stuff that is not worked around.
I'm getting access to a machine to reproduce and debug this further.
There is some chance that I'll have to add a terrible workaround for
PPC.
There is also some problem with iOS, but I have no ability to really
evaluate what the issue is there. I'm leaving it to folks maintaining
that platform to suggest a path forward -- personally I don't see any
useful path forward that supports threading in LLVM but does so without
support for *very basic* TLS. Note that we don't need more than some
pointers, and we don't need constructors, destructors, or any of the
other fanciness which remains widely unimplemented.
Reid Kleckner [Thu, 29 Jan 2015 00:41:44 +0000 (00:41 +0000)]
Add a Windows EH preparation pass that zaps resumes
If the personality is not a recognized MSVC personality function, this
pass delegates to the dwarf EH preparation pass. This chaining supports
people on *-windows-itanium or *-windows-gnu targets.
Currently this recognizes some personalities used by MSVC and turns
resume instructions into traps to avoid link errors. Even if cleanups
are not used in the source program, LLVM requires the frontend to emit a
code path that resumes unwinding after an exception. Clang does this,
and we get unreachable resume instructions. PR20300 covers cleaning up
these unreachable calls to resume.
Eric Christopher [Thu, 29 Jan 2015 00:19:42 +0000 (00:19 +0000)]
Remove getSubtargetImpl from AArch64ISelLowering and cache the
correct subtarget by passing it in during the constructor as
TargetLowering is Subtarget specific.
Eric Christopher [Thu, 29 Jan 2015 00:19:39 +0000 (00:19 +0000)]
Remove getSubtargetImpl from ARMISelLowering and cache the
correct subtarget by passing it in during the constructor as
TargetLowering is Subtarget specific.