Che-Liang Chiou [Mon, 28 Feb 2011 06:34:09 +0000 (06:34 +0000)]
Add preliminary support for .f32 in the PTX backend.
- Add appropriate TableGen patterns for fadd, fsub, fmul.
- Add .f32 as the PTX type for the LLVM float type.
- Allow parameters, return values, and global variable declarations
to accept the float type.
- Add appropriate test cases.
Nick Lewycky [Mon, 28 Feb 2011 06:20:05 +0000 (06:20 +0000)]
The sign of an srem instruction is the sign of its dividend (the first
argument), regardless of the divisor. Teach instcombine about this and fix
test7 in PR9343!
Bill Wendling [Mon, 28 Feb 2011 01:10:44 +0000 (01:10 +0000)]
Update the documentation on "How to Release LLVM". It lays out a new way of
tagging and branching for the release. I will update this more throughout the
2.9 release process.
Bill Wendling [Sat, 26 Feb 2011 03:09:12 +0000 (03:09 +0000)]
A new TableGen feature! (Not turned on just yet.)
InstAlias<{alias}, {aliasee}>;
The InstAlias instruction should be able to go from the MCInst to the
{alias}. All of the information is there to match the MCInst with the
{aliasee}. From there, it's a simple matter to emit the {alias}, with the
correct operands from the {aliasee}.
The code this patch generates can be used by the InstPrinter to automatically
print out the alias without having to write special C++ code to handle the
situation.
This is a WIP, and therefore are several limitations. For instance, it cannot
handle AsmOperands at the moment. It also doesn't know what to do when two
{alias}es match the same {aliasee}. (Currently, it just ignores those two cases
and allows the printInstruction method to handle them.)
Cameron Zwarich [Fri, 25 Feb 2011 01:11:01 +0000 (01:11 +0000)]
Set NumSignBits to 1 if KnownZero/KnownOne are being zero extended. In theory it
is possible to do better if the high bit is set in either KnownZero/KnownOne, but
in practice NumSignBits is always 1 when we are zero extending because nothing
is known about that register.
Evan Cheng [Fri, 25 Feb 2011 00:24:46 +0000 (00:24 +0000)]
Each prologue may have multiple vpush instructions to store callee-saved
D registers since the vpush list may not have gaps. Make sure the stack
adjustment instruction isn't moved between them. Ditto for vpop in
epilogues.
Sorry, can't reduce a small test case.
rdar://9043312
Tweak the register allocator priority queue some more.
New live ranges are assigned in long -> short order, but live ranges that have
been evicted at least once are deferred and assigned in short -> long order.
Also disable splitting and spilling for live ranges seen for the first time.
The intention is to create a realistic interference pattern from the heavy live
ranges before starting splitting and spilling around it.
Restore r125595 (reverted in r126336) with modifications:
Introduce a variable in the AsmParserExtension whether [] is valid in an
expression. If it is true, parse them like (). Enable this for ELF only.
Nadav Rotem [Thu, 24 Feb 2011 21:01:34 +0000 (21:01 +0000)]
Enable support for vector sext and trunc:
Limit the folding of any_ext and sext into the load operation to scalars.
Limit the active-bits trunc optimization to scalars.
Document vector trunc and vector sext in LangRef.
NAKAMURA Takumi [Thu, 24 Feb 2011 12:34:34 +0000 (12:34 +0000)]
test/lit.cfg: Add PATHEXT to 'substitution', to recognize tools on Windows hosts. Thanks to Danil Malyshev!
Some tests on Windows use the "not" utility and fail with an error "program not executable". The reason for this error is that the name of the executable file sended to the "not" without the extension.
Duncan Sands [Thu, 24 Feb 2011 11:54:18 +0000 (11:54 +0000)]
Rewrite the vector part of getExtendedTypeAction to make it more
understandable (at least I find it easier to understand like this).
No intended functionality change.
Cameron Zwarich [Thu, 24 Feb 2011 10:00:25 +0000 (10:00 +0000)]
Merge information about the number of zero, one, and sign bits of live-out
registers at phis. This enables us to eliminate a lot of pointless zexts during
the DAGCombine phase. This fixes <rdar://problem/8760114>.
Cameron Zwarich [Thu, 24 Feb 2011 10:00:04 +0000 (10:00 +0000)]
Have isel visit blocks in reverse postorder rather than an undefined order. This
allows for the information propagated across basic blocks to be merged at phis.
Chris Lattner [Thu, 24 Feb 2011 05:10:56 +0000 (05:10 +0000)]
change instcombine to not turn a call to non-varargs bitcast of
function prototype into a call to a varargs prototype. We do
allow the xform if we have a definition, but otherwise we don't
want to risk that we're changing the abi in a subtle way. On
X86-64, for example, varargs require passing stuff in %al.
Jim Grosbach [Wed, 23 Feb 2011 21:26:51 +0000 (21:26 +0000)]
Revert r125595, which is an X86-only undocumented assembly syntax extension
enabled for all targets. Non-X86 targets should not have this behavior
enabled by default.
Joerg, if you would like to resubmit with the behavior conditionalized to be
X86-ELF only, that's fine.
Richard Osborne [Wed, 23 Feb 2011 18:35:59 +0000 (18:35 +0000)]
Add llvm.xcore.waitevent intrinsic. The effect of this intrinsic is to enable
events on the thread and wait until a resource is ready to event. The vector
of the resource that is ready is returned.
Sean Callanan [Wed, 23 Feb 2011 03:31:28 +0000 (03:31 +0000)]
Fixed a bug in the enhanced disassembler that caused
it to ignore valid uses of FS and GS as additional
base registers in address computations. Added a test
case for this.
Evan Cheng [Wed, 23 Feb 2011 02:24:55 +0000 (02:24 +0000)]
More fcopysign correctness and performance fix.
The previous codegen for the slow path (when values are in VFP / NEON
registers) was incorrect if the source is NaN.
The new codegen uses NEON vbsl instruction to copy the sign bit. e.g.
vmov.i32 d1, #0x80000000
vbsl d1, d2, d0
If NEON is not available, it uses integer instructions to copy the sign bit.
rdar://9034702
Keep track of how many times a live range has been dequeued, and prioritize new ranges.
When a large live range is evicted, it will usually be split when it comes
around again. By deferring evicted live ranges, the splitting happens at a time
when the interference pattern is more realistic. This prevents repeated
splitting and evictions.
Use interval sizes instead of spill weights to determine if it is legal to evict
interference. A smaller interval can evict interference if all interfering live
ranges are larger.
Allow multiple interferences to be evicted as along as they are all larger than
the live range being allocated.
Spill weights are still used to select the preferred eviction candidate.
Change the RAGreedy register assignment order so large live ranges are allocated first.
This is based on the observation that long live ranges are more difficult to
allocate, so there is a better chance of solving the puzzle by handling the big
pieces first. The allocator will evict and split long alive ranges when they get
in the way.
RABasic is still using spill weights for its priority queue, so the interface to
the queue has been virtualized.
Nick Lewycky [Tue, 22 Feb 2011 22:48:47 +0000 (22:48 +0000)]
Fix C++0x incompatibility. The signature of std::make_pair<> changes from:
template <class T1, class T2> pair<T1,T2> make_pair(const T1&, const T2&);
to
template <class T1, class T2> pair<V1, V2> make_pair(T1&&, T2&&);
so explicitly specifying the template arguments to make_pair<> is going to break
when C++0x rolls through. Replace them with equivalent std::pair<>. Patch by
James Dennett!