Cameron Zwarich [Tue, 22 Feb 2011 00:46:27 +0000 (00:46 +0000)]
Merge information about the number of zero, one, and sign bits of live-out registers
at phis. This enables us to eliminate a lot of pointless zexts during the DAGCombine
phase. This fixes <rdar://problem/8760114>.
Cameron Zwarich [Tue, 22 Feb 2011 00:46:22 +0000 (00:46 +0000)]
Have isel visit blocks in reverse postorder rather than an undefined order. This
allows for the information propagated across basic blocks to be merged at phis.
Devang Patel [Mon, 21 Feb 2011 23:21:26 +0000 (23:21 +0000)]
Revert r124611 - "Keep track of incoming argument's location while emitting LiveIns."
In other words, do not keep track of argument's location. The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
This requires some coordination with debugger to get this working.
- The debugger needs to be aware of prolog_end attribute attached with line table entries.
- The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)
Add SplitKit::isOriginalEndpoint and use it to force live range splitting to terminate.
An original endpoint is an instruction that killed or defined the original live
range before any live ranges were split.
When splitting global live ranges, avoid creating local live ranges without any
original endpoints. We may still create global live ranges without original
endpoints, but such a range won't be split again, and live range splitting still
terminates.
Sean Callanan [Mon, 21 Feb 2011 21:55:05 +0000 (21:55 +0000)]
Fixed a bug in the X86 disassembler where a member of the
X86 instruction decode structure was being interpreted as
being in units of bits, although it is actually stored in
units of bytes.
David Greene [Mon, 21 Feb 2011 19:23:22 +0000 (19:23 +0000)]
Add a convenience tool for doing comparison builds of the LLVM
ecosystem. This is a handy utility for checking changes before
committing them to the repository.
Duncan Sands [Mon, 21 Feb 2011 17:32:05 +0000 (17:32 +0000)]
If the phi node was used by an unreachable instruction that ends up using
itself without going via a phi node then we could return false here in
spite of making a change. Also, tweak the comment because this method
can (and always could) return true without deleting the original phi node.
For example, if the phi node was used by a read-only invoke instruction
which is used by another phi node phi2 which is only used by and only uses
the invoke, then phi2 would be deleted but not the invoke instruction and
not the original phi node.
Duncan Sands [Mon, 21 Feb 2011 16:27:36 +0000 (16:27 +0000)]
Simplify RecursivelyDeleteDeadPHINode. The only functionality change
should be that if the phi is used by a side-effect free instruction with
no uses then the phi and the instruction now get zapped (checked by the
unittest).
NAKAMURA Takumi [Mon, 21 Feb 2011 04:50:06 +0000 (04:50 +0000)]
Target/X86/X86FastISel: [PR6275] Fix Win32's dllimport function with fastisel.
"dllimport" function must not be GlobalVariable, but Function. It is enough to check with GlobalValue.
test/CodeGen/X86/dll-linkage.ll is updated to check llc -O0.
Cameron Zwarich [Mon, 21 Feb 2011 01:29:32 +0000 (01:29 +0000)]
A lo/hi mul has higher latency than an imul r,ri, e.g. 5 cycles compared to 3
on Core 2 and Nehalem, so the code we generate is better than GCC's here.
Cameron Zwarich [Mon, 21 Feb 2011 00:22:02 +0000 (00:22 +0000)]
The signed version of our "magic number" computation for the integer approximation
of a constant had a minor typo introduced when copying it from the book, which
caused it to favor negative approximations over positive approximations in many
cases. Positive approximations require fewer operations beyond the multiplication.
In the case of division by 3, we still generate code that is a single instruction
larger than GCC's code.
Nick Lewycky [Sun, 20 Feb 2011 18:05:56 +0000 (18:05 +0000)]
Make RecursivelyDeleteDeadPHINode delete a phi node that has no users and add a
test for that. With this change, test/CodeGen/X86/codegen-dce.ll no longer finds
any instructions to DCE, so delete the test.
Also renamed J and JP to I and IP in RecursivelyDeleteDeadPHINode.
Nadav Rotem [Sun, 20 Feb 2011 12:37:50 +0000 (12:37 +0000)]
Fix 9267; Add vector zext support.
The DAGCombiner folds the zext into complex load instructions. This patch
prevents this optimization on vectors since none of the supported targets
knows how to perform load+vector_zext in one instruction.
Nick Lewycky [Sun, 20 Feb 2011 08:11:03 +0000 (08:11 +0000)]
Instead of keeping two Value*->id# mappings, keep one Value->Value mapping and
one Value set. This is faster because we only need to use the set when there
isn't already an entry in the map. No functionality change!
Eric Christopher [Sun, 20 Feb 2011 05:04:42 +0000 (05:04 +0000)]
If both operands are loads from stores in memory we can't use movlpd/movlps
since one needs to be a register operand. Just use movss instead of forcing
an operand into a register.
Stephen Wilson [Sun, 20 Feb 2011 04:17:15 +0000 (04:17 +0000)]
This patch lets LLDB build as an LLVM subproject. LLDB is not built in
parallel with the rest of the tools directory as it depends on Clang.
This patch was first applied in r125956 and subsequently reverted in
r125964 as it broke in-tree builds. Makefile.rules was fixed up in
r126070 to handle missing optional directories for the in-tree case,
so it should be safe now to bring this patch back in.
Stephen Wilson [Sun, 20 Feb 2011 03:51:07 +0000 (03:51 +0000)]
Do not try to descend into optional build directories if they do not
exist. This makes the build logic symmetric for both the in tree and
out of tree cases.
Eli Friedman [Sat, 19 Feb 2011 22:42:40 +0000 (22:42 +0000)]
PR9218: SimplifyDemandedVectorElts can return a non-null value that is not
the instruction passed in. Make sure to account for this correctly, instead
of looping infinitely.
Cameron Zwarich [Sat, 19 Feb 2011 21:44:35 +0000 (21:44 +0000)]
Try to fix the MC/AsmParser/section.s failure on the llvm-x86_64-linux-vg_leak
bot. I am not sure if this is valid Valgrind exclusion file syntax, but the
Internet seems to think so.
Chris Lattner [Sat, 19 Feb 2011 19:56:44 +0000 (19:56 +0000)]
rewrite the memset_pattern pattern generation stuff to accept any 2/4/8/16-byte
constant, including globals. This makes us generate much more "pretty" pattern
globals as well because it doesn't break it down to an array of bytes all the
time.
This enables us to handle stores of relocatable globals. This kicks in about
48 times in 254.gap, giving us stuff like this:
Chris Lattner [Sat, 19 Feb 2011 19:31:39 +0000 (19:31 +0000)]
Implement rdar://9009151, transforming strided loop stores of
unsplatable values into memset_pattern16 when it is available
(recent darwins). This transforms lots of strided loop stores
of ints for example, like 5 in vpr:
Formed memset: call void @memset_pattern16(i8* %4, i8* getelementptr inbounds ([16 x i8]* @.memset_pattern9, i32 0, i32 0), i64 %tmp25)
from store to: {%3,+,4}<%11> at: store i32 3, i32* %scevgep, align 4, !tbaa !4
Ted Kremenek [Sat, 19 Feb 2011 01:59:21 +0000 (01:59 +0000)]
Add ImmutableMap methods 'manualRetain()', 'manualRelease()', and 'getRootWithoutRetain()' to help more aggressively reclaim memory in the static analyzer.
Devang Patel [Fri, 18 Feb 2011 22:43:42 +0000 (22:43 +0000)]
Do not lose debug info of an inlined function argument even if the argument is only used through GEPs.
This time with a fix that avoids using invalidated DenseMap iterator.
Chris Lattner [Fri, 18 Feb 2011 22:36:36 +0000 (22:36 +0000)]
Now that -loop-idiom uses TargetLibraryInfo properly, it doesn't
need to be pulled out of the pass manager when the user specifies
-fno-builtin. It can intelligently determine which libcalls to
optimize based on what is enabled in TargetLibraryInfo. This
allows -fno-builtin-foo to work someday.
Owen Anderson [Fri, 18 Feb 2011 21:51:29 +0000 (21:51 +0000)]
Add FixedLenDecoderEmitter, the skeleton of a new disassembler emitter for fixed-length instruction encodings.
A major part of its (eventual) goal is to support a much cleaner separation between disassembly callbacks
provided by the target and the disassembler emitter itself, i.e. not requiring hardcoding of knowledge in tblgen
like the existing disassembly emitters do.
The hope is that some day this will allow us to replace the existing non-Thumb ARM disassembler and remove
some of the hacks the old one introduced to tblgen.
Chris Lattner [Fri, 18 Feb 2011 21:50:34 +0000 (21:50 +0000)]
introduce a new TargetLibraryInfo pass, which transformations can use to
query about available library functions. For now this just has
memset_pattern16, which exists on darwin, but it can be extended for a
bunch of other things in the future.
Chris Lattner [Fri, 18 Feb 2011 04:43:06 +0000 (04:43 +0000)]
prevent jump threading from merging blocks when their address is
taken (and used!). This prevents merging the blocks (invalidating
the block addresses) in a case like this: