Chris Lattner [Fri, 11 Feb 2011 21:37:43 +0000 (21:37 +0000)]
When lowering an inbounds gep, the intermediate adds can have
unsigned overflow (e.g. due to a negative array index), but
the scales on array size multiplications are known to not
sign wrap.
Nate Begeman [Fri, 11 Feb 2011 20:53:29 +0000 (20:53 +0000)]
Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types.
This avoids moving each element to the integer register file and calling __divsi3 etc. on it.
Nadav Rotem [Fri, 11 Feb 2011 19:20:37 +0000 (19:20 +0000)]
Fix #9190
The bug happens when the DAGCombiner attempts to optimize one of the patterns
of the SUB opcode. It tries to create a zero of type v2i64. This type is legal
on 32bit machines, but the initializer of this vector (i64) is target dependent.
Currently, the initializer attempts to create an i64 zero constant, which fails.
Added a flag to tell the DAGCombiner to create a legal zero, if we require that
the pass would generate legal types.
Douglas Gregor [Fri, 11 Feb 2011 18:13:20 +0000 (18:13 +0000)]
Poison the relational operators ==, !=, <, <=, >=, > on llvm::Optional
objects, since they'll end up using the implicit conversion to "bool"
and causing some very "fun" surprises.
Cameron Zwarich [Fri, 11 Feb 2011 06:08:28 +0000 (06:08 +0000)]
Make LoopUnswitch preserve ScalarEvolution by just forgetting everything about
a loop when unswitching it. It only does this in the complex case, because
everything should be fine already in the simple case.
Was compiled to:
vmov s0, r1
bic r0, r0, #-2147483648
vmov s1, r0
vcmpe.f32 s0, #0
vmrs apsr_nzcv, fpscr
it lt
vneglt.f32 s1, s1
vmov r0, s1
bx lr
This fails to copy the sign of -0.0f because it's lost during the float to int
conversion. Also, it's sub-optimal when the inputs are in GPR registers.
Now it uses integer and + or operations when it's profitable. And it's correct!
lsrs r1, r1, #31
bfi r0, r1, #31, #1
bx lr
rdar://8984306
David Greene [Thu, 10 Feb 2011 23:11:29 +0000 (23:11 +0000)]
[AVX] Implement 256-bit vector lowering for SCALAR_TO_VECTOR. This
largely completes support for 128-bit fallback lowering for code that
is not 256-bit ready.
Chris Lattner [Thu, 10 Feb 2011 07:01:55 +0000 (07:01 +0000)]
switch the constantexpr, target folder, and IRBuilder interfaces
for NSW/NUW binops to follow the pattern of exact binops. This
allows someone to use Builder.CreateAdd(x, y, "tmp", MaybeNUW);
Chris Lattner [Thu, 10 Feb 2011 05:36:31 +0000 (05:36 +0000)]
Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.
Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck. I believe that this takes care of a bunch of related
code quality issues attached to PR8862.
Chris Lattner [Thu, 10 Feb 2011 05:23:05 +0000 (05:23 +0000)]
Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts. For example, these (which
are the same except the first is 'exact' sdiv:
Chris Lattner [Thu, 10 Feb 2011 05:14:58 +0000 (05:14 +0000)]
A bunch of cleanups and simplifications using the new PatternMatch predicates
and generally tidying things up. Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.
Chris Lattner [Thu, 10 Feb 2011 05:09:34 +0000 (05:09 +0000)]
teach SimplifyDemandedBits that exact shifts demand the bits they
are shifting out since they do require them to be zeros. Similarly
for NUW/NSW bits of shl
Evan Cheng [Thu, 10 Feb 2011 02:20:55 +0000 (02:20 +0000)]
After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508.
This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of
thing automatically, but it seems easier to just change the ordering of the
passes if this is the only case.
Jim Grosbach [Thu, 10 Feb 2011 00:08:28 +0000 (00:08 +0000)]
Do AsmMatcher operand classification per-opcode.
When matching operands for a candidate opcode match in the auto-generated
AsmMatcher, check each operand against the expected operand match class.
Previously, operands were classified independently of the opcode being
handled, which led to difficulties when operand match classes were
more complicated than simple subclass relationships.
Douglas Gregor [Wed, 9 Feb 2011 22:11:23 +0000 (22:11 +0000)]
Add llvm::sys::path::canonical(), which provides the canonicalized
name of a path, after resolving symbolic links and eliminating excess
path elements such as "foo/../" and "./".
This routine still needs a Windows implementation, but I don't have a
Windows machine available. Help? Please?
Shantonu Sen [Wed, 9 Feb 2011 21:03:19 +0000 (21:03 +0000)]
Fix comparator used for looking up previously instantiated EDDisassemblers.
Now, Syntax is only used as a tie-breaker if the Arch
matches. Previously, a request for x86_64 disassembler followed by the
i386 disassembler in a single process would return the cached x86_64
disassembler. Fixes <rdar://problem/8958982>
Chris Lattner [Wed, 9 Feb 2011 17:00:45 +0000 (17:00 +0000)]
Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/Exact
versions of creation functions. Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.
Do a massive rewrite of much of pattern match. It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.
Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.
Chris Lattner [Wed, 9 Feb 2011 16:46:02 +0000 (16:46 +0000)]
emit a specific error when the input file is empty. This fixes
an annoyance of mine when working on tests: if the input .ll file
is broken, opt outputs an error and generates an empty file. FileCheck
then emits its "ooh I couldn't find the first CHECK line, scanning
from ..." which obfuscates the actual problem.
Nick Lewycky [Wed, 9 Feb 2011 06:32:02 +0000 (06:32 +0000)]
When removing a function from the function set and adding it to deferred, we
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.
NAKAMURA Takumi [Wed, 9 Feb 2011 04:18:48 +0000 (04:18 +0000)]
lib/Support/Errno.cpp: Check strerror_s() with HAVE_DECL_STRERROR_S in config.h.*.
AC_CHECK_FUNCS seeks a symbol only in libs. We should check the declaration in string.h.
FIXME: I have never seen mingw(s) have strerror_s() (not _strerror_s()).
FIXME: Autoconf/CMake may seek strerror_s() with the definition MINGW_HAS_SECURE_API in future.
Evict a lighter single interference before attempting to split a live range.
Registers are not allocated strictly in spill weight order when live range
splitting and spilling has created new shorter intervals with higher spill
weights.
When one of the new heavy intervals conflicts with a single lighter interval,
simply evict the old interval instead of trying to split the heavy one.
The lighter interval is a better candidate for splitting, it has a smaller use
density.
Owen Anderson [Tue, 8 Feb 2011 22:39:40 +0000 (22:39 +0000)]
Revert both r121082 (which broke a bunch of constant pool stuff) and r125074 (which worked around it). This should get us back to the old, correct behavior, though it will make the integrated assembler unhappy for the time being.
David Greene [Tue, 8 Feb 2011 19:04:41 +0000 (19:04 +0000)]
[AVX] Implement BUILD_VECTOR lowering for 256-bit vectors. For
anything but the simplest of cases, lower a 256-bit BUILD_VECTOR by
splitting it into 128-bit parts and recombining.