David Greene [Thu, 10 Feb 2011 23:11:29 +0000 (23:11 +0000)]
[AVX] Implement 256-bit vector lowering for SCALAR_TO_VECTOR. This
largely completes support for 128-bit fallback lowering for code that
is not 256-bit ready.
Chris Lattner [Thu, 10 Feb 2011 07:01:55 +0000 (07:01 +0000)]
switch the constantexpr, target folder, and IRBuilder interfaces
for NSW/NUW binops to follow the pattern of exact binops. This
allows someone to use Builder.CreateAdd(x, y, "tmp", MaybeNUW);
Chris Lattner [Thu, 10 Feb 2011 05:36:31 +0000 (05:36 +0000)]
Enhance a bunch of transformations in instcombine to start generating
exact/nsw/nuw shifts and have instcombine infer them when it can prove
that the relevant properties are true for a given shift without them.
Also, a variety of refactoring to use the new patternmatch logic thrown
in for good luck. I believe that this takes care of a bunch of related
code quality issues attached to PR8862.
Chris Lattner [Thu, 10 Feb 2011 05:23:05 +0000 (05:23 +0000)]
Enhance the "compare with shift" and "compare with div"
optimizations to be much more aggressive in the face of
exact/nsw/nuw div and shifts. For example, these (which
are the same except the first is 'exact' sdiv:
Chris Lattner [Thu, 10 Feb 2011 05:14:58 +0000 (05:14 +0000)]
A bunch of cleanups and simplifications using the new PatternMatch predicates
and generally tidying things up. Only very trivial functionality changes
like now doing (-1 - A) -> (~A) for vectors too.
Chris Lattner [Thu, 10 Feb 2011 05:09:34 +0000 (05:09 +0000)]
teach SimplifyDemandedBits that exact shifts demand the bits they
are shifting out since they do require them to be zeros. Similarly
for NUW/NSW bits of shl
Evan Cheng [Thu, 10 Feb 2011 02:20:55 +0000 (02:20 +0000)]
After 3-addressifying a two-address instruction, update the register maps; add a missing check when considering whether it's profitable to commute. rdar://8977508.
This fixes <rdar://problem/8869639>. I also filed PR9184 on doing this sort of
thing automatically, but it seems easier to just change the ordering of the
passes if this is the only case.
Jim Grosbach [Thu, 10 Feb 2011 00:08:28 +0000 (00:08 +0000)]
Do AsmMatcher operand classification per-opcode.
When matching operands for a candidate opcode match in the auto-generated
AsmMatcher, check each operand against the expected operand match class.
Previously, operands were classified independently of the opcode being
handled, which led to difficulties when operand match classes were
more complicated than simple subclass relationships.
Douglas Gregor [Wed, 9 Feb 2011 22:11:23 +0000 (22:11 +0000)]
Add llvm::sys::path::canonical(), which provides the canonicalized
name of a path, after resolving symbolic links and eliminating excess
path elements such as "foo/../" and "./".
This routine still needs a Windows implementation, but I don't have a
Windows machine available. Help? Please?
Shantonu Sen [Wed, 9 Feb 2011 21:03:19 +0000 (21:03 +0000)]
Fix comparator used for looking up previously instantiated EDDisassemblers.
Now, Syntax is only used as a tie-breaker if the Arch
matches. Previously, a request for x86_64 disassembler followed by the
i386 disassembler in a single process would return the cached x86_64
disassembler. Fixes <rdar://problem/8958982>
Chris Lattner [Wed, 9 Feb 2011 17:00:45 +0000 (17:00 +0000)]
Rework InstrTypes.h so to reduce the repetition around the NSW/NUW/Exact
versions of creation functions. Eventually, the "insertion point" versions
of these should just be removed, we do have IRBuilder afterall.
Do a massive rewrite of much of pattern match. It is now shorter and less
redundant and has several other widgets I will be using in other patches.
Among other changes, m_Div is renamed to m_IDiv (since it only matches
integer divides) and m_Shift is gone (it used to match all binops!!) and
we now have m_LogicalShift for the one client to use.
Enhance IRBuilder to have "isExact" arguments to things like CreateUDiv
and reduce redundancy within IRbuilder by having these methods chain to
each other more instead of duplicating code.
Chris Lattner [Wed, 9 Feb 2011 16:46:02 +0000 (16:46 +0000)]
emit a specific error when the input file is empty. This fixes
an annoyance of mine when working on tests: if the input .ll file
is broken, opt outputs an error and generates an empty file. FileCheck
then emits its "ooh I couldn't find the first CHECK line, scanning
from ..." which obfuscates the actual problem.
Nick Lewycky [Wed, 9 Feb 2011 06:32:02 +0000 (06:32 +0000)]
When removing a function from the function set and adding it to deferred, we
could end up removing a different function than we intended because it was
functionally equivalent, then end up with a comparison of a function against
itself in the next round of comparisons (the one in the function set and the
one on the deferred list). To fix this, I introduce a choice in the form of
comparison for ComparableFunctions, either normal or "pointer only" used to
find exact Function*'s in lookups.
NAKAMURA Takumi [Wed, 9 Feb 2011 04:18:48 +0000 (04:18 +0000)]
lib/Support/Errno.cpp: Check strerror_s() with HAVE_DECL_STRERROR_S in config.h.*.
AC_CHECK_FUNCS seeks a symbol only in libs. We should check the declaration in string.h.
FIXME: I have never seen mingw(s) have strerror_s() (not _strerror_s()).
FIXME: Autoconf/CMake may seek strerror_s() with the definition MINGW_HAS_SECURE_API in future.
Evict a lighter single interference before attempting to split a live range.
Registers are not allocated strictly in spill weight order when live range
splitting and spilling has created new shorter intervals with higher spill
weights.
When one of the new heavy intervals conflicts with a single lighter interval,
simply evict the old interval instead of trying to split the heavy one.
The lighter interval is a better candidate for splitting, it has a smaller use
density.
Owen Anderson [Tue, 8 Feb 2011 22:39:40 +0000 (22:39 +0000)]
Revert both r121082 (which broke a bunch of constant pool stuff) and r125074 (which worked around it). This should get us back to the old, correct behavior, though it will make the integrated assembler unhappy for the time being.
David Greene [Tue, 8 Feb 2011 19:04:41 +0000 (19:04 +0000)]
[AVX] Implement BUILD_VECTOR lowering for 256-bit vectors. For
anything but the simplest of cases, lower a 256-bit BUILD_VECTOR by
splitting it into 128-bit parts and recombining.
Add SplitEditor::overlapIntv() to create small ranges where both registers are live.
If a live range is used by a terminator instruction, and that live range needs
to leave the block on the stack or in a different register, it can be necessary
to have both sides of the split live at the terminator instruction.
Example:
%vreg2 = COPY %vreg1
JMP %vreg1
Becomes after spilling %vreg2:
SPILL %vreg1
JMP %vreg1
The spill doesn't kill the register as is normally the case.
Andrew Trick [Tue, 8 Feb 2011 18:20:48 +0000 (18:20 +0000)]
Added bugpoint options: -compile-custom and -compile-command=...
I've been using this mode to narrow down llc unit tests. Example
custom compile script:
llc "$@"
not pygrep.py 'mul\s+r([0-9]), r\1,' < bugpoint-test-program.s
Andrew Trick [Tue, 8 Feb 2011 17:39:46 +0000 (17:39 +0000)]
Fix PostRA antidependence breaker.
Avoid using the same register for two def operands or and earlyclobber
def and use operand. This fixes PR8986 and improves on the prior fix
for rdar://problem/8959122.
Evan Cheng [Tue, 8 Feb 2011 03:07:03 +0000 (03:07 +0000)]
Temporary workaround for a bad bug introduced by r121082 which replaced
t2LDRpci with t2LDRi12.
There are a couple of problems with this.
1. The encoding for the literal and immediate constant are different.
Note bit 7 of the literal case is 'U' so it can be negative.
2. t2LDRi12 is now narrowed to tLDRpci before constant island pass is run.
So we end up never using the Thumb2 instruction, which ends up creating a
lot more constant islands.
Dan Gohman [Tue, 8 Feb 2011 00:55:13 +0000 (00:55 +0000)]
Don't split any loop backedges, including backedges of loops other than
the active loop. This is generally desirable, and it avoids trouble
in situations such as the testcase in PR9123, though the failure
mode depends on use-list order, so it is infeasible to test.
After uses of a live range are removed, recompute the live range to only cover
the remaining uses. This is necessary after rematerializing the value before
some (but not all) uses.
Remove the MCR asm parser hack and start using the custom target specific asm
parsing of operands introduced in r125030. As a small note, besides using a more
generic approach we can also have more descriptive output when debugging
llvm-mc, example:
Implement support for custom target specific asm parsing of operands.
Motivation: Improve the parsing of not usual (different from registers or
immediates) operand forms.
This commit implements only the generic support. The ARM specific modifications
will come next.
A table like the one below is autogenerated for every instruction
containing a 'ParserMethod' in its AsmOperandClass
A matcher function very similar (but lot more naive) to
MatchInstructionImpl scans the table. After the mnemonic match, the
features are checked and if the "to be parsed" operand index is
present in the mask, there's a real match. Then, a switch like the one
below dispatch the parsing to the custom method provided in
'ParseMethod':
case MCK_Coproc:
return TryParseCoprocessorOperandName(Operands);
Bob Wilson [Mon, 7 Feb 2011 17:43:21 +0000 (17:43 +0000)]
Add codegen support for using post-increment NEON load/store instructions.
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.