Andrew Trick [Wed, 6 Nov 2013 02:08:26 +0000 (02:08 +0000)]
Rewrite SCEV's backedge taken count computation.
Patch by Michele Scandale!
Rewrite of the functions used to compute the backedge taken count of a
loop on LT and GT comparisons.
I decided to split the handling of LT and GT cases becasue the trick
"a > b == -a < -b" in some cases prevents the trip count computation
due to the multiplication by -1 on the two operands of the
comparison. This issue comes from the conservative computation of
value range of SCEVs: taking the negative SCEV of an expression that
have a small positive range (e.g. [0,31]), we would have a SCEV with a
fullset as value range.
Indeed, in the new rewritten function I tried to better handle the
maximum backedge taken count computation when MAX/MIN expression are
used to handle the cases where no entry guard is found.
Some test have been modified in order to check the new value correctly
(I manually check them and reasoning on possible overflow the new
values seem correct).
I finally added a new test case related to the multiplication by -1
issue on GT comparisons.
Remove another unused, and IMHO, not very desirable feature of ErrorOr.
One of the uses of the IsValid flag is to support default constructing
a ErrorOr that is not a Error or a Value. There is not much value in
doing that IMHO. If ErrorOr was to have a default constructor, it
should be implemented by default constructing the value, but even that
looks unnecessary.
The other use is to avoid calling destructors on moved objects. This
looks wrong. If the data being moved has non trivial treatment of
moves (an std::vector for example), it is its destructor that should
handle it, not ~ErrorOr.
With this change ErrorOr becomes a fairly simple wrapper and should
always be better than using an error_code + value in an API.
Andrew Trick [Tue, 5 Nov 2013 22:44:04 +0000 (22:44 +0000)]
Slightly change the way stackmap and patchpoint intrinsics are lowered.
MorphNodeTo is not safe to call during DAG building. It eagerly
deletes dependent DAG nodes which invalidates the NodeMap. We could
expose a safe interface for morphing nodes, but I don't think it's
worth it. Just create a new MachineNode and replaceAllUsesWith.
My understaning of the SD design has been that we want to support
early target opcode selection. That isn't very well supported, but
generally works. It seems reasonable to rely on this feature even if
it isn't widely used.
Reed Kotler [Tue, 5 Nov 2013 22:34:29 +0000 (22:34 +0000)]
Get rid of all references to soimm in MipsConstantIslands pass because
we don't have such an operand.
Suprisingly enough, this is never actually accounted for in the
ARM version when determining offset ranges. In both places there is the
comment:
- // FIXME: Make use full range of soimm values.
(soimm = shift operand immediate).
Tim Northover [Tue, 5 Nov 2013 21:36:02 +0000 (21:36 +0000)]
ARM: permit bare dmb/dsb/isb aliases on Cortex-M0
Cortex-M0 supports these 32-bit instructions despite being Thumb1 only
(mostly). We knew about that but not that the aliases without the default "sy"
operand were also permitted.
[objc-arc] Convert the one directional retain/release relation assert to a conditional check + fail.
Due to the previously added overflow checks, we can have a retain/release
relation that is one directional. This occurs specifically when we run into an
additive overflow causing us to drop state in only one direction. If that
occurs, we should bail and not optimize that retain/release instead of
asserting.
Apologies for the size of the testcase. It is necessary to cause the additive
cfg overflow to trigger.
Reed Kotler [Tue, 5 Nov 2013 08:14:14 +0000 (08:14 +0000)]
Fix r194019 as requested by Eric Christopher.
Submit the basic port of the rest of ARM constant islands code to Mips.
Two test cases are added which reflect the next level of functionality:
constants getting moved to water areas that are out of range from the
initial placement at the end of the function and basic blocks being split to
create water when none exists that can be used. There is a bunch of this
code that is not complete and has been marked with IN_PROGRESS. I will
finish cleaning this all up during the next week or two and submit the
rest of the test cases. I have elminated some code for dealing with
inline assembly because to me it unecessarily complicates things and
some of the newer features of llvm like function attributies and builtin
assembler give me better tools to solve the alignment issues created
there. Also, for Mips16 I even have the option of not doing constant
islands in the present of inline assembler if I chose. When everything
has been completed I will summarize the port and notify people that
are knowledgable regarding the ARM Constant Islands code so they can
review it in it's entirety if they wish.
Yuchen Wu [Tue, 5 Nov 2013 01:11:58 +0000 (01:11 +0000)]
Support for reading run counts in llvm-cov.
This patch enables llvm-cov to correctly output the run count stored in
the GCDA file. GCOVProfiling currently does not generate this
information, so the GCDA run data had to be hacked on from a GCDA file
generated by gcc. This is corrected by a subsequent patch.
With the run and program data included, both llvm-cov and gcov produced
the same output.
ErrorOr had quiet a bit of complexity and indirection to be able to hold a user
type with the error.
That feature is not used anymore. This patch removes it, it will live in svn
history if we ever need it again.
If we do need it again, IMHO there is one thing that should be done
differently: Holding extra info in the error is not a property a function also
returning a value or not. The ability to hold extra info should be in the error
type and ErrorOr templated over it so that we don't need the funny looking
ErrorOr<void>.
Hal Finkel [Tue, 5 Nov 2013 00:08:03 +0000 (00:08 +0000)]
Add a runtime unrolling parameter to the LoopUnroll pass constructor
As with the other loop unrolling parameters (the unrolling threshold, partial
unrolling, etc.) runtime unrolling can now also be controlled via the
constructor. This will be necessary for moving non-trivial unrolling late in
the pass manager (after loop vectorization).
Tim Northover [Mon, 4 Nov 2013 23:04:15 +0000 (23:04 +0000)]
ARM: remove unnecessary state-tracking during frame lowering.
ResolveFrameIndex had what appeared to be a very nasty hack for when the
frame-index referred to a callee-saved register. In this case it "adjusted" the
offset so that the address was correct if (and only if) the MachineInstr
immediately followed the respective push.
This "worked" for all forms of GPR & DPR but was only ever used to set the
frame pointer itself, and once this was put in a more sensible location the
entire state-tracking machinery it relied on became redundant. So I stripped
it.
The only wrinkle is that "add r7, sp, #0" might theoretically be slower (need
an actual ALU slot) compared to "mov r7, sp" so I added a micro-optimisation
that also makes emitARMRegUpdate and emitT2RegUpdate also work when NumBytes ==
0.
No test changes since there shouldn't be any functionality change.
Tim Northover [Mon, 4 Nov 2013 23:04:07 +0000 (23:04 +0000)]
AArch64: use default asm operand printing when modifier inapplicable
If an inline assembly operand has multiple constraints (e.g. "Ir" for immediate
or register) and an operand modifier (E.g. "w" for "print register as wN") then
we need to decide behaviour when the modifier doesn't apply to the constraint.
Previousely produced some combination of an assertion failure and a fatal
error. GCC's behaviour appears to be to ignore the modifier and print the
operand in the default way. This patch should implement that.
Reed Kotler [Mon, 4 Nov 2013 22:11:25 +0000 (22:11 +0000)]
Submit the basic port of the rest of ARM constant islands code to Mips.
Two test cases are added which reflect the next level of functionality:
constants getting moved to water areas that are out of range from the
initial placement at the end of the function and basic blocks being split to
create water when none exists that can be used. There is a bunch of this
code that is not complete and has been marked with IN_PROGRESS. I will
finish cleaning this all up during the next week or two and submit the
rest of the test cases. I have elminated some code for dealing with
inline assembly because to me it unecessarily complicates things and
some of the newer features of llvm like function attributies and builtin
assembler give me better tools to solve the alignment issues created
there. Also, for Mips16 I even have the option of not doing constant
islands in the present of inline assembler if I chose.
Matt Arsenault [Mon, 4 Nov 2013 20:36:06 +0000 (20:36 +0000)]
Scalarize select vector arguments when extracted.
When the elements are extracted from a select on vectors
or a vector select, do the select on the extracted scalars
from the input if there is only one use.
Change BitcodeReader to use error_code instead of bool + string.
In order to create an ObjectFile implementation that uses bitcode files, we
need to propagate the bitcode errors to the ObjectFile interface, so we need
to convert it to use the same error handling as ObjectFile: error_code.
Filip Pizlo [Mon, 4 Nov 2013 02:22:25 +0000 (02:22 +0000)]
Make the pretty stack trace be an opt-in, rather than opt-out, facility. Enable pretty
stack traces by default if you use PrettyStackTraceProgram, so that existing LLVM-based
tools will continue to get it without any changes.
Benjamin Kramer [Sun, 3 Nov 2013 12:27:52 +0000 (12:27 +0000)]
SLPVectorizer: When CSEing generated gathers only scan blocks containing them.
Instead of doing a RPO traversal of the whole function remember the blocks
containing gathers (typically <= 2) and scan them in dominator-first order.
The actual CSE is still quadratic, but I'm not confident that adding a
scoped hash table here is worth it as we're only looking at the generated
instructions and not arbitrary code.
Bob Wilson [Sun, 3 Nov 2013 06:48:38 +0000 (06:48 +0000)]
Convert calls to __sinpi and __cospi into __sincospi_stret
This adds an SimplifyLibCalls case which converts the special __sinpi and
__cospi (float & double variants) into a __sincospi_stret where appropriate to
remove duplicated work.
Filip Pizlo [Sun, 3 Nov 2013 00:29:47 +0000 (00:29 +0000)]
When LLVM is embedded in a larger application, it's not OK for LLVM to intercept crashes. LLVM already has
the ability to disable this functionality. This patch exposes it via the C API.
Benjamin Kramer [Sat, 2 Nov 2013 13:39:00 +0000 (13:39 +0000)]
LoopVectorize: Remove quadratic behavior the local CSE.
Doing this with a hash map doesn't change behavior and avoids calling
isIdenticalTo O(n^2) times. This should probably eventually move into a utility
class shared with EarlyCSE and the limited CSE in the SLPVectorizer.