This is along the same lines as r234832, but for `DILocation`. Clean
out all accessors from `DILocation`. Any callers should be using
`MDLocation` directly (e.g., via `operator->()`).
Completely gut `DIExpression`, turning it into a simple wrapper around
`MDExpression *`. There are two bits of magic left:
- It's constructed from `const MDExpression*` but convertible to
`MDExpression*`.
- It's default-constructed to `nullptr`.
Otherwise, it should behave quite like a raw pointer. Once I've done
the same to the rest of the `DIDescriptor` subclasses, I'll come back to
delete them entirely (and update call sites as necessary to deal with
the missing magic).
Philip Reames [Tue, 14 Apr 2015 00:41:34 +0000 (00:41 +0000)]
[RewriteStatepointsForGC] Delete dead code [NFC]
Before we had real liveness, we needed to track every value that base pointer
insertion code created because these now might be live. We now just rerun
the data flow liveness algorithm (which is actually faster!) and no longer
need the associated code.
As documented in PR23200 (and the FIXMEs I've added to the code here),
this logic is fairly broken: it modifies the `LLVMContext` in a way that
affects other modules and cannot be serialized to assembly/bitcode. For
now, move it over to `MDLocation::computeNewDiscriminators()` anyway.
I don't see a reason to add the `copyWithNewScope()` API over to
`MDLocation` -- it seems to be a holdover from when creating locations
required knowing details of operand layout -- so change
`AddDiscriminators` to call `MDLocation::get()` directly. Should be no
functionality change here.
There's only one user of the various `DIObjCProperty::is*Property()`
accessors -- `DwarfUnit::constructTypeDIE()` -- and it's just using the
reverse logic to reconstruct the bitfield. Drop this API and simplify
the only caller.
DebugInfo: Make DIDerivedType accessors more strict
These accessors in `DIDerivedType` should only be called when `DbgNode`
really is a `MDDerivedType`, not just a `MDDerivedTypeBase`. Assume
that it is.
Lang Hames [Mon, 13 Apr 2015 22:12:54 +0000 (22:12 +0000)]
[Orc] Add an Orc layer for applying arbitrary transforms to IR, use it to add
debugging output to the LLI orc-lazy JIT, and update the orc-lazy "hello.ll"
test to actually test for lazy compilation.
This is almost NFC, but I'm removing some assertions against `nullptr`.
The assertions aren't worth all that much since we'll typically get
segfaults at the same site (and I imagine ASan catches this sort of
thing), and besides: the whole idea is to replace the `DIDescriptor`
hierarchy with raw pointers to the new one.
SelectionDAG: Stop using DIVariable::isInlinedFnArgument()
Instead of calling the somewhat confusingly-named
`DIVariable::isInlinedFnArgument()`, do the check directly here.
There's possibly a small functionality change here: instead of
`dyn_cast<>`'ing `DV->getScope()` to `MDSubprogram`, I'm looking up the
scope chain for the actual subprogram. I suspect that this is a no-op
for function arguments so in practise there isn't a real difference.
I've also added a `FIXME` to check the `inlinedAt:` chain instead, since
I wonder if that would be more reliable than the
`MDSubprogram::describes()` function.
Since this was the only user of `DIVariable::isInlinedFnArgument()`,
delete it.
`DIGlobalVariable::getGlobal()` isn't really helpful, it just does a
`dyn_cast_or_null<>`. Simplify its only user by doing the cast directly
and delete the code.
StripSymbols: Use DIGlobalVariable::getConstant() instead of getGlobal()
The only difference between the two is a `dyn_cast<>` to
`GlobalVariable`. If optimizations have left anything behind when a
global gets replaced, then it doesn't seem like the debug info is dead.
I can't seem to find an optimization that would leave behind a
non-`GlobalVariable` without nulling the reference entirely, so I
haven't added a testcase (but I'll be deleting `getGlobal()` in a future
commit).
We use dummy calls to adjust the liveness of values over statepoints in the midst of the insertion. If there are no values which need held live, there's no point in actually inserting the holder.
DebugInfo: Migrate DISubprogram::describes() to new hierarchy, NFC
I don't really like this function at all -- I think it should be as
simple as `return getFunction() == F` -- but for now this seems like the
best we can do.
Reapply "Verifier: Check for incompatible bit piece expressions"
This reverts commit r234717, reapplying r234698 (in spirit).
As described in r234717, the original `Verifier` check had a
use-after-free. Instead of storing pointers to "interesting" debug info
intrinsics whose bit piece expressions should be verified once we have
typerefs, do a second traversal. I've added a testcase to catch the
`llc` crasher.
Original commit message:
Verifier: Check for incompatible bit piece expressions
Convert an assertion into a `Verifier` check. Bit piece expressions
must fit inside the variable, and mustn't be the entire variable.
Catching this in the verifier will help us find bugs sooner, and makes
`DIVariable::getSizeInBits()` dead code.
Philip Reames [Mon, 13 Apr 2015 18:07:21 +0000 (18:07 +0000)]
[RewriteStatepointsForGC] Fix a latent bug in normalization for invoke statepoint [NFC]
Since we're restructuring the CFG, we also need to make sure to update the analsis passes. While I'm touching the code, I dedicided to restructure it a bit. The code involved here was very confusing. This change moves the normalization to essentially being a pre-pass before the main insertion work and updates a few comments to actually say what is happening and *why*.
The restructuring should be covered by existing tests. I couldn't easily see how to create a test for the invalidation bug. Suggestions welcome.
Jan Vesely [Mon, 13 Apr 2015 17:47:15 +0000 (17:47 +0000)]
Revert revisions r234755, r234759, r234760
Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)"
Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO"
Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB"
Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll
on hexagon, nvptx, and r600. Revert while I investigate.
Philip Reames [Mon, 13 Apr 2015 17:35:55 +0000 (17:35 +0000)]
[RewriteStatepointsForGC] Strengthen assertions around liveness
This is related to the issues addressed in 234651. These assertions check the properties ensured by that change at the place of use. Note that a similiar property is checked in checkBasicSSA, but without the reachability constraint. Technically, the liveness would be correct to include unreachable values, but this would be problematic for actual relocation.
Philip Reames [Mon, 13 Apr 2015 16:41:32 +0000 (16:41 +0000)]
[RewriteStatepointsForGC] Move an expensive debugging check to XDEBUG
The check in question is attempting to help find cases where we haven't relocated a pointer at a safepoint we should have. It does this by coercing the value to null at any safepoint which doesn't relocate it.
Unfortunately, this turns out to be rather expensive in terms of memory usage and time. The number of stores inserted can grow with O(number of values x number of statepoints). On at least one example I looked at, over half of peak memory usage was coming from this check.
With this change, the check is no longer enabled by default in Asserts builds. It is enabled for expensive asserts builds and has a command line option to enable it in both Asserts and non-Asserts builds.
Jan Vesely [Mon, 13 Apr 2015 16:26:00 +0000 (16:26 +0000)]
R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO
v2: tighten the sub64 tests
v3: rename to CARRY/BORROW
v4: fixup test cmdline
add known bits computation
use sign extend instead of sub 0,x
better add test
v5: remove redundant break
move lowering to separate functions
fix comments
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewers: arsenm
David Blaikie [Mon, 13 Apr 2015 16:04:17 +0000 (16:04 +0000)]
[docs] Update outdated ExtendingLLVM.rst
Summary:
The document is still incomplete in some degrees, but updated to reflect the
latest changes. Anyway we can detail it if any one think it is not enough. For
the sake of it, some useful examples are listed below:
Refer to r113618 "Add X86 MMX type to bitcode and Type" for how to add a new
type.
> One notable change from then is only one thing that ``lib/VMCore`` is renamed
to ``lib/IR``.
Refer to r194760 "Add addrspacecast instruction" for how to add a new
instruction.
John Brawn [Mon, 13 Apr 2015 10:47:39 +0000 (10:47 +0000)]
[ARM] Align global variables passed to memory intrinsics
Fill in the TODO in CodeGenPrepare::OptimizeCallInst so that global
variables that are passed to memory intrinsics are aligned in the same
way that allocas are.
Revert "Verifier: Check for incompatible bit piece expressions"
This reverts commit r234698.
This caused a use-after-free: `QueuedBitPieceExpressions` holds onto
references to `DbgInfoIntrinsic`s and references them past where they're
deleted (this is because the verifier is run as a function pass, and
then `verifyTypeRefs()` is called during `doFinalization()`).
I'll include a reduced crasher for `llc` when I recommit the check.
Petr Hosek [Sun, 12 Apr 2015 23:42:25 +0000 (23:42 +0000)]
[MC] Write padding into fragments when -mc-relax-all flag is used
Summary:
When instruction bundling is enabled and the -mc-relax-all flag is
set, we can write bundle padding directly into fragments and avoid
creating large number of fragments significantly reducing LLVM MC
memory usage.
Hal Finkel [Sun, 12 Apr 2015 17:18:56 +0000 (17:18 +0000)]
[PowerPC] Really iterate over all loops in PPCLoopDataPrefetch/PPCLoopPreIncPrep
When I fixed these a couple of days ago to iterate over all loops, not just
depth == 1 loops, I inadvertently made it such that we'd only look at the first
top-level loop. Make sure that we really look at all of them.
Sanjoy Das [Sun, 12 Apr 2015 01:24:01 +0000 (01:24 +0000)]
[LoopUnrollRuntime] Clean up a predicate.
Clean up a predicate I added in r229731, fix the relevant comment and
add a test case. The earlier version is confusing to read and was also
buggy (probably not a coincidence) till Alexey fixed it in r233881.
Verifier: Check for incompatible bit piece expressions
Convert an assertion into a `Verifier` check. Bit piece expressions
must fit inside the variable, and mustn't be the entire variable.
Catching this in the verifier will help us find bugs sooner, and makes
`DIVariable::getSizeInBits()` dead code.
Add `DIBuilder::replaceTemporary()` as a replacement for
`DIDescriptor::replaceAllUsesWith()`. I'll update clang to use the new
method, and then come back to delete the original.
This method dispatches to `replaceAllUsesWith()` or
`replaceWithUniqued()`, depending on whether the replacement is actually
a different node from the original.
DebugInfo: Move DIScope::getName() and getContext() to MDScope
Continue gutting the `DIDescriptor` hierarchy. In this case, move the
guts of `DIScope::getName()` and `DIScope::getContext()` to
`MDScope::getName()` and `MDScope::getScope()`.
DebugInfo: Rewrite atSameLineAs() as MDLocation::canDiscriminate()
Rewrite `DILocation::atSameLineAs()` as `MDLocation::canDiscriminate()`
with a doxygen comment explaining its purpose. I've added a few FIXMEs
where I think this check is too weak; fixing that is tracked by PR23199.
DebugInfo: Add forwarding getFilename() accessor to new hierarchy
Add forwarding `getFilename()` and `getDirectory()` accessors to nodes
in the new hierarchy that define a `getFile()`. Use that to
re-implement existing functionality in the `DIDescriptor` hierarchy.
Hal Finkel [Sat, 11 Apr 2015 00:33:08 +0000 (00:33 +0000)]
[PowerPC] Fix PPCLoopPreIncPrep for depth > 1 loops
This pass had the same problem as the data-prefetching pass: it was only
checking for depth == 1 loops in practice. Fix that, add some debugging
statements, and make sure that, when we grab an AddRec, it is for the loop we
expect.
Ahmed Bougacha [Sat, 11 Apr 2015 00:06:36 +0000 (00:06 +0000)]
[CodeGen] Split -enable-global-merge into ARM and AArch64 options.
Currently, there's a single flag, checked by the pass itself.
It can't force-enable the pass (and is on by default), because it
might not even have been created, as that's the targets decision.
Instead, have separate explicit flags, so that the decision is
consistently made in the target.
Keep the flag as a last-resort "force-disable GlobalMerge" for now,
for backwards compatibility.
[AArch64] Strengthen the code for the prologue insertion.
The spilled registers are pristine and thus, correctly handled by
the register scavenger and so on, but the liveness information is
strictly speaking wrong at this point.
Fix that.
[WinEH] Recognize SEH finally block inserted by the frontend
This allows winehprepare to build sensible llvm.eh.actions calls for SEH
finally blocks. The pattern matching in this change is brittle and
should be replaced with something more robust soon. In the meantime,
this will let us write the code that produces __C_specific_handler xdata
tables, which we need regardless of how we decide to get finally blocks
through EH preparation.
Tim Northover [Fri, 10 Apr 2015 22:58:48 +0000 (22:58 +0000)]
Generic: Make isMask_N and isShiftedMask_N consistent over 0
Previously, isMask_N returned false for 0 but isShiftedMask_N returned true.
Almost all uses are for pattern matching bitfield operations in the backends,
and expect false (this was discovered because of AArch64's copy of this logic).
Unfortunately, I couldn't put together a small non-fragile test for this. The
nature of the bitfield operations means that this edge case is only really
triggered for nodes like "(and N, 0)", which the DAG combiner is usually very
good at folding away before they get to this stage.
Philip Reames [Fri, 10 Apr 2015 22:53:14 +0000 (22:53 +0000)]
[RewriteStatepointsForGC] Use an actual liveness algorithm
When rewriting statepoints to make relocations explicit, we need to have a conservative but consistent notion of where a particular pointer is live at a particular site. The old code just used dominance, which is correct, but decidedly more conservative then it needed to be. This patch implements a simple dataflow algorithm that's run one per function (well, twice counting fixup after base pointer insertion). There's still lots of room to make this faster, but it's fast enough for all practical purposes today.