Justin Bogner [Tue, 18 Nov 2014 05:00:52 +0000 (05:00 +0000)]
docs: Fix a couple of typo-ish errors in WritingAnLLVMPass
- Make CallGraphSCCPass's paragraph about doFinalization refer to
runOnSCC instead of runOnFunction, since that's what it's about.
- Fix a reference in the FunctionPass paragraph.
Frederic Riss [Tue, 18 Nov 2014 02:46:11 +0000 (02:46 +0000)]
Allow DwarfCompileUnit::constructImportedEntityDIE to instanciate a GlobalVariable DIE.
Usually global variables are in a retain list and instanciated before
any call to constructImportedEntityDIE is made. This isn't true for
forward declarations though.
The testcase for this change is generated by a clang patched to emit
such forward declarations (patch at http://reviews.llvm.org/D6173
which will land soon). The updated testcase tests more than just
global variables, it now tests every type of 'using' clause we
support.
IR: Move MDNode operands from the back to the front
Having the operands at the back prevents subclasses from safely adding
fields. Move them to the front.
Instead of replicating the custom `malloc()`, `free()` and `DestroyFlag`
logic that was there before, overload `new` and `delete`.
I added calls to a new `GenericMDNode::dropAllReferences()` in
`LLVMContextImpl::~LLVMContextImpl()`. There's a maze of callbacks
happening during teardown, and this resolves them before we enter
the destructors.
IR: Split MDNode into GenericMDNode and MDNodeFwdDecl
Split `MDNode` into two classes:
- `GenericMDNode`, which is uniquable (and for now, always starts
uniqued). Once `Metadata` is split from the `Value` hierarchy, this
class will lose the ability to RAUW itself.
- `MDNodeFwdDecl`, which is used for the "temporary" interface, is
never uniqued, and isn't managed by `LLVMContext` at all.
I've left most of the guts in `MDNode` for now, but I'll incrementally
move things to the right places (or delete the functionality, as
appropriate).
Change uniquing from a `FoldingSet` to a `DenseSet` with custom
`DenseMapInfo`. Unfortunately, this doesn't save any memory, since
`DenseSet<T>` is a simple wrapper for `DenseMap<T, char>`, but I'll come
back to fix that later.
I used the name `GenericDenseMapInfo` to the custom `DenseMapInfo` since
I'll be splitting `MDNode` into two classes soon: `MDNodeFwdDecl` for
temporaries, and `GenericMDNode` for everything else.
I also added a non-debug-info reduced version of a type-uniquing test
that started failing on an earlier draft of this patch.
David Blaikie [Mon, 17 Nov 2014 22:55:41 +0000 (22:55 +0000)]
Revert "Improve memory ownership/management in TableGen by unique_ptrifying TreePattern's Tree member."
This reverts commit r222183.
Broke on the MSVC buildbots due to MSVC not producing default move
operations - I'd fix it immediately but just broke my build system a
bit, so backing out until I have a chance to get everything going again.
David Blaikie [Mon, 17 Nov 2014 22:16:55 +0000 (22:16 +0000)]
Improve memory ownership/management in TableGen by unique_ptrifying TreePattern's Tree member.
The next step is to actually use unique_ptr in TreePatternNode's
Children vector. That will be more intrusive, and may not work,
depending on exactly how these things are handled (I have a bad
suspicion things are shared more than they should be, making this more
DAG than tree - but if it's really a tree, unique_ptr should suffice)
Matt Arsenault [Mon, 17 Nov 2014 21:11:37 +0000 (21:11 +0000)]
R600/SI: Don't copy flags when extracting subreg
This was resulting in use of a register after a kill.
For some reason this showed up as a problem in many tests
when moving the SIFixSGPRCopies pass closer to instruction
selection.
David Majnemer [Mon, 17 Nov 2014 11:27:45 +0000 (11:27 +0000)]
ScalarEvolution: Construct SCEVDivision's Derived type instead of itself
SCEVDivision::divide constructed an object of SCEVDivision<Derived>
instead of Derived. divide would call visit which would cast the
SCEVDivision<Derived> to type Derived. As it happens,
SCEVDivision<Derived> and Derived currently have the same layout but
this is fragile and grounds for UB.
Instead, just construct Derived. No functional change intended.
Oliver Stannard [Mon, 17 Nov 2014 11:18:10 +0000 (11:18 +0000)]
[Thumb1] Re-write emitThumbRegPlusImmediate
This was motivated by a bug which caused code like this to be
miscompiled:
declare void @take_ptr(i8*)
define void @test() {
%addr1.32 = alloca i8
%addr2.32 = alloca i32, i32 1028
call void @take_ptr(i8* %addr1)
ret void
}
This was emitting the following assembly to get the value of %addr1:
add r0, sp, #1020
add r0, r0, #8
However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this
could not be assembled. The generated object file contained this,
resulting in r0 holding SP+8 rather tha SP+1028:
add r0, sp, #1020
add r0, sp, #8
This function looked like it could have caused miscompilations for
other combinations of registers and offsets (though I don't think it is
currently called with these), and the heuristic it used did not match
the emitted code in all cases.
David Majnemer [Mon, 17 Nov 2014 11:17:17 +0000 (11:17 +0000)]
Object, COFF: Tighten the object file parser
We were a little lax in a few areas:
- We pretended that import libraries were like any old COFF file, they
are not. In fact, they aren't really COFF files at all, we should
probably grow some specialized functionality to handle them smarter.
- Our symbol iterators were more than happy to attempt to go past the
end of the symbol table if you had a symbol with a bad list of
auxiliary symbols.
Oliver Stannard [Mon, 17 Nov 2014 10:49:31 +0000 (10:49 +0000)]
Fix optimisations of SELECT_CC which assumed result is boolean
Some optimisations in DAGCombiner cause miscompilations for targets that use
TargetLowering::UndefinedBooleanContent, because they assume that the results
of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and
XORed. These optimisations are only valid for targets that use
ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent.
Yaron Keren [Mon, 17 Nov 2014 09:29:33 +0000 (09:29 +0000)]
silence gcc 4.9.1 warning in /llvm/lib/Support/Windows/Path.inc:564:39:
warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (ec = widenPath(path, path_utf16))
Erik Eckstein [Mon, 17 Nov 2014 09:13:57 +0000 (09:13 +0000)]
Optimize switch lookup tables with linear mapping.
This is a simple optimization for switch table lookup:
It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output.
Example:
int f1(int x) {
switch (x) {
case 0: return 10;
case 1: return 11;
case 2: return 12;
case 3: return 13;
}
return 0;
}
Craig Topper [Mon, 17 Nov 2014 05:50:14 +0000 (05:50 +0000)]
Move register class name strings to a single array in MCRegisterInfo to reduce static table size and number of relocation entries.
Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table.
Rafael Espindola [Mon, 17 Nov 2014 02:28:27 +0000 (02:28 +0000)]
Add back r222061 with a fix.
This adds back r222061, but now calls initializePAEvalPass from the correct
library to avoid link problems.
Original message:
Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
David Majnemer [Sun, 16 Nov 2014 20:35:19 +0000 (20:35 +0000)]
ScalarEvolution: Introduce SCEVSDivision and SCEVUDivision
It turns out that not all users of SCEVDivision want the same
signedness. Let the users determine which operation they'd like by
explicitly choosing SCEVUDivision or SCEVSDivision.
findArrayDimensions and computeAccessFunctions will use SCEVSDivision
while HowFarToZero will use SCEVUDivision.
Jingyue Wu [Sun, 16 Nov 2014 16:52:44 +0000 (16:52 +0000)]
[DependenceAnalysis] Allow subscripts of different types
Summary:
Several places in DependenceAnalysis assumes both SCEVs in a subscript pair
share the same integer type. For instance, isKnownPredicate calls
SE->getMinusSCEV(X, Y) which asserts X and Y share the same type. However,
DependenceAnalysis fails to ensure this assumption when producing a subscript
pair, causing tests such as NonCanonicalizedSubscript to crash. With this
patch, DependenceAnalysis runs unifySubscriptType before producing any
subscript pair, ensuring the assumption.
Test Plan:
Added NonCanonicalizedSubscript.ll on which DependenceAnalysis before the fix
crashed because subscripts have different types.
David Majnemer [Sun, 16 Nov 2014 07:30:35 +0000 (07:30 +0000)]
ScalarEvolution: HowFarToZero was wrongly using signed division
HowFarToZero was supposed to use unsigned division in order to calculate
the backedge taken count. However, SCEVDivision::divide performs signed
division. Unless I am mistaken, no users of SCEVDivision actually want
signed arithmetic: switch to udiv and urem.
David Majnemer [Sun, 16 Nov 2014 02:20:08 +0000 (02:20 +0000)]
InstSimplify: Optimize ICmpInst xform that uses computeKnownBits
A few things:
- computeKnownBits is relatively expensive, let's delay its use as long
as we can.
- Don't create two APInt values just to run computeKnownBits on a
ConstantInt, we already know the exact value!
- Avoid creating a temporary APInt value in order to calculate unary
negation.
This patch teaches the DAGCombiner how to combine shuffles according to rules:
shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2)
shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2)
shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2)
David Majnemer [Sat, 15 Nov 2014 02:03:59 +0000 (02:03 +0000)]
yaml2obj, COFF: Consider the DOS stub when laying out section headers
While this program worked correctly with small example programs, larger
ones tickled this bug. I'm working on a reduction because my program is
quite large.
Tom Stellard [Sat, 15 Nov 2014 01:07:53 +0000 (01:07 +0000)]
R600: Factor i64 UDIVREM lowering into its own fuction
This is so it could potentially be used by SI. However, the current
implementation does not always produce correct results, so the
IntegerDivisionPass is being used instead.
Now that `MDString` and `MDNode` have a common base class, use it. Note
that it's not useful to assume subclasses of `Metadata` must be one or
the other since we'll be adding more subclasses soon enough.
Reid Kleckner [Fri, 14 Nov 2014 23:31:07 +0000 (23:31 +0000)]
Rename EH related stuff to be more precise
Summary:
The current "WinEH" exception handling type is more about Itanium-style
LSDA tables layered on top of the Windows native unwind info format
instead of .eh_frame tables or EHABI unwind info. Use the name
"ItaniumWinEH" to better reflect the hybrid nature of the design.
Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions,
since the LSDA is part of the Itanium C++ ABI document, and not the
DWARF standard.
Rafael Espindola [Fri, 14 Nov 2014 23:17:47 +0000 (23:17 +0000)]
Don't make assumptions about the name of private global variables.
Private variables are can be renamed, so it is not reliable to make
decisions on the name.
The name is also dropped by the assembler before getting to the
linker, so using the name causes a disconnect between how llvm makes a
decision (var name) and how the linker makes a decision (section it is
in).
This patch changes one case where we were looking at the variable name to use
the section instead.
Tim Northover [Fri, 14 Nov 2014 22:45:33 +0000 (22:45 +0000)]
ARM: refactor .cfi_def_cfa_offset emission.
We use to track quite a few "adjusted" offsets through the FrameLowering code
to account for changes in the prologue instructions as we went and allow the
emission of correct CFA annotations. However, we were missing a couple of cases
and the code was almost impenetrable.
It's easier to just add any stack-adjusting instruction to a list and emit them
together.
Tim Northover [Fri, 14 Nov 2014 22:45:31 +0000 (22:45 +0000)]
ARM: correctly calculate the offset of FP in its push.
When we folded the DPR alignment gap into a push, we weren't noting the extra
distance from the beginning of the push to the FP, and so FP ended up pointing
at an incorrect offset.
The .cfi_def_cfa_offset directives are still wrong in this case, but I think
that can be improved by refactoring.
Tim Northover [Fri, 14 Nov 2014 22:45:23 +0000 (22:45 +0000)]
ARM: simplify test.
The test's DWARF stubs were there just to trigger the emission of .cfi
directives. Fortunately, the NetBSD ABI already demands proper DWARF unwind
info, so it's easier to just use that triple.
David Majnemer [Fri, 14 Nov 2014 21:21:15 +0000 (21:21 +0000)]
InstCombine: Fix infinite loop caused by visitFPTrunc
We would attempt to replace a fptrunc of an frem with an identical
fptrunc. This would cause the new fptrunc to be added to the worklist.
Of course, this results in an infinite loop because we will keep
visiting the newly created fptruncs.
Chad Rosier [Fri, 14 Nov 2014 21:09:13 +0000 (21:09 +0000)]
Reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"
This commit updates the failing test in
Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll
The failing test is sensitive to the order in which we process loads. This
version turns on the RPO traversal instead of the while DT traversal in GVN.
The new test code is functionally same just the order of loads that are
eliminated is swapped.
This new version also fixes an issue where GVN splits a critical edge and
potentially invalidate the RPO/DT iterator.
Bill Schmidt [Fri, 14 Nov 2014 21:05:45 +0000 (21:05 +0000)]
Change order of tablegen generated fast-isel instruction code to be
based on instruction complexity
The order that tablegen fast-isel instruction code is generated is
currently based on the text of the predicate (using string
less-than). This patch changes this to instead use the instruction
complexity. Because the complexities are not unique a C++ multimap is
used instead of a map.
This fixes the problem where code with no predicate always comes out
first (the empty string always compares as less than all other
strings) thus making the code with predicates dead code. See the FMUL
code in PPCFastISel.cpp for an example. It also more closely matches
the normal codegen ordering. Some error checking in the tablegen
fast-isel code is fixed as well.
Tom Stellard [Fri, 14 Nov 2014 20:43:26 +0000 (20:43 +0000)]
R600/SI: Fix spilling of m0 register
If we have spilled the value of the m0 register, then we need to restore
it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't
write to m0.
Frederic Riss [Fri, 14 Nov 2014 20:33:40 +0000 (20:33 +0000)]
COFF: Add support for Dwarf accelerator tables.
This allows COFF targets to emit accelerator tables
when requested by -dwarf-accel-tables=Enable instead
of aborting. The test DebugInfo/cross-cu-inlining.ll
covers this on COFF platforms.
Frederic Riss [Fri, 14 Nov 2014 19:30:08 +0000 (19:30 +0000)]
[dwarfdump] Handle relocations in Dwarf accelerator tables
ELF targets (and maybe COFF) use relocations when referring
to strings in the .debug_str section. Handle that in the
accelerator table dumper. This commit restores the
test/DebugInfo/cross-cu-inlining.ll test to its expected
platform independant form, validating that the fix works
(this test failed on linux boxes).
Frederic Riss [Fri, 14 Nov 2014 17:08:18 +0000 (17:08 +0000)]
Tentatively appease the bots.
If this workaround gets the bots green, then we have to find out
why the -dwarf-accel-tables=Enable option doesn't work as
expected on non-darwin platforms.
Chad Rosier [Fri, 14 Nov 2014 17:08:15 +0000 (17:08 +0000)]
[Reassociate] Canonicalize operands of vector binary operators.
Prior to this commit fmul and fadd binary operators were being canonicalized for
both scalar and vector versions. We now canonicalize add, mul, and, or, and xor
vector instructions.
Frederic Riss [Fri, 14 Nov 2014 16:15:53 +0000 (16:15 +0000)]
Reapply "[dwarfdump] Add support for dumping accelerator tables."
This reverts commit r221842 which was a revert of r221836 and of the
test parts of r221837.
This new version fixes an UB bug pointed out by David (along with
addressing some other review comments), makes some dumping more
resilient to broken input data and forces the accelerator tables
to be dumped in the tests where we use them (this decision is
platform specific otherwise).