The issue was that linkAppendingVarProto does the full linking job, including
deleting the old dst variable. The fix is just to call it and return early
if we have a GV with appending linkage.
original message:
Refactor duplicated code in liking GlobalValues.
There is quiet a bit of logic that is common to any GlobalValue but was
duplicated for Functions, GlobalVariables and GlobalAliases.
While at it, merge visibility even when comdats are used, fixing pr21415.
Revert r221014: "Refactor duplicated code in liking GlobalValues."
This commit introduces heap-use-after-free detected by ASan. Here is the output
for one of several tests that detect it:
******************** TEST 'LLVM :: Linker/AppendingLinkage.ll' FAILED ********************
Command Output (stderr):
--
=================================================================
==2122==ERROR: AddressSanitizer: heap-use-after-free on address 0x60c00000b9c8 at pc 0x0000005d05d1 bp 0x7fff64ed27c0 sp 0x7fff64ed27b8
READ of size 4 at 0x60c00000b9c8 thread T0
#0 0x5d05d0 in llvm::GlobalValue::setUnnamedAddr(bool) /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115:35
#1 0x69fff1 in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1041:5
#2 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
#3 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
#4 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
#5 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
#6 0x41eb71 in _start (/usr/local/google/home/chandlerc/src/llvm/build/bin/llvm-link+0x41eb71)
0x60c00000b9c8 is located 72 bytes inside of 128-byte region [0x60c00000b980,0x60c00000ba00)
freed by thread T0 here:
#0 0x4a1e6b in operator delete(void*) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:94:3
#1 0x5d1a7a in llvm::iplist<llvm::GlobalVariable, llvm::ilist_traits<llvm::GlobalVariable> >::erase(llvm::ilist_iterator<llvm::GlobalVariable>) /usr/local/google/home/chandlerc/src/llvm/build/../inclu
de/llvm/ADT/ilist.h:466:5
#2 0x5d1980 in llvm::GlobalVariable::eraseFromParent() /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/Globals.cpp:204:3
#3 0x6a8a4d in (anonymous namespace)::ModuleLinker::linkAppendingVarProto(llvm::GlobalVariable*, llvm::GlobalVariable const*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.
cpp:980:3
#4 0x6a7403 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1074:11
#5 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
#6 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
#7 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
#8 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
#9 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
previously allocated by thread T0 here:
#0 0x4a192b in operator new(unsigned long) /usr/local/google/home/chandlerc/src/llvm/opt-build/../projects/compiler-rt/lib/asan/asan_new_delete.cc:62:35
#1 0x61d85c in llvm::User::operator new(unsigned long, unsigned int) /usr/local/google/home/chandlerc/src/llvm/build/../lib/IR/User.cpp:57:19
#2 0x6a7525 in (anonymous namespace)::ModuleLinker::linkGlobalVariableProto(llvm::GlobalVariable const*, llvm::GlobalValue*, bool) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkMod
ules.cpp:1100:3
#3 0x69ff4e in (anonymous namespace)::ModuleLinker::linkGlobalValueProto(llvm::GlobalValue*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1028:13
#4 0x697229 in (anonymous namespace)::ModuleLinker::run() /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1485:9
#5 0x696542 in llvm::Linker::linkInModule(llvm::Module*) /usr/local/google/home/chandlerc/src/llvm/build/../lib/Linker/LinkModules.cpp:1621:10
#6 0x4a2db7 in main /usr/local/google/home/chandlerc/src/llvm/build/../tools/llvm-link/llvm-link.cpp:116:9
#7 0x7f4ae61e5ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
SUMMARY: AddressSanitizer: heap-use-after-free /usr/local/google/home/chandlerc/src/llvm/build/../include/llvm/IR/GlobalValue.h:115 llvm::GlobalValue::setUnnamedAddr(bool)
Shadow bytes around the buggy address:
0x0c187fff96e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c187fff96f0: 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa
0x0c187fff9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
0x0c187fff9710: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x0c187fff9720: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
=>0x0c187fff9730: fd fd fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd
0x0c187fff9740: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c187fff9750: fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa fa
0x0c187fff9760: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x0c187fff9770: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x0c187fff9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
ASan internal: fe
==2122==ABORTING
David Blaikie [Sun, 2 Nov 2014 08:51:37 +0000 (08:51 +0000)]
Add DwarfUnit::isDwoUnit and use it to generalize string creation
Currently we only need to emit skeleton strings into the CU header and
we do this by explicitly calling "addLocalString". With gmlt-in-fission,
we'll be emitting a bunch of other strings from other codepaths where
it's not statically known that these strings will be local or not.
Introduce a virtual function to indicate whether this unit is a DWO unit
or not (I'm not sure if we have a good term for this, the
opposite/alternative to 'skeleton' unit) and use that to generalize the
string emission logic so that strings can be correctly emitted in both
the skeleton and dwo unit when in split dwarf mode.
And to demonstrate that this works, switch the existing special callers
of addLocalString in the skeleton builder to addString - and they still
work. Yay.
David Blaikie [Sun, 2 Nov 2014 08:18:06 +0000 (08:18 +0000)]
Remove the last mention of LineTablesOnly from DwarfUnit, sinking it into DwarfCompileUnit
This is a useful distinction/invariant/delination to make because
LineTablesOnly mode is never relevant to type units, so it's clear that
we're not doing weird line-tables-only-with-types by making this API
choice.
It also lays the foundations nicely for adding gmlt-like data to fission
skeleton CUs while limiting the effects to CUs and not TUs.
Use Alias Analysis to hoist 2 loads from diamond to the common predecessor basic block.
Alias Analysis allows to detect real barriers for load hoisting.
David Blaikie [Sun, 2 Nov 2014 06:06:14 +0000 (06:06 +0000)]
Add DwarfUnit::addGlobalType to match DwarfUnit::addGlobalName
(these will shortly become virtual, with a null implementation in
DwarfUnit (since type units don't have accelerator tables in the current
schema) and the current implementation down in DwarfCompileUnit, moving
the actual maps there too)
NAKAMURA Takumi [Sun, 2 Nov 2014 04:43:54 +0000 (04:43 +0000)]
Revert r221056 and others, "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
r221056 "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC."
r221058 "[mips] Fix unused variable warning introduced in r221056"
r221059 "[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC."
r221061 "Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC."
It cuased an undefined behavior in LLVM :: CodeGen/Mips/return-vector.ll.
David Blaikie [Sun, 2 Nov 2014 02:26:24 +0000 (02:26 +0000)]
Don't bother creating LabelBegin for .dwo units
This would help catch cases where we might otherwise try to reference a
dwo CU label, which would be weird - because without relocations in the
dwo file it's not generally meaningful to talk about the CU offsets
there (or, if it is, we can do so in absolute terms without using a
relocation to compute it).
David Blaikie [Sat, 1 Nov 2014 23:07:14 +0000 (23:07 +0000)]
Remove DwarfUnit::LabelEnd in favor of computing the length of the section directly
This was a compile-unit specific label (unused in type units) and seems
unnecessary anyway when we can more easily directly compute the size of
the compile unit.
Daniel Sanders [Sat, 1 Nov 2014 19:17:10 +0000 (19:17 +0000)]
[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC.
Summary:
CCState already contains a byval implementation that is very similar to the
Mips custom code. This patch merges the custom code into the existing
common code and tablegen-erated code.
Daniel Sanders [Sat, 1 Nov 2014 18:38:03 +0000 (18:38 +0000)]
[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC.
Summary:
There are a couple more changes to make before analyzeFormalArguments can
be merged into the standard AnalyzeFormalArguments. I've had to temporarily
poke a couple holes in MipsCCState's encapsulation to save having to make
all the required changes for this merge all at once*. These will be removed
shortly.
* We must merge our ByVal argument handling with the implementation in CCState.
This will be done over the next three patches, then the fourth will merge
analyzeFormalArguments with AnalyzeFormalArguments.
David Blaikie [Sat, 1 Nov 2014 18:18:07 +0000 (18:18 +0000)]
Sink DwarfUnit::Skeleton down into DwarfCompileUnit
Type units no longer have skeletons and it's misleading to be able to
query for a type unit's skeleton (it might incorrectly lead one to
conclude that if a unit doesn't have a skeleton it's not in a .dwo
file... ).
Daniel Sanders [Sat, 1 Nov 2014 18:13:52 +0000 (18:13 +0000)]
[mips] Remove MipsCC::CCInfo. NFC.
Summary:
It's now passed in as an argument to functions that need it. Eventually
this argument will be replaced by the 'this' pointer for a MipsCCState
object.
Daniel Sanders [Sat, 1 Nov 2014 17:44:51 +0000 (17:44 +0000)]
[mips] Removed MipsCC::fixedArgFn(). NFC
Summary:
There is one remaining trace of it in MipsCC::analyzeCallOperands() where
Mips16 might override the calling convention. This will moved into
tablegen-erated code later.
Daniel Sanders [Sat, 1 Nov 2014 17:38:22 +0000 (17:38 +0000)]
[tablegen] Add CustomCallingConv and use it to tablegen-erate the outermost parts of the Mips O32 implementation
Summary:
CustomCallingConv is simply a CallingConv that tablegen should not generate the
implementation for. It allows regular CallingConv's to delegate to these custom
functions. This is (currently) necessary for Mips and we cannot use CCCustom
without having to adapt to the different API that CCCustom uses.
This brings us a bit closer to being able to remove
MipsCC::analyzeCallOperands and MipsCC::analyzeFormalArguments in favour of
the common implementation.
David Blaikie [Sat, 1 Nov 2014 17:21:26 +0000 (17:21 +0000)]
Sink DwarfDebug::AbstractSPDies down into DwarfFile
This is the first big step to allowing gmlt-like inline scope
information in the skeleton CU. While this commit doesn't change the
functionality, it's only a small step to call
"constructAbstractSubprogramDIE" on both the InfoHolder and the
SkeletonHolder (when in use) and that will at least create the abstract
SP dies in that case, though still not creating the other subprograms.
This removes calls to isMaterializable in the following cases:
* It was redundant with a call to isDeclaration now that isDeclaration returns
the correct answer for materializable functions.
* It was followed by a call to Materialize. Just call Materialize and check EC.
NAKAMURA Takumi [Sat, 1 Nov 2014 01:36:14 +0000 (01:36 +0000)]
Revert r220779, "[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine."
Since r221028 (reverting r220777), this caused failures.
Diego Novillo [Sat, 1 Nov 2014 00:56:55 +0000 (00:56 +0000)]
Add show and merge tools for sample PGO profiles.
Summary:
This patch extends the 'show' and 'merge' commands in llvm-profdata to handle
sample PGO formats. Using the 'merge' command it is now possible to convert
one sample PGO format to another.
The only format that is currently not working is 'gcc'. I still need to
implement support for it in lib/ProfileData.
The changes in the sample profile support classes are needed for the
merge operation.
Adrian Prantl [Sat, 1 Nov 2014 00:26:59 +0000 (00:26 +0000)]
Temporarily revert r220777 to sort out build bot breakage.
"[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros."
Add `Instruction::getMDNode()` that casts to `MDNode` before changing
`Instruction::getMetadata()` to return `Value`. This avoids adding
`cast_or_null<MDNode>` boiler-plate throughout the code.
Reid Kleckner [Fri, 31 Oct 2014 23:19:46 +0000 (23:19 +0000)]
Work around bugs in MSVC "14" CTP 3's conversion logic
It appears to ignore or find ambiguous MachineInstrBuilder's conversion
operators that allow conversion to MachineInstr* and
MachineBasicBlock::bundle_iterator.
As a workaround, add an explicit way to get the MachineInstr.
Reid Kleckner [Fri, 31 Oct 2014 22:55:57 +0000 (22:55 +0000)]
Suppress MSVC's equivalent of -Wshadow warnings
IMO we need to clean up some of these, but the member variable one
(C4458) has false positives on static methods. It is currently firing
on Twine, which has a static method like:
struct Twine {
uintptr_t LHS, RHS;
static void staticMethod() {
// warning C4458: declaration of 'LHS' hides class member
uintptr_t LHS;
...
}
};
We should fix up clang's -Wshadow and clean it up, and then we can
re-enable some of these MSVC warnings.
Lang Hames [Fri, 31 Oct 2014 21:37:49 +0000 (21:37 +0000)]
[Object] Modify OwningBinary's interface to separate inspection from ownership.
The getBinary and getBuffer method now return ordinary pointers of appropriate
const-ness. Ownership is transferred by calling takeBinary(), which returns a
pair of the Binary and a MemoryBuffer.
Tom Stellard [Fri, 31 Oct 2014 20:52:04 +0000 (20:52 +0000)]
R600: Don't promote allocas when one of the users is a ptrtoint instruction
We need to figure out how to track ptrtoint values all the
way until result is converted back to a pointer in order
to correctly rewrite the pointer type.
Bill Schmidt [Fri, 31 Oct 2014 19:19:07 +0000 (19:19 +0000)]
[PowerPC] Initial VSX intrinsic support, with min/max for vector double
Now that we have initial support for VSX, we can begin adding
intrinsics for programmer access to VSX instructions. This patch adds
basic support for VSX intrinsics in general, and tests it by
implementing intrinsics for minimum and maximum for the vector double
data type.
The LLVM portion of this is quite straightforward. There is a
companion patch for Clang.
Quentin Colombet [Fri, 31 Oct 2014 17:52:53 +0000 (17:52 +0000)]
[CodeGenPrepare] Move extractelement close to store if they can be combined.
This patch adds an optimization in CodeGenPrepare to move an extractelement
right before a store when the target can combine them.
The optimization may promote any scalar operations to vector operations in the
way to make that possible.
** Context **
Some targets use different register files for both vector and scalar operations.
This means that transitioning from one domain to another may incur copy from one
register file to another. These copies are not coalescable and may be expensive.
For example, according to the scheduling model, on cortex-A8 a vector to GPR
move is 20 cycles.
** Motivating Example **
Let us consider an example:
define void @foo(<2 x i32>* %addr1, i32* %dest) {
%in1 = load <2 x i32>* %addr1, align 8
%extract = extractelement <2 x i32> %in1, i32 1
%out = or i32 %extract, 1
store i32 %out, i32* %dest, align 4
ret void
}
As it is, this IR generates the following assembly on armv7:
vldr d16, [r0] @vector load
vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles
orr r0, r0, #1 @ scalar bitwise or
str r0, [r1] @ scalar store
bx lr
Whereas we could generate much faster code:
vldr d16, [r0] @ vector load
vorr.i32 d16, #0x1 @ vector bitwise or
vst1.32 {d16[1]}, [r1:32] @ vector extract + store
bx lr
Half of the computation made in the vector is useless, but this allows to get
rid of the expensive cross-register-file copy.
** Proposed Solution **
To avoid this cross-register-copy penalty, we promote the scalar operations to
vector operations. The penalty will be removed if we manage to promote the whole
chain of computation in the vector domain.
Currently, we do that only when the chain of computation ends by a store and the
target is able to combine an extract with a store.
Stores are the most likely candidates, because other instructions produce values
that would need to be promoted and so, extracted as some point[1]. Moreover,
this is customary that targets feature stores that perform a vector extract (see
AArch64 and X86 for instance).
The proposed implementation relies on the TargetTransformInfo to decide whether
or not it is beneficial to promote a chain of computation in the vector domain.
Unfortunately, this interface is rather inaccurate for this level of details and
although this optimization may be beneficial for X86 and AArch64, the inaccuracy
will lead to the optimization being too aggressive.
Basically in TargetTransformInfo, everything that is legal has a cost of 1,
whereas, even if a vector type is legal, usually a vector operation is slightly
more expensive than its scalar counterpart. That will lead to too many
promotions that may not be counter balanced by the saving of the
cross-register-file copy. For instance, on AArch64 this penalty is just 4
cycles.
For now, the optimization is just enabled for ARM prior than v8, since those
processors have a larger penalty on cross-register-file copies, and the scope is
limited to basic blocks. Because of these two factors, we limit the effects of
the inaccuracy. Indeed, I did not want to build up a fancy cost model with block
frequency and everything on top of that.
[1] We can imagine targets that can combine an extractelement with other
instructions than just stores. If we want to go into that direction, the current
interfaces must be augmented and, moreover, I think this becomes a global isel
problem.
David Blaikie [Fri, 31 Oct 2014 17:02:30 +0000 (17:02 +0000)]
Update the non-pthreads fallback for RWMutex on Unix
Tested this by #if 0'ing out the pthreads implementation, which
indicated that this fallback was not currently compiling successfully
and applying this patch resolves that.
Bradley Smith [Fri, 31 Oct 2014 11:40:32 +0000 (11:40 +0000)]
[SCEV] Improve Scalar Evolution's use of no {un,}signed wrap flags
In a case where we have a no {un,}signed wrap flag on the increment, if
RHS - Start is constant then we can avoid inserting a max operation bewteen
the two, since we can statically determine which is greater.
Ulrich Weigand [Fri, 31 Oct 2014 10:33:14 +0000 (10:33 +0000)]
[PowerPC] Load BlockAddress values from the TOC in 64-bit SVR4 code
Since block address values can be larger than 2GB in 64-bit code, they
cannot be loaded simply using an @l / @ha pair, but instead must be
loaded from the TOC, just like GlobalAddress, ConstantPool, and
JumpTable values are.
The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where
temporary labels could not be used as TOC values, since code would
attempt (and fail) to use GetOrCreateSymbol to create a symbol of the
same name as the temporary label.
Peter Zotov [Fri, 31 Oct 2014 09:05:36 +0000 (09:05 +0000)]
[OCaml] Rework Llvm_executionengine using ctypes.
Since JIT->MCJIT migration, most of the ExecutionEngine interface
became deprecated and/or broken. This especially affected the OCaml
bindings, as runFunction is no longer available, and unlike in C,
it is not possible to coerce a pointer to a function and call it
in OCaml.
In practice, LLVM 3.5 shipped completely unusable
Llvm_executionengine.
The GenericValue interface and runFunction were essentially
a poor man's FFI. As such, this interface was removed and instead
a dependency on ctypes >=0.3 added, which handled platform-specific
aspects of accessing data and calling functions.
The new interface does not expose JIT (which is a shim around MCJIT),
as well as the interpreter (which can't handle a lot of valid IR).
Llvm_executionengine.add_global_mapping is currently unusable
due to PR20656.