Lang Hames [Wed, 28 Jan 2015 01:30:37 +0000 (01:30 +0000)]
Revert r227247 and r227228: "Add weak symbol support to RuntimeDyld".
This has wider implications than I expected when I reviewed the patch: It can
cause JIT crashes where clients have used the default value for AbortOnFailure
during symbol lookup. I'm currently investigating alternative approaches and I
hope to have this back in tree soon.
Zachary Turner [Wed, 28 Jan 2015 01:22:33 +0000 (01:22 +0000)]
[llvm-pdbdump] Add basic symbol dumping.
This adds two command line options:
--symbols dumps a list of all symbols found in the PDB.
--symbol-details dumps the same list, but with detailed information
for every symbol such as type, attributes, etc.
Zachary Turner [Wed, 28 Jan 2015 00:33:00 +0000 (00:33 +0000)]
[llvm-pdbdump] Add support for printing source files and compilands.
This adds two command line options to llvm-pdbdump.
--source-files prints a flat list of all source files in the PDB.
--compilands prints a list of all compilands (e.g. object files)
that the PDB knows about, and for each one, a list of
source files that the compiland is composed of as well
as a hash of the original source file.
This commit creates infinite loop in DAG combine for in the LLVM test-suite
for aarch64 with mcpu=cylcone (just having neon may be enough to expose this).
COMDATs must be identically named to the symbol. When support for COMDATs was
introduced, the symbol rewriter was not updated, resulting in rewriting failing
for symbols which were placed into COMDATs. This corrects the behaviour and
adds test cases for this.
Zachary Turner [Tue, 27 Jan 2015 22:40:14 +0000 (22:40 +0000)]
Add support for dumping debug tables to llvm-pdbdump.
PDB stores some of its data in streams and some in tables.
This patch teaches llvm-pdbdump to dump basic summary data
for the debug tables.
In support of this, this patch also adds some DIA helper
classes, such as a wrapper around an IDiaSymbol interface,
as well as helpers for outputting various enumerations to
a raw_ostream.
Summary:
A simple genetic in-process coverage-guided fuzz testing library.
I've used this fuzzer to test clang-format
(it found 12+ bugs, thanks djasper@ for the fixes!)
and it may also help us test other parts of LLVM.
So why not keep it in the LLVM repository?
I plan to add the cmake build rules later (in a separate patch, if that's ok)
and also add a clang-format-fuzzer target.
Ahmed Bougacha [Tue, 27 Jan 2015 21:52:16 +0000 (21:52 +0000)]
[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk.
This was introduced in a faulty refactoring (r225640, mea culpa):
the tests weren't testing the return values, so, for both
__strcpy_chk and __stpcpy_chk, we would return the end of the
buffer (matching stpcpy) instead of the beginning (for strcpy).
The root cause was the prefix "__" being ignored when comparing,
which made us always pick LibFunc::stpcpy_chk.
Pass the LibFunc::Func directly to avoid this kind of error.
Also, make the testcases as explicit as possible to prevent this.
The now-useful testcases expose another, entangled, stpcpy problem,
with the further simplification. This was introduced in a
refactoring (r225640) to match the original behavior.
However, this leads to problems when successive simplifications
generate several similar instructions, none of which are removed
by the custom replaceAllUsesWith.
For instance, InstCombine (the main user) doesn't erase the
instruction in its custom RAUW. When trying to simplify say
__stpcpy_chk:
- first, an stpcpy is created (fortified simplifier),
- second, a memcpy is created (normal simplifier), but the
stpcpy call isn't removed.
- third, InstCombine later revisits the instructions,
and simplifies the first stpcpy to a memcpy. We now have
two memcpys.
Sanjoy Das [Tue, 27 Jan 2015 21:38:12 +0000 (21:38 +0000)]
Teach IRCE to look at branch weights when recognizing range checks
Splitting a loop to make range checks redundant is profitable only if
the range check "never" fails. Make this fact a part of recognizing a
range check -- a branch is a range check only if it is expected to
pass (via branch_weights metadata).
Zachary Turner [Tue, 27 Jan 2015 20:46:21 +0000 (20:46 +0000)]
Add llvm-pdbdump to tools.
llvm-pdbdump is a tool which can be used to dump the contents
of Microsoft-generated PDB files. It makes use of the Microsoft
DIA SDK, which is a COM based library designed specifically for
this purpose.
The initial commit of this tool dumps the raw bytes from PDB data
streams. Future commits will dump more semantic information such
as types, symbols, source files, etc similar to the types of
information accessible via llvm-dwarfdump.
Dmitry Vyukov [Tue, 27 Jan 2015 20:19:17 +0000 (20:19 +0000)]
tsan: properly instrument unaligned accesses
If a memory access is unaligned, emit __tsan_unaligned_read/write
callbacks instead of __tsan_read/write.
Required to change semantics of __tsan_unaligned_read/write to not do the user memory.
But since they were unused (other than through __sanitizer_unaligned_load/store) this is fine.
Fixes long standing issue 17:
https://code.google.com/p/thread-sanitizer/issues/detail?id=17
Keno Fischer [Tue, 27 Jan 2015 20:02:31 +0000 (20:02 +0000)]
[ExecutionEngine] Add weak symbol support to RuntimeDyld
Support weak symbols by first looking up if there is an externally visible symbol we can find,
and only if that fails using the one in the object file we're loading.
Kai Nacke [Tue, 27 Jan 2015 19:11:28 +0000 (19:11 +0000)]
[mips] Add range checks and transformation to octeon instructions in AsmParser.
This patch adds range checks to the immediate operands of octeon
instructions in the AsmParser. Like gas, it applies the following
transformations if the immediate is to large:
Summary:
When LLVM_USE_SANITIZE_COVERAGE=YES
and one of the sanitizers is used -fsanitize-coverage=3 will be added
to build flag. This will be used to run a coverage-guided fuzzer on various
llvm libraries.
Andrea Di Biagio [Tue, 27 Jan 2015 15:58:14 +0000 (15:58 +0000)]
[InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag.
This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by
a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared.
Manuel Jacob [Tue, 27 Jan 2015 13:14:35 +0000 (13:14 +0000)]
Add a FIXME in SelectionDAGBuilder before an assert that is valid only on X86.
When lowering memcpy, memset or memmove, this assert checks whether the pointer
operands are in an address space < 256 which means "user defined address space"
on X86. However, this notion of "user defined address space" does not exist
for other targets.
Eric Christopher [Tue, 27 Jan 2015 08:48:42 +0000 (08:48 +0000)]
Replace some uses of getSubtargetImpl with the cached version
off of the MachineFunction or with the version that takes a
Function reference as an argument.
Eric Christopher [Tue, 27 Jan 2015 07:54:39 +0000 (07:54 +0000)]
Update a few calls to getSubtarget<> to either be getSubtargetImpl
when we didn't need the cast to the base class or the cached version
off of the subtarget.
David Majnemer [Tue, 27 Jan 2015 06:21:43 +0000 (06:21 +0000)]
LoopRotate: Don't walk the uses of a Constant
LoopRotate wanted to avoid live range interference by looking at the
uses of a Value in the loop latch and seeing if any lied outside of the
loop. We would wrongly perform this operation on Constants.
Richard Trieu [Tue, 27 Jan 2015 03:03:47 +0000 (03:03 +0000)]
Revert r227148 & r227154 which added a test which infinitely loops.
r227148 added test CommandLineTest.HideUnrelatedOptionsMulti which repeatedly
outputs two following lines:
-tool: CommandLine Error: Option 'test-option-1' registered more than once!
-tool: CommandLine Error: Option 'test-option-2' registered more than once!
Chandler Carruth [Tue, 27 Jan 2015 02:20:43 +0000 (02:20 +0000)]
[PM] Run clang-format over this header to clean up the very few)
divergent formatting issues. This should prevent any format-only diffs
from sneaking into subsequent changes to port TTI to the new pass
manager.
Chandler Carruth [Tue, 27 Jan 2015 01:34:14 +0000 (01:34 +0000)]
[PM] Refactor the core logic to run EarlyCSE over a function into an
object that manages a single run of this pass.
This was already essentially how it worked. Within the run function, it
would point members at *stack local* allocations that were only live for
a single run. Instead, it seems much cleaner to have a utility object
whose lifetime is clearly bounded by the run of the pass over the
function and can use member variables in a more direct way.
This also makes it easy to plumb the analyses used into it from the pass
and will make it re-usable with the new pass manager.
No functionality changed here, its just a refactoring.
Eric Christopher [Tue, 27 Jan 2015 01:15:16 +0000 (01:15 +0000)]
MachineRegisterInfo can access TII off of the MachineFunction's
subtarget and so doesn't need the TargetMachine or to access via
getSubtargetImpl. Update all callers.
Philip Reames [Mon, 26 Jan 2015 22:40:44 +0000 (22:40 +0000)]
Add test cases for PRE w/volatile loads
These tests check that the combination of 227110 (cross block query inst) and 227112 (volatile load semantics) work together properly to allow PRE in cases where a loop contains a volatile access.
Simon Pilgrim [Mon, 26 Jan 2015 22:29:24 +0000 (22:29 +0000)]
[X86][SSE] Float comparisons can sometimes be safely commuted
For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can.
Zachary Turner [Mon, 26 Jan 2015 22:05:50 +0000 (22:05 +0000)]
Have the UTF conversion wrappers append a null terminator.
This is especially useful for the UTF8 -> UTF16 direction, since
there is no equivalent of llvm::SmallString<> for wide characters.
This means that anyone who wants a null terminated string is forced
to manually push and pop their own null terminator.
Chris Bieneman [Mon, 26 Jan 2015 21:57:29 +0000 (21:57 +0000)]
Add new HideUnrelatedOptions API that takes a SmallVectorImpl.
Need a new API for clang-modernize that allows specifying a list of option categories to remain visible. This will allow clang-modernize to move off getRegisteredOptions.
[x86][MMX] Rename and cleanup tests: arith, intrinsics and shuffle
- Rename mmx-builtins to mmx-intrinsics to match other intrinsic test naming.
- Remove tests that duplicate functionality from mmx-intrinsics.ll.
- Move arith related tests to mmx-arith.ll.
- MMX related shuffle goes to vector-shuffle-mmx.ll.
An unreachable default destination can be exploited by other optimizations and
allows for more efficient lowering. Both the SDag switch lowering and
LowerSwitch can exploit unreachable defaults.
Also make TurnSwitchRangeICmp handle switches with unreachable default.
This is kind of separate change, but it cannot be tested without the change
above, and I don't want to land the change above without this since that would
regress other tests.
Reid Kleckner [Mon, 26 Jan 2015 19:51:00 +0000 (19:51 +0000)]
Add a UTF8 to UTF16 conversion wrapper for use in the pdb dumper
This can also be used instead of the WindowsSupport.h ConvertUTF8ToUTF16
helpers, but that will require massaging some character types. The
Windows support routines want wchar_t output, but wchar_t is often 32
bits on non-Windows OSs.
Eric Christopher [Mon, 26 Jan 2015 19:19:04 +0000 (19:19 +0000)]
Add a FIXME about preferred alignment to DataLayout.
Essentially DataLayout is global and affects the layout of ABI
level objects. Preferred alignment could change on a per function
basis as we change CPU features.
Eric Christopher [Mon, 26 Jan 2015 19:03:15 +0000 (19:03 +0000)]
Move DataLayout back to the TargetMachine from TargetSubtargetInfo
derived classes.
Since global data alignment, layout, and mangling is often based on the
DataLayout, move it to the TargetMachine. This ensures that global
data is going to be layed out and mangled consistently if the subtarget
changes on a per function basis. Prior to this all targets(*) have
had subtarget dependent code moved out and onto the TargetMachine.
*One target hasn't been migrated as part of this change: R600. The
R600 port has, as a subtarget feature, the size of pointers and
this affects global data layout. I've currently hacked in a FIXME
to enable progress, but the port needs to be updated to either pass
the 64-bitness to the TargetMachine, or fix the DataLayout to
avoid subtarget dependent features.
Philip Reames [Mon, 26 Jan 2015 18:54:27 +0000 (18:54 +0000)]
Refine memory dependence's notion of volatile semantics
According to my reading of the LangRef, volatiles are only ordered with respect to other volatiles. It is entirely legal and profitable to forward unrelated loads over the volatile load. This patch implements this for GVN by refining the transition rules MemoryDependenceAnalysis uses when encountering a volatile.
The added test cases show where the extra flexibility is profitable for local dependence optimizations. I have a related change (227110) which will extend this to non-local dependence (i.e. PRE), but that's essentially orthogonal to the semantic change in this patch. I have tested the two together and can confirm that PRE works over a volatile load with both changes. I will be submitting a PRE w/volatiles test case seperately in the near future.
Sanjay Patel [Mon, 26 Jan 2015 18:42:16 +0000 (18:42 +0000)]
Model sqrtsd as a binary operation with one source operand tied to the destination (PR14221)
This patch fixes the following miscompile:
define void @sqrtsd(<2 x double> %a) nounwind uwtable ssp {
%0 = tail call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a) nounwind
%a0 = extractelement <2 x double> %0, i32 0
%conv = fptrunc double %a0 to float
%a1 = extractelement <2 x double> %0, i32 1
%conv3 = fptrunc double %a1 to float
tail call void @callee2(float %conv, float %conv3) nounwind
ret void
}
Current codegen:
sqrtsd %xmm0, %xmm1 ## high element of %xmm1 is undef here
xorps %xmm0, %xmm0
cvtsd2ss %xmm1, %xmm0
shufpd $1, %xmm1, %xmm1
cvtsd2ss %xmm1, %xmm1 ## operating on undef value
jmp _callee
This is a continuation of http://llvm.org/viewvc/llvm-project?view=revision&revision=224624 ( http://reviews.llvm.org/D6330 )
which was itself a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ).
All of these patches are partial fixes for PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 );
this should be the final patch needed to resolve that bug.
Philip Reames [Mon, 26 Jan 2015 18:39:52 +0000 (18:39 +0000)]
Pass QueryInst down through non-local dependency calculation
This change is mostly motivated by exposing information about the original query instruction to the actual scanning work in getPointerDependencyFrom when used by GVN PRE. In a follow up change, I will use this to be more precise with regards to the semantics of volatile instructions encountered in the scan of a basic block.
Worth noting, is that this change (despite appearing quite simple) is not semantically preserving. By providing more information to the helper routine, we allow some optimizations to kick in that weren't previously able to (when called from this code path.) In particular, we see that treatment of !invariant.load becomes more precise. In theory, we might see a difference with an ordered/atomic instruction as well, but I'm having a hard time actually finding a test case which shows that.
Test wise, I've included new tests for !invariant.load which illustrate this difference. I've also included some updated TBAA tests which highlight that this change isn't needed for that optimization to kick in - it's handled inside alias analysis itself.
Eventually, it would be nice to factor the !invariant.load handling inside alias analysis as well.
Philip Reames [Mon, 26 Jan 2015 18:26:35 +0000 (18:26 +0000)]
Revert GCStrategy ownership changes
This change reverts the interesting parts of 226311 (and 227046). This change introduced two problems, and I've been convinced that an alternate approach is preferrable anyways.
The bugs were:
- Registery appears to require all users be within the same linkage unit. After this change, asking for "statepoint-example" in Transform/ would sometimes get you nullptr, whereas asking the same question in CodeGen would return the right GCStrategy. The correct long term fix is to get rid of the utter hack which is Registry, but I don't have time for that right now. 227046 appears to have been an attempt to fix this, but I don't believe it does so completely.
- GCMetadataPrinter::finishAssembly was being called more than once per GCStrategy. Each Strategy was being added to the GCModuleInfo multiple times.
Once I get time again, I'm going to split GCModuleInfo into the gc.root specific part and a GCStrategy owning Analysis pass. I'm probably also going to kill off the Registry. Once that's done, I'll move the new GCStrategyAnalysis and all built in GCStrategies into Analysis. (As original suggested by Chandler.) This will accomplish my original goal of being able to access GCStrategy from Transform/ without adding all of the builtin GCs to IR/.
Zachary Turner [Mon, 26 Jan 2015 18:21:33 +0000 (18:21 +0000)]
Teach raw_ostream to support hex formatting without a prefix '0x'.
Previously using format_hex() would always print a 0x prior to the
hex characters. This allows this to be optional, so that one can
choose to print (e.g.) 255 as either 0xFF or just FF.
Eric Christopher [Mon, 26 Jan 2015 17:33:46 +0000 (17:33 +0000)]
Move the Mips target to storing the ABI in the TargetMachine rather
than on MipsSubtargetInfo.
This required a bit of massaging in the MC level to handle this since
MC is a) largely a collection of disparate classes with no hierarchy,
and b) there's no overarching equivalent to the TargetMachine, instead
only the subtarget via MCSubtargetInfo (which is the base class of
TargetSubtargetInfo).
We're now storing the ABI in both the TargetMachine level and in the
MC level because the AsmParser and the TargetStreamer both need to
know what ABI we have to parse assembly and emit objects. The target
streamer has a pointer to the one in the asm parser and is updated
when the asm parser is created. This is fragile as the FIXME comment
notes, but shouldn't be a problem in practice since we always
create an asm parser before attempting to emit object code via the
assembler. The TargetMachine now contains the ABI so that the DataLayout
can be constructed dependent upon ABI.
All testcases have been updated to use the -target-abi command line
flag so that we can set the ABI without using a subtarget feature.