Silviu Baranga [Mon, 10 Aug 2015 14:50:54 +0000 (14:50 +0000)]
[TTI] Add a hook for specifying per-target defaults for Interleaved Accesses
Summary:
This adds a hook to TTI which enables us to selectively turn on by default
interleaved access vectorization for targets on which we have have performed
the required benchmarking.
Fraser Cormack [Mon, 10 Aug 2015 14:48:47 +0000 (14:48 +0000)]
Prevent the scalarizer from caching incorrect entries
The scalarizer can cache incorrect entries when walking up a chain of
insertelement instructions. This occurs when it encounters more than one
instruction that it is not actively searching for, as it unconditionally caches
every element it finds. The fix is to only cache the first element that it
isn't searching for so we don't overwrite correct entries.
Michael Kruse [Mon, 10 Aug 2015 13:21:59 +0000 (13:21 +0000)]
[RegionInfo] Add debug-time region viewer functions
Summary:
Analogously to Function::viewCFG(), RegionInfo::view() and RegionInfo::viewOnly() are meant to be called in debugging sessions. They open a viewer to show how RegionInfo currently understands the region hierarchy.
The functions viewRegion(Function*) and viewRegionOnly(Function*) invoke a fresh region analysis of the function in contrast to viewRegion(RegionInfo*) and viewRegionOnly(RegionInfo*) which show the current analysis result.
Robert Lougher [Mon, 10 Aug 2015 11:59:44 +0000 (11:59 +0000)]
Trace copies when checking for rematerializability in spill weight calculation
PR24139 contains an analysis of poor register allocation. One of the findings
was that when calculating the spill weight, a rematerializable interval once
split is no longer rematerializable. This is because the isRematerializable
check in CalcSpillWeights.cpp does not follow the copies introduced by live
range splitting (after splitting, the live interval register definition is a
copy which is not rematerializable).
The SP was always unconditionally assigned to later, but initialised early.
This delays the initialisation, and avoids the dead store. Identified by
clang static analysis. No functional change intended.
David Majnemer [Sun, 9 Aug 2015 15:43:02 +0000 (15:43 +0000)]
[PHITransAddr] Don't assume that instruction operands are translatable
We can only PHI translate instructions. In our attempt to PHI translate
a bitcast, we attempt to translate its operand; however, the operand
might be an argument or a global instead of an instruction. Benignly
bail out when this happens.
Justin Bogner [Sat, 8 Aug 2015 21:04:45 +0000 (21:04 +0000)]
cmake: Error on invalid CMAKE_BUILD_TYPE
Apparently if you make a typo in the argument to CMAKE_BUILD_TYPE,
cmake silently accepts this but doesn't apply any particular build
type to your build. This means you get a build that doesn't really
make any sense - it's sort of a debug build with asserts disabled.
Yaron Keren [Sat, 8 Aug 2015 21:03:19 +0000 (21:03 +0000)]
Fix dangling reference in DwarfLinker.cpp. The original code
Seq.emplace_back(Seq.back());
does not work as planned, since Seq.back() may become a dangling reference
when emplace_back is called and possibly reallocates vector. To avoid this,
the vector allocation should be reserved first and only then used.
This broke test/tools/dsymutil/X86/custom-line-table.test with Visual C++ 2013.
Craig Topper [Sat, 8 Aug 2015 07:20:04 +0000 (07:20 +0000)]
Add SlowBTMem to Sandy Bridge and newer Intel CPUs. Reading through Agner Fog's table suggests there have been no improvements to these processors relative to Westmere for bit test instructions.
Adam Nemet [Fri, 7 Aug 2015 22:44:15 +0000 (22:44 +0000)]
[LAA] Make the set of runtime checks part of the state of LAA, NFC
This is the full set of checks that clients can further filter. IOW,
it's client-agnostic. This makes LAA complete in the sense that it now
provides the two main results of its analysis precomputed:
1. memory dependences via getDepChecker().getInsterestingDependences()
2. run-time checks via getRuntimePointerCheck().getChecks()
However, as a consequence we now compute this information pro-actively.
Thus if the client decides to skip the loop based on the dependences
we've computed the checks unnecessarily. In order to see whether this
was a significant overhead I checked compile time on SPEC2k6 LTO bitcode
files. The change was in the noise.
The checks are generated in canCheckPtrAtRT, at the same place where we
used to call groupChecks to merge checks.
Tom Stellard [Fri, 7 Aug 2015 22:00:56 +0000 (22:00 +0000)]
AMDGPU/SI: Use InstAlias instead of MnemonicAlias for VOPC instructions
Summary:
With InstAlias, we don't need to print the _e32 portion of the mnemonic
when we print the $dst operand. This change makes it possible to
include vcc in the asm string when we switch VOPC over to having
implicit vcc defs.
Alex Lorenz [Fri, 7 Aug 2015 20:21:00 +0000 (20:21 +0000)]
MIR Parser: Extract the parsing of the operand's offset into a new method. NFC.
This commit extract the code that parses the 64-bit offset from the method
'parseOperandsOffset' to a new method 'parseOffset' so that we can reuse it
when parsing the offset for the machine memory operands.
Chen Li [Fri, 7 Aug 2015 19:30:12 +0000 (19:30 +0000)]
[ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst
Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion.
Simon Pilgrim [Fri, 7 Aug 2015 18:22:50 +0000 (18:22 +0000)]
[InstCombine] Fix SSE2/AVX2 vector logical shift by constant
This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes.
e.g.
%1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>)
In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) | 15) - giving a zero result.
In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821).
Nico Weber [Fri, 7 Aug 2015 17:47:03 +0000 (17:47 +0000)]
Add functions to save and restore the PrettyStackTrace state.
PrettyStackTraceHead is a LLVM_THREAD_LOCAL, which means it's just a global
in LLVM_ENABLE_THREADS=NO builds. If a CrashRecoveryContext is used with
code that uses PrettyStackEntries, and a crash happens, PrettyStackTraceHead is
currently not reset to its pre-crash value. These functions make it possible
to add a cleanup to such code that does this.
(Not reseting the value then causes the assert in ~PrettyStackTraceEntry() to
fire if the code outside of the CrashRecoveryContext also uses
PrettyStackEntries -- for example, clang when building a module.)
We're actually -Wmissing-field-initializers clean thanks to the cmake
build so check and turn on -Wmissing-field-initializers. While there,
reorganize the conditional warning code based on compiler to be a bit
more obvious and inside a switch statement.
Frederic Riss [Fri, 7 Aug 2015 15:14:13 +0000 (15:14 +0000)]
[dsymutil] Use the new MCDwarfLineTableParams customization to emit linetables
llvm-dsymutil has to be able to process debug info produced by other compilers
which use different line table settings. The testcase wasn't generated by
another compiler, but by a modified clang.
Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files.
ValueMapper: Resolve uniquing cycles more aggressively
As a follow-up to r244181, resolve uniquing cycles underneath distinct
nodes on the fly. This prevents uniquing cycles in early operands from
affecting later operands. It also removes an iteration through distinct
nodes' operands.
No real functional change here, just more prompt resolution of temporary
nodes.
Alex Lorenz [Thu, 6 Aug 2015 23:57:04 +0000 (23:57 +0000)]
MIR Serialization: Fix serialization of unnamed IR block references.
The block address machine operands can reference IR blocks in other functions.
This commit fixes a bug where the references to unnamed IR blocks in other
functions weren't serialized correctly.
Alex Lorenz [Thu, 6 Aug 2015 23:17:42 +0000 (23:17 +0000)]
MIR Parser: Simplify the token's string value handling.
This commit removes the 'StringOffset' and 'HasStringValue' fields from the
MIToken struct and simplifies the 'stringValue' method which now returns
the new 'StringValue' field.
This commit also adopts a different way of initializing the lexed tokens -
instead of constructing a new MIToken instance, the lexer resets the old token
using the new 'reset' method and sets its attributes using the new
'setStringValue', 'setOwnedStringValue', and 'setIntegerValue' methods.