Vedant Kumar [Sun, 26 Jun 2016 02:45:13 +0000 (02:45 +0000)]
[llvm-cov] Simplify the way expansion views are rendered (NFC)
If a sub-view has already been rendered, it's helpful to re-render the
expansion site before rendering the next expansion view. Make this fact
explicit in the rendering interface, instead of hiding it behind an
awkward Optional<LineRef> parameter.
Craig Topper [Sat, 25 Jun 2016 19:05:29 +0000 (19:05 +0000)]
[X86] Convert ==/!= comparisons with -1 for checking undef in shuffle lowering to comparisons of <0 or >=0. While there do the same for other kinds of index checks that can just check for greater than 0. No functional change intended.
Jan Vesely [Sat, 25 Jun 2016 18:24:16 +0000 (18:24 +0000)]
AMDGPU/R600: Fix GlobalValue regressions.
Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary
expressions by including offset even if the offset it 0
(we haven't hit this sooner since tested workloads don't include static offsets)
We don't really care about the type of expression, so set it directly. Fixes: r272705
Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32. Fixes: r273166
Differential Revision: http://reviews.llvm.org/D21633
David Majnemer [Sat, 25 Jun 2016 08:04:19 +0000 (08:04 +0000)]
[SimplifyCFG] Stop inserting calls to llvm.trap for UB
SimplifyCFG had logic to insert calls to llvm.trap for two very
particular IR patterns: stores and invokes of undef/null.
While InstCombine canonicalizes certain undefined behavior IR patterns
to stores of undef, phase ordering means that this cannot be relied upon
in general.
There are much better tools than llvm.trap: UBSan and ASan.
N.B. I could be argued into reverting this change if a clear argument as
to why it is important that we synthesize llvm.trap for stores, I'd be
hard pressed to see why it'd be useful for invokes...
[AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.
Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
- offset 0: work group ID x
- offset 4: work group ID y
- offset 8: work group ID z
- offset 16: work item ID x
- offset 20: work item ID y
- offset 24: work item ID z
Set
- amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
- amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
- amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled
Vedant Kumar [Sat, 25 Jun 2016 02:58:30 +0000 (02:58 +0000)]
[llvm-cov] Separate presentation logic from formatting logic, NFC
This makes it easier to add renderers for new kinds of output formats.
- Define and document a pure-virtual coverage rendering interface.
- Move the text-based rendering logic into its a new file.
- Re-work the API to better reflect the presentation/formatting split.
Matthias Braun [Sat, 25 Jun 2016 02:03:36 +0000 (02:03 +0000)]
MachineScheduler: Remember top/bottom choice in bidirectional scheduling
Remember the last choice for the top/bottom scheduling boundary in
bidirectional scheduling mode. The top choice should not change if we
schedule at the bottom and vice versa.
This allows us to improve compiletime: We only recalculate the best pick
for one border and re-use the cached top-pick from the other border.
David Majnemer [Sat, 25 Jun 2016 00:55:12 +0000 (00:55 +0000)]
The absence of noreturn doesn't ensure mayReturn
There are two separate issues:
- LLVM doesn't consider infinite loops to be side effects: we happily
hoist/sink above/below loops whose bounds are unknown.
- The absence of the noreturn attribute is insufficient for us to know
if a function will definitely return. Relying on noreturn in the
middle-end for any property is an accident waiting to happen.
This intrinsic safely loads a function pointer from a virtual table pointer
using type metadata. This intrinsic is used to implement control flow integrity
in conjunction with virtual call optimization. The virtual call optimization
pass will optimize away llvm.type.checked.load intrinsics associated with
devirtualized calls, thereby removing the type check in cases where it is
not needed to enforce the control flow integrity constraint.
This patch also introduces the capability to copy type metadata between
global variables, and teaches the virtual call optimization pass to do so.
In bidirectional scheduling this gives more stable results than just
comparing the "reason" fields of the top/bottom node because the reason
field may be higher depending on what other nodes are in the queue.
David Majnemer [Sat, 25 Jun 2016 00:04:10 +0000 (00:04 +0000)]
Reinstate r273711
r273711 was reverted by r273743. The inliner needs to know about any
call sites in the inlined function. These were obscured if we replaced
a call to undef with an undef but kept the call around.
Matthias Braun [Fri, 24 Jun 2016 23:52:11 +0000 (23:52 +0000)]
AMDGPU: Define a schedule class for COPY.
COPY was lacking a scheduling class, define it to avoid regressions in
the upcoming change to the bidirectional MachineScheduler. Approved by
tstellar on IRC.
Adrian Prantl [Fri, 24 Jun 2016 21:35:09 +0000 (21:35 +0000)]
Fix the type signature of DwarfExpression::Add.*Constant to support values >32 bits.
This fixes an embarrassing bug when emitting .debug_loc entries for 64-bit+ constants,
which were previously silently truncated to 32 bits.
IR: New representation for CFI and virtual call optimization pass metadata.
The bitset metadata currently used in LLVM has a few problems:
1. It has the wrong name. The name "bitset" refers to an implementation
detail of one use of the metadata (i.e. its original use case, CFI).
This makes it harder to understand, as the name makes no sense in the
context of virtual call optimization.
2. It is represented using a global named metadata node, rather than
being directly associated with a global. This makes it harder to
manipulate the metadata when rebuilding global variables, summarise it
as part of ThinLTO and drop unused metadata when associated globals are
dropped. For this reason, CFI does not currently work correctly when
both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable
globals, and fails to associate metadata with the rebuilt globals. As I
understand it, the same problem could also affect ASan, which rebuilds
globals with a red zone.
This patch solves both of those problems in the following way:
1. Rename the metadata to "type metadata". This new name reflects how
the metadata is currently being used (i.e. to represent type information
for CFI and vtable opt). The new name is reflected in the name for the
associated intrinsic (llvm.type.test) and pass (LowerTypeTests).
2. Attach metadata directly to the globals that it pertains to, rather
than using the "llvm.bitsets" global metadata node as we are doing now.
This is done using the newly introduced capability to attach
metadata to global variables (r271348 and r271358).
See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html
This patch moves MSSA's caching walker into MemorySSA, and moves the
actual definition of MSSA's caching walker out of MemorySSA.h. This is
done in preparation for the new walker, which should be out for review
soonish.
Also, this patch removes a field from UpwardsMemoryQuery and has a few
lines of diff from clang-format'ing MemorySSA.cpp.
Chris Bieneman [Fri, 24 Jun 2016 20:42:28 +0000 (20:42 +0000)]
[obj2yaml] [yaml2obj] Support for MachO Universal binaries
This patch adds round-trip support for MachO Universal binaries to obj2yaml and yaml2obj. Universal binaries have a header and list of architecture structures, followed by a the individual object files at specified offsets.
Sanjay Patel [Fri, 24 Jun 2016 18:26:02 +0000 (18:26 +0000)]
[InstCombine] consolidate commutation variants of matchSelectFromAndOr() in one place; NFCI
By putting all the possible commutations together, we simplify the code.
Note that this is NFCI, but I'm adding tests that actually exercise each
commutation pattern because we don't have this anywhere else.
Kevin Enderby [Fri, 24 Jun 2016 18:24:42 +0000 (18:24 +0000)]
Thread Expected<...> up from libObject’s getSymbolAddress() for symbols to allow
a good error message to be produced.
This is nearly the last libObject interface that used ErrorOr and the last one
that appears in llvm/include/llvm/Object/MachO.h . For Mach-O objects this is
just a clean up because it’s version of getSymbolAddress() can’t return an
error.
I will leave it to the experts on COFF and ELF to actually add meaning full
error messages in their tests if they wish. And also leave it to these experts
to change the last two ErrorOr interfaces in llvm/include/llvm/Object/ObjectFile.h
for createCOFFObjectFile() and createELFObjectFile() if they wish.
Since there are no test cases for COFF and ELF error cases with respect to
getSymbolAddress() in the test suite this is no functional change (NFC).
Kyle Butt [Fri, 24 Jun 2016 18:16:36 +0000 (18:16 +0000)]
Codegen: Fix broken assumption in Tail Merge.
Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.
Reid Kleckner [Fri, 24 Jun 2016 17:55:40 +0000 (17:55 +0000)]
[codeview] Emit parameter variables in the right order
Clang emits them in reverse order to conform to the ABI, which requires
left-to-right destruction. As a result, the order doesn't fall out
naturally, and we have to sort things out in the backend.
Reid Kleckner [Fri, 24 Jun 2016 17:23:49 +0000 (17:23 +0000)]
[codeview] Use one byte for S_FRAMECOOKIE CookieKind and add flags byte
We bailed out while printing codeview for an MSVC compiled
SemaExprCXX.cpp that used this record. The MS reference headers look
incorrect here, which is probably why we had this bug. They use a 32-bit
enum as the field type, but the actual record appears to use one byte
for the cookie kind followed by a flags byte.
Reid Kleckner [Fri, 24 Jun 2016 16:24:24 +0000 (16:24 +0000)]
[codeview] Emit base class information from DW_TAG_inheritance nodes
There are two remaining issues here:
1. No vbptr information
2. Need to mention indirect virtual bases
Getting indirect virtual bases is just a matter of adding an "indirect"
flag, emitting them in the frontend, and ignoring them when appropriate
for DWARF.
All virtual bases use the same artificial vbptr field, so I think the
vbptr offset will be best represented by an implicit __vbptr$ClassName
member similar to our existing __vptr$ member.
Matthew Simpson [Fri, 24 Jun 2016 15:33:25 +0000 (15:33 +0000)]
[LV] Preserve order of dependences in interleaved accesses analysis
The interleaved access analysis currently assumes that the inserted run-time
pointer aliasing checks ensure the absence of dependences that would prevent
its instruction reordering. However, this is not the case.
Issues can arise from how code generation is performed for interleaved groups.
For a load group, all loads in the group are essentially moved to the location
of the first load in program order, and for a store group, all stores in the
group are moved to the location of the last store. For groups having members
involved in a dependence relation with any other instruction in the loop, this
reordering can violate the dependence.
This patch teaches the interleaved access analysis how to avoid breaking such
dependences, and should fix PR27626.
An assumption of the original analysis was that the accesses had been collected
in "program order". The analysis was then simplified by visiting the accesses
bottom-up. However, this ordering was never guaranteed for anything other than
single basic block loops. Thus, this patch also enforces the desired ordering.
Artur Pilipenko [Fri, 24 Jun 2016 15:10:29 +0000 (15:10 +0000)]
Remangle intrinsics names when types are renamed
This is a resubmittion of previously reverted rL273568.
This is a fix for the problem mentioned in "LTO and intrinsics mangling" llvm-dev mail thread:
http://lists.llvm.org/pipermail/llvm-dev/2016-April/098387.html
ExecutionEngine: add preliminary support for COFF ARM
This adds rudimentary support for COFF ARM to the dynamic loader for the
exeuction engine. This can be used by lldb to JIT code into a COFF ARM
environment. This lays the foundation for the loader, though a few of the
relocation types are yet unhandled.
Anna Thomas [Fri, 24 Jun 2016 12:38:45 +0000 (12:38 +0000)]
[LICM] Avoid repeating expensive call while promoting loads. NFC
Summary:
We can avoid repeating the check `isGuaranteedToExecute` when it's already called once while checking if the alignment can be widened for the load/store being hoisted.
The function is invariant for the same instruction `UI` in `isGuaranteedToExecute(*UI, DT, CurLoop, SafetyInfo);`
Hubert Tong [Fri, 24 Jun 2016 11:34:16 +0000 (11:34 +0000)]
Add FixedSizeStorage to TrailingObjects; NFC
Summary: This change introduces two types, `FixedSizeStorage` and `FixedSizeStorageOwner`, which can be used to provide stack-allocated objects with trailing objects.
Matt Arsenault [Fri, 24 Jun 2016 06:30:11 +0000 (06:30 +0000)]
AMDGPU: Cleanup subtarget handling.
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.