Richard Trieu [Sat, 11 May 2019 02:59:02 +0000 (02:59 +0000)]
[Sparc] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 02:43:58 +0000 (02:43 +0000)]
[RISCV] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure
Richard Trieu [Sat, 11 May 2019 02:33:18 +0000 (02:33 +0000)]
[PowerPC] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 02:09:13 +0000 (02:09 +0000)]
[NVPTX] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:58:52 +0000 (01:58 +0000)]
[MSP430] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:38:56 +0000 (01:38 +0000)]
[Mips] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:25:58 +0000 (01:25 +0000)]
[Lanai] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:13:21 +0000 (01:13 +0000)]
[BPF] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:03:03 +0000 (01:03 +0000)]
[AVR] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 00:34:07 +0000 (00:34 +0000)]
[ARM] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 00:13:01 +0000 (00:13 +0000)]
[ARC] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 00:03:35 +0000 (00:03 +0000)]
[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Fri, 10 May 2019 23:50:01 +0000 (23:50 +0000)]
[AArch64] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Fri, 10 May 2019 23:36:49 +0000 (23:36 +0000)]
[XCore] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Fri, 10 May 2019 23:24:38 +0000 (23:24 +0000)]
[X86] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Philip Reames [Fri, 10 May 2019 22:55:42 +0000 (22:55 +0000)]
Factor out redzone ABI checks [NFCI]
As requested in D58632, cleanup our red zone detection logic in the X86 backend. The existing X86MachineFunctionInfo flag is used to track whether we *use* the redzone (via a particularly optimization?), but there's no common way to check whether the function *has* a red zone.
I'd appreciate careful review of the uses being updated. I think they are NFC, but a careful eye from someone else would be appreciated.
Lang Hames [Fri, 10 May 2019 22:24:37 +0000 (22:24 +0000)]
[JITLink][MachO] Mark atoms in sections 'no-dead-strip' set live by default.
If a MachO section has the no-dead-strip attribute set then its atoms should
be preserved, regardless of whether they're public or referenced elsewhere in
the object.
Craig Topper [Fri, 10 May 2019 22:03:33 +0000 (22:03 +0000)]
[X86] Disable speculative load hardening for operations with an explicit RSP base.
After D58632, we can create idempotent atomic operations to the top of stack.
This confused speculative load hardening because it thinks accesses should have
virtual register base except for the cases it already excluded.
This commit adds a new exclusion for this case. I'll try to reduce a test case
for this, but this fix was verified to work by the reporter. This should avoid
needing to revert D58632.
Craig Topper [Fri, 10 May 2019 21:42:27 +0000 (21:42 +0000)]
[LegalizeVectorOps] Remove calls to LegalizeOp on the return value from ExpandLoad/ExpandStore.
We already updated the LegalizedNodes map at the end of the Expand call. This
would have marked the new node as being mapped to itself. So the LegalizeOp
call will find that an immediately return.
Mircea Trofin [Fri, 10 May 2019 21:27:55 +0000 (21:27 +0000)]
Skip over prefetches
Summary: Skip over prefetches when assigning debug info to instructions with memory operands. This way, the debug info is stable after instrumenting a binary with prefetches, allowing for iterative profiling and instrumentation.
Nikita Popov [Fri, 10 May 2019 20:42:48 +0000 (20:42 +0000)]
[SDAG] Recursively legalize both vector mulo results
Split out from D61692 per RKSimon's suggestion. Vector op
legalization will automatically recursively legalize the returned
SDValue, but we need to take care of the other results ourselves.
Otherwise it will end up getting legalized only during op
legalization, by which point it might be too late (though I'm not
aware of any specific cases right now).
There are codegen differences because expansion occurs earlier now
and we don't get a DAGCombiner run in between.
Teresa Johnson [Fri, 10 May 2019 20:08:24 +0000 (20:08 +0000)]
[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).
Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.
Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.
David Blaikie [Fri, 10 May 2019 19:15:29 +0000 (19:15 +0000)]
DebugInfo: Only move types out of type units if they're named or type united
Follow up to r359122, after a bug was reported in it - the original
change too aggressively tried to move related types out of type units,
which included unnamed types (like array types) which can't reasonably
be declared-but-not-defined.
A step beyond that is that some types in type units can be anonymous, if
they are types with a name for linkage purposes (eg: "typedef struct { }
x;"). So ensure those don't get turned into plain declarations (without
signatures) because, lacking names, they can't be resolved to the
definition.
[Also include a fix for llvm-dwarfdump/libDebugInfoDWARF to pretty print
types in type units]
Amara Emerson [Fri, 10 May 2019 17:29:35 +0000 (17:29 +0000)]
[LSR] Tweak setup cost depth threshold to 10.
The original change introduced a depth limit of 7 which caused a 22% regression
in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold
still ensures that the original test case doesn't hang.
Fangrui Song [Fri, 10 May 2019 17:09:25 +0000 (17:09 +0000)]
[MC][ELF] Copy top 3 bits of st_other to .symver aliases
On PowerPC64 ELFv2 ABI, the top 3 bits of st_other encode the local
entry offset. A versioned symbol alias created by .symver should copy
the bits from the source symbol.
This partly fixes PR41048. A full fix needs tracking of .set assignments
and updating st_other fields when finish() is called, see D56586.
Momchil Velikov [Fri, 10 May 2019 16:54:32 +0000 (16:54 +0000)]
Adjust MachineScheduler to use ProcResource counts
This fix allows the scheduler to take into account the number of instances of
each ProcResource specified. Previously a declaration in a scheduler of
ProcResource<1> would be treated identically to a declaration of
ProcResource<2>. Now the hazard recognizer would report a hazard only after all
of the resource instances are busy.
Robert Lougher [Fri, 10 May 2019 15:55:06 +0000 (15:55 +0000)]
[X86] Avoid SFB - Fix inconsistent codegen with/without debug info
Fixes https://bugs.llvm.org/show_bug.cgi?id=40969
The functions findPotentiallyBlockedCopies and buildCopy are currently not
accounting for the presence of debug instructions. In the former this results
in the optimization not being trigerred, and in the latter results in
inconsistent codegen.
This patch enables the optimization to be performed in a debug build and
ensures the codegen is consistent with non-debug builds.
Nemanja Ivanovic [Fri, 10 May 2019 15:44:56 +0000 (15:44 +0000)]
Another attempt to fix the build bot breaks after r360426
The test case checks were produced by the update_test_checks.py
scripts and I assumed that is sufficient. However, the behaviour
is different with different default target triples. Specify the
triple explicitly in the test case.
If this doesn't clean up the build bot breaks, I'll remove the test
case until I can get to the bottom of why the behaviour on build bots
is different from my machine.
Michael Liao [Fri, 10 May 2019 14:57:42 +0000 (14:57 +0000)]
[InferAddressSpaces] Enhance the handling of cosntexpr.
Summary:
- Constant expressions may not be added in strict postorder as the
forward instruction scan order. Thus, for a constant express (CE0), if
its operand (CE1) is used in an previous instruction, they are not in
postorder. However, different from
`cloneInstructionWithNewAddressSpace`,
`cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred
instructions for later resolving. That results in failure of inferring
constant address.
- This patch adds the support to infer constant expression operand
recursively, since there won't be loop, if that operand is another
constant expression.
James Henderson [Fri, 10 May 2019 12:58:52 +0000 (12:58 +0000)]
[llvm-objcopy] Add additional testing for various cases
This patch adds a number of tests to test various cases not covered by
existing tests. All of them work correctly, with no need to change
llvm-objcopy itself, although some do indicate possible areas for
improvement.
Jeremy Morse [Fri, 10 May 2019 10:03:41 +0000 (10:03 +0000)]
[DebugInfo] Use zero linenos for debug intrinsics when promoting dbg.declare
In certain circumstances, optimizations pick line numbers from debug
intrinsic instructions as the new location for altered instructions. This
is problematic because the line number of a debugging intrinsic is
meaningless (it doesn't produce any machine instruction), only the scope
information is valid. The result can be the line number of a variable
declaration "leaking" into real code from debugging intrinsics, making the
line table un-necessarily jumpy, and potentially different with / without
variable locations.
Fix this by using zero line numbers when promoting dbg.declare intrinsics
into dbg.values: this is safe for debug intrinsics as their line numbers
are meaningless, and reduces the scope for damage / misleading stepping
when optimizations pick locations from the wrong place.
Philip Reames [Thu, 9 May 2019 23:23:42 +0000 (23:23 +0000)]
[X86] Improve lowering of idemptotent RMW operations
The current lowering uses an mfence. mfences are substaintially higher latency than the locked operations originally requested, but we do want to avoid contention on the original cache line. As such, use a locked instruction on a cache line assumed to be thread local.
Lang Hames [Thu, 9 May 2019 23:17:41 +0000 (23:17 +0000)]
[JITLink] Fixed a signedness bug when processing X86_64_RELOC_SUBTRACTOR.
Subtractor relocation addends are signed, so we need to read them via signed
int pointers. Accidentally treating 32-bit addends as unsigned leads to
out-of-range errors when we try to add very large (>INT32_MAX) bogus addends.
Philip Reames [Thu, 9 May 2019 23:13:09 +0000 (23:13 +0000)]
Compile time tweak for libcall lookup
If we have a large module which is mostly intrinsics, we hammer the lib call lookup path from CodeGenPrepare. Adding a fastpath reduces compile by 15% for one such example.
The problem is really more general than intrinsics - a module with lots of non-intrinsics non-libcall calls has the same problem - but we might as well avoid an easy case quickly.
Lang Hames [Thu, 9 May 2019 22:03:57 +0000 (22:03 +0000)]
[JITLink] Improve/fix some JITLink debugging output.
Adds full edge details (rather than just edge targets) when out-of-range errors
are generated. Also fixes a bug where debugging output accessed an invalidated
DenseMap iterator by moving the debugging output above the invalidation point.
Use UNSUPPORTED: windows in shtest-timeout.py. Apparently system-windows does not cover all cases either and the case it doesn't cover affects one of the buildbots.
Use UNSUPPORTED: system-windows instead of REQUIRES: nowindows or UNSUPPORTED: windows. nowindows is not currently defined and windows does not cover all cases. system-windows is also consistent with how other platforms are used.
Simon Pilgrim [Thu, 9 May 2019 17:45:01 +0000 (17:45 +0000)]
[X86][SSE] Fold add(shuffle(),shuffle()) to hadd on 'slow' targets (PR39920)
As reported on PR39920, "slow horizontal ops" targets tend to internally expand to 2*shuffle+add/sub - so if we can reduce 2*shuffle+add/sub to a hadd/sub then we should do it - similar port usage but reduced instruction count.
This works out in most cases, although the "PR22377" regression in vector-shuffle-combining.ll is annoying - going from 2*shuffle+add+shuffle to hadd+2*shuffle - I've opened PR41813 to cover this.
Florian Hahn [Thu, 9 May 2019 17:05:52 +0000 (17:05 +0000)]
[DAGCombiner] Limit number of nodes explored as store candidates.
To find the candidates to merge stores we iterate over all nodes in a chain
for each store, which leads to quadratic compile times for large basic blocks
with a large number of stores.
[MCA] Add support for nested and overlapping region markers
This patch fixes PR41523
https://bugs.llvm.org/show_bug.cgi?id=41523
Regions can now nest/overlap provided that they have different names.
Anonymous regions cannot overlap.
Region end markers must specify the region name. The only exception is for when
there is only one user-defined region; in that particular case, the region end
marker doesn't need to specify a name.
Incorrect region end markers are no longer ignored. Instead, the tool reports an
error and we exit with an error code.
Added test cases to verify the new diagnostic error messages.
Updated the llvm-mca docs to reflect this feature change.
Pavel Labath [Thu, 9 May 2019 15:13:53 +0000 (15:13 +0000)]
MinidumpYAML: add support for the ThreadList stream
Summary:
The implementation is a pretty straightforward extension of the pattern
used for (de)serializing the ModuleList stream. Since there are other
streams which use the same format (MemoryList and MemoryList64, at
least). I tried to generalize the code a bit so that adding future
streams of this type can be done with less code.
David Stuttard [Thu, 9 May 2019 15:02:10 +0000 (15:02 +0000)]
[CodeGenPrepare] Limit recursion depth for collectBitParts
Summary:
Seeing some issues for windows debug pathological cases with collectBitParts
recursion (1525 levels of recursion!)
Setting the limit to 64 as this should be sufficient - passes all lit cases
Sanjay Patel [Thu, 9 May 2019 13:43:22 +0000 (13:43 +0000)]
[LoopVectorizer] fix test file to not run the entire -O3 pipeline
This test file has a long history of edits from changes outside
of vectorization, and it would happen again with the proposal in
D61726.
End-to-end testing shouldn't be happening in a test file that is
specifically checking for vector masked load/store ops.
Larger-scale testing goes in PhaseOrdering or the test-suite.
I've hopefully preserved the intent by taking what was completely
unoptimized IR in some tests and passing that through the -O1
pipeline. That becomes the input IR, and now we just run the loop
vectorizer and verify that the vector masked ops are produced as
expected.
Fangrui Song [Thu, 9 May 2019 12:43:37 +0000 (12:43 +0000)]
[llvm-nm] Fix handling of symbol types 't' 'd' 'r'
This restores part of r359311 that was reverted by r359830.
Rewrite the symbol types to fix several issues.
Notable difference is that the type of __init_array_start changes from
't' to 'd'.
GNU nm used to mark ELF symbols relative to .init_array as 't'
https://sourceware.org/bugzilla/show_bug.cgi?id=24505 (before 2.33)
because ".init" is the prefix. The bug was copied by r287803.
Sam Parker [Thu, 9 May 2019 11:56:16 +0000 (11:56 +0000)]
[ARM][CGP] Guard against signext args and sitofp
Add an Argument that has the SExtAttr attached, as well as SIToFP
instructions, as values that generate sign bits. SIToFP doesn't
strictly do this and could be treated as a sink to be sign-extended.
Commit r360221 ("[Support] Add error handling to
sys::Process::getPageSize().", 2019-05-08) seems to have missed these
uses of getPageSize(). Update them to getPageSizeEstimate().
Hans Wennborg [Thu, 9 May 2019 09:22:56 +0000 (09:22 +0000)]
X86WinAllocaExpander: Drop code looking through register copies (PR41786)
This code was never covered by tests, in PR41786 it was pointed out that
the deletion part doesn't work, and in a full Chrome build I was never
able to hit the code path that looks through copies. It seems the
situation it's supposed to handle doesn't actually come up in practice.
Markus Lavin [Thu, 9 May 2019 08:29:04 +0000 (08:29 +0000)]
Make sub-registers index names case sensitive in the MIRParser
Prior to this change sub-register index names are assumed to be lower
case (but they are printed with original casing). This means that if a
target has some upper case characters in its sub-register names then
mir-export directly followed by mir-import is not possible. This also
means that sub-register indices currently are (and will continue to be)
slightly inconsistent with register names which are printed and assumed
to be lower case.
As the current textual representation of mir has a few inconsistencies
in this area it is a bit arbitrary how to address the matter. This
change is towards the direction that we feel is most correct (i.e. case
sensitivity).
Pengfei Wang [Thu, 9 May 2019 08:09:21 +0000 (08:09 +0000)]
Bugfix for nullptr check by klocwork
Klocwork static check:
Pointer from call to function `DebugLoc::operator DILocation *() const `
may be NULL and will be dereference in function `printExtendedName```
Patch by Shengchen Kan (skan)
Differential Revision: https://reviews.llvm.org/D61715