Sanjay Patel [Sun, 12 May 2019 14:43:20 +0000 (14:43 +0000)]
[DAGCombiner] try to move bitcast after extract_subvector
I noticed that we were failing to narrow an x86 ymm math op in a case similar
to the 'madd' test diff. That is because a bitcast is sitting between the math
and the extract subvector and thwarting our pattern matching for narrowing:
Don Hinton [Sat, 11 May 2019 20:27:01 +0000 (20:27 +0000)]
[CommandLine] Add long option flag for cl::ParseCommandLineOptions . Part 5 of 5
Summary:
If passed, the long option flag makes the CommandLine parser
mimic the behavior or GNU getopt_long. Short options are a single
character prefixed by a single dash, and long options are multiple
characters prefixed by a double dash.
This patch was motivated by the discussion in the following thread:
http://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html
Simon Pilgrim [Sat, 11 May 2019 17:12:52 +0000 (17:12 +0000)]
[CostModel][X86] Add min/max reduction costs for all SSE targets
The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference).
I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1.
Richard Trieu [Sat, 11 May 2019 03:36:16 +0000 (03:36 +0000)]
[SystemZ] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 02:59:02 +0000 (02:59 +0000)]
[Sparc] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 02:43:58 +0000 (02:43 +0000)]
[RISCV] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure
Richard Trieu [Sat, 11 May 2019 02:33:18 +0000 (02:33 +0000)]
[PowerPC] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 02:09:13 +0000 (02:09 +0000)]
[NVPTX] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:58:52 +0000 (01:58 +0000)]
[MSP430] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:38:56 +0000 (01:38 +0000)]
[Mips] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:25:58 +0000 (01:25 +0000)]
[Lanai] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:13:21 +0000 (01:13 +0000)]
[BPF] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 01:03:03 +0000 (01:03 +0000)]
[AVR] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 00:34:07 +0000 (00:34 +0000)]
[ARM] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 00:13:01 +0000 (00:13 +0000)]
[ARC] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Sat, 11 May 2019 00:03:35 +0000 (00:03 +0000)]
[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Fri, 10 May 2019 23:50:01 +0000 (23:50 +0000)]
[AArch64] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Fri, 10 May 2019 23:36:49 +0000 (23:36 +0000)]
[XCore] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Richard Trieu [Fri, 10 May 2019 23:24:38 +0000 (23:24 +0000)]
[X86] Move InstPrinter files to MCTargetDesc. NFC
For some targets, there is a circular dependency between InstPrinter and
MCTargetDesc. Merging them together will fix this. For the other targets,
the merging is to maintain consistency so all targets will have the same
structure.
Philip Reames [Fri, 10 May 2019 22:55:42 +0000 (22:55 +0000)]
Factor out redzone ABI checks [NFCI]
As requested in D58632, cleanup our red zone detection logic in the X86 backend. The existing X86MachineFunctionInfo flag is used to track whether we *use* the redzone (via a particularly optimization?), but there's no common way to check whether the function *has* a red zone.
I'd appreciate careful review of the uses being updated. I think they are NFC, but a careful eye from someone else would be appreciated.
Lang Hames [Fri, 10 May 2019 22:24:37 +0000 (22:24 +0000)]
[JITLink][MachO] Mark atoms in sections 'no-dead-strip' set live by default.
If a MachO section has the no-dead-strip attribute set then its atoms should
be preserved, regardless of whether they're public or referenced elsewhere in
the object.
Craig Topper [Fri, 10 May 2019 22:03:33 +0000 (22:03 +0000)]
[X86] Disable speculative load hardening for operations with an explicit RSP base.
After D58632, we can create idempotent atomic operations to the top of stack.
This confused speculative load hardening because it thinks accesses should have
virtual register base except for the cases it already excluded.
This commit adds a new exclusion for this case. I'll try to reduce a test case
for this, but this fix was verified to work by the reporter. This should avoid
needing to revert D58632.
Craig Topper [Fri, 10 May 2019 21:42:27 +0000 (21:42 +0000)]
[LegalizeVectorOps] Remove calls to LegalizeOp on the return value from ExpandLoad/ExpandStore.
We already updated the LegalizedNodes map at the end of the Expand call. This
would have marked the new node as being mapped to itself. So the LegalizeOp
call will find that an immediately return.
Mircea Trofin [Fri, 10 May 2019 21:27:55 +0000 (21:27 +0000)]
Skip over prefetches
Summary: Skip over prefetches when assigning debug info to instructions with memory operands. This way, the debug info is stable after instrumenting a binary with prefetches, allowing for iterative profiling and instrumentation.
Nikita Popov [Fri, 10 May 2019 20:42:48 +0000 (20:42 +0000)]
[SDAG] Recursively legalize both vector mulo results
Split out from D61692 per RKSimon's suggestion. Vector op
legalization will automatically recursively legalize the returned
SDValue, but we need to take care of the other results ourselves.
Otherwise it will end up getting legalized only during op
legalization, by which point it might be too late (though I'm not
aware of any specific cases right now).
There are codegen differences because expansion occurs earlier now
and we don't get a DAGCombiner run in between.
Teresa Johnson [Fri, 10 May 2019 20:08:24 +0000 (20:08 +0000)]
[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible
Summary:
We hit undefined references building with ThinLTO when one source file
contained explicit instantiations of a template method (weak_odr) but
there were also implicit instantiations in another file (linkonce_odr),
and the latter was the prevailing copy. In this case the symbol was
marked hidden when the prevailing linkonce_odr copy was promoted to
weak_odr. It led to unsats when the resulting shared library was linked
with other code that contained a reference (expecting to be resolved due
to the explicit instantiation).
Add a CanAutoHide flag to the GV summary to allow the thin link to
identify when all copies are eligible for auto-hiding (because they were
all originally linkonce_odr global unnamed addr), and only do the
auto-hide in that case.
Most of the changes here are due to plumbing the new flag through the
bitcode and llvm assembly, and resulting test changes. I augmented the
existing auto-hide test to check for this situation.
David Blaikie [Fri, 10 May 2019 19:15:29 +0000 (19:15 +0000)]
DebugInfo: Only move types out of type units if they're named or type united
Follow up to r359122, after a bug was reported in it - the original
change too aggressively tried to move related types out of type units,
which included unnamed types (like array types) which can't reasonably
be declared-but-not-defined.
A step beyond that is that some types in type units can be anonymous, if
they are types with a name for linkage purposes (eg: "typedef struct { }
x;"). So ensure those don't get turned into plain declarations (without
signatures) because, lacking names, they can't be resolved to the
definition.
[Also include a fix for llvm-dwarfdump/libDebugInfoDWARF to pretty print
types in type units]
Amara Emerson [Fri, 10 May 2019 17:29:35 +0000 (17:29 +0000)]
[LSR] Tweak setup cost depth threshold to 10.
The original change introduced a depth limit of 7 which caused a 22% regression
in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold
still ensures that the original test case doesn't hang.
Fangrui Song [Fri, 10 May 2019 17:09:25 +0000 (17:09 +0000)]
[MC][ELF] Copy top 3 bits of st_other to .symver aliases
On PowerPC64 ELFv2 ABI, the top 3 bits of st_other encode the local
entry offset. A versioned symbol alias created by .symver should copy
the bits from the source symbol.
This partly fixes PR41048. A full fix needs tracking of .set assignments
and updating st_other fields when finish() is called, see D56586.
Momchil Velikov [Fri, 10 May 2019 16:54:32 +0000 (16:54 +0000)]
Adjust MachineScheduler to use ProcResource counts
This fix allows the scheduler to take into account the number of instances of
each ProcResource specified. Previously a declaration in a scheduler of
ProcResource<1> would be treated identically to a declaration of
ProcResource<2>. Now the hazard recognizer would report a hazard only after all
of the resource instances are busy.
Robert Lougher [Fri, 10 May 2019 15:55:06 +0000 (15:55 +0000)]
[X86] Avoid SFB - Fix inconsistent codegen with/without debug info
Fixes https://bugs.llvm.org/show_bug.cgi?id=40969
The functions findPotentiallyBlockedCopies and buildCopy are currently not
accounting for the presence of debug instructions. In the former this results
in the optimization not being trigerred, and in the latter results in
inconsistent codegen.
This patch enables the optimization to be performed in a debug build and
ensures the codegen is consistent with non-debug builds.
Nemanja Ivanovic [Fri, 10 May 2019 15:44:56 +0000 (15:44 +0000)]
Another attempt to fix the build bot breaks after r360426
The test case checks were produced by the update_test_checks.py
scripts and I assumed that is sufficient. However, the behaviour
is different with different default target triples. Specify the
triple explicitly in the test case.
If this doesn't clean up the build bot breaks, I'll remove the test
case until I can get to the bottom of why the behaviour on build bots
is different from my machine.
Michael Liao [Fri, 10 May 2019 14:57:42 +0000 (14:57 +0000)]
[InferAddressSpaces] Enhance the handling of cosntexpr.
Summary:
- Constant expressions may not be added in strict postorder as the
forward instruction scan order. Thus, for a constant express (CE0), if
its operand (CE1) is used in an previous instruction, they are not in
postorder. However, different from
`cloneInstructionWithNewAddressSpace`,
`cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred
instructions for later resolving. That results in failure of inferring
constant address.
- This patch adds the support to infer constant expression operand
recursively, since there won't be loop, if that operand is another
constant expression.
James Henderson [Fri, 10 May 2019 12:58:52 +0000 (12:58 +0000)]
[llvm-objcopy] Add additional testing for various cases
This patch adds a number of tests to test various cases not covered by
existing tests. All of them work correctly, with no need to change
llvm-objcopy itself, although some do indicate possible areas for
improvement.
Jeremy Morse [Fri, 10 May 2019 10:03:41 +0000 (10:03 +0000)]
[DebugInfo] Use zero linenos for debug intrinsics when promoting dbg.declare
In certain circumstances, optimizations pick line numbers from debug
intrinsic instructions as the new location for altered instructions. This
is problematic because the line number of a debugging intrinsic is
meaningless (it doesn't produce any machine instruction), only the scope
information is valid. The result can be the line number of a variable
declaration "leaking" into real code from debugging intrinsics, making the
line table un-necessarily jumpy, and potentially different with / without
variable locations.
Fix this by using zero line numbers when promoting dbg.declare intrinsics
into dbg.values: this is safe for debug intrinsics as their line numbers
are meaningless, and reduces the scope for damage / misleading stepping
when optimizations pick locations from the wrong place.
Philip Reames [Thu, 9 May 2019 23:23:42 +0000 (23:23 +0000)]
[X86] Improve lowering of idemptotent RMW operations
The current lowering uses an mfence. mfences are substaintially higher latency than the locked operations originally requested, but we do want to avoid contention on the original cache line. As such, use a locked instruction on a cache line assumed to be thread local.
Lang Hames [Thu, 9 May 2019 23:17:41 +0000 (23:17 +0000)]
[JITLink] Fixed a signedness bug when processing X86_64_RELOC_SUBTRACTOR.
Subtractor relocation addends are signed, so we need to read them via signed
int pointers. Accidentally treating 32-bit addends as unsigned leads to
out-of-range errors when we try to add very large (>INT32_MAX) bogus addends.
Philip Reames [Thu, 9 May 2019 23:13:09 +0000 (23:13 +0000)]
Compile time tweak for libcall lookup
If we have a large module which is mostly intrinsics, we hammer the lib call lookup path from CodeGenPrepare. Adding a fastpath reduces compile by 15% for one such example.
The problem is really more general than intrinsics - a module with lots of non-intrinsics non-libcall calls has the same problem - but we might as well avoid an easy case quickly.
Lang Hames [Thu, 9 May 2019 22:03:57 +0000 (22:03 +0000)]
[JITLink] Improve/fix some JITLink debugging output.
Adds full edge details (rather than just edge targets) when out-of-range errors
are generated. Also fixes a bug where debugging output accessed an invalidated
DenseMap iterator by moving the debugging output above the invalidation point.