Simon Atanasyan [Wed, 21 Aug 2019 18:54:51 +0000 (18:54 +0000)]
[mips] Replace call `expandLoadAddress` by `loadAndAddSymbolAddress`. NFC
In case of expanding `lw/sw $reg, symbol($reg)` instruction for PIC it's
enough to call the `loadAndAddSymbolAddress` method. Additional work
performed by the `expandLoadAddress` is not required here.
Amaury Sechet [Wed, 21 Aug 2019 18:51:08 +0000 (18:51 +0000)]
[DAGCombiner] Remove mostly redundant calls to AddToWorklist
Summary:
These calls change the order in which some nodes are processed and so have an effect on codegen.
The change in fixup-bw-copy.ll is due to (and (load anyext)) gets transformed into (load zext) while previously the and was removed by SimplifyDemandedBits, so the (load anyext) remained.
Florian Hahn [Wed, 21 Aug 2019 18:20:11 +0000 (18:20 +0000)]
[BitcodeReader] Check if we can create a null constant for type.
We cannot create null constants for certain types, e.g. VoidTy,
FunctionTy or LabelTy. getNullValue asserts if we pass in an
unsupported type. We should also check for opaque types, but I'm not
sure how.
This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14795.
Jordan Rupprecht [Wed, 21 Aug 2019 18:00:17 +0000 (18:00 +0000)]
[docs] Convert remaining command guide entries from md to rst.
Summary:
Linking between markdown and rst files is currently not supported very well, e.g. the current llvm-addr2line docs [1] link to "llvm-symbolizer" instead of "llvm-symbolizer.html". This is weirdly broken in different ways depending on which versions of sphinx and recommonmark are being used, so workaround the bug by using rst everywhere.
Mitch Phillips [Wed, 21 Aug 2019 17:53:51 +0000 (17:53 +0000)]
[GWP-ASan] Add public-facing documentation [6].
Summary:
Note: Do not submit this documentation until Scudo support is reviewed and submitted (should be #[5]).
See D60593 for further information.
This patch introduces the public-facing documentation for GWP-ASan, as well as updating the definition of one of the options, which wasn't properly merged. The document describes the design and features of GWP-ASan, as well as how to use GWP-ASan from both a user's standpoint, and development documentation for supporting allocators.
Alina Sbirlea [Wed, 21 Aug 2019 17:00:57 +0000 (17:00 +0000)]
[LoopPassManager + MemorySSA] Only enable use of MemorySSA for LPMs known to preserve it.
Summary:
Add a flag to the FunctionToLoopAdaptor that allows enabling MemorySSA only for the loop pass managers that are known to preserve it.
If an LPM is known to have only loop transforms that *all* preserve MemorySSA, then use MemorySSA if `EnableMSSALoopDependency` is set.
If an LPM has loop passes that do not preserve MemorySSA, then the flag passed is `false`, regardless of the value of `EnableMSSALoopDependency`.
When using a custom loop pass pipeline via `passes=...`, use keyword `loop` vs `loop-mssa` to use MemorySSA in that LPM. If a loop that does not preserve MemorySSA is added while using the `loop-mssa` keyword, that's an error.
Add the new `loop-mssa` keyword to a few tests where a difference occurs when enabling MemorySSA.
Matt Arsenault [Wed, 21 Aug 2019 16:59:10 +0000 (16:59 +0000)]
GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources
This is necessary for handling <3 x s16> on AMDGPU, assuming this
should be handled as 2 separate legalization actions. The alternative
would be for fewerElementsVector to handle 3->2.
[LLVM][Alignment] Introduce Alignment In MachineFrameInfo
Summary:
This is patch is part of a serie to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Sanjay Patel [Wed, 21 Aug 2019 11:56:08 +0000 (11:56 +0000)]
[InstCombine] narrow icmp with extended operands of different widths
An intermediate extend is used to widen the narrow operand to the width of
the other (wider) operand. At that point, we have the same logic as the
existing transform that was restricted to folds of equal width zext/sext.
This mostly solves PR42700:
https://bugs.llvm.org/show_bug.cgi?id=42700
Sam McCall [Wed, 21 Aug 2019 11:37:06 +0000 (11:37 +0000)]
[gtest] Fix printing of StringRef and SmallString in assert messages.
Summary:
These are detected by gtest as containers, and so previously printed as e.g.
{ '.' (46, 0x2E), 's' (115, 0x73), 'e' (101, 0x65), 'c' (99, 0x63), '0' (48, 0x30) },
gtest itself overloads PrintTo for std::string and friends, we use the same mechanism.
Pavel Labath [Wed, 21 Aug 2019 11:30:48 +0000 (11:30 +0000)]
MinidumpYAML: move serialization code to MinidumpEmitter.cpp
Summary:
The code for serializing minidumps was living in MinidumpYAML.cpp
so that it would be accessible from unit tests. While this had its
advantages, it was also unfortunate because it broke symmetry with all
other yaml2obj serializers.
Fortunately, nowadays all of yaml2obj is a library, so we don't need to
do anything special. This patch improves the code consistency by moving
the serialization code to MinidumpEmitter.cpp to match the style used in
other backends. It also removes the writeAsBinary entry point in favor
of the more general convertYAML interface.
This patch is just massaging the code a bit. There shouldn't be any
functional change here.
George Rimar [Wed, 21 Aug 2019 11:07:31 +0000 (11:07 +0000)]
[llvm-objdump] - Cleanup the error reporting.
The error reporting function are not consistent.
Before this change:
* They had inconsistent naming (e.g. 'error' vs 'report_error').
* Some of them reported the object name, others - dont.
* Some of them accepted the case when there was no error. (i.e. error code or Error had a success value).
This patch tries to cleanup it a bit.
It also renames report_error -> reportError, report_warning -> reportWarning
and removes a full stop from messages.
Jeremy Morse [Wed, 21 Aug 2019 09:22:31 +0000 (09:22 +0000)]
[DebugInfo] Avoid dropping location info across block boundaries
LiveDebugValues propagates variable locations between blocks by creating
new DBG_VALUE insts in the successors, then interpreting them when it
passes back through the block at a later time. However, this flushes out
any extra information about the location that LiveDebugValues holds: for
example, connections between variable locations such as discussed in
D65368. And as reported in PR42772 this causes us to lose track of the
fact that a spill-location is actually a spill, not a register location.
This patch fixes that by deferring the creation of propagated DBG_VALUEs
until after propagation has completed: instead location propagation occurs
only by sharing location ID numbers between blocks.
Luke Cheeseman [Wed, 21 Aug 2019 09:09:56 +0000 (09:09 +0000)]
[AArch64] Update MTE system register encodings
The encodings for the system registers TFSRE0_EL1, TFSR_EL1 TFSR_EL2, TFSR_EL3
and TFSR_EL12 have been changed so that they consistently have CRn=5 and CRm=6
as per https://developer.arm.com/docs/ddi0487/latest.
Serge Guelton [Wed, 21 Aug 2019 07:54:42 +0000 (07:54 +0000)]
Be explicit about Windows coff name trailing character policy
It's okay to *not* copy the trailing zero of a windows section/symbol name.
This is compatible with strncpy behavior but gcc doesn't know that and
throws an invalid warning. Encode this behavior in a proper function.
Raphael Isemann [Wed, 21 Aug 2019 07:39:17 +0000 (07:39 +0000)]
[NFC] Mark CallTargetComparator() as const to fix libc++ warnings
We currently get this warning when compiling with libc++:
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/set:454:26: warning: the specified comparator type does not provide a const call operator [-Wuser-defined-warnings]
static_assert(sizeof(__diagnose_non_const_comparator<_Key, _Compare>()), "");
^
llvm-project/llvm/include/llvm/ProfileData/SampleProf.h:193:29: note: in instantiation of template class 'std::__1::set<std::__1::pair<llvm::StringRef, unsigned long long>, llvm::sampleprof::SampleRecord::CallTargetComparator, std::__1::allocator<std::__1::pair<llvm::StringRef, unsigned long long> > >' requested here
const SortedCallTargetSet getSortedCallTargets() const {
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__tree:967:5: note: from 'diagnose_if' attribute on '__diagnose_non_const_comparator<std::__1::pair<llvm::StringRef, unsigned long long>, llvm::sampleprof::SampleRecord::CallTargetComparator>':
_LIBCPP_DIAGNOSE_WARNING(!std::__invokable<_Compare const&, _Tp const&, _Tp const&>::value,
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__config:1320:21: note: expanded from macro '_LIBCPP_DIAGNOSE_WARNING'
__attribute__((diagnose_if(__VA_ARGS__, "warning")))
^ ~~~~~~~~~~~
1 warning generated.
Chris Bieneman [Wed, 21 Aug 2019 01:48:28 +0000 (01:48 +0000)]
Autogenerate the shebang lines for tools/opt-viewer
Summary:
Since these files depend on the built python modules, they need to use
the right python binary to run them. So use configure_file
to set the right shebang line.
Andrew Trick [Tue, 20 Aug 2019 23:29:28 +0000 (23:29 +0000)]
Add TinyPtrVector support for general pointer-like things.
In particular, make TinyPtrVector<PtrIntPair<T *, 1>> work. Remove all
unnecessary assumptions that the element type has a formal "null"
representation. The important property to maintain is that
default-constructed element type has the same internal representation
as the default-constructed PointerUnion (all zero bits).
Remove the incorrect recursive behavior from
PointerUnion::isNull. This was never generally correct because it only
recursed over the first type parameter. With variadic templates it's
completely unnecessary.
Sean Fertile [Tue, 20 Aug 2019 23:24:47 +0000 (23:24 +0000)]
Fix assert in XCOFFObjectWriter related to program code csects.
Removed code that added program code csects to a collection as part
of addressing review comments, but I failed to update an assert affected
by the change before commiting.
Alina Sbirlea [Tue, 20 Aug 2019 22:29:06 +0000 (22:29 +0000)]
[MemorySSA] Fix existing phis when inserting defs.
Summary:
When inserting a new Def, and inserting Phis in the IDF when needed,
also mark the already existing Phis in the IDF as non-optimized, since
these may need fixing as well.
In the test attached, there is a Phi in the IDF that happens to be
trivial, and is wrongfully removed by the call to getLastDef that
follows. This is a valid situation and the existing IDF Phis need to
marked as "may need fixing" as well.
Resolves PR43044.
Jessica Paquette [Tue, 20 Aug 2019 22:18:06 +0000 (22:18 +0000)]
[AArch64][GlobalISel] Select patterns which use shifted register operands
This adds GlobalISel equivalents for the following from AArch64InstrFormats:
- arith_shifted_reg32
- arith_shifted_reg64
And partial support for
- logical_shifted_reg32
- logical_shifted_reg32
The only thing missing for the logical cases is support for rotates. Other than
the missing support, the transformation is identical for the arithmetic shifted
register and the logical shifted register.
Lots of tests here:
- Add select-arith-shifted-reg.mir to show that we correctly select add and
sub instructions which use this pattern.
- Add select-logical-shifted-reg.mir to cover patterns which are not shared
between the arithmetic and logical cases.
- Update addsub-shifted.ll to show that we correctly fold shifts into
adds/subs.
- Update eon.ll to show that we can select the eon instruction by folding xors.
Craig Topper [Tue, 20 Aug 2019 22:12:50 +0000 (22:12 +0000)]
[DAGCombiner][X86] Teach visitCONCAT_VECTORS to combine (concat_vectors (concat_vectors X, Y), undef)) -> (concat_vectors X, Y, undef, undef)
I also had to add a new combine to X86's combineExtractSubvector to prevent a regression.
This helps our vXi1 code see the full concat operation and allow it optimize undef to a zero if there is already a zero in the concat. This helped us use a movzx instead of an AND in some of the tests. In those tests, one concat comes from SelectionDAGBuilder and the second comes from type legalization of v4i1->i4 bitcasts which uses an additional concat. Though these changes weren't my original motivation.
I'm looking at making X86ISelLowering's narrowShuffle emit a concat_vectors instead of an insert_subvector since concat_vectors is more canonical during early DAG combine. This patch helps prevent a regression from my experiments with that.
Sean Fertile [Tue, 20 Aug 2019 22:03:18 +0000 (22:03 +0000)]
Adds support for writing the .bss section for XCOFF object files.
Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target
object writer. Also adds a class to represent the top-level sections, which we
materialize in the ObjectWriter.
executePostLayoutBinding will map all csects into the appropriate
container depending on its storage mapping class, and map all symbols
into their containing csect. Once all symbols have been processed we
- Assign addresses and symbol table indices.
- Calaculte section sizes.
- Build the section header table.
- Assign the sections raw-pointer value for non-virtual sections.
Since the .bss section is virtual, writing the header table is enough to
add support. Writing of a sections raw data, or of any relocations is
not included in this patch.
Testing is done by dumping the section header table, but it needs to be
extended to include dumping the symbol table once readobj support for
dumping auxiallary entries lands.
Martin Storsjo [Tue, 20 Aug 2019 20:58:02 +0000 (20:58 +0000)]
[test] Fix tests when run on windows after SVN r369426. NFC.
When running tests on windows, invoking "llc -march=<arch>" will
implicitly use windows as the target os, making these tests misbehave
after this change.
Fix the issue by using more specific -mtriple values instead of plain
-march in these tests.
This should hopefully fix buildbot failures like
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/9816.
Wenlei He [Tue, 20 Aug 2019 20:52:00 +0000 (20:52 +0000)]
[AutoFDO] Make call targets order deterministic for sample profile
Summary:
StringMap is used for storing call target to frequency map for AutoFDO. However the iterating order of StringMap is non-deterministic, which leads to non-determinism in AutoFDO profile output. Now new API getSortedCallTargets and SortCallTargets are added for deterministic ordering and output.
Roundtrip test for text profile and binary profile is added.
Jinsong Ji [Tue, 20 Aug 2019 20:45:16 +0000 (20:45 +0000)]
[llvm-extract] Update the help message for group extraction feature
Summary:
https://reviews.llvm.org/D60973 exposed the group extraction feature of
the BlockExtractor to llvm-extract.
However, the help message was not updated, so users might not be able to
know how to use this feature without looking into history/commits.
This patch just update the help message to show how to use this group
extraction feature.
Craig Topper [Tue, 20 Aug 2019 20:20:04 +0000 (20:20 +0000)]
[X86] Add a DAG combine to transform (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) -> (i8 (trunc (i16 (bitcast (v16i1 X))))) on KNL target
Without AVX512DQ we don't have KMOVB so we can't really copy 8-bits of a k-register to a GPR. We have to copy 16 bits instead. We do this even if the DAG copy is from v8i1->v16i1. If we detect the (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) we should rewrite the types to match the copy we do support. By doing this, we can help known bits to propagate without losing the upper 8 bits of the input to the extract_subvector. This allows some zero extends to be removed since we have an isel pattern to use kmovw for (zero_extend (i16 (bitcast (v16i1 X))).
Craig Topper [Tue, 20 Aug 2019 19:43:48 +0000 (19:43 +0000)]
[X86] Add isel patterns for (i64 (zext (i8 (bitcast (v16i1 X))))) to use a KMOVW and a SUBREG_TO_REG. Similar for i8 and anyextend.
We already had patterns for extending to i32 to take advantage of
the impliciting zeroing of the upper bits of a 32-bit GPR that is
done by KMOVW/KMOVB. But the extend might be all the way to i64,
in which case the existing patterns would fail and we'd get a
KMOVW/B followed by a MOVZX. By adding patterns for i64 we can
use the fact that KMOVW/B zero the upper bits of the 32-bit GPR
and the normal property that 32-bit GPR writes implicitly zero the
upper 32-bits of the full 64-bit GPR.
The anyextend patterns are slightly different since we don't care
about the upper zeros. For the i8->i64 I think this avoids selecting
the anyextend as a MOVZX to prevent a partial register issue that
doesn't exist. For i16->i64 I think we would have just emitted an
insert_subreg on top of the extract_subreg that the vXi16->i16
bitcast pattern emits. The register coalescer or peephole pass
should combine those, but this saves that work and makes i8/16
consistent.
Sam Clegg [Tue, 20 Aug 2019 18:39:24 +0000 (18:39 +0000)]
[WebAssembly][lld] Fix crash when applying relocations to debug sections
Debug sections are special in that they can contain relocations against
symbols that are not present in the final output (i.e. not live).
However it is also possible to have R_WASM_TABLE_INDEX relocations
against symbols that don't have a table index assigned (since they are
not address taken by actual code.
Thomas Raoux [Tue, 20 Aug 2019 15:54:59 +0000 (15:54 +0000)]
[CodeGen] Add a pass to do block predication on SSA machine IR.
For targets requiring aggressive scheduling and/or software pipeline we need to
apply predication before preRA scheduling. This adds a pass re-using the early
if-cvt infrastructure but generating predicated instructions instead of
speculatively executing instructions. It allows doing if conversion on blocks
containing instructions with side-effects. The pass re-use the target hook from
postRA if-conversion to let the target decide on the heuristic to apply.
Sanjay Patel [Tue, 20 Aug 2019 14:56:44 +0000 (14:56 +0000)]
[InstCombine] improve readability for icmp with cast folds; NFC
1. Update function name and stale code comments.
2. Use variable names that are less ambiguous.
3. Move operand checks into the function as early exits.
Jinsong Ji [Tue, 20 Aug 2019 14:46:02 +0000 (14:46 +0000)]
[BlockExtractor] Avoid assert with wrong line format
Summary:
When the line format is wrong, we may end up accessing out of bound
memory. eg: the test with invalide line will cause assert.
Assertion `idx < size()' failed
The fix is to report fatal when we found mismatched line format.
Sanjay Patel [Tue, 20 Aug 2019 13:39:17 +0000 (13:39 +0000)]
[InstCombine] simplify min/max of min/max with same operands (PR35607)
This is the original integer variant requested in:
https://bugs.llvm.org/show_bug.cgi?id=35607
As noted in the TODO and several similar TODOs around this block,
we could do this in instsimplify, but then it would cost more
because we would be trying to match min/max via ValueTracking
in 2 different places.
There are 4 commuted variants for each of smin/smax/umin/umax
that are not matched here. There are also icmp predicate variants
that are not included in the affected test file because they are
already handled by instsimplify by folding the final icmp to
true/false.
George Rimar [Tue, 20 Aug 2019 13:19:16 +0000 (13:19 +0000)]
[llvm-objdump] - Remove one of `report_error` functions and improve the error reporting.
One of the report_error functions was taking object::Archive::Child as an
argument. It feels excessive, this patch removes it and introduce a helper
function instead. Also I fixed a "TODO" in this patch what improved the message printed.
Igor Kudrin [Tue, 20 Aug 2019 12:52:32 +0000 (12:52 +0000)]
[DWARF] Fix reading 64-bit DWARF type units.
The type_offset field is 8 bytes long in DWARF64. The patch extends
TypeOffset to uint64_t and fixes its reading. The patch also fixes
checking of TypeOffset bounds as it was inaccurate in DWARF64 case.
Alex Bradbury [Tue, 20 Aug 2019 12:32:31 +0000 (12:32 +0000)]
[RISCV] Implement getExprForFDESymbol to ensure RISCV_32_PCREL is used for the FDE location
Follow binutils in using RISCV_32_PCREL for the FDE initial location. As
explained in the relevant binutils commit
<https://github.com/riscv/riscv-binutils-gdb/commit/a6cbf936e3dce68114d28cdf60d510a3f78a6d40>,
the ADD/SUB pair of relocations is problematic in the presence of linker
relaxation.
This patch has the same end goal as D64715 but includes test changes and
avoids adding a new global VariantKind to MCExpr.h (preferring
RISCVMCExpr VKs like the rest of the RISC-V backend).
Pavel Labath [Tue, 20 Aug 2019 12:08:52 +0000 (12:08 +0000)]
Recommit "MemoryBuffer: Add a missing error-check to getOpenFileImpl"
This recommits r368977, which was reverted in r369027 due to test
failures in lldb. The cause of this was different behavior of
readNativeFileSlice on windows and unix. These have been addressed in
r369269.
The original commit message was:
In case the function was called with a desired read size *and* the file
was not an "mmap()" candidate, the function was falling back to a
"pread()", but it was failing to check the result of that system call.
This meant that the function would return "success" even though the read
operation failed, and it returned a buffer full of uninitialized memory.
Simon Pilgrim [Tue, 20 Aug 2019 11:20:05 +0000 (11:20 +0000)]
[CMake] Update C4324 MSVC warning comment to explain its still broken at VS2019
As promised, I've updated the comment for the C4324 MSVC warning that was re-disabled at rL367409 / rG8f823e63e3edf87ab029ba32b68f3eb5d2f392b5 to put it in terms of currently supported VS versions
Andrea Di Biagio [Tue, 20 Aug 2019 10:23:55 +0000 (10:23 +0000)]
[X86][Btver2] Fix latency and throughput of CMPXCHG instructions.
On Jaguar, CMPXCHG has a latency of 11cy, and a maximum throughput of 0.33 IPC.
Throughput is superiorly limited to 0.33 because of the implicit in/out
dependency on register EAX. In the case of repeated non-atomic CMPXCHG with the
same memory location, store-to-load forwarding occurs and values for sequent
loads are quickly forwarded from the store buffer.
Interestingly, the functionality in LLVM that computes the reciprocal throughput
doesn't seem to know about RMW instructions. That functionality only looks at
the "consumed resource cycles" for the throughput computation. It should be
fixed/improved by a future patch. In particular, for RMW instructions, that
logic should also take into account for the write latency of in/out register
operands.
An atomic CMPXCHG has a latency of ~17cy. Throughput is also limited to
~17cy/inst due to cache locking, which prevents other memory uOPs to start
executing before the "lock releasing" store uOP.
CMPXCHG8rr and CMPXCHG8rm are treated specially because they decode to one less
macro opcode. Their latency tend to be the same as the other RR/RM variants. RR
variants are relatively fast 3cy (but still microcoded - 5 macro opcodes).
CMPXCHG8B is 11cy and unfortunately doesn't seem to benefit from store-to-load
forwarding. That means, throughput is clearly limited by the in/out dependency
on GPR registers. The uOP composition is sadly unknown (due to the lack of PMCs
for the Integer pipes). I have reused the same mix of consumed resource from the
other CMPXCHG instructions for CMPXCHG8B too.
LOCK CMPXCHG8B is instead 18cycles.
CMPXCHG16B is 32cycles. Up to 38cycles when the LOCK prefix is specified. Due to
the in/out dependencies, throughput is limited to 1 instruction every 32 (or 38)
cycles dependeing on whether the LOCK prefix is specified or not.
I wouldn't be surprised if the microcode for CMPXCHG16B is similar to 2x
microcode from CMPXCHG8B. So, I have speculatively set the JALU01 consumption to
2x the resource cycles used for CMPXCHG8B.
The two new hasLockPrefix() functions are used by the btver2 scheduling model
check if a MCInst/MachineInst has a LOCK prefix. Calls to hasLockPrefix() have
been encoded in predicates of variant scheduling classes that describe lat/thr
of CMPXCHG.