Diana Picus [Thu, 2 May 2019 09:28:00 +0000 (09:28 +0000)]
[ARM GlobalISel] Select extensions to < 32 bits
Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in
the exact same way as 32 bits. This overwrites the higher bits, but that
should be ok since all legal users of types smaller than 32 bits ignore
those bits anyway.
Kang Zhang [Thu, 2 May 2019 08:15:13 +0000 (08:15 +0000)]
[NFC][PowerPC] Return early if the element type is not byte-sized in combineBVOfConsecutiveLoads
Summary:
Based on the Eli Friedman's comments in https://reviews.llvm.org/D60811 , we'd better return early if the element type is not byte-sized in `combineBVOfConsecutiveLoads`.
Pavel Labath [Thu, 2 May 2019 07:45:42 +0000 (07:45 +0000)]
Object/Minidump: Add support for the ThreadList stream
Summary:
The stream contains the list of threads belonging to the process
described by the minidump. Its structure is the same as the ModuleList
stream, and in fact, I have generalized the ModuleList reading code to
handle this stream too.
Fangrui Song [Thu, 2 May 2019 05:58:09 +0000 (05:58 +0000)]
[Support] Don't check MAP_ANONYMOUS, just use MAP_ANON
Though being marked "deprecated" by the Linux man-pages project
(MAP_ANON is a synonym of MAP_ANONYMOUS), it is the mostly widely
available macro - many systems that don't provide MAP_ANONYMOUS have
MAP_ANON. MAP_ANON is also used here and there in compiler-rt.
Nico Weber [Thu, 2 May 2019 01:52:24 +0000 (01:52 +0000)]
lld-link: Make "duplicate resource" error message a bit more concise
Reduces the error message from:
lld-link: error: failed to parse .res file: duplicate resource: type STRINGTABLE (ID 6)/name ID 3/language 1033, in test1.res and in test2.res
To:
lld-link: error: duplicate resource: type STRINGTABLE (ID 6)/name ID 3/language 1033, in test1.res and in test2.res
Make sure every error message emitted by cvtres contains the name of at
least one ".res" file, so that removing the "failed to parse .res file"
string doesn't lose information.
Bob Haarman [Thu, 2 May 2019 00:37:36 +0000 (00:37 +0000)]
remove inalloca parameters in globalopt and simplify argpromotion
Summary:
Inalloca parameters require special handling in some optimizations.
This change causes globalopt to strip the inalloca attribute from
function parameters when it is safe to do so, removes the special
handling for inallocas from argpromotion, and replaces it with a
simple check that causes argpromotion to skip functions that receive
inallocas (for when the pass is invoked on code that didn't run
through globalopt first). This also avoids a case where argpromotion
would incorrectly try to pass an inalloca in a register.
Summary:
This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch introduces the @LINE numeric
expressions.
This commit introduces a new syntax to express a relation a numeric
value in the input text must have with the line number of a given CHECK
pattern: [[#<@LINE numeric expression>]]. Further commits build on that
to express relations between several numeric values in the input text.
To help with naming, regular variables are renamed into pattern
variables and old @LINE expression syntax is referred to as legacy
numeric expression.
Compared to existing @LINE expressions, this new syntax allow arbitrary
spacing between the component of the expression. It offers otherwise the
same functionality but the commit serves to introduce some of the data
structure needed to support more general numeric expressions.
Copyright:
- Linaro (changes up to diff 183612 of revision D55940)
- GraphCore (changes in later versions of revision D55940 and
in new revision created off D55940)
Nico Weber [Wed, 1 May 2019 20:00:45 +0000 (20:00 +0000)]
Try to unbreak sphinx bot after r359714
The now-correctly-referenced label dbi_type_server_map_substream didn't
exist. Rewrite things a bit after looking at NewDBIHdr in dbi.h and its
use in dbi.cpp in the reference implementation.
Nico Weber [Wed, 1 May 2019 19:34:00 +0000 (19:34 +0000)]
Make check-clang depend on the clang-check binary always
check-clang (the target that runs all clang tests) used to
only depend on clang-check (a binary like clang-tidy,
clang-refactor, etc) if the static analyzer is enabled.
However, several lit tests call clang-check unconditionally,
so always depend on it.
Fixes a "could not find clang-check" lit warning in clean builds with
the static analyzer disabled.
Also sort the deps in the CMake file and put just one dep on each line.
Nico Weber [Wed, 1 May 2019 19:29:30 +0000 (19:29 +0000)]
Minor tweaks to PDB docs
- Fix a broken link
- Some spelling fixes
- Remove an unnecessary "amortized"
- Don't say "log(n) random access"; "random access" means O(1)
- Make MSF overview a bit more concise
Simon Pilgrim [Wed, 1 May 2019 17:13:35 +0000 (17:13 +0000)]
[X86][SSE] Fold scalar horizontal add/sub for non-0/1 element extractions
We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector).
Daniel Sanders [Wed, 1 May 2019 16:52:29 +0000 (16:52 +0000)]
[globalisel] Update the legalizer documentation
Summary:
* The getActionDefinitionsBuilder() is now documented.
* Includes descriptions of the various actions (legal*, widenScalar*, lower*,
etc).
* Includes descriptions of the various predicates (*If, *For,
*ForCartesianProduct, etc.)
* Includes the rule-order details
* Removed the out-of-date prohibition on non-power-of-2 types.
* Removed the Vector types section since it was incorrect and vectors follow the
same ruleset as scalars. They're only special in the sense that more of the
actions and predicates are meaningful for them (e.g. moreElements).
* Clarified the position on context sensitive legality (which is not permitted)
and contrasted this with context sensitive legalization (which is permitted).
Nico Weber [Wed, 1 May 2019 16:45:15 +0000 (16:45 +0000)]
Option spell checking: Penalize delimiter flags if input has no argument
If the user passes a flag like `-version` to a program, it's more likely
they mean `--version` than `-version:`, since there's no parameter
passed. Hence, give delimited arguments a penalty of 1 if the user input
doesn't contain the delimiter or no data after it.
The motivation is that with this, lld-link can suggest "--version"
instead of "-version:" for "-version" and "-nodefaultlib" instead of
"-nodefaultlib:" for "-nodefaultlibs".
Teresa Johnson [Wed, 1 May 2019 16:26:59 +0000 (16:26 +0000)]
[ThinLTO] Fix unreachable code when parsing summary entries.
Summary:
Early returns were causing some code to be skipped. This was missed
since the summary entries are typically at the end of the llvm assembly
file.
Keno Fischer [Wed, 1 May 2019 15:58:24 +0000 (15:58 +0000)]
[SCEV] Use isKnownViaNonRecursiveReasoning for smax simplification
Summary:
Commit
rL331949: SCEV] Do not use induction in isKnownPredicate for simplification umax
changed the codepath for umax from isKnownPredicate to
isKnownViaNonRecursiveReasoning to avoid compile time blow up (and as
I found out also stack overflows). However, there is an exact copy of
the code for umax that was lacking this change. In D50167 I want to unify
these codepaths, but to avoid that being a behavior change for the smax
case, pull this independent bit out of it.
Hubert Tong [Wed, 1 May 2019 15:47:16 +0000 (15:47 +0000)]
[lit][tests][AIX] Update expected form of diagnostic messages; use `not` to normalize non-zero exit values
Summary:
Various tests in the `lit` testing suite expect specific return codes
and forms of diagnostic message from utility programs. As per
POSIX.1-2017 XCU Section 1.4, Utility Description Defaults, "[the]
format of diagnostic messages for most utilities is unspecified".
The STDERR subsections of the `cat` and `wc` utilities merely indicate
that "[the] standard error shall be used only for diagnostic messages".
The corresponding EXIT STATUS subsections merely indicate, with regard
to errors, an exit value of >0.
The affected tests are updated to accept the applicable diagnostic
message as produced by the utilities on AIX. The exit value is
normalized using `not` as necessary.
Hubert Tong [Wed, 1 May 2019 15:36:18 +0000 (15:36 +0000)]
[tests] Add host-byteorder-*-endian; update XFAILs of big-endian triples
Summary:
Triple components in `XFAIL` lines are tested against the target triple.
Various tests that are expected to fail on big-endian hosts are marked
as being `XFAIL` for big-endian targets. This patch corrects these tests
by having them test against a new `host-byteorder-big-endian` feature.
Nico Weber [Wed, 1 May 2019 14:46:17 +0000 (14:46 +0000)]
Fix OptTable::findNearest() adding delimiter for free
Prior to this, OptTable::findNearest() thought that the input `--foo`
had an editing distance of 0 from an existing flag `--foo=`, which made
it suggest flags with delimiters more often than flags without one.
After this, it correctly assigns this case an editing distance of 1.
Keno Fischer [Wed, 1 May 2019 14:39:11 +0000 (14:39 +0000)]
[LoopInfo] Faster implementation of setLoopID. NFC.
Summary:
This change was part of D46460. However, in the meantime rL341926 fixed the
correctness issue here. What remained was the performance issue in setLoopID
where it would iterate through all blocks in the loop and their successors,
rather than just the predecessor of the header (the later presumably being
much faster). We already have the `getLoopLatches` to compute precisely these
basic blocks in an efficient manner, so just use it (as the original commit
did for `getLoopID`).
Tim Northover [Wed, 1 May 2019 12:37:30 +0000 (12:37 +0000)]
DAG: allow DAG pointer size different from memory representation.
In preparation for supporting ILP32 on AArch64, this modifies the SelectionDAG
builder code so that pointers are allowed to have a larger type when "live" in
the DAG compared to memory.
Pointers get zero-extended whenever they are loaded, and truncated prior to
stores. In addition, a few not quite so obvious locations need updating:
* A GEP that has not been marked inbounds needs to enforce the IR-documented
2s-complement wrapping at the memory pointer size. Inbounds GEPs are
undefined if they overflow the address space, so no additional operations
are needed.
* Signed comparisons would give incorrect results if performed on the
zero-extended values.
This shouldn't affect CodeGen for now, but will become active when the AArch64
ILP32 support is committed.
Fangrui Song [Wed, 1 May 2019 09:28:24 +0000 (09:28 +0000)]
[llvm-readobj] Change -t to --symbols in tests. NFC
-t is --symbols in llvm-readobj but --section-details (unimplemented) in readelf.
The confusing option should not be used since we aim for improving
compatibility.
Keep just one llvm-readobj -t use case in test/tools/llvm-readobj/symbols.test
Fangrui Song [Wed, 1 May 2019 05:27:20 +0000 (05:27 +0000)]
[llvm-readobj] Change -long-option to --long-option in tests. NFC
We use both -long-option and --long-option in tests. Switch to --long-option for consistency.
In the "llvm-readelf" mode, -long-option is discouraged as it conflicts with grouped short options and it is not accepted by GNU readelf.
While updating the tests, change llvm-readobj -s to llvm-readobj -S to reduce confusion ("s" is --section-headers in llvm-readobj but --symbols in llvm-readelf).
Lang Hames [Wed, 1 May 2019 02:43:52 +0000 (02:43 +0000)]
[JITLink] Make sure we explicitly deallocate memory on failure.
JITLinkGeneric phases 2 and 3 (focused on applying fixups and finalizing memory,
respectively) may fail for various reasons. If this happens, we need to
explicitly de-allocate the memory allocated in phase 1 (explicitly, because
deallocation may also fail and so is implemented as a method returning error).
No testcase yet: I am still trying to decide on the right way to test totally
platform agnostic code like this.
GNU objcopy uses bfd_elf_get_default_section_type to decide the candidate section type,
which roughly translates to our [a] (I assume SEC_COMMON implies SHF_ALLOC):
I suggest we just ignore this clause and follow the common case
behavior, which is done in this patch. Rationales to do so:
If --set-section-flags is a no-op (osec->flags == isec->flags)
(corresponds to the "readonly" test in set-section-flags.test), GNU
objcopy will require (Sec.Flags & ELF::SHF_ALLOC). [a] is essentially:
This special case is not really useful. Non-SHF_ALLOC SHT_NOBITS
sections do not make much sense and it doesn't matter if they are
SHT_NOBITS or SHT_PROGBITS.
For all other RUN lines in set-section-flags.test, the new behavior
matches GNU objcopy, i.e. this patch improves compatibility.
Philip Reames [Tue, 30 Apr 2019 23:09:26 +0000 (23:09 +0000)]
[InstCombine] Limit a vector demanded elts rule which was producing invalid IR.
The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes.
This should fix https://bugs.llvm.org/show_bug.cgi?id=41624.
Dan Gohman [Tue, 30 Apr 2019 23:04:49 +0000 (23:04 +0000)]
[WebAssembly] Test the "wasm32-wasi" triple
Add triple tests for "wasm32-wasi" and "wasm64-wasi", and also remove the
"-musl" component from the existing wasm triple tests as we're not using that
in practice (WASI libc is derived in part from musl, but it is not fully
musl-compatible).
[AliasAnalysis/NewPassManager] Invalidate AAManager less often.
Summary:
This is a redo of D60914.
The objective is to not invalidate AAManager, which is stateless, unless
there is an explicit invalidate in one of the AAResults.
To achieve this, this patch adds an API to PAC, to check precisely this:
is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone
[PassManagerBuilder] Add option for interleaved loops, for loop vectorize.
Summary:
Match NewPassManager behavior: add option for interleaved loops in the
old pass manager, and use that instead of the flag used to disable loop unroll.
No changes in the defaults.
Lang Hames [Tue, 30 Apr 2019 21:27:56 +0000 (21:27 +0000)]
[ORC][JITLink] Name in-memory compiled objects after their source modules.
In-memory compiled object buffer identifiers will now be derived from the
identifiers of their source IR modules. This makes it easier to connect
in-memory objects with their source modules in debugging output.
Dan Gohman [Tue, 30 Apr 2019 19:17:59 +0000 (19:17 +0000)]
[WebAssembly] Support f16 libcalls
Add support for f16 libcalls in WebAssembly. This entails adding signatures
for the remaining F16 libcalls, and renaming gnu_f2h_ieee/gnu_h2f_ieee to
truncsfhf2/extendhfsf2 for consistency between f32 and f64/f128 (compiler-rt
already supports this).
[X86] If PreprocessISelDAG reorders a load before a call, make sure we remove dead nodes from the graph
The reordering can leave at least a dead TokenFactor in the graph. This cause the linearize scheduler to fail with something like the assert seen in PR22614. This is only one of many ways we can break the linearize scheduler today so I can't say for sure that any of the other failures in that bug were caused by this issue.
This takes the heavy hammer approach of just running RemoveDeadNodes unconditionally at the end of the PreprocessISelDAG. If this turns out to be a compile time hit, we can try to refine it.
[X86] Initial cleanups on the FixupLEAs pass. Separate Atom LEA creation from other LEA optimizations.
This removes some of the class variables. Merge basic block processing into
runOnMachineFunction to keep the flags local.
Pass MachineBasicBlock around instead of an iterator. We can get the iterator in
the few places that need it. Allows a range-based outer for loop.
Separate the Atom optimization from the rest of the optimizations. This allows
fixupIncDec to create INC/DEC and still allow Atom to turn it back into LEA
when profitable by its heuristics.
I'd like to improve fixupIncDec to turn LEAs into ADD any time the base or index
register is equal to the destination register. This is profitable regardless of
the various slow flags. But again we would want Atom to be able to undo that.
Re-reland "[Option] Fix PR37006 prefix choice in findNearest"
This was first reviewed in https://reviews.llvm.org/D46776 and
landed in r332299, but got reverted because it broke the PS4
bots.
https://reviews.llvm.org/D50410 fixed this, and then this
change was re-reviewed at https://reviews.llvm.org/D50515 and
relanded in r341329. It got reverted due to causing MSan issues.
However, nobody wrote down the error message and the bot link
is dead, so I'm relanding this to capture the MSan error.
I'll then either fix it, or copy it somewhere and revert if
fixing looks difficult.
Russell Gallop [Tue, 30 Apr 2019 15:35:16 +0000 (15:35 +0000)]
Add llvm-profdata to LLVM_TOOLCHAIN_TOOLS
This is required for using PGO on Windows but isn't in the Windows
release packages. Windows packages are built with
LLVM_INSTALL_TOOLCHAIN_ONLY so only includes llvm "tools" listed here.