Alex Brachet [Fri, 12 Jul 2019 10:20:01 +0000 (10:20 +0000)]
[tools] [llvm-nm] Default to reading from stdin not a.out
Summary: This moves away from defaulting to a.out and uses stdin only if stdin has a file redirected to it. This has been discussed on the llvm-dev mailing list [[ https://lists.llvm.org/pipermail/llvm-dev/2019-July/133642.html | here ]].
Simon Atanasyan [Fri, 12 Jul 2019 04:58:45 +0000 (04:58 +0000)]
[mips] Fix JmpLink to texternalsym and tglobaladdr on mcroMIPS R6
There is not match for the `MipsJmpLink texternalsym` and `MipsJmpLink
tglobaladdr` patterns for microMIPS R6. As a result LLVM incorrectly
selects the `JALRC16` compact 2-byte instruction which takes a target
instruction address from a register only and assign `R_MIPS_32` relocation
for this instruction. This relocation completely overwrites `JALRC16`
and nearby instructions.
This patch adds missed matching patterns, selects `BALC` instruction and
assign a correct `R_MICROMIPS_PC26_S1` relocation.
Fangrui Song [Fri, 12 Jul 2019 04:51:31 +0000 (04:51 +0000)]
[YAMLIO] Remove trailing spaces when outputting maps
llvm::yaml::Output::paddedKey unconditionally outputs spaces, which
are superfluous if the value to be dumped is a sequence or map.
Change `bool NeedsNewLine` to `StringRef Padding` so that it can be
overridden to `\n` if the value is a sequence or map.
An empty map/sequence is special. It is printed as `{}` or `[]` without
a newline, while a non-empty map/sequence follows a newline. To handle
this distinction, add another variable `PaddingBeforeContainer` and does
the special handling in endMapping/endSequence.
Petr Hosek [Thu, 11 Jul 2019 22:59:23 +0000 (22:59 +0000)]
[sancov] Ignore PC samples with value 0
The sancov runtime for the (Fuchsia) Zircon kernel delivers results
in the standard format, but as the full array of possible samples
with 0 in uncovered slots. That runtime delivers "live" data and
has no final "export" pass to compactify out the uncovered slots,
and it seems silly to require another offline tool just for that.
Leonard Chan [Thu, 11 Jul 2019 22:35:40 +0000 (22:35 +0000)]
[NewPM] Port Sancov
This patch contains a port of SanitizerCoverage to the new pass manager. This one's a bit hefty.
Changes:
- Split SanitizerCoverageModule into 2 SanitizerCoverage for passing over
functions and ModuleSanitizerCoverage for passing over modules.
- ModuleSanitizerCoverage exists for adding 2 module level calls to initialization
functions but only if there's a function that was instrumented by sancov.
- Added legacy and new PM wrapper classes that own instances of the 2 new classes.
- Update llvm tests and add clang tests.
Diego Novillo [Thu, 11 Jul 2019 22:08:35 +0000 (22:08 +0000)]
Fix build errors LLVM tests are disabled.
Original patch from alanbaker@google.com
Fixes the error:
CMake Error in <...>/llvm/cmake/modules/CMakeLists.txt:
export called with target "LLVMTestingSupport" which requires target
"gtest" that is not in the export set.
This occurs when LLVM is embedded in a larger project, but is configured not to
include tests. If testing is disabled gtest isn't available and LLVM fails to
configure.
Introduce and deduce "nosync" function attribute to indicate that a function
does not synchronize with another thread in a way that other thread might free memory.
Simon Pilgrim [Thu, 11 Jul 2019 14:45:03 +0000 (14:45 +0000)]
[DAGCombine] narrowInsertExtractVectorBinOp - add CONCAT_VECTORS support
We already split extract_subvector(binop(insert_subvector(v,x),insert_subvector(w,y))) -> binop(x,y).
This patch adds support for extract_subvector(binop(concat_vectors(),concat_vectors())) cases as well.
In particular this means we don't have to wait for X86 lowering to convert concat_vectors to insert_subvector chains, which helps avoid some cases where demandedelts/combine calls occur too late to split large vector ops.
The fast-isel-store.ll load folding regression is annoying but I don't think is that critical.
The interface predates CallBase, so both it and implementation were
significantly more complicated than they needed to be. There was even
some redundancy that could be eliminated.
Should also help with OpaquePointers by not trying to derive a
function's type from it's PointerType.
George Rimar [Thu, 11 Jul 2019 12:26:48 +0000 (12:26 +0000)]
[llvm-readobj/llvm-readelf] - Report a warning instead of a error when dumping a broken dynamic section.
It does not make sence to stop dumping the object if the broken
dynamic section was found. In this patch I changed the behavior from
"report an error" to "report a warning". This matches GNU.
llvm-dis runs out of memory while opening invalid-fcmp-opnum.bc on
llvm-hexagon-elf, probably because the bitcode file contains other
suspicious values.
Fangrui Song [Thu, 11 Jul 2019 10:17:59 +0000 (10:17 +0000)]
[llvm-objcopy] Don't change permissions of non-regular output files
There is currently an EPERM error when a regular user executes `llvm-objcopy a.o /dev/null`.
Worse, root can even change the mode bits of /dev/null.
Fix it by checking if the output file is special.
A new overload of llvm::sys::fs::setPermissions with FD as the parameter
is added. Users should provide `perm & ~umask` as the parameter if they
intend to respect umask.
The existing overload of llvm::sys::fs::setPermissions may be deleted if
we can find an implementation of fchmod() on Windows. fchmod() is
usually better than chmod() because it saves syscalls and can avoid race
condition.
Fangrui Song [Thu, 11 Jul 2019 10:10:09 +0000 (10:10 +0000)]
[X86] -fno-plt: use GOT __tls_get_addr only if GOTPCRELX is enabled
Summary:
As of binutils 2.32, ld has a bogus TLS relaxation error when the GD/LD
code sequence using R_X86_64_GOTPCREL (instead of R_X86_64_GOTPCRELX) is
attempted to be relaxed to IE/LE (binutils PR24784). gold and lld are good.
In gcc/config/i386/i386.md, there is a configure-time check of as/ld
support and the GOT relaxation will not be used if as/ld doesn't support
it:
if (flag_plt || !HAVE_AS_IX86_TLS_GET_ADDR_GOT)
return "call\t%P2";
return "call\t{*%p2@GOT(%1)|[DWORD PTR %p2@GOT[%1]]}";
In clang, -DENABLE_X86_RELAX_RELOCATIONS=OFF is the default. The ld.bfd
bogus error can be reproduced with:
thread_local int a;
int main() { return a; }
clang -fno-plt -fpic a.cc -fuse-ld=bfd
GOTPCRELX gained relative good support in 2016, which is considered
relatively new. It is even difficult to conditionally default to
-DENABLE_X86_RELAX_RELOCATIONS=ON due to cross compilation reasons. So
work around the ld.bfd bug by only using GOT when GOTPCRELX is enabled.
Sam Parker [Thu, 11 Jul 2019 09:56:15 +0000 (09:56 +0000)]
[ARM][LowOverheadLoops] Correct offset checking
This patch addresses a couple of problems:
1) The maximum supported offset of LE is -4094.
2) The offset of WLS also needs to be checked, this uses a
maximum positive offset of 4094.
The use of BasicBlockUtils has been changed because the block offsets
weren't being initialised, but the isBBInRange checks both positive
and negative offsets.
ARMISelLowering has been tweaked because the test case presented
another pattern that we weren't supporting.
Simon Tatham [Thu, 11 Jul 2019 09:52:15 +0000 (09:52 +0000)]
[ARM] Remove nonexistent unsigned forms of MVE VQDMLAH.
The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't
actually exist: the Armv8.1-M architecture spec only lists signed
forms of that instruction. The unsigned ones were added in error: they
existed in an early draft of the spec, but they were removed before
the public version, and we missed that particular spec change.
Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH.
Petar Avramovic [Thu, 11 Jul 2019 09:28:34 +0000 (09:28 +0000)]
[MIPS GlobalISel] Skip copies in addUseDef and addDefUses
Skip copies between virtual registers during search for UseDefs
and DefUses.
Since each operand has one def search for UseDefs is straightforward.
But since operand can have many uses, we have to check all uses of
each copy we traverse during search for DefUses.
Petar Avramovic [Thu, 11 Jul 2019 09:22:49 +0000 (09:22 +0000)]
[MIPS GlobalISel] RegBankSelect for chains of ambiguous instructions
When one of the uses/defs of ambiguous instruction is also ambiguous
visit it recursively and search its uses/defs for instruction with
only one mapping available.
When all instruction in a chain are ambiguous arbitrary mapping can
be selected. For s64 operands in ambiguous chain fprb is selected since
it results in less instructions then having to narrow scalar s64 to s32.
For s32 both gprb and fprb result in same number of instructions and
gprb is selected like a general purpose option.
At the moment we always avoid cross register bank copies.
TODO: Implement a model for costs calculations of different mappings
on same instruction and cross bank copies. Allow cross bank copies
when appropriate according to cost model.
Sam Parker [Thu, 11 Jul 2019 07:47:50 +0000 (07:47 +0000)]
[ARM][ParallelDSP] Change the search for smlads
Two functional changes have been made here:
- Now search up from any add instruction to find the chains of
operations that we may turn into a smlad. This allows the
generation of a smlad which doesn't accumulate into a phi.
- The search function has been corrected to stop it falsely searching
up through an invalid path.
The bulk of the changes have been making the Reduction struct a class
and making it more C++y with getters and setters.
Mikael Holmen [Thu, 11 Jul 2019 07:07:23 +0000 (07:07 +0000)]
[test] Silence gcc 7.4 warning [NFC]
Without this gcc 7.4.0 complains with
../unittests/Analysis/ValueTrackingTest.cpp:937:66: error: ISO C++11 requires at least one argument for the "..." in a variadic macro [-Werror]
::testing::ValuesIn(IsBytewiseValueTests));
^
Replace three "strip & accumulate" implementations with a single one
This patch replaces the three almost identical "strip & accumulate"
implementations for constant pointer offsets with a single one,
combining the respective functionalities. The old interfaces are kept
for now.
[NFC] Adjust "invalid.ll.bc" tests to check for AttrKind #255 not #63
We are about to add enum attributes with AttrKind numbers >= 63. This
means we cannot use AttrKind #63 to test for an invalid attribute number
in the RAW format anymore. This patch changes the number of an invalid
attribute to #255. There is no change to the character of the tests.
[X86] Don't convert 8 or 16 bit ADDs to LEAs on Atom in FixupLEAPass.
We use the functions that convert to three address to do the
conversion, but changing an 8 or 16 bit will cause it to create
a virtual register. This can't be done after register allocation
where this pass runs.
I've switched the pass completely to a white list of instructions
that can be converted to LEA instead of a blacklist that was
incorrect. This will avoid surprises if we enhance the three
address conversion function to include additional instructions
in the future.
This patch doesn't work with binaries built w/ `--emit-relocs`, e.g.
```
$ echo 'int main() { return 0; }' | clang -Wl,--emit-relocs -x c - -o foo && llvm-objcopy --strip-unneeded foo
llvm-objcopy: error: 'foo': not stripping symbol '__gmon_start__' because it is named in a relocation
```
...then flipping the operands in the compare instruction can allow using a subtract that sets compare flags.
Motivated by diffs in D58875 - not sure if this changes anything there,
but this seems like a good thing independent of that.
There's a more involved version of this transform already in IR (in instcombine
although that seems misplaced to me) - see "swapMayExposeCSEOpportunities()".
David Tenty [Wed, 10 Jul 2019 22:13:55 +0000 (22:13 +0000)]
[NFC]Fix IR/MC depency issue for function descriptor SDAG implementation
Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering.
[AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and unknown values.
Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes
obstruct attempts to find constant source values. These usually come about when
try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough
about this operation allows the CBZ/CBNZ optimization to catch more cases.
This change also improves the case where we can't find a constant source at all.
Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit
a cmp and conditional branch, saving an instruction.
The cumulative code size improvement of this change plus D64354 is 5.5% geomean
on arm64 CTMark -O0.
[GlobalISel][AArch64][NFC] Use getDefIgnoringCopies from Utils where we can
There are a few places where we walk over copies throughout
AArch64InstructionSelector.cpp. In Utils, there's a function that does exactly
this which we can use instead.
Note that the utility function works with the case where we run into a COPY
from a physical register. We've run into bugs with this a couple times, so using
it should defend us from similar future bugs.
Also update opt-fold-compare.mir to show that we still handle physical registers
properly.
Michael Berg [Wed, 10 Jul 2019 18:23:26 +0000 (18:23 +0000)]
Move three folds for FADD, FSUB and FMUL in the DAG combiner away from Unsafe to more aligned checks that reflect context
Summary: Unsafe does not map well alone for each of these three cases as it is missing NoNan context when accessed directly with clang. I have migrated the fold guards to reflect the expectations of handing nan and zero contexts directly (NoNan, NSZ) and some tests with it. Unsafe does include NSZ, however there is already precedent for using the target option directly to reflect that context.