Support a new assembler directive, .import_global, to declare imported
global variables (i.e. those with external linkage and no
initializer). The linker turns these into wasm imports.
Matthias Braun [Wed, 30 Nov 2016 23:49:01 +0000 (23:49 +0000)]
Move most EH from MachineModuleInfo to MachineFunction
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.
This is a necessary step to have machine module passes work properly.
Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
because the available MachineFunction pointers are const, but the code
wants to call tidyLandingPads() in between
(markFunctionEnd()/endFunction()).
Matthias Braun [Wed, 30 Nov 2016 23:48:26 +0000 (23:48 +0000)]
MCStreamer: Use "cfi" for CFI related temp labels.
Choosing a "cfi" name makes the intend a bit clearer in an assembly dump
and more importantly the assembly dumps are slightly more stable as the
numbers don't move around anymore when unrelated code calls
createTempSymbol() more or less often.
As they are temp labels the name doesn't influence the generated object
code.
Maintain the command line resolutions as a map to a list of resolutions
rather than a single resolution, and apply the resolutions in the order
observed. This is not only simpler but allows us to test the scenario where
the two symbols have different resolutions.
Paul Robinson [Wed, 30 Nov 2016 22:49:55 +0000 (22:49 +0000)]
Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions.
The LLDB tests are now ready for this patch.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table. Use this for branch targets and some
other cases that have no specified source location, to prevent
inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).
David Callahan [Wed, 30 Nov 2016 22:32:58 +0000 (22:32 +0000)]
Only computeRelativePath() on new members
Summary:
When using thin archives, and processing the same archive multiple times, we were mangling existing entries. The root cause is that we were calling computeRelativePath() more than once. Here, we only call it when adding new members to an archive.
Note that D27218 changes the way thin archives are printed, and will break the new unit test included here. Depending on which one lands first, the other will need to be slightly modified.
Joel Jones [Wed, 30 Nov 2016 22:25:24 +0000 (22:25 +0000)]
[AArch64] Refactor LSE support as feature separate from V8.1a support.
Summary:
This is preparation for ThunderX processors that have Large
System Extension (LSE) atomic instructions, but not the
other instructions introduced by V8.1a.
This will mimic changes to GCC as described here:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html
Zachary Turner [Wed, 30 Nov 2016 21:44:26 +0000 (21:44 +0000)]
[LibFuzzer] Add Windows implementations of some IO functions.
This patch moves some posix specific file i/o code into a new
file, FuzzerIOPosix.cpp, and provides implementations for these
functions on Windows in FuzzerIOWindows.cpp. This is another
incremental step towards getting libfuzzer working on Windows,
although it still should not be expected to be fully working.
Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27233
The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.
Mehdi Amini [Wed, 30 Nov 2016 19:12:53 +0000 (19:12 +0000)]
[git-llvm] Use --force-interactive when commiting to enable SVN to prompt password
When svn does not know the password and it has to prompt, it needs to query.
However it won't when invoked from the Python script and instead fails with:
svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option
Zachary Turner [Wed, 30 Nov 2016 19:06:14 +0000 (19:06 +0000)]
[LibFuzzer] Split up some functions among different headers.
In an effort to get libfuzzer working on Windows, we need to make
a distinction between what functions require platform specific
code (e.g. different code on Windows vs Linux) and what code
doesn't. IO functions, for example, tend to be platform
specific.
This patch separates out some of the functions which will need
to have platform specific implementations into different headers,
so that we can then provide different implementations for each
platform.
Aside from that, this patch contains no functional change. It
is purely a re-organization.
Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27230
Silviu Baranga [Wed, 30 Nov 2016 17:04:22 +0000 (17:04 +0000)]
[AArch64] Fix useful bits detection for BFM instructions
Summary:
When computing useful bits for a BFM instruction, we need
to take into consideration the case where both operands
of the BFM are equal and provide data that we need to track.
Simon Pilgrim [Wed, 30 Nov 2016 16:33:46 +0000 (16:33 +0000)]
[X86][SSE] Add support for target shuffle constant folding
Initial support for target shuffle constant folding in cases where all shuffle inputs are constant. We may be able to relax this and merge shuffles with only some constant inputs in the future.
I've added the helper function getTargetConstantBitsFromNode (based off a similar function in X86ShuffleDecodeConstantPool.cpp) that could be reused for other cases requiring constant vector extraction.
Pavel Labath [Wed, 30 Nov 2016 15:34:29 +0000 (15:34 +0000)]
[Support] Use HAVE_DLOPEN to guard dlopen(3) usage
Summary:
The usage was previously guarded by HAVE_DLFCN. This breaks on Android with
LLVM_BUILD_STATIC as the platform does not provide a static version of libdl.
Using HAVE_DLOPEN fixes it as the code will only get used if we are actually able
to link an executable using dlopen.
Nemanja Ivanovic [Tue, 29 Nov 2016 23:36:03 +0000 (23:36 +0000)]
[PowerPC] Improvements for BUILD_VECTOR Vol. 2
This patch corresponds to review:
https://reviews.llvm.org/D25980
This is the 2nd patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector
of fp-to-int conversions into an fp-to-int conversion of a build vector of fp
values. For example:
Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...)
Into (fp_to_[su]i (build_vector $A, $B, ...))).
Which is a natural match for much cleaner code.
Jacques Pienaar [Tue, 29 Nov 2016 23:01:09 +0000 (23:01 +0000)]
[lanai] Manually match 0/-1 with R0/R1.
Summary: Previously 0 and -1 was matched via tablegen rules. But this could cause problems where a physical register was being used where a virtual register was expected (seen in optimizeSelect and TwoAddressInstructionPass). Instead follow AArch64 and match in DAGToDAGISel.
Paul Robinson [Tue, 29 Nov 2016 22:41:16 +0000 (22:41 +0000)]
Emit 'no line' information for interesting 'orphan' instructions.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table. Use this for branch targets and some
other cases that have no specified source location, to prevent
inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).
Adam Nemet [Tue, 29 Nov 2016 22:37:01 +0000 (22:37 +0000)]
[GVN] Basic optimization remark support
[recommiting patches one-by-one to see which breaks the stage2 LTO bot]
Follow-on patches will add more interesting cases.
The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430
This program is for testing features that rely on multi-module bitcode files.
It takes a multi-module bitcode file, extracts one of the modules and writes
it to the output file.
Justin Lebar [Tue, 29 Nov 2016 21:49:02 +0000 (21:49 +0000)]
[StructurizeCFG] Fix infinite loop in rebuildSSA.
Michel Dänzer reported that r288051, "[StructurizeCFG] Use range-based
for loops", introduced a bug into rebuildSSA, wherein we were iterating
over an instruction's use list while modifying it, without taking care
to do this correctly.
Kevin Enderby [Tue, 29 Nov 2016 21:43:40 +0000 (21:43 +0000)]
Add to llvm-objdump the -no-leading-headers option with the use of the -macho option.
In some cases the leading headers of the file name, archive member and
architecture slice name in the output of lvm-objdump is not wanted so the
tool’s output can be directly used by scripts. This matches the -X option
of the Apple otool(1) program.
Mehdi Amini [Tue, 29 Nov 2016 20:45:48 +0000 (20:45 +0000)]
Change Error unittest to use the LLVM_ENABLE_ABI_BREAKING_CHECKS instead of NDEBUG
This is consistent with the header (after r288087) and fixes the
test for the configuration:
-DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ABI_BREAKING_CHECKS=FORCE_OFF
This interface allows clients to write multiple modules to a single
bitcode file. Also introduce the llvm-cat utility which can be used
to create a bitcode file containing multiple modules.
Matt Arsenault [Tue, 29 Nov 2016 19:39:53 +0000 (19:39 +0000)]
AMDGPU: Disallow exec as SMEM instruction operand
This is not in the list of valid inputs for the encoding.
When spilling, copies from exec can be folded directly
into the spill instruction which results in broken
stores.
This only fixes the operand constraints, more codegen
work is required to avoid emitting the invalid
spills.
This sort of breaks the dbg.value test. Because the
register class of the s_load_dwordx2 changes, there
is a copy to SReg_64, and the copy is the operand
of dbg_value. The copy is later dead, and removed
from the dbg_value.
Geoff Berry [Tue, 29 Nov 2016 19:31:35 +0000 (19:31 +0000)]
[LiveRangeEdit] Handle instructions with no defs correctly.
Summary:
The code in LiveRangeEdit::eliminateDeadDef() that computes isOrigDef
doesn't handle instructions in which operand 0 is not a def (e.g. KILL)
correctly. Add a check that operand 0 is a def before doing the rest of
the isOrigDef computation.
Matt Arsenault [Tue, 29 Nov 2016 19:20:42 +0000 (19:20 +0000)]
AMDGPU: Refactor immediate folding logic
Change the logic for when to fold immediates to
consider the destination operand rather than the
source of the materializing mov instruction.
No change yet, but this will allow for correctly handling
i16/f16 operands. Since 32-bit moves are used to materialize
constants for these, the same bitvalue will not be in the
register.
Geoff Berry [Tue, 29 Nov 2016 18:28:32 +0000 (18:28 +0000)]
[AArch64] Fold spills of COPY of WZR/XZR
Summary:
In AArch64InstrInfo::foldMemoryOperandImpl, catch more cases where the
COPY being spilled is copying from WZR/XZR, but the source register is
not in the COPY destination register's regclass.
For example, when spilling:
%vreg0 = COPY %XZR ; %vreg0:GPR64common
without this change, the code in TargetInstrInfo::foldMemoryOperand()
and canFoldCopy() that normally handles cases like this would fail to
optimize since %XZR is not in GPR64common. So the spill code generated
would be:
Artur Pilipenko [Tue, 29 Nov 2016 16:24:57 +0000 (16:24 +0000)]
[CVP] Remove cvp-dont-process-adds flag
The flag was introduced because the optimization controlled by the flag initially caused regressions. All the regressions were fixed some time ago and the flag has been false for quite a while.
Chandler Carruth [Tue, 29 Nov 2016 12:54:34 +0000 (12:54 +0000)]
[PM] Fix a bad invalid densemap iterator bug in the new invalidation
logic.
Yup, the invalidation logic has an invalid iterator bug. Can't make this
stuff up.
We can recursively insert things into the map so we can't cache the
iterator into that map across those recursive calls. We did this
differently in two places. I have an end-to-end test that triggers at
least one of them. I'm going to work on a nice minimal test case that
triggers these, but I didn't want to leave the bug in the tree while
I tried to trigger it.
Also, the dense map iterator checking stuff we have now is awesome. =D
Alexey Bataev [Tue, 29 Nov 2016 08:21:14 +0000 (08:21 +0000)]
[SLPVectorizer] Improved support of partial tree vectorization.
Currently SLP vectorizer tries to vectorize a binary operation and dies
immediately after unsuccessful the first unsuccessfull attempt. Patch
tries to improve the situation, trying to vectorize all binary
operations of all children nodes in the binop tree.
We now expect each module's identification block to appear immediately before
the module block. Any module block that appears without an identification block
immediately before it is interpreted as if it does not have a module block.
Also change the interpretation of VST and function offsets in bitcode.
The offset is always taken as relative to the start of the identification
(or module if not present) block, minus one word. This corresponds to the
historical interpretation of offsets, i.e. relative to the start of the file.
These changes allow for bitcode modules to be concatenated by copying bytes.
Reid Kleckner [Tue, 29 Nov 2016 01:32:21 +0000 (01:32 +0000)]
[asan/win] Align global registration metadata to its size
This way, when the linker adds padding between globals, we can skip over
the zero padding bytes and reliably find the start of the next metadata
global.
Reid Kleckner [Tue, 29 Nov 2016 00:29:27 +0000 (00:29 +0000)]
Recognize ${:uid} escapes in intel syntax inline asm
It looks like this logic was duplicated long ago and the GCC side of
things has grown additional functionality. We need ${:uid} at least to
generate unique MS inline asm labels (PR23715), so expose these.
Adam Nemet [Tue, 29 Nov 2016 00:09:22 +0000 (00:09 +0000)]
[GVN, OptDiag] Print the interesting instructions involved in missed load-elimination
This includes the intervening store and the load/store that we're trying
to forward from in the optimization remark for the missed load
elimination.
This is hooked up under a new mode in ORE that allows for compile-time
budget for a bit more analysis to print more insightful messages. This
mode is currently enabled for -fsave-optimization-record (-Rpass is
trickier since it is controlled in the front-end).
With this we can now print the red remark in http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L446