Ulrich Weigand [Thu, 1 Dec 2016 17:10:27 +0000 (17:10 +0000)]
[SystemZ] Fix applyFixup for 12-bit fixups
Now that we have fixups that only fill parts of a byte, it turns
out we have to mask off the bits outside the fixup area when
applying them. Failing to do so caused invalid object code to
be emitted for bprp with a negative 12-bit displacement.
Adam Nemet [Thu, 1 Dec 2016 16:40:32 +0000 (16:40 +0000)]
[GVN] Basic optimization remark support
[recommitting after the fix in r288307]
Follow-on patches will add more interesting cases.
The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430
Object: Extract a ModuleSymbolTable class from IRObjectFile.
This class represents a symbol table built from in-memory IR. It provides
access to GlobalValues and should only be used if such access is required
(e.g. in the LTO implementation). We will eventually change IRObjectFile
to read from a bitcode symbol table rather than using ModuleSymbolTable,
so it would not be able to expose the module.
Bitcode: Correctly handle Fixed and VBR arrays in BitstreamCursor::skipRecord().
The assertions were wrong; we need to call getEncodingData() on the element,
not the array. While here, simplify the skipRecord() implementation for Fixed
and Char6 arrays. This is tested by the code I added to llvm-bcanalyzer
which makes sure that we can skip any record.
Adam Nemet [Thu, 1 Dec 2016 03:56:43 +0000 (03:56 +0000)]
[GVN] When merging blocks update LoopInfo if it's available
If LoopInfo is available during GVN, BasicAA will use it. However
MergeBlockIntoPredecessor does not update LI as it merges blocks.
This didn't use to cause problems because LI was freed before
GVN/BasicAA. Now with OptimizationRemarkEmitter, the lifetime of LI is
extended so LI needs to be kept up-to-date during GVN.
Ivan Krasin [Thu, 1 Dec 2016 02:54:54 +0000 (02:54 +0000)]
Use trigrams to speed up SpecialCaseList.
Summary:
it's often the case when the rules in the SpecialCaseList
are of the form hel.o*bar. That gives us a chance to build
trigram index to quickly discard 99% of inputs without
running a full regex. A similar idea was used in Google Code Search
as described in the blog post:
https://swtch.com/~rsc/regexp/regexp4.html
The check is defeated, if there's at least one regex
more complicated than that. In this case, all inputs
will go through the regex. That said, the real-world
rules are often simple or can be simplied. That considerably
speeds up compiling Chromium with CFI and UBSan.
As measured on Chromium's content_message_generator.cc:
before, CFI: 44 s
after, CFI: 23 s
after, CFI, no blacklist: 23 s (~1% slower, but 3 runs were unable to show the difference)
after, regular compilation to bitcode: 23 s
Support a new assembler directive, .import_global, to declare imported
global variables (i.e. those with external linkage and no
initializer). The linker turns these into wasm imports.
Matthias Braun [Wed, 30 Nov 2016 23:49:01 +0000 (23:49 +0000)]
Move most EH from MachineModuleInfo to MachineFunction
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.
This is a necessary step to have machine module passes work properly.
Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
because the available MachineFunction pointers are const, but the code
wants to call tidyLandingPads() in between
(markFunctionEnd()/endFunction()).
Matthias Braun [Wed, 30 Nov 2016 23:48:26 +0000 (23:48 +0000)]
MCStreamer: Use "cfi" for CFI related temp labels.
Choosing a "cfi" name makes the intend a bit clearer in an assembly dump
and more importantly the assembly dumps are slightly more stable as the
numbers don't move around anymore when unrelated code calls
createTempSymbol() more or less often.
As they are temp labels the name doesn't influence the generated object
code.
Maintain the command line resolutions as a map to a list of resolutions
rather than a single resolution, and apply the resolutions in the order
observed. This is not only simpler but allows us to test the scenario where
the two symbols have different resolutions.
Paul Robinson [Wed, 30 Nov 2016 22:49:55 +0000 (22:49 +0000)]
Recommit r288212: Emit 'no line' information for interesting 'orphan' instructions.
The LLDB tests are now ready for this patch.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table. Use this for branch targets and some
other cases that have no specified source location, to prevent
inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).
David Callahan [Wed, 30 Nov 2016 22:32:58 +0000 (22:32 +0000)]
Only computeRelativePath() on new members
Summary:
When using thin archives, and processing the same archive multiple times, we were mangling existing entries. The root cause is that we were calling computeRelativePath() more than once. Here, we only call it when adding new members to an archive.
Note that D27218 changes the way thin archives are printed, and will break the new unit test included here. Depending on which one lands first, the other will need to be slightly modified.
Joel Jones [Wed, 30 Nov 2016 22:25:24 +0000 (22:25 +0000)]
[AArch64] Refactor LSE support as feature separate from V8.1a support.
Summary:
This is preparation for ThunderX processors that have Large
System Extension (LSE) atomic instructions, but not the
other instructions introduced by V8.1a.
This will mimic changes to GCC as described here:
https://gcc.gnu.org/ml/gcc-patches/2015-06/msg00388.html
Zachary Turner [Wed, 30 Nov 2016 21:44:26 +0000 (21:44 +0000)]
[LibFuzzer] Add Windows implementations of some IO functions.
This patch moves some posix specific file i/o code into a new
file, FuzzerIOPosix.cpp, and provides implementations for these
functions on Windows in FuzzerIOWindows.cpp. This is another
incremental step towards getting libfuzzer working on Windows,
although it still should not be expected to be fully working.
Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27233
The basic idea is that when the average dynamic trip-count of a loop is known,
based on PGO, to be low, we can expect a performance win by peeling off the
first several iterations of that loop.
Unlike unrolling based on a known trip count, or a trip count multiple, this
doesn't save us the conditional check and branch on each iteration. However,
it does allow us to simplify the straight-line code we get (constant-folding,
etc.). This is important given that we know that we will usually only hit this
code, and not the actual loop.
Mehdi Amini [Wed, 30 Nov 2016 19:12:53 +0000 (19:12 +0000)]
[git-llvm] Use --force-interactive when commiting to enable SVN to prompt password
When svn does not know the password and it has to prompt, it needs to query.
However it won't when invoked from the Python script and instead fails with:
svn: E215004: Authentication failed and interactive prompting is disabled; see the --force-interactive option
Zachary Turner [Wed, 30 Nov 2016 19:06:14 +0000 (19:06 +0000)]
[LibFuzzer] Split up some functions among different headers.
In an effort to get libfuzzer working on Windows, we need to make
a distinction between what functions require platform specific
code (e.g. different code on Windows vs Linux) and what code
doesn't. IO functions, for example, tend to be platform
specific.
This patch separates out some of the functions which will need
to have platform specific implementations into different headers,
so that we can then provide different implementations for each
platform.
Aside from that, this patch contains no functional change. It
is purely a re-organization.
Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27230
Silviu Baranga [Wed, 30 Nov 2016 17:04:22 +0000 (17:04 +0000)]
[AArch64] Fix useful bits detection for BFM instructions
Summary:
When computing useful bits for a BFM instruction, we need
to take into consideration the case where both operands
of the BFM are equal and provide data that we need to track.
Simon Pilgrim [Wed, 30 Nov 2016 16:33:46 +0000 (16:33 +0000)]
[X86][SSE] Add support for target shuffle constant folding
Initial support for target shuffle constant folding in cases where all shuffle inputs are constant. We may be able to relax this and merge shuffles with only some constant inputs in the future.
I've added the helper function getTargetConstantBitsFromNode (based off a similar function in X86ShuffleDecodeConstantPool.cpp) that could be reused for other cases requiring constant vector extraction.
Pavel Labath [Wed, 30 Nov 2016 15:34:29 +0000 (15:34 +0000)]
[Support] Use HAVE_DLOPEN to guard dlopen(3) usage
Summary:
The usage was previously guarded by HAVE_DLFCN. This breaks on Android with
LLVM_BUILD_STATIC as the platform does not provide a static version of libdl.
Using HAVE_DLOPEN fixes it as the code will only get used if we are actually able
to link an executable using dlopen.
Nemanja Ivanovic [Tue, 29 Nov 2016 23:36:03 +0000 (23:36 +0000)]
[PowerPC] Improvements for BUILD_VECTOR Vol. 2
This patch corresponds to review:
https://reviews.llvm.org/D25980
This is the 2nd patch in a series of 4 that improve the lowering and combining
for BUILD_VECTOR nodes on PowerPC. This particular patch combines a build vector
of fp-to-int conversions into an fp-to-int conversion of a build vector of fp
values. For example:
Converts (build_vector (fp_to_[su]i $A), (fp_to_[su]i $B), ...)
Into (fp_to_[su]i (build_vector $A, $B, ...))).
Which is a natural match for much cleaner code.
Jacques Pienaar [Tue, 29 Nov 2016 23:01:09 +0000 (23:01 +0000)]
[lanai] Manually match 0/-1 with R0/R1.
Summary: Previously 0 and -1 was matched via tablegen rules. But this could cause problems where a physical register was being used where a virtual register was expected (seen in optimizeSelect and TwoAddressInstructionPass). Instead follow AArch64 and match in DAGToDAGISel.
Paul Robinson [Tue, 29 Nov 2016 22:41:16 +0000 (22:41 +0000)]
Emit 'no line' information for interesting 'orphan' instructions.
DWARF specifies that "line 0" really means "no appropriate source
location" in the line table. Use this for branch targets and some
other cases that have no specified source location, to prevent
inheriting unfortunate line numbers from physically preceding
instructions (which might be from completely unrelated source).
Adam Nemet [Tue, 29 Nov 2016 22:37:01 +0000 (22:37 +0000)]
[GVN] Basic optimization remark support
[recommiting patches one-by-one to see which breaks the stage2 LTO bot]
Follow-on patches will add more interesting cases.
The goal of this patch-set is to get the GVN messages printed in
opt-viewer from Dhrystone as was presented in my Dev Meeting talk. This
is the optimization view for the function (the last remark in the
function has a bug which is fixed in this series):
http://lab.llvm.org:8080/artifacts/opt-view_test-suite/build/SingleSource/Benchmarks/Dhrystone/CMakeFiles/dry.dir/html/_org_test-suite_SingleSource_Benchmarks_Dhrystone_dry.c.html#L430
This program is for testing features that rely on multi-module bitcode files.
It takes a multi-module bitcode file, extracts one of the modules and writes
it to the output file.
Justin Lebar [Tue, 29 Nov 2016 21:49:02 +0000 (21:49 +0000)]
[StructurizeCFG] Fix infinite loop in rebuildSSA.
Michel Dänzer reported that r288051, "[StructurizeCFG] Use range-based
for loops", introduced a bug into rebuildSSA, wherein we were iterating
over an instruction's use list while modifying it, without taking care
to do this correctly.
Kevin Enderby [Tue, 29 Nov 2016 21:43:40 +0000 (21:43 +0000)]
Add to llvm-objdump the -no-leading-headers option with the use of the -macho option.
In some cases the leading headers of the file name, archive member and
architecture slice name in the output of lvm-objdump is not wanted so the
tool’s output can be directly used by scripts. This matches the -X option
of the Apple otool(1) program.
Mehdi Amini [Tue, 29 Nov 2016 20:45:48 +0000 (20:45 +0000)]
Change Error unittest to use the LLVM_ENABLE_ABI_BREAKING_CHECKS instead of NDEBUG
This is consistent with the header (after r288087) and fixes the
test for the configuration:
-DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ABI_BREAKING_CHECKS=FORCE_OFF
This interface allows clients to write multiple modules to a single
bitcode file. Also introduce the llvm-cat utility which can be used
to create a bitcode file containing multiple modules.
Matt Arsenault [Tue, 29 Nov 2016 19:39:53 +0000 (19:39 +0000)]
AMDGPU: Disallow exec as SMEM instruction operand
This is not in the list of valid inputs for the encoding.
When spilling, copies from exec can be folded directly
into the spill instruction which results in broken
stores.
This only fixes the operand constraints, more codegen
work is required to avoid emitting the invalid
spills.
This sort of breaks the dbg.value test. Because the
register class of the s_load_dwordx2 changes, there
is a copy to SReg_64, and the copy is the operand
of dbg_value. The copy is later dead, and removed
from the dbg_value.
Geoff Berry [Tue, 29 Nov 2016 19:31:35 +0000 (19:31 +0000)]
[LiveRangeEdit] Handle instructions with no defs correctly.
Summary:
The code in LiveRangeEdit::eliminateDeadDef() that computes isOrigDef
doesn't handle instructions in which operand 0 is not a def (e.g. KILL)
correctly. Add a check that operand 0 is a def before doing the rest of
the isOrigDef computation.