Wei Mi [Thu, 30 Jun 2016 18:42:56 +0000 (18:42 +0000)]
Refine the set of UniformAfterVectorization instructions.
Except the seed uniform instructions (conditional branch and consecutive ptr
instructions), dependencies to be added into uniform set should only be used
by existing uniform instructions or intructions outside of current loop.
Lang Hames [Thu, 30 Jun 2016 17:43:06 +0000 (17:43 +0000)]
[Support] Fix a bug in ErrorList::join / joinErrors.
When concatenating two error lists the ErrorList::join method (which is called
by joinErrors) was failing to set the checked bit on the second error, leading
to a 'failure to check error' assertion.
Zachary Turner [Thu, 30 Jun 2016 17:43:00 +0000 (17:43 +0000)]
[pdb] Re-add code to write PDB files.
Somehow all the functionality to write PDB files got removed,
probably accidentally when uploading the patch perhaps the wrong
one got uploaded. This re-adds all the code, as well as the
corresponding test.
Adrian Prantl [Thu, 30 Jun 2016 17:15:44 +0000 (17:15 +0000)]
[CMake] Add an LLVM_ENABLE_MODULE_DEBUGGING flag for building with -gmodules.
This flag is only effective in builds with debug info and modules.
The default is On for Darwin only.
This removes a hack that was added for the benefit of x86 codegen.
It prevented shrinking the switch condition even to smaller legal (DataLayout) types.
We have a safety mechanism in CGP after:
http://reviews.llvm.org/rL251857
...so we're free to use the optimal (smallest) IR type now.
David Majnemer [Thu, 30 Jun 2016 03:00:20 +0000 (03:00 +0000)]
[CodeView] Implement support for bitfields in LLVM
CodeView need to know the offset of the storage allocation for a
bitfield. Encode this via the "extraData" field in DIDerivedType and
introduced a new flag, DIFlagBitField, to indicate whether or not a
member is a bitfield.
Chandler Carruth [Thu, 30 Jun 2016 02:32:20 +0000 (02:32 +0000)]
[ADT] Add a new data structure for managing a priority worklist where
re-insertion of entries into the worklist moves them to the end.
This is fairly similar to a SetVector, but helps in the case where in
addition to not inserting duplicates you want to adjust the sequence of
a pop-off-the-back worklist.
I'm not at all attached to the name of this data structure if others
have better suggestions, but this is one that David Majnemer brought up
in IRC discussions that seems plausible.
I've trimmed the interface down somewhat from SetVector's interface
because several things make less sense here IMO: iteration primarily.
I'd prefer to add these back as we have users that need them. My use
case doesn't even need all of what is provided here. =]
I've also included a basic unittest to make sure this functions
reasonably.
This patch makes CFLAA answer some ModRef queries. Because we don't
distinguish between reading/writing when making StratifiedSets, we're
unable to offer any of the readonly-related answers.
Adrian Prantl [Thu, 30 Jun 2016 01:46:49 +0000 (01:46 +0000)]
[CMake] Introduce a LLVM_ENABLE_LOCAL_SUBMODULE_VISIBILITY flag.
On Darwin it is currently impossible to build LLVM with modules
because the Darwin system module map is not compatible with
-fmodules-local-submodule-visibility at this point in time. This
patch makes the flag optional and off by default on Darwin so it
becomes possible to build LLVM with modules again.
Matthias Braun [Thu, 30 Jun 2016 00:23:54 +0000 (00:23 +0000)]
RegisterScavenging: Code cleanup; NFC
- Use range based for loops
- No need for some !Reg checks: isPhysicalRegister() reports false for
NoRegister anyway
- Do not repeat function name in documentation comment.
- Do not repeat documentation comment in implementation when we already
have one at the declaration.
- Factor some common subexpressions out.
- Change file comments to use doxygen syntax.
CodeGen: Add an explicit BuildMI overload for MachineInstr&
Add an explicit overload to BuildMI for MachineInstr& to deal with
insertions inside of instruction bundles.
- Use it to re-implement MachineInstr* to give it coverage.
- Document how the overload for MachineBasicBlock::instr_iterator
differs from that for MachineBasicBlock::iterator (the previous
(implicit) overload for MachineInstr&).
- Add a comment explaining why the MachineInstr& and MachineInstr*
overloads don't universally forward to the
MachineBasicBlock::instr_iterator overload.
Thanks to Justin for noticing the API quirk. While this doesn't fix any
known bugs -- all uses of BuildMI with a MachineInstr& were previously
using MachineBasicBlock::iterator -- it protects against future bugs.
CodeGen: Use MachineInstr& in TargetInstrInfo, NFC
This is mostly a mechanical change to make TargetInstrInfo API take
MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator)
when the argument is expected to be a valid MachineInstr. This is a
general API improvement.
Although it would be possible to do this one function at a time, that
would demand a quadratic amount of churn since many of these functions
call each other. Instead I've done everything as a block and just
updated what was necessary.
This is mostly mechanical fixes: adding and removing `*` and `&`
operators. The only non-mechanical change is to split
ARMBaseInstrInfo::getOperandLatencyImpl out from
ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a
`MachineInstr*` which it updated to the instruction bundle leader; now,
the latter calls the former either with the same `MachineInstr&` or the
bundle leader.
As a side effect, this removes a bunch of MachineInstr* to
MachineBasicBlock::iterator implicit conversions, a necessary step
toward fixing PR26753.
Note: I updated WebAssembly, Lanai, and AVR (despite being
off-by-default) since it turned out to be easy. I couldn't run tests
for AVR since llc doesn't link with it turned on.
Object: Replace NewArchiveIterator with a simpler NewArchiveMember class. NFCI.
The NewArchiveIterator class has a problem: it requires too much context. Any
memory buffers added to the archive must be stored within an Archive::Member,
which must have an associated Archive. This makes it harder than necessary
to create new archive members (or new archives entirely) from scratch using
memory buffers.
This patch replaces NewArchiveIterator with a NewArchiveMember class that
stores just the memory buffer and the information that goes into the archive
member header.
Where command1 and command2 contain completely different sets of
valid options. This is backwards compatible with previous uses
of llvm cl which did not support subcommands, as any option
which specifies no optional subcommand (e.g. all existing
code) goes into a special "top level" subcommand that expects
dashed options to appear immediately after the program name.
For example, code which is subcommand unaware would generate
a command line such as the following, where no subcommand
is specified:
llvm-foo.exe -q1 -q2
The top level subcommand can co-exist with actual subcommands,
as it is implemented as an actual subcommand which is searched
if no explicit subcommand is specified. So llvm-foo.exe as
specified above could be written so as to support all three
aforementioned command lines simultaneously.
There is one additional "special" subcommand called AllSubCommands,
which can be used to inject an option into every subcommand.
This is useful to support things like help, so that commands
such as:
All work and display the help for the selected subcommand
without having to explicitly go and write code to handle each
one separately.
This patch is submitted without an example of anything actually
using subcommands, but a followup patch will convert the
llvm-pdbdump tool to use subcommands.
Evgeniy Stepanov [Wed, 29 Jun 2016 20:37:43 +0000 (20:37 +0000)]
StackColoring for SafeStack.
This is a fix for PR27842.
An IR-level implementation of stack coloring tailored to work with
SafeStack. It is a bit weaker than the MI implementation in that it
does not the "lifetime start at first access" logic. This can be
improved in the future.
This patch also replaces the naive implementation of stack frame
layout with a greedy algorithm that can split existing stack slots
and even fit small objects inside the alignment padding of other
objects.
Tim Shen [Wed, 29 Jun 2016 20:10:17 +0000 (20:10 +0000)]
[InstCombine] Simplify and correct folding fcmps with the same children
Summary: Take advantage of FCmpInst::Predicate's bit pattern and handle (fcmp *, x, y) | (fcmp *, x, y) and (fcmp *, x, y) & (fcmp *, x, y) more consistently. Also fold more FCmpInst::FCMP_FALSE and FCmpInst::FCMP_TRUE to constants.
Currently InstCombine wrongly folds (fcmp ogt, x, y) | (fcmp ord, x, y) to (fcmp ogt, x, y); this patch also fixes that.
Nirav Dave [Wed, 29 Jun 2016 19:54:27 +0000 (19:54 +0000)]
Permit memory operands in ins/outs instructions
[x86] (PR15455) While (ins|outs)[bwld] instructions do not take %dx as a
memory operand, various unofficial references do and objdump
disassembles to this format. Extend special treatment of
similar (in|out)[bwld] operations.
Rafael Espindola [Wed, 29 Jun 2016 18:31:48 +0000 (18:31 +0000)]
Don't verify inputs to the Linker if ODR merging.
This fixes pr28072.
The point, as Duncan pointed out, is that the file is already
partially linked by just reading it.
Long term I think the solution is to make metadata owned by the module
and then the linker will lazily read it and be in charge of all the
linking. Running a verifier in each input will defeat the lazy
loading, but will be legal.
Right now we are at the unfortunate position that to support odr
merging we cannot verify the inputs, which mildly annoying (see test
update).
Vedant Kumar [Wed, 29 Jun 2016 17:47:08 +0000 (17:47 +0000)]
[llvm-cov] Change some FileCheck prefixes to make tests reusable (NFC)
I'm planning on extending these two tests with checks that validate
html coverage reports. Make it easier to extend them by not using a
prefix called "CHECK".
Vedant Kumar [Wed, 29 Jun 2016 16:22:12 +0000 (16:22 +0000)]
[llvm-cov] Do not allow ".." to escape the coverage sub-directory
In -output-dir mode, file reports are placed into a "coverage"
directory. If filenames in the coverage mapping contain "..", they might
escape out of this directory.
Fix the problem by removing ".." from source filenames (expand the path
component).
Benjamin Kramer [Wed, 29 Jun 2016 15:04:07 +0000 (15:04 +0000)]
[ManagedStatic] Reimplement double-checked locking with std::atomic.
This gets rid of the memory fence in the hot path (dereferencing the
ManagedStatic), trading for an extra mutex lock in the cold path (when
the ManagedStatic was uninitialized). Since this only happens on the
first accesses it shouldn't matter much. On strict architectures like
x86 this removes any atomic instructions from the hot path.
Also remove the tsan annotations, tsan knows how standard atomics work
so they should be unnecessary now.
Craig Topper [Wed, 29 Jun 2016 04:57:00 +0000 (04:57 +0000)]
Revert "[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition."
This is breaking an optimizaton remark test in clang. I've identified a couple fixes for that, but want to understand it better before I commit to anything.
Adam Nemet [Wed, 29 Jun 2016 04:55:19 +0000 (04:55 +0000)]
[Diag] Add getter shouldAlwaysPrint. NFC
For the new hotness attribute, the API will take the pass rather than
the pass name so we can no longer play the trick of AlwaysPrint being a
special pass name. This adds a getter to help the transition.
Craig Topper [Wed, 29 Jun 2016 03:46:47 +0000 (03:46 +0000)]
[ValueTracking] Teach computeKnownBits for PHI nodes to compute sign bit for a recurrence with a NSW addition.
If a operation for a recurrence is an addition with no signed wrap and both input sign bits are 0, then the result sign bit must also be 0. Similar for the negative case.
I found this deficiency while playing around with a loop in the x86 backend that contained a signed division that could be optimized into an unsigned division if we could prove both inputs were positive. One of them being the loop induction variable. With this patch we can perform the conversion for this case. One of the test cases here is a contrived variation of the loop I was looking at.
Craig Topper [Wed, 29 Jun 2016 03:29:12 +0000 (03:29 +0000)]
[DAGCombine] Teach DAG combine to handle ORs of shuffles involving zero vectors where the zero vector is the first operand to the shuffle instead of the second.
Craig Topper [Wed, 29 Jun 2016 03:29:09 +0000 (03:29 +0000)]
[DAGCombine] Add test cases to show that DAG combining an OR of two shuffles with zero vectors doesn't work if the zero vector is the first operand of the shuffle. Fix coming in a follow up patch.
Eric Christopher [Wed, 29 Jun 2016 03:05:58 +0000 (03:05 +0000)]
Revert "[InstCombine] Avoid combining the bitcast of a var that is used as both address and result of load instructions"
Revert "[InstCombine] Combine A->B->A BitCast"
as this appears to cause PR27996 and as discussed in http://reviews.llvm.org/D20847
Davide Italiano [Wed, 29 Jun 2016 01:56:27 +0000 (01:56 +0000)]
[Triple] Add isLittleEndian().
This allows us to query about the endianness without having to
look at DataLayout. The API will be used (and tested) in lld,
in order to find out the endianness of BitcodeFiles.
Kevin Enderby [Tue, 28 Jun 2016 23:16:13 +0000 (23:16 +0000)]
Finish cleaning up most of the error handling in libObject’s MachOUniversalBinary
and its clients to use the new llvm::Error model for error handling.
Changed getAsArchive() from ErrorOr<...> to Expected<...> so now all
interfaces there use the new llvm::Error model for return values.
In the two places it had if (!Parent) this is actually a program error so changed
from returning errorCodeToError(object_error::parse_failed) to calling
report_fatal_error() with a message.
In getObjectForArch() added error messages to its two llvm::Error return values
instead of returning errorCodeToError(object_error::arch_not_found) with no
error message.
For the llvm-obdump, llvm-nm and llvm-size clients since the only binary files in
Mach-O Universal Binaries that are supported are Mach-O files or archives with
Mach-O objects, updated their logic to generate an error when a slice contains
something like an ELF binary instead of ignoring it. And added a test case for
that.
The last error stuff to be cleaned up for libObject’s MachOUniversalBinary is
the use of errorOrToExpected(Archive::create(ObjBuffer)) which needs
Archive::create() to be changed from ErrorOr<...> to Expected<...> first,
which I’ll work on next.
Weiming Zhao [Tue, 28 Jun 2016 22:30:45 +0000 (22:30 +0000)]
[ARM] Fix 28282: cost computation for constant hoisting
Summary:
This fixes bug: https://llvm.org/bugs/show_bug.cgi?id=28282
Currently the cost model of constant hoisting checks the bit width of the data type of the constants.
However, the actual immediate value is small enough and not need to be hoisted.
This patch checks for the actual bit width needed for the constant.
Dehao Chen [Tue, 28 Jun 2016 21:19:34 +0000 (21:19 +0000)]
Relax the clearance calculating for breaking partial register dependency.
Summary: LLVM assumes that large clearance will hide the partial register spill penalty. But in our experiment, 16 clearance is too small. As the inserted XOR is normally fairly cheap, we should have a higher clearance threshold to aggressively insert XORs that is necessary to break partial register dependency.
Chris Bieneman [Tue, 28 Jun 2016 21:10:26 +0000 (21:10 +0000)]
[YAML] Fix YAML tags appearing before the start of sequence elements
Our existing yaml::Output code writes tags immediately when mapTag is called, without any state handling. This results in tags on sequence elements being written before the element itself. For example, we see this:
Our reader handles reading properly, so this bug only impacts writing yaml sequences with tagged elements.
As a test for this I've modified the Mach-O yaml encoding to allways apply the !mach-o tag when encoding MachOYAML::Object entries. This results in the !mach-o tag appearing as expected in dumped fat files.
Zhan Jun Liau [Tue, 28 Jun 2016 21:03:19 +0000 (21:03 +0000)]
[SystemZ] Use NILL instruction instead of NILF where possible
Summary: SystemZ shift instructions only use the last 6 bits of the shift
amount. When the result of an AND operation is used as a shift amount, this
means that we can use the NILL instruction (which operates on the last 16 bits)
rather than NILF (which operates on the last 32 bits) for a 16-bit savings in
instruction size.
Rafael Espindola [Tue, 28 Jun 2016 20:13:36 +0000 (20:13 +0000)]
Use isPositionIndependent in a few more places.
I think this converts all the simple cases that really just care about
the generated code being position independent or not. The remaining
uses are a bit more complicated and are checking things like "is this
a library or executable" or "can this symbol be preempted".
Where command1 and command2 contain completely different sets of
valid options. This is backwards compatible with previous uses
of llvm cl which did not support subcommands, as any option
which specifies no optional subcommand (e.g. all existing
code) goes into a special "top level" subcommand that expects
dashed options to appear immediately after the program name.
For example, code which is subcommand unaware would generate
a command line such as the following, where no subcommand
is specified:
llvm-foo.exe -q1 -q2
The top level subcommand can co-exist with actual subcommands,
as it is implemented as an actual subcommand which is searched
if no explicit subcommand is specified. So llvm-foo.exe as
specified above could be written so as to support all three
aforementioned command lines simultaneously.
There is one additional "special" subcommand called AllSubCommands,
which can be used to inject an option into every subcommand.
This is useful to support things like help, so that commands
such as:
All work and display the help for the selected subcommand
without having to explicitly go and write code to handle each
one separately.
This patch is submitted without an example of anything actually
using subcommands, but a followup patch will convert the
llvm-pdbdump tool to use subcommands.