Sanjoy Das [Tue, 8 Nov 2016 20:46:01 +0000 (20:46 +0000)]
[TBAA] Drop support for "old style" scalar TBAA tags
Summary:
We've had support for auto upgrading old style scalar TBAA access
metadata tags into the "new" struct path aware TBAA metadata for 3 years
now. The only way to actually generate old style TBAA was explicitly
through the IRBuilder API. I think this is a good time for dropping
support for old style scalar TBAA.
I'm not removing support for textual or bitcode upgrade -- if you have
IR with the old style scalar TBAA tags that go through the AsmParser orf
the bitcode parser before LLVM sees them, they will keep working as
usual.
Tim Northover [Tue, 8 Nov 2016 20:39:03 +0000 (20:39 +0000)]
GlobalISel: allow CodeGen to fallback on VReg type/class issues.
After instruction selection we perform some checks on each VReg just before
discarding the type information. These checks were assertions before, but that
breaks the fallback path so this patch moves the logic into the main flow and
reports a better error on failure.
Ulrich Weigand [Tue, 8 Nov 2016 20:18:41 +0000 (20:18 +0000)]
[SystemZ] Add missing FP extension instructions
This completes assembler / disassembler support for all BFP
instructions provided by the floating-point extensions facility.
The instructions added here are not currently used for codegen.
Ulrich Weigand [Tue, 8 Nov 2016 20:17:02 +0000 (20:17 +0000)]
[SystemZ] Add program mask and addressing mode instructions
Add several instructions that operate on the program mask
or the addressing mode. These are not really needed for
code generation under Linux, but are provided for completeness
for the assembler/disassembler.
Ulrich Weigand [Tue, 8 Nov 2016 20:15:26 +0000 (20:15 +0000)]
[SystemZ] Model access registers as LLVM registers
Add the 16 access registers as LLVM registers. This allows removing
a lot of special cases in the assembler and disassembler where we
were handling access registers; this can all just use the generic
register code now.
Also add a bunch of instructions to operate on access registers,
for assembler/disassembler use only. No change in code generation
intended.
Dan Gohman [Tue, 8 Nov 2016 19:40:38 +0000 (19:40 +0000)]
[WebAssembly] Convert stackified IMPLICIT_DEF into constant 0.
Since IMPLIFIT_DEF instructions are omitted in the output, when the output
of an IMPLICIT_DEF instruction is stackified, the resulting register lacks
an explicit push, leading to a push/pop mismatch. Fix this by converting
such IMPLICIT_DEFs into CONST_I32 0 instructions so that they have explicit
pushes.
Ahmed Bougacha [Tue, 8 Nov 2016 19:27:10 +0000 (19:27 +0000)]
[GlobalISel] Permit select() to erase.
Erasing reverse_iterators is problematic; iterate manually.
While there, keep track of the range of inserted instructions.
It can miss instructions inserted elsewhere, but those are harder
to track.
Davide Italiano [Tue, 8 Nov 2016 19:18:20 +0000 (19:18 +0000)]
[LibcallsShrinkWrap] This pass doesn't preserve the CFG.
For example, it invalidates the domtree, causing assertions
in later passes which need dominator infos. Make it preserve
GlobalsAA, as suggested by Eli.
Ulrich Weigand [Tue, 8 Nov 2016 18:36:31 +0000 (18:36 +0000)]
[SystemZ] Refactor InstRR* instruction format patterns
This changes the InstRR (and related) patterns to no longer
automatically add an "r" at the end of the mnemonic. This
makes the .td files more obviously understandable, and also
allows using the patterns for those few instructions that
do not follow the *r scheme.
Also add some more sub-formats of the RRF format class, to
match operand names and sequence from the PoP better.
Wei Mi [Tue, 8 Nov 2016 18:19:36 +0000 (18:19 +0000)]
[RegAllocGreedy] Another fix about NewVRegs for last chance recoloring after r281783.
About when we should move a vreg from CurrentNewVRegs to NewVRegs,
if the vreg in CurrentNewVRegs was added into RecoloringCandidate and was
evicted, it shouldn't be added to NewVRegs because its physical register
will be restored at the end of tryLastChanceRecoloring after the recoloring
failed. If the vreg in CurrentNewVRegs was not in RecoloringCandidate, i.e.
it was evicted in selectOrSplitImpl inside tryRecoloringCandidates, its
physical register will not be restored even if the recoloring failed. In
that case, we need to add the vreg to NewVRegs.
Same as r281783, the problem was seen on out-of-tree target and we didn't
have a test case that reproduce the problem with in-tree targets.
Simon Pilgrim [Tue, 8 Nov 2016 15:07:01 +0000 (15:07 +0000)]
[TargetLowering] Fix undef vector element issue with true/false result handling
Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching.
The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements....
This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs).
The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed.
Pablo Barrio [Tue, 8 Nov 2016 14:53:30 +0000 (14:53 +0000)]
[JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64
transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to
"a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have
the same type which can lead to a wrong CSEL node that crashes later
due to nonsensical copies.
Craig Topper [Tue, 8 Nov 2016 06:58:53 +0000 (06:58 +0000)]
[AVX-512] Add an avx512f without avx512vl command line to vec_fp_to_int.ll and regenerate. This will make a change in a future patch easier to see. NFC
IR, Bitcode: Change bitcode reader to no longer own its memory buffer.
Unique ownership is just one possible ownership pattern for the memory buffer
underlying the bitcode reader. In practice, as this patch shows, ownership can
often reside at a higher level. With the upcoming change to allow multiple
modules in a single bitcode file, it will no longer be appropriate for
modules to generally have unique ownership of their memory buffer.
The C API exposes the ownership relation via the LLVMGetBitcodeModuleInContext
and LLVMGetBitcodeModuleInContext2 functions, so we still need some way for
the module to own the memory buffer. This patch does so by adding an owned
memory buffer field to Module, and using it in a few other places where it
is convenient.
Justin Bogner [Tue, 8 Nov 2016 05:02:18 +0000 (05:02 +0000)]
cmake: Don't try to install exports if there aren't any
When using LLVM_DISTRIBUTION_COMPONENTS, it's possible for LLVM's
export list to be empty. If this happens the install(EXPORTS) command
will fail, but since there isn't anything to install anyway we really
just want to skip it.
Bitcode: Decouple block info block state from reader.
As proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106630.html
Move block info block state to a new class, BitstreamBlockInfo.
Clients may set the block info for a particular cursor with the
BitstreamCursor::setBlockInfo() method.
At this point BitstreamReader is not much more than a container for an
ArrayRef<uint8_t>, so remove it and replace all uses with direct uses
of memory buffers.
Summary:
Set _install_rpath to CMAKE_INSTALL_RPATH if it is defined, so that eventually
INSTALL_RPATH is set to CMAKE_INSTALL_RPATH.
The "if(NOT DEFINED CMAKE_INSTALL_RPATH)" was missing a corresponding else
clause.
This also cleans up the fix made in r285908.
Tim Northover [Tue, 8 Nov 2016 00:34:06 +0000 (00:34 +0000)]
GlobalISel: constrain PHI registers on AArch64.
Self-referencing PHI nodes need their destination operands to be constrained
because nothing else is likely to do so. For now we just pick a register class
naively.
[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies
Codegen prepare sinks comparisons close to a user is we have only one register
for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions.
Changed BE to report we have many condition registers. That way IR LICM pass
would hoist an invariant comparison out of a loop and codegen prepare will not
sink it.
With that done a condition is calculated in one block and used in another.
Current behavior is to store workitem's condition in a VGPR using v_cndmask
and then restore it with yet another v_cmp instruction from that v_cndmask's
result. To mitigate the issue a forward propagation of a v_cmp 64 bit result
to an user is implemented. Additional side effect of this is that we may
consume less VGPRs in a cost of more SGPRs in case if holding of multiple
conditions is needed, and that is a clear win in most cases.
Adam Nemet [Mon, 7 Nov 2016 22:41:13 +0000 (22:41 +0000)]
[OptDiag, opt-viewer] Save callee's location and display as link
With this we get a new field in the YAML record if the value being
streamed out has a debug location. For examples, please see the changes
to the tests.
This is then used in opt-viewer to display a link for the callee
function in the inlining remarks.
Sanjin Sijaric [Mon, 7 Nov 2016 22:39:02 +0000 (22:39 +0000)]
[AArch64] Transfer memory operands when lowering vector load/store intrinsics
Summary:
Some vector loads and stores generated from AArch64 intrinsics alias each other
unnecessarily, preventing better scheduling. We just need to transfer memory
operands during lowering.
Lang Hames [Mon, 7 Nov 2016 22:33:13 +0000 (22:33 +0000)]
[docs] Add a pointer to ExitOnError to the discussion of handleErrors in the
programmer's manual.
ExitOnError is often a better alternative to handleErrors for tool code. This
patch makes it easier to find the ExitOnError discussion when reading the
handleErrors section.
Mehdi Amini [Mon, 7 Nov 2016 22:13:38 +0000 (22:13 +0000)]
Add experimental support for unofficial monorepo-like directory layout
Summary:
This allows to have clang and llvm and the other subprojects
side-by-side instead of nested. This can be used with the monorepo or
multiple repos.
It will help having a single set of sources checked out but allows to
have a build directory with llvm and another one with llvm+clang.
Basically it abstracts LLVM_EXTERNAL_xxxx_SOURCE_DIR making it more
convenient by adopting a convention.
Derek Schuff [Mon, 7 Nov 2016 22:00:48 +0000 (22:00 +0000)]
[WebAssembly] Emit a BasePointer when we have overly-aligned stack objects
Because we shift the stack pointer by an unknown amount, we need an
additional pointer. In the case where we have variable-size objects
as well, we can't reuse the frame pointer, thus three pointers.
Sanjoy Das [Mon, 7 Nov 2016 21:01:49 +0000 (21:01 +0000)]
Avoid tail recursion elimination across calls with operand bundles
Summary:
In some specific scenarios with well understood operand bundle types
(like `"deopt"`) it may be possible to go ahead and convert recursion to
iteration, but TailRecursionElimination does not have that logic today
so avoid doing the right thing for now.
I need some input on whether `"funclet"` operand bundles should also
block tail recursion elimination. If not, I'll allow TRE across calls
with `"funclet"` operand bundles and add a test case.
Mehdi Amini [Mon, 7 Nov 2016 20:00:47 +0000 (20:00 +0000)]
Add some facilities to work with a git monorepo (experimental setup)
Add a new script in llvm/utils/git-svn/. When present in the $PATH,
it enables a `git llvm` command. It is providing at this
point only the ability to push from the git monorepo: `git llvm push`.
It is intended to evolves with more features, for instance I plan on
features like `git llvm show r284955` to help working with sequential
revision numbers.
The push feature is taken from Justin Lebar's script available here:
https://github.com/jlebar/llvm-repo-tools/
Kuba Brecka [Mon, 7 Nov 2016 19:09:56 +0000 (19:09 +0000)]
[tsan] Cast floating-point types correctly when instrumenting atomic accesses, LLVM part
Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this.
Matt Arsenault [Mon, 7 Nov 2016 19:09:27 +0000 (19:09 +0000)]
AMDGPU: Preserve vcc undef flags when inverting branch
If the branch was on a read-undef of vcc, passes that used
analyzeBranch to invert the branch condition wouldn't preserve
the undef flag resulting in a verifier error.
Fixes verifier failures in a future commit.
Also fix verifier error when inserting copy for vccz
corruption bug.
Benjamin Kramer [Mon, 7 Nov 2016 17:47:28 +0000 (17:47 +0000)]
[MemCpyOpt] Don't emit IR in an unspecified order
Argument evaluation order is one of the edge cases where Clang differs
from GCC, yielding different IR depending on which compiler LLVM was
built with. Make the order deterministic and tune the test to actually
verify the order instead of trying to hide it.
Mehdi Amini [Mon, 7 Nov 2016 17:40:28 +0000 (17:40 +0000)]
Add some facilities to work with a git monorepo (experimental setup)
Summary:
Some changes are made to cmake, especially the addition of a new
LLVM_ENABLE_PROJECTS option that makes the build system aware of
the monorepo directory structure.
Also a new script is added in llvm/utils/git-svn/. When present in
the $PATH, it enables a `git llvm` command. It is providing at this
point only the ability to push from the git monorepo: `git llvm push`.
It is intended to evolves with more features, for instance I plan on
features like `git llvm show r284955` to help working with sequential
revision numbers.
The push feature is taken from Justin Lebar's script available here:
https://github.com/jlebar/llvm-repo-tools/
Sanjay Patel [Mon, 7 Nov 2016 15:52:45 +0000 (15:52 +0000)]
[InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)
This was reverted at r285866 because there was a crash handling a scalar
select of vectors. I added a check for that pattern and a test case based
on the example provided in the post-commit thread for r285732.
Jonas Paulsson [Mon, 7 Nov 2016 15:45:06 +0000 (15:45 +0000)]
[SystemZ] Correct the SchedModel regarding vector unit / instructions.
* Use a generic vector unit to model the issue unit more accurately.
* Update some vector instructions that actually use the vector unit for more
than one cycle.
James Molloy [Mon, 7 Nov 2016 13:38:21 +0000 (13:38 +0000)]
[Thumb1] Move padding earlier when synthesizing TBBs off of the PC
When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding.
If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset.
In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4.
Craig Topper [Mon, 7 Nov 2016 00:13:46 +0000 (00:13 +0000)]
[X86] Remove GCCBuiltins from cvtsi2ss/cvtsi2sd/cvtss2sd intrinsics as they aren't used by clang. Add TODOs to remove these and some other unused intrinsics.