Andrea Di Biagio [Tue, 25 Oct 2016 16:45:17 +0000 (16:45 +0000)]
[IndVarSimplify][Dwarf] When widening the IV increment, correctly set the debug loc.
When indvars widened an induction variable, the debug location for the loop
increment computation was incorrectly set equal to the debug loc of the loop
latch terminator.
This patch fixes the issue by propagating the correct location from the
original loop increment instruction to the new widened increment.
Geoff Berry [Tue, 25 Oct 2016 16:18:47 +0000 (16:18 +0000)]
[EarlyCSE] Make MemorySSA memory dependency check more aggressive.
Now that MemorySSA keeps track of whether MemoryUses are optimized, use
getClobberingMemoryAccess() to check MemoryUse memory dependencies since
it should no longer be so expensive.
This is a follow-up change to https://reviews.llvm.org/D25881
Ulrich Weigand [Tue, 25 Oct 2016 15:39:15 +0000 (15:39 +0000)]
[SystemZ] Do not use LOC(G) for volatile loads
It is not safe to use LOAD ON CONDITION to implement access to a memory
location marked "volatile", since the architecture leaves it unspecified
whether or not an access happens if the condition is false.
The current code already appears to care about that:
def LOC : CondUnaryRSY<"loc", 0xEBF2, nonvolatile_load, GR32, 4>;
Unfortunately, that "nonvolatile_load" operator is simply ignored
by the CondUnaryRSY class, and there was no test to catch it.
Simon Pilgrim [Tue, 25 Oct 2016 14:29:25 +0000 (14:29 +0000)]
[X86][SSE] Add support for (V)PMOVSX* constant folding
We already have (V)PMOVZX* combining support, this is the beginning of handling (V)PMOVSX* similarly - other combines in combineVSZext can be generalized in future patches.
This unearthed an interesting bug in that we were generating illegal build vectors on 32-bit targets - it was proving difficult to create a test for it from PMOVZX, but it fired immediately with PMOVSX. I've created a more general form of the existing getConstVector to handle these cases - ideally this should be handled in non-target-specific code but I couldn't find an equivalent.
Keeping the shuffle allows lowering the constant build_vector to a materialized
constant vector (such as a vector-load from the constant-pool or some other idiom).
This patch implements an API as close to that as possible. The
implementation on top of the current IRObjectFile is a bit hackish,
but should map just fine over a symbol table and is very convenient to
use.
Craig Topper [Tue, 25 Oct 2016 04:00:29 +0000 (04:00 +0000)]
[AVX-512] Add support for creating SIGN_EXTEND_VECTOR_INREG and ZERO_EXTEND_VECTOR_INREG for 512-bit vectors to support vpmovzxbq and vpmovsxbq.
Summary: The one tricky thing about this is that the sign/zero_extend_inreg uses v64i8 as an input type which isn't legal without BWI support. Though the vpmovsxbq and vpmovzxbq instructions themselves don't require BWI. To support this we need to add custom lowering for ZERO_EXTEND_VECTOR_INREG with v64i8 input. This can mostly reuse the existing sign extend code with a couple checks for sign extend vs zero extend added.
Matthias Braun [Tue, 25 Oct 2016 02:55:17 +0000 (02:55 +0000)]
MachineInstrBundle: Pass iterators to getBundle(Start|End); NFC
This is a function to go backwards in a block to find the first
instruction in a bundle, so iterator is a more natural choice for
parameter/return rather than a reference to a MachineInstruction.
Vedant Kumar [Tue, 25 Oct 2016 00:08:33 +0000 (00:08 +0000)]
[llvm-cov] Do not print out the filename of the object file
When we load coverage data from multiple objects, we don't have a way to
attribute a source object to a function record. Printing out the object
filename next to the source filename is already not very useful: soon,
it'll actually become misleading. Stop printing out the filename now.
Dan Gohman [Mon, 24 Oct 2016 23:27:49 +0000 (23:27 +0000)]
[WebAssembly] Implement more WebAssembly binary encoding.
This changes locals from being declared by the emitLocal hook in
WebAssemblyTargetStreamer, rather than with an instruction. After exploring
the infastructure in LLVM more, this seems to make more sense since
declaring locals doesn't use an encoded opcode.
This also adds more 0xd opcodes, type encodings, and miscellaneous
binary encoding bits.
Matthias Braun [Mon, 24 Oct 2016 23:23:02 +0000 (23:23 +0000)]
CodeGen/Passes: Pass MachineFunction as functor arg; NFC
Passing a MachineFunction as argument is more natural and avoids an
unnecessary round-trip through the logic determining the correct
Subtarget because MachineFunction already has a reference anyway.
Justin Bogner [Mon, 24 Oct 2016 21:58:58 +0000 (21:58 +0000)]
cmake: Rename installhdrs to install-llvm-headers and fix the dependencies
The installhdrs target was inconsistently named and would behave
differently depending on whether or not you ran a build first. This
renames it to install-llvm-headers to match other target names and
adds a dependency on intrinsics_gen so that it will always install the
same set of things.
Eli Friedman [Mon, 24 Oct 2016 21:47:44 +0000 (21:47 +0000)]
Fix regression from my recent GlobalsAA fix.
There are two fixes here: one, AnalyzeUsesOfPointer can't return
false until it has checked all the uses of the pointer. Two, if a
global uses another global, we have to assume the address of the
first global escapes.
Matthias Braun [Mon, 24 Oct 2016 21:36:43 +0000 (21:36 +0000)]
Use MachineInstr::mop_iterator instead of MIOperands; NFC
(Const)?MIOperands is equivalent to the C++ style
MachineInstr::mop_iterator. Use the latter for consistency except for a
few callers of MIOperands::analyzePhysReg().
Dan Gohman [Mon, 24 Oct 2016 19:49:43 +0000 (19:49 +0000)]
[WebAssembly] Add an option to make get_local/set_local explicit.
This patch adds a pass, controlled by an option and off by default for
now, for making implicit get_local/set_local explicit. This simplifies
emitting wasm with MC.
Target: Change various section classifiers in TargetLoweringObjectFile to take a GlobalObject.
These functions are about classifying a global which will actually be
emitted, so it does not make sense for them to take a GlobalValue which may
for example be an alias.
Change the Mach-O object writer and the Hexagon, Lanai and MIPS backends to
look through aliases before using TargetLoweringObjectFile interfaces. These
are functional changes but all appear to be bug fixes.
Adrian Prantl [Mon, 24 Oct 2016 18:23:51 +0000 (18:23 +0000)]
add-discriminators: Fix handling of lexical scopes.
This fixes a bug in the handling of lexical scopes, when more than one
scope is defined on the same line or functions are inlined into call
sites that are on the same line as the function definition. This
situation can easily happen in macro expansions.
The problem is solved by introducing a SmallDenseMap<DIScope *,
DILexicalBlockFile *, 1> that keeps track of all the different lexical
scopes that share a line/file location.
Geoff Berry [Mon, 24 Oct 2016 15:54:00 +0000 (15:54 +0000)]
[EarlyCSE] Optimize MemoryPhis and reduce memory clobber queries w/ MemorySSA
Summary:
When using MemorySSA, re-optimize MemoryPhis when removing a store since
this may create MemoryPhis with all identical arguments.
Also, when using MemorySSA to check if two MemoryUses are reading from
the same version of the heap, use the defining access instead of calling
getClobberingAccess, since the latter can currently result in many more
AA calls. Once the MemorySSA use optimization tracking changes are
done, we can remove this limitation, which should result in more loads
being CSE'd.
Ehsan Amiri [Mon, 24 Oct 2016 15:46:58 +0000 (15:46 +0000)]
[PPC] Better codegen for AND, ANY_EXT, SRL sequence
https://reviews.llvm.org/D24924
This improves the code generated for a sequence of AND, ANY_EXT, SRL instructions. This is a targetted fix for this special pattern. The pattern is generated by target independet dag combiner and so a more general fix may not be necessary. If we come across other similar cases, some ideas for handling it are discussed on the code review.
Nicolai Haehnle [Mon, 24 Oct 2016 14:56:02 +0000 (14:56 +0000)]
AMDGPU: Fix Two Address problems with v_movreld
Summary:
The v_movreld machine instruction is used with three operands that are
in a sense tied to each other (the explicit VGPR_32 def and the implicit
VGPR_NN def and use). There is no way to express that using the currently
available operand bits, and indeed there are cases where the Two Address
instructions pass does the wrong thing.
This patch introduces a new set of pseudo instructions that are identical
in intended semantics as v_movreld, but they only have two tied operands.
Having to add a new set of pseudo instructions is admittedly annoying, but
it's a fairly straightforward and solid approach. The only alternative I
see is to try to teach the Two Address instructions pass about Three Address
instructions, and I'm afraid that's trickier and is going to end up more
fragile.
Note that v_movrels does not suffer from this problem, and so this patch
does not touch it.
This fixes several GL45-CTS.shaders.indexing.* tests.
Nico Weber [Mon, 24 Oct 2016 14:52:04 +0000 (14:52 +0000)]
Revert 284971.
It seems to break selfhost on some bots, see e.g.
http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22
Pavel Labath [Mon, 24 Oct 2016 14:19:28 +0000 (14:19 +0000)]
[Chrono] Fix !HAVE_FUTIMENS build
If we don't have futimens(), we fall back to futimes(), which only supports
microsecond timestamps. In that case, we need to explicitly cast away the extra
precision in setLastModificationAndAccessTime().
Pavel Labath [Mon, 24 Oct 2016 13:38:27 +0000 (13:38 +0000)]
[Object] Replace TimeValue with std::chrono
Summary:
Most of the changes are very straight-forward. The only choice I had to make was
to use second-precision time points in the Archive classes. I did this because
the archive files use that precision in the on-disk representation anyway.
Joel Jones [Mon, 24 Oct 2016 13:37:13 +0000 (13:37 +0000)]
AArch64 ILP32 relocations for assembly and ELF
Summary:
Add relocations for AArch64 ILP32. Includes:
- Addition of definitions for R_AARCH32_*
- Definition of new -target-abi: ilp32
- Definition of data layout string
- Tests for added relocations. Not comprehensive, but matches
existing tests for 64-bit. Renames "CHECK-OBJ" to "CHECK-OBJ-LP64".
- Tests for llvm-readobj
Pablo Barrio [Mon, 24 Oct 2016 13:04:45 +0000 (13:04 +0000)]
[JumpThreading] Unfold selects that depend on the same condition
Summary:
These are good candidates for jump threading. This enables later opts
(such as InstCombine) to combine instructions from the selects with
instructions out of the selects. SimplifyCFG will fold the select
again if unfolding wasn't worth it.
Pavel Labath [Mon, 24 Oct 2016 10:59:17 +0000 (10:59 +0000)]
Remove TimeValue usage from llvm/Support
Summary:
This is a follow-up to D25416. It removes all usages of TimeValue from
llvm/Support library (except for the actual TimeValue declaration), and replaces
them with appropriate usages of std::chrono. To facilitate this, I have added
small utility functions for converting time points and durations into appropriate
OS-specific types (FILETIME, struct timespec, ...).
Simon Dardis [Mon, 24 Oct 2016 10:23:59 +0000 (10:23 +0000)]
[mips] synci microMIPS instruction definition.
Add synci to the microMIPS instruction definitions, mark the MIPS sync & synci
as not being part of microMIPS. This does not cover the sync instruction alias,
as that will be handled with a different patch. Add sync to the valid tests for
microMIPS.
Simon Pilgrim [Sun, 23 Oct 2016 16:49:04 +0000 (16:49 +0000)]
[X86][SSE] Add SSE41/AVX1 costs for vector shifts.
We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results.
Brian Gesiak [Sat, 22 Oct 2016 17:27:31 +0000 (17:27 +0000)]
[lit] Add more testing instructions to README
Summary:
r283710 introduced two regressions, one to llvm-lit, and the other to
lit executables that were installed via setuptools. Add instructions on
how to test for these regressions in the future.
Craig Topper [Sat, 22 Oct 2016 06:51:52 +0000 (06:51 +0000)]
[X86] Add support for lowering v4i64 and v8i64 shuffles directly to PALIGNR. I think shuffle combine can figure it out later, but we should try to get it right up front.
Craig Topper [Sat, 22 Oct 2016 06:51:44 +0000 (06:51 +0000)]
[X86] Remove 128-bit lane handling from the main loop of matchVectorShuffleAsByteRotate. Instead check for is128LaneRepeatedSuffleMask before the loop and just loop over the repeated mask.
I plan to use the loop to support VALIGND/Q shuffles so this makes it easier to reuse.
Gerolf Hoflehner [Sat, 22 Oct 2016 02:41:39 +0000 (02:41 +0000)]
[BasicAA] Fix - missed alias in GEP expressions
In BasicAA GEP operand values get adjusted ("wrap-around") based on the
pointersize. Otherwise, in non-64b modes, AA could report false negatives.
However, a wrap-around is valid only for a fully evaluated expression.
It had been introduced to fix an alias problem in
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html.
This commit restricts the wrap-around to constant gep operands only where the
value is known at compile-time.
David L. Jones [Fri, 21 Oct 2016 23:30:39 +0000 (23:30 +0000)]
Fix map insertion that is elided in release build.
The assert() macro doesn't actually execute its body in Release builds, so using
it to check cache invariants requires that the insertion be outside of the
assert() statement. This change does that, and also makes sure to return the
actual map contents.
Justin Lebar [Fri, 21 Oct 2016 22:10:23 +0000 (22:10 +0000)]
[ADT] Don't rely on string literals not being convertible to non-const char* in CachedHashString.
The build was breaking on some platforms because we assumed that
CachedHashString("foo") would match the CachedHashString(StringRef)
constructor rather than the CachedHashString(char*) constructor.
To fix this, provide a CachedHashString(const char*) constructor, and
add a dummy argument to the old CachedHashString(char*) constructor.