Simon Pilgrim [Tue, 28 Jun 2016 08:08:15 +0000 (08:08 +0000)]
[X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permutes
This patch allows target shuffles to be combined to single input immediate permute instructions - (V)PSHUFD/VPERMILPD/VPERMILPS - allowing more general pattern matching than what we current do and improves the likelihood of memory folding compared to existing patterns which tend to reuse the input in multiple arguments.
Further permute instructions (V)PSHUFLW/(V)PSHUFHW/(V)PERMQ/(V)PERMPD may be added in the future but its proven tricky to create tests cases for them so far. (V)PSHUFLW/(V)PSHUFHW is already handled quite well in combineTargetShuffle so it may be that removing some of that code may allow us to perform more of the combining in one place without duplication.
This patch enhances dot graph viewer to show hot regions
with hot bbs/edges displayed in red. The ratio of the bb
freq to the max freq of the function needs to be no less
than the value specified by view-hot-freq-percent option.
The default value is 10 (i.e. 10%).
MBFI supports profile count dumping and function
name based filtering. Add these two feature to
BFI as well. The filtering option is shared between
BFI and MBFI: -view-bfi-func-name=..
Adam Nemet [Tue, 28 Jun 2016 04:02:47 +0000 (04:02 +0000)]
[LLE] Don't hoist conditionally executed loads
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional. Thus we would
access a memory location that the original loop did not access.
BFI and MBFI's dot traits class share most of the
code and all future enhancement. This patch extracts
common implementation into base class BFIDOTGraphTraitsBase.
This patch also enables BFI graph to show branch probability
on edges as MBFI does before.
Vedant Kumar [Tue, 28 Jun 2016 02:09:39 +0000 (02:09 +0000)]
Reapply "[llvm-cov] Add an -output-dir option for the show sub-command""
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.
In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.
Vedant Kumar [Tue, 28 Jun 2016 00:18:57 +0000 (00:18 +0000)]
[llvm-cov] Add an -output-dir option for the show sub-command
Passing -output-dir path/to/dir to llvm-cov show creates path/to/dir if
it doesn't already exist, and prints reports into that directory.
In function view mode, all views are written into
path/to/dir/functions.$EXTENSION. In file view mode, all views are
written into path/to/dir/coverage/$PATH.$EXTENSION.
Chandler Carruth [Mon, 27 Jun 2016 23:26:08 +0000 (23:26 +0000)]
[PM] Improve the debugging and logging facilities of the CGSCC bits of
the new pass manager.
This adds operator<< overloads for the various bits of the
LazyCallGraph, dump methods for use from the debugger, and debug logging
using them to the CGSCC pass manager.
Having this was essential for debugging the call graph update patch, and
I've extracted what I could from that patch here to minimize the delta.
Kevin Enderby [Mon, 27 Jun 2016 21:39:39 +0000 (21:39 +0000)]
Change all but the last ErrorOr<...> use for MachOUniversalBinary to Expected<...> to
allow a good error message to be produced.
I added the one test case that the object file tools could produce an error
message. The other two errors can’t be triggered if the input file is passed
through sys::fs::identify_magic(). But the malformedError("bad magic number")
does get triggered by the logic in llvm-dsymutil when dealing with a normal
Mach-O file. The other "File too small ..." error would take a logic error
currently to produce and is not tested for.
Davide Italiano [Mon, 27 Jun 2016 20:38:39 +0000 (20:38 +0000)]
[llvm-ar] Ignore -plugin option.
binutils ar uses -plugin to specify the LTO plugin, but LLVM doesn't
need this as it doesn't use a plugin for LTO. Accepting (and ignoring)
the option allows interoperability with existing build systems and
make downstream consumers life much easier.
Chris Bieneman [Mon, 27 Jun 2016 19:53:53 +0000 (19:53 +0000)]
[yaml2obj] Remove --format option in favor of YAML tags
Summary:
Our YAML library's handling of tags isn't perfect, but it is good enough to get rid of the need for the --format argument to yaml2obj. This patch does exactly that.
Instead of requiring --format, it infers the format based on the tags found in the object file. The supported tags are:
!ELF
!COFF
!mach-o
!fat-mach-o
I have a corresponding patch that is quite large that fixes up all the in-tree test cases.
Artur Pilipenko [Mon, 27 Jun 2016 16:54:33 +0000 (16:54 +0000)]
Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
Artur Pilipenko [Mon, 27 Jun 2016 16:29:26 +0000 (16:29 +0000)]
Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).
This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.
The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.
Kuba Brecka [Mon, 27 Jun 2016 15:57:08 +0000 (15:57 +0000)]
[asan] fix false dynamic-stack-buffer-overflow report with constantly-sized dynamic allocas, LLVM part
See the bug report at https://github.com/google/sanitizers/issues/691. When a dynamic alloca has a constant size, ASan instrumentation will treat it as a regular dynamic alloca (insert calls to poison and unpoison), but the backend will turn it into a regular stack variable. The poisoning/unpoisoning is then broken. This patch will treat such allocas as static.
Renato Golin [Mon, 27 Jun 2016 14:42:20 +0000 (14:42 +0000)]
[ARM] Fix Thumb text sections' flags under COFF/Windows
The main issue here is that the "thumb" flag wasn't set for some of these
sections, making MSVC's link.exe fails to correctly relocate code
against the symbols inside these sections. link.exe could fail for
instance with the "fixup is not aligned for target 'XX'" error. If
linking doesn't fail, the relocation process goes wrong in the end and
invalid code is generated by the linker.
This patch adds Thumb/ARM information so that the right flags are set
on COFF/Windows.
Diana Picus [Mon, 27 Jun 2016 09:08:23 +0000 (09:08 +0000)]
[ARM] Do not test for CPUs, use SubtargetFeatures (Part 2). NFCI
This is a follow-up for r273544.
The end goal is to get rid of the isSwift / isCortexXY / isWhatever methods.
Since the ARM backend seems to have quite a lot of calls to these methods, I
intend to submit 5-6 subtarget features at a time, instead of one big lump.
Pawel Bylica [Mon, 27 Jun 2016 08:46:23 +0000 (08:46 +0000)]
CachePruning: correct comment about file order. NFC
Summary: Actually the list of cached files is sorted by file size, not by last accessed time. Also remove unused file access time param for a helper function.
Simon Pilgrim [Mon, 27 Jun 2016 07:44:32 +0000 (07:44 +0000)]
[X86][AVX] Peek through bitcasts to find the source of broadcasts
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.
This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.
Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.
[lit] Add SANITIZER_IGNORE_CVE_2016_2143 to pass_vars.
This variable is used by ASan (and other sanitizers in the future)
on s390x-linux to override a check for CVE-2016-2143 in the running
kernel (see revision 267747 on compiler-rt). Since the check simply
checks if the kernel version is in a whitelist of known-good versions,
it may miss distribution kernels, or manually-patched kernels - hence
the need for this variable. To enable running the ASan testsuite on
such kernels, this variable should be passed from the environment
down to the testcases.