[x86][inline-asm][clang] accept 'v' constraint
Commit on behalf of: Coby Tayree
1.'v' constraint for (x86) non-avx arch imitates the already implemented 'x' constraint, i.e. allows XMM{0-15} & YMM{0-15} depending on the apparent arch & mode (32/64).
2.for the avx512 arch it allows [X,Y,Z]MM{0-31} (mode dependent)
This patch applies the needed changes to clang
LLVM patch: https://reviews.llvm.org/D25005
Craig Topper [Tue, 1 Nov 2016 05:47:56 +0000 (05:47 +0000)]
[AVX-512] Remove masked vector insert builtins and replace with native shufflevectors and selects.
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.
Richard Smith [Tue, 1 Nov 2016 01:34:46 +0000 (01:34 +0000)]
Implement ABI proposal for throwing noexcept function pointers, per discussion
on cxx-abi-dev (thread starting 2016-10-11). This is currently hidden behind a
cc1-only -m flag, pending discussion of how best to deal with language changes
that require use of new symbols from the ABI library.
Richard Smith [Tue, 1 Nov 2016 01:31:23 +0000 (01:31 +0000)]
p0012: Teach resolving address of overloaded function with dependent exception
specification to resolve the exception specification as part of the type check,
in C++1z onwards. This is not actually part of P0012 / CWG1330 rules for when
an exception specification is "needed", but is necessary for sanity.
Eric Fiselier [Mon, 31 Oct 2016 23:07:15 +0000 (23:07 +0000)]
[modules] Mark deleted functions as implicitly inline to allow merging
Summary: When merging definitions with ModulesLocalVisibility enabled it's important to make deleted definitions implicitly inline, otherwise they'll be diagnosed as a redefinition.
John McCall [Mon, 31 Oct 2016 21:56:26 +0000 (21:56 +0000)]
A compound literal within a global lambda or block is still within
the body of a function for the purposes of computing its storage
duration and deciding whether its initializer must be constant.
There are a number of problems in our current treatment of compound
literals. C specifies that a compound literal yields an l-value
referring to an object with either static or automatic storage
duration, depending on where it was written; in the latter case,
the literal object has a lifetime tied to the enclosing scope (much
like an ObjC block), not the enclosing full-expression. To get these
semantics fully correct in our current design, we would need to
collect compound literals on the ExprWithCleanups, just like we do
with ObjC blocks; we would probably also want to identify literals
like we do with materialized temporaries. But it gets stranger;
GCC adds compound literals to C++ as an extension, but makes them
r-values, which are generally assumed to have temporary storage
duration. Ignoring destructor ordering, the difference only matters
if the object's address escapes the full-expression, which for an
r-value can only happen with reference binding (which extends
temporaries) or array-to-pointer decay (which does not). GCC then
attempts to lock down on array-to-pointer decay in ad hoc ways.
Arguably a far superior language solution for C++ (and perhaps even
array r-values in C, which can occur in other ways) would be to
propagate lifetime extension through array-to-pointer decay, so
that initializing a pointer object to a decayed r-value array
extends the lifetime of the complete object containing the array.
But this would be a major change in semantics which arguably ought
to be blessed by the committee(s).
Anyway, I'm not fixing any of that in this patch; I did try, but
it got out of hand.
Artem Dergachev [Mon, 31 Oct 2016 21:04:54 +0000 (21:04 +0000)]
[analyzer] MacOSXAPIChecker: Improve warnings for __block vars in dispatch_once.
The checker already warns for __block-storage variables being used as a
dispatch_once() predicate, however it refers to them as local which is not quite
accurate, so we fix that.
Richard Smith [Mon, 31 Oct 2016 18:18:29 +0000 (18:18 +0000)]
When diagnosing that a defaulted function is ill-formed because it would be
implicitly deleted and overrides a non-deleted function, explain why the
function is deleted. For PR30844.
Artem Dergachev [Mon, 31 Oct 2016 17:27:26 +0000 (17:27 +0000)]
[analyzer] MacOSXAPIChecker: Disallow dispatch_once_t in ivars and heap.
Unlike global/static variables, calloc etc. functions that allocate ObjC
objects behave differently in terms of memory barriers, and hacks that make
dispatch_once as fast as it possibly could be start failing.
[x86][inline-asm][AVX512][clang][PART-1] Introducing "k" and "Yk" constraints for extended inline assembly, enabling use of AVX512 masked vectorized instructions.
Commit on behalf of mharoush
Extending inline assembly support, compatible with GCC as folowing:
"k" constraint hints the compiler to select any of AVX512 k0-k7 registers.
"Yk" constraint is a subset of "k" excluding k0 which is not allowd to be used as a mask.
[x86][inline-asm] Add support for curly brackets escape using "%" in extended inline asm.
Commit on behalf of mharoush
After LGTM and check all:
This patch is a compatibility fix for clang, matching GCC support for charter escape when using extended in-line assembly (i.e, "%{" ,"%}" --> "{" ,"}" ).
It is meant to enable support for advanced features such as AVX512 conditional\masked vector instructions/broadcast assembly syntax.
Ulrich Weigand [Mon, 31 Oct 2016 14:38:05 +0000 (14:38 +0000)]
[SystemZ] Add -march=archX aliases
For compatibility with other compilers on the platform, allow specifying
levels of the z/Architecture instead of model names with -march. In
particular, the following aliases are now supported:
Daniel Jasper [Mon, 31 Oct 2016 13:23:00 +0000 (13:23 +0000)]
Skip over AnnotatedLines with >50 levels of nesting; don't format them.
Reasoning:
- ExpressionParser uses a lot of stack for these, bad in some environments.
- Our formatting algorithm is N^3 and gets really slow.
- The resulting formatting is unlikely to be any good.
- This is probably generated code we're formatting by accident.
We treat these as unparseable, and signal incomplete formatting. 50 is
an arbitrary number, I've only seen real problems from ~150 levels.
David Majnemer [Mon, 31 Oct 2016 05:37:48 +0000 (05:37 +0000)]
Add support for __builtin_alloca_with_align
__builtin_alloca always uses __BIGGEST_ALIGNMENT__ for the alignment of
the allocation. __builtin_alloca_with_align allows the programmer to
specify the alignment of the allocation.
Craig Topper [Mon, 31 Oct 2016 04:30:56 +0000 (04:30 +0000)]
[AVX-512] Remove masked vector extract builtins and replace with native shufflevectors and selects.
Unfortunately, the backend currently doesn't fold masks into the instructions correctly when they come from these shufflevectors. I'll work on that in a future commit.
Mehdi Amini [Sun, 30 Oct 2016 23:26:13 +0000 (23:26 +0000)]
Fix clang installed path to handle case where clang is invoked through a symlink
This code path is used when generating the path to libLTO.dylib, which
is passed to the linker as `-lto_library'.
Without this, if clang is invoked through a symlink, libLTO is
searched in a path relative to where the symlink is instead of
where clang is actually installed.
Piotr Padlewski [Sat, 29 Oct 2016 15:28:30 +0000 (15:28 +0000)]
[Devirtualization] Decorate vfunction load with invariant.load
Summary:
This patch was introduced one year ago, but because my google account
was disabled, I didn't get email with failing buildbot and I missed
revert of this commit. There was small but in test regex.
I am back.
[X86][AVX512][Clang][Intrinsics][reduce] Adding missing reduce (max|min) intrinsics to Clang .
After LGTM and Check-all
Vector-reduction arithmetic accepts vectors as inputs and produces
scalars as outputs.This class of vector operation forms the basis
of many scientific computations. In vector-reduction arithmetic,
the evaluation off is independent of the order of the input elements of V.
Justin Lebar [Fri, 28 Oct 2016 16:46:39 +0000 (16:46 +0000)]
Relax assertion in FunctionDecl::doesDeclarationForceExternallyVisibleDefinition.
Previously we were asserting that this declaration doesn't have a body
*and* won't have a body after we continue parsing. This is too strong
and breaks the go-bindings test during codegen.
Implement the -dI as supported by GCC: Output ‘#include’ directives in addition
to the result of preprocessing.
This change aims to add this option, pass it through to the preprocessor via
the options class, and when inclusions occur we output some information (+ test
cases).
Justin Lebar [Fri, 28 Oct 2016 16:26:26 +0000 (16:26 +0000)]
[CUDA] [AST] Allow isInlineDefinitionExternallyVisible to be called on functions without bodies.
Summary:
In CUDA compilation, we call isInlineDefinitionExternallyVisible (via
getGVALinkageForFunction) on functions while parsing their definitions.
At the point in time when we call getGVALinkageForFunction, we haven't
yet added the body to the function, so we trip this assert. But as far
as I can tell, this is harmless.
To work around this, we add a new flag to FunctionDecl, "WillHaveBody".
There was other code that was working around the existing assert with a
really awful hack -- this change lets us get rid of that hack.
OpenCL disallows using variadic arguments (s6.9.e and s6.12.5 OpenCL v2.0)
apart from some exceptions:
- printf
- enqueue_kernel
This change adds error diagnostic for variadic functions but accepts printf
and any compiler internal function (which should cover __enqueue_kernel_XXX cases).
It also unifies diagnostic with block prototype and adds missing uncaught cases for blocks.
This patch adds an objc_subclassing_restricted attribute into clang. This
attribute acts similarly to 'final' - Objective-C classes with this attribute
can't be subclassed. However, @interface declarations that have
objc_subclassing_restricted but don't have @implementation are allowed to
inherit other @interface declarations with objc_subclassing_restricted. This is
needed to describe the Swift class hierarchy in clang while making sure that
the Objective-C classes cannot subclass the Swift classes.
This attribute is already implemented in a fork of clang that's used for Swift
(https://github.com/apple/swift-clang) and this patch moves that code to the
upstream clang repository.
Erik Verbruggen [Fri, 28 Oct 2016 08:28:42 +0000 (08:28 +0000)]
Sema: do not warn about unused const vars if main file is a header
If we pass a header to libclang, e.g. because it's open in an editor in
an IDE, warnings about unused const vars are not useful: other files
that include the header might use those constants. So when -x *-header
is passed as command-line option, suppress this warning.
[Modules] Add testcase for builtins used in umbrella headers
This used to work before r284797 + r285152, which exposed something
interesting; some users include builtins from umbrella headers.
Clang should emit a warning to warn users this is not a good practice
and umbrella headers shouldn't get the
implicitly-add-the-builtin-version behavior for builtin header names.
While we're not there, add the testcase to represent the way it
currently works.
Richard Trieu [Fri, 28 Oct 2016 00:15:24 +0000 (00:15 +0000)]
Fix a crash on invalid code.
The diagnostic was attempting to access the QualType of a TypeDecl by calling
TypeDecl::getTypeForDecl. However, the Type pointer stored there is lazily
loaded by functions in ASTContext. In most cases, the pointer is loaded and
this does not cause a problem. However, when more that 50 or so unknown types
are seen beforehand, this causes the Type to not be loaded, passing a null
Type to the diagnostics, leading to the crash. Using
ASTContext::getTypeDeclType will give a proper QualType for all cases.
Anna Zaks [Thu, 27 Oct 2016 21:38:44 +0000 (21:38 +0000)]
[docs] Update the TSan and MSan docs to refer to the new no_sanitize attribute
TSan and MSan were the only remaining sanitizers referring to the deprecated
attribute for issue suppression. Update the docs to recommend
__attribute__((no_sanitize("..."))) instead.
Samuel Antao [Thu, 27 Oct 2016 18:14:55 +0000 (18:14 +0000)]
[Driver][OpenMP] Add support to create jobs for unbundling actions.
Summary:
This patch adds the support to create jobs for the `OffloadBundlingAction` which will invoke the `clang-offload-bundler` tool to unbundle input files.
Unlike other actions, unbundling actions have multiple outputs. Therefore, this patch adds the required changes to have a variant of `Tool::ConstructJob` with multiple outputs.
The way the naming of the results is implemented is also slightly modified so that the same action can use a different offloading prefix for each use by the different offloading actions.
With this patch, it is possible to compile a functional OpenMP binary with offloading support, even with separate compilation.
Samuel Antao [Thu, 27 Oct 2016 18:00:51 +0000 (18:00 +0000)]
[Driver][OpenMP] Update actions builder to create unbundling action when necessary.
Summary:
Each time that offloading support is requested by the user and the input file is not a source file, an action `OffloadUnbundlingAction` is created to signal that the input file may contain bundles, so that the proper tool is then invoked to attempt to extract the components of the bundle. This patch adds the logic to create that action in offload action builder.
The job creation for the new action will be proposed in a separate patch.
Samuel Antao [Thu, 27 Oct 2016 17:50:43 +0000 (17:50 +0000)]
[Driver][OpenMP] Update actions builder to create bundling action when necessary.
Summary:
In order to save the user from dealing with multiple output files (for host and device) while using separate compilation, a new action `OffloadBundlingAction` is used when the last phase is not linking. This action will then result in a job that uses the proposed bundling tool to create a single preprocessed/IR/ASM/Object file from multiple ones.
The job creation for the new action will be proposed in a separate patch.
Samuel Antao [Thu, 27 Oct 2016 17:39:44 +0000 (17:39 +0000)]
[Driver][OpenMP] Add logic for offloading-specific argument translation.
Summary:
This patch includes support for argument translation that is specific of a given offloading kind. Additionally, it implements the translation for OpenMP device kinds in the gcc tool chain.
With this patch, it is possible to compile a functional OpenMP application with offloading capabilities with no separate compilation.
Samuel Antao [Thu, 27 Oct 2016 17:31:22 +0000 (17:31 +0000)]
[Driver][OpenMP] Build jobs for OpenMP offloading actions for targets using gcc tool chains.
Summary:
This patch adds logic to create jobs for OpenMP offloading actions by:
- tuning the jobs result information to use the offloading prefix even for (device) linking actions.
- replacing the device inputs of the host linking jobs by a linker script that embed them in the right sections.