Marina Yatsina [Mon, 7 Mar 2016 18:10:25 +0000 (18:10 +0000)]
[ms-inline-asm][AVX512] Add ability to use k registers in MS inline asm + fix bag with curly braces
Until now curly braces could only be used in MS inline assembly to mark block start/end.
All curly braces were removed completely at a very early stage.
This approach caused bugs like:
"m{o}v eax, ebx" turned into "mov eax, ebx" without any error.
In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such.
Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such).
This patch fixes the bug described above and enables the use of AVX-512 special operands.
This commit is the the clang part of the patch.
The clang part of the review is: http://reviews.llvm.org/D17766
The llvm part of the review is: http://reviews.llvm.org/D17767
Carlo Bertolli [Mon, 7 Mar 2016 16:04:49 +0000 (16:04 +0000)]
Reapply r262741 [OPENMP] Codegen for distribute directive
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.
David Majnemer [Mon, 7 Mar 2016 08:51:17 +0000 (08:51 +0000)]
[MS ABI] Mangle symbols names longer than 4096 characters correctly
Really long symbols are hashed using MD5 and prefixed/suffixed with the
usual sigils. There is an additional reason beyond the usual
compatibility with MSVC, it is important to keep COFF symbols shorter
than 0xFFFF because the CodeView debugging format has a maximum
symbol/record size of 0xFFFF.
There are some quirks worth noting:
- Some mangled names reference other entities which are mangled
separately. A quick example:
int I;
template <int *> struct S {};
S<I> s;
In this case, the mangling for 's' doesn't depend directly on the
mangling for 'I'. While 's' would need an MD5 hash if 'I' also needed
one, the hash for 's' applied to the fully realized mangled name. In
other words, the mangled name for 's' will not depend on the MD5 of the
mangled name for 'I'.
- Some mangled names, like the venerable CatchableType, embed the MD5
verbatim.
- Finally, the complete object locator is handled as a special case.
A complete object locators are mangled exactly like a VFTable except for
a small deviation in the prefix sigils. However, complete object
locators for hashed vftables result in a complete object locator whose
name is identical to the vftable except for an additional suffix.
Richard Trieu [Sat, 5 Mar 2016 04:04:57 +0000 (04:04 +0000)]
Add null check to diagnostic path for lambda captures.
Previously, the failed capture of a variable in nested lambdas may crash when
the lambda pointer is null. Only give the note if a location can be retreived
from the lambda pointer.
Devin Coughlin [Sat, 5 Mar 2016 01:32:43 +0000 (01:32 +0000)]
[analyzer] Nullability: add option to not report on calls to system headers.
Add an -analyzer-config 'nullability:NoDiagnoseCallsToSystemHeaders' option to
the nullability checker. When enabled, this option causes the analyzer to not
report about passing null/nullable values to functions and methods declared
in system headers.
This option is motivated by the observation that large projects may have many
nullability warnings. These projects may find warnings about nullability
annotations that they have explicitly added themselves higher priority to fix
than warnings on calls to system libraries.
Nico Weber [Fri, 4 Mar 2016 21:59:42 +0000 (21:59 +0000)]
clang-cl: Enable PCH flags by default.
Now that pragma comment and pragma detect_mismatch are implemented, this might
just work.
Some pragmas aren't serialized yet (from the top of my head: code_seg, bss_seg,
data_seg, const_seg, init_seg, section, vtordisp), but these are as far as I
know usually pushed and popped within the header and usually don't leak out.
If it turns out the current PCH support isn't good enough yet, we can turn it
off again.
Stephen Hines [Fri, 4 Mar 2016 20:57:22 +0000 (20:57 +0000)]
Switch krait to use -mcpu=cortex-a15 for assembler tool invocations.
Summary:
Using -no-integrated-as causes -mcpu=krait to be transformed into
-march=armv7-a today. This precludes the assembler from using
instructions like sdiv, which are present for krait. Cortex-a15 is the
closest subset of functionality for krait, so we should switch the
assembler to use that instead.
Carlo Bertolli [Fri, 4 Mar 2016 20:24:58 +0000 (20:24 +0000)]
[OPENMP] Codegen for distribute directive
This patch provide basic implementation of codegen for teams directive, excluding all clauses except dist_schedule. It also fixes parts of AST reader/writer to enable correct pre-compiled header handling.
James Y Knight [Fri, 4 Mar 2016 19:00:41 +0000 (19:00 +0000)]
Make TargetInfo store an actual DataLayout instead of a string.
Use it to calculate UserLabelPrefix, instead of specifying it (often
incorrectly).
Note that the *actual* user label prefix has always come from the
DataLayout, and is handled within LLVM. The main thing clang's
TargetInfo::UserLabelPrefix did was to set the #define value. Having
these be different from each-other is just silly.
Devin Coughlin [Fri, 4 Mar 2016 18:09:58 +0000 (18:09 +0000)]
[analyzer] Add diagnostic in ObjCDeallocChecker for use of -dealloc instead of -release.
In dealloc methods, the analyzer now warns when -dealloc is called directly on
a synthesized retain/copy ivar instead of -release. This is intended to find mistakes of
the form:
- (void)dealloc {
[_ivar dealloc]; // Mistaken call to -dealloc instead of -release
Pavel Labath [Fri, 4 Mar 2016 10:00:08 +0000 (10:00 +0000)]
[SemaExprCXX] Avoid calling isInSystemHeader for invalid source locations
Summary:
While diagnosing a CXXNewExpr warning, we were calling isInSystemHeader(), which expect to be
called with a valid source location. This causes an assertion failure if the location is unknown.
A quick grep shows it's not without precedent to guard calls to the function with a
"Loc.isValid()".
This fixes a test failure in LLDB, which always creates object with invalid source locations as it
does not (always) have access to the source.
Vedant Kumar [Fri, 4 Mar 2016 08:07:15 +0000 (08:07 +0000)]
[Coverage] Fix the start/end locations of switch statements
While pushing switch statements onto the region stack we neglected to
specify their start/end locations. This results in a crash (PR26825) if
we end up in nested macro expansions without enough information to
handle the relevant file exits.
I added a test in switchmacro.c and fixed up a bunch of incorrect CHECK
lines that specify strange end locations for switches.
David Majnemer [Fri, 4 Mar 2016 05:26:16 +0000 (05:26 +0000)]
[X86] Pass __m64 types via SSE registers for GCC compatibility
For compatibility with GCC, classify __m64 as SSE.
However, clang is a platform compiler for certain targets; retain our
old behavior on those targets: classify __m64 as integer.
David Majnemer [Fri, 4 Mar 2016 05:26:14 +0000 (05:26 +0000)]
[VFS] Switch from close to SafelyCloseFileDescriptor
The SafelyCloseFileDescriptor machinery does the right thing in the face
of signals while close will do something platform specific which results
in the FD potentially getting leaked.
Carlo Bertolli [Thu, 3 Mar 2016 22:09:40 +0000 (22:09 +0000)]
[OPENMP] firstprivate and private clauses of teams, host codegeneration
Add code generation support for firstprivate and private clauses of teams on the host. Add extensive regression tests including lambda functions and vla testing.
[OpenCL] Improve diagnostics of address spaces for variables in function
- Prevent local variables to be declared in global AS
- Diagnose AS of local variables with an extern storage class
as if they would be in a program scope
Samuel Antao [Thu, 3 Mar 2016 16:20:23 +0000 (16:20 +0000)]
[OpenMP] Code generation for teams - kernel launching
Summary:
This patch implements the launching of a target region in the presence of a nested teams region, i.e calls tgt_target_teams with the required arguments gathered from the enclosed teams directive.
The actual codegen of the region enclosed by the teams construct will be contributed in a separate patch.
[OpenCL] Apply missing restrictions for Blocks in OpenCL v2.0
Applying the following restrictions for block types in OpenCL (v2.0 s6.12.5):
- __block storage class is disallowed
- every block declaration must be const qualified and initialized
- a block can't be used as a return type of a function
- a blocks can't be used to declare a structure or union field
- extern speficier is disallowed
Corrected image and sampler types diagnostics with struct and unions.
Alexey Bataev [Thu, 3 Mar 2016 05:21:39 +0000 (05:21 +0000)]
[OPENMP 4.0] Initial support for 'omp declare reduction' construct.
Add parsing, sema analysis and serialization/deserialization for 'declare reduction' construct.
User-defined reductions are defined as
#pragma omp declare reduction( reduction-identifier : typename-list : combiner ) [initializer ( initializer-expr )]
These custom reductions may be used in 'reduction' clauses of OpenMP constructs. The combiner specifies how partial results can be combined into a single value. The
combiner can use the special variable identifiers omp_in and omp_out that are of the type of the variables being reduced with this reduction-identifier. Each of them will
denote one of the values to be combined before executing the combiner. It is assumed that the special omp_out identifier will refer to the storage that holds the resulting
combined value after executing the combiner.
As the initializer-expr value of a user-defined reduction is not known a priori the initializer-clause can be used to specify one. Then the contents of the initializer-clause
will be used as the initializer for private copies of reduction list items where the omp_priv identifier will refer to the storage to be initialized. The special identifier
omp_orig can also appear in the initializer-clause and it will refer to the storage of the original variable to be reduced.
Differential Revision: http://reviews.llvm.org/D11182
Alexey Bataev [Thu, 3 Mar 2016 03:52:24 +0000 (03:52 +0000)]
[OPENMP 4.5] Initial support for data members in 'linear' clause.
OpenMP 4.5 allows to privatize data members of current class in member
functions. Patch adds initial support for privatization of data members
in 'linear' clause, no codegen support.
Sean Callanan [Thu, 3 Mar 2016 02:22:05 +0000 (02:22 +0000)]
Caught and fixed a typo in r262572.
I should have checked and imported D's in-class initializer.
Instead I accidentally used ToField's in-class initializer,
which is always NULL so ToField will never get one.
Sean Callanan [Thu, 3 Mar 2016 01:21:28 +0000 (01:21 +0000)]
Fixed a problem where the ASTImporter mishandled in-class initializers.
Previously, the ASTImporter, when copying a FieldDecl, would make the
new FieldDecl use the exact same in-class initializer as the original
FieldDecl, which is a problem since the initializer is in the wrong AST.
The initializer must be imported, just like all the other parts of the
field.
This patch adds doxygen comments for all the intrinsincs in the header file tmmintrin.h.
The doxygen comments are automatically generated based on Sony's intrinsics document.
I got an OK from Eric Christopher to commit doxygen comments without prior code review upstream.
Nico Weber [Wed, 2 Mar 2016 23:22:00 +0000 (23:22 +0000)]
Serialize `pragma ms_struct` state.
pragma ms_struct has an effect on struct decls, and the effect is serialized
correctly already. But the "is ms_struct currently on" state wasn't before
this change.
This uses the same approach as `pragma clang optimize`: When writing a module,
the state isn't serialized, only when writing a pch file.
Turns "foo" into 'foo' (or vice versa, depending on configuration).
This makes it more convenient to follow the Google JavaScript style
guide:
https://google.github.io/styleguide/javascriptguide.xml?showone=Strings#Strings
This functionality is behind the option "JavaScriptQuotes", which can be:
* "leave" (no re-quoting)
* "single" (change to single quotes)
* "double" (change to double quotes)
This also changes single quoted JavaScript string literals to be treated
as tok::string_literal, not tok::char_literal, which fixes two unrelated
tests.
Artem Belevich [Wed, 2 Mar 2016 21:03:20 +0000 (21:03 +0000)]
Fixed test failure platforms with name mangling different from Linux.
* Run cc with -triple x86_64-linux-gnu to make symbol mangling predictable.
* Use temporary file as a fake GPU input so its content
does not interfere with pattern matching.
Rong Xu [Wed, 2 Mar 2016 20:59:36 +0000 (20:59 +0000)]
[PGO] Change profile use cc1 option to handle IR level profiles
This patch changes cc1 option for PGO profile use from
-fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>.
-fprofile-instr-use=<path> is now a driver only option.
In addition to decouple the cc1 option from the driver level option, this patch
also enables IR level profile use. cc1 option handling now reads the profile
header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM}
-- this is a common enum for -fprofile-instrument={}, for the profile
instrumentation), and invoke the pipeline to enable the respective PGO use pass.
Nico Weber [Wed, 2 Mar 2016 19:28:54 +0000 (19:28 +0000)]
Serialize `#pragma detect_mismatch`.
This is like r262493, but for pragma detect_mismatch instead of pragma comment.
The two pragmas have similar behavior, so use the same approach for both.
Nico Weber [Wed, 2 Mar 2016 17:28:48 +0000 (17:28 +0000)]
Serialize `#pragma comment`.
`#pragma comment` was handled by Sema calling a function on ASTConsumer, and
CodeGen then implementing this function and writing things to its output.
Instead, introduce a PragmaCommentDecl AST node and hang one off the
TranslationUnitDecl for every `#pragma comment` line, and then use the regular
serialization machinery. (Since PragmaCommentDecl has codegen relevance, it's
eagerly deserialized.)
Alexey Bataev [Wed, 2 Mar 2016 04:57:40 +0000 (04:57 +0000)]
[OPENMP 4.5] Codegen for data members in 'reduction' clause.
OpenMP 4.5 allows to privatize non-static data members of current class
in non-static member functions. Patch supports codegen for non-static
data members in 'reduction' clauses.
Nico Weber [Tue, 1 Mar 2016 23:16:44 +0000 (23:16 +0000)]
clang-cl: Implement initial limited support for precompiled headers.
In the gcc precompiled header model, one explicitly runs clang with `-x
c++-header` on a .h file to produce a gch file, and then includes the header
with `-include foo.h` and if a .gch file exists for that header it gets used.
This is documented at
http://clang.llvm.org/docs/UsersManual.html#precompiled-headers
cl.exe's model is fairly different, and controlled by the two flags /Yc and
/Yu. A pch file is generated as a side effect of a regular compilation when
/Ycheader.h is passed. While the compilation is running, the compiler keeps
track of #include lines in the main translation unit and writes everything up
to an `#include "header.h"` line into a pch file. Conversely, /Yuheader.h tells
the compiler to skip all code in the main TU up to and including `#include
"header.h"` and instead load header.pch. (It's also possible to use /Yc and /Yu
without an argument, in that case a `#pragma hrdstop` takes the role of
controlling the point where pch ends and real code begins.)
This patch implements limited support for this in that it requires the pch
header to be passed as a /FI force include flag – with this restriction,
it can be implemented almost completely in the driver with fairly small amounts
of code. For /Yu, this is trivial, and for /Yc a separate pch action is added
that runs before the actual compilation. After r261774, the first failing
command makes a compilation stop – this means if the pch fails to build the
main compilation won't run, which is what we want. However, in /fallback builds
we need to run the main compilation even if the pch build fails so that the
main compilation's fallback can run. To achieve this, add a ForceSuccessCommand
that pretends that the pch build always succeeded in /fallback builds (the main
compilation will then fail to open the pch and run the fallback cl.exe
invocation).
If /Yc /Yu are used in a setup that clang-cl doesn't implement yet, clang-cl
will now emit a "not implemented yet; flag ignored" warning that can be
disabled using -Wno-clang-cl-pch.
Since clang-cl doesn't yet serialize some important things (most notably
`pragma comment(lib, ...)`, this feature is disabled by default and only
enabled by an internal driver flag. Once it's more stable, this internal flag
will disappear.
(The default stdafx.h setup passes stdafx.h as explicit argument to /Yc but not
as /FI – instead every single TU has to `#include <stdafx.h>` as first thing it
does. Implementing support for this should be possible with the approach in
this patch with minimal frontend changes by passing a --stop-at / --start-at
flag from the driver to the frontend. This is left for a follow-up. I don't
think we ever want to support `#pragma hdrstop`, and supporting it with this
approach isn't easy: This approach relies on the driver knowing the pch
filename in advance, and `#pragma hdrstop(out.pch)` can set the output
filename, so the driver can't know about it in advance.)
clang-cl now also honors /Fp and puts pch files in the same spot that cl.exe
would put them, but the pch file format is of course incompatible. This has
ramifications on /fallback, so /Yc /Yu aren't passed through to cl.exe in
/fallback builds.
John McCall [Tue, 1 Mar 2016 22:18:03 +0000 (22:18 +0000)]
Mangle extended qualifiers in the proper order and mangle the
ARC ownership-convention function type modifications.
According to the Itanium ABI, vendor extended qualifiers are
supposed to be mangled in reverse-alphabetical order before
any CVR qualifiers. The ARC function type conventions are
plausibly order-significant (they are associated with the
function type), which permits us to ignore the need to correctly
inter-order them with any other vendor qualifiers on the parameter
and return types.
Implementing these rules correctly is technically an ABI break.
Apple is comfortable with the risk of incompatibility here for
the ARC features, and I believe that address-space qualification
is still uncommon enough to allow us to adopt the conforming
rule without serious risk. Still, targets which make heavy
use of address space qualification may want to revert to the
non-conforming order.