[OpenMP] Support for the num_threads-clause on 'target parallel' on the NVPTX device.
This patch adds support for the Spmd construct 'target parallel' on the
NVPTX device. This involves ignoring the num_threads clause on the device
since the number of threads in this combined construct is already set on
the host through the call to __tgt_target_teams().
[OpenMP] Support for the num_threads-clause on 'target parallel'.
The num_threads-clause on the combined directive applies to the
'parallel' region of this construct. We modify the NumThreadsClause
class to capture the clause expression within the 'target' region.
The offload runtime call for 'target parallel' is changed to
__tgt_target_teams() with 1 team and the number of threads set by
this clause or a default if none.
Richard Smith [Tue, 24 Jan 2017 19:39:46 +0000 (19:39 +0000)]
[docs] Add TableGen-based generator for command line argument documentation,
and generate documentation for all (non-hidden) options supported by the
'clang' driver.
Alex Lorenz [Tue, 24 Jan 2017 14:15:08 +0000 (14:15 +0000)]
[CodeCompletion] Ensure that ObjC root class completes instance
methods from protocols and categories as well
Code completion results for class methods already include instance methods
from Objective-C root classes. This commit ensures that the results also include
instance methods from protocols that the root class implements and root class
categories as well.
David L. Jones [Tue, 24 Jan 2017 01:04:30 +0000 (01:04 +0000)]
Switch TableGen to emit calls to ASTRecordReader for AttrPCHRead.
Summary:
This patch changes TableGen-generated code in AttrPCHRead to call functions on
ASTRecordReader, instead of passing separate parameters to ASTReader. This is a
follow-up to r290217.
[sanitizer-coverage] emit __sanitizer_cov_trace_pc_guard w/o a preceding 'if' by default. Update the docs, also add deprecation notes around other parts of sanitizer coverage
David L. Jones [Mon, 23 Jan 2017 23:16:58 +0000 (23:16 +0000)]
Add LF_ prefix to LibFunc enums in TargetLibraryInfo.
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).
Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.
The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LF_". (Unfortunately, a scoped enum is not sufficient to override macros.)
These changes are for clang. See https://reviews.llvm.org/D28476 for LLVM.
Tim Shen [Mon, 23 Jan 2017 22:39:35 +0000 (22:39 +0000)]
[APFloat] Switch from (PPCDoubleDoubleImpl, IEEEdouble) layout to (IEEEdouble, IEEEdouble)
Summary:
This patch changes the layout of DoubleAPFloat, and adjust all
operations to do either:
1) (IEEEdouble, IEEEdouble) -> (uint64_t, uint64_t) -> PPCDoubleDoubleImpl,
then run the old algorithm.
2) Do the right thing directly.
I could split this into two patches, e.g. use
1) for all operatoins first, then incrementally change some of them to
2). I didn't do that, because 1) involves code that converts data between
PPCDoubleDoubleImpl and (IEEEdouble, IEEEdouble) back and forth, and may
pessimize the compiler. Instead, I find easy functions and use
approach 2) for them directly.
Next step is to implement move multiply and divide from 1) to 2). I don't
have plans for other functions in 1).
[OpenMP] DSAChecker bug fix for combined directives.
The DSAChecker code in SemaOpenMP looks at the captured statement
associated with an OpenMP directive. A combined directive such as
'target parallel' has nested capture statements, which have to be
fully traversed before executing the DSAChecker. This is a patch
to perform the traversal for such combined directives.
Antonio Maiorano [Mon, 23 Jan 2017 13:20:23 +0000 (13:20 +0000)]
clang-format: remove tests that assume no config file will be found as this is not always the case
These tests fail for developers who place their build directories under the
llvm root directory because llvm's own .clang-format file will be found.
Anyway these cases are covered by FormatStyle.GetStyleOfFile tests
(FormatTest.cpp).
Aleksei Sidorin [Mon, 23 Jan 2017 09:30:36 +0000 (09:30 +0000)]
ASTImporter: improve support for C++ templates
* Support template partial specialization
* Avoid infinite recursion in IsStructurallyEquivalent for TemplateArgument with implementing IsStructurallyEquivalent for TemplateName
David Blaikie [Mon, 23 Jan 2017 02:24:03 +0000 (02:24 +0000)]
DebugInfo: Omit class definitions even in the presence of available_externally vtables
To ensure optimization level doesn't pessimize the -fstandalone-debug
vtable debug info optimization (where class definitions are only emitted
where the vtable is emitted - reducing redundant debug info) ensure the
debug info class definition is still omitted when an
available_externally vtable definition is emitted for optimization
purposes.
Tim Shen [Fri, 20 Jan 2017 22:05:33 +0000 (22:05 +0000)]
[Altivec] Change vec_sl to a << (b % (sizeof(a) * 8))
For a << b (as original vec_sl does), if b >= sizeof(a) * 8, the
behavior is undefined. However, Power instructions do define the
behavior, which is equivalent to a << (b % (sizeof(a) * 8)).
This patch changes altivec.h to use a << (b % (sizeof(a) * 8)), to
ensure the consistent semantic of the instructions. Then it combines
the generated multiple instructions back to a single shift.
This patch handles left shift only. Right shift, on the other hand, is
more complicated, considering arithematic/logical right shift.
Alex Lorenz [Fri, 20 Jan 2017 15:38:58 +0000 (15:38 +0000)]
[Sema] Improve the error diagnostic for dot destructor calls on pointer objects
This commit improves the mismatched destructor type error by detecting when the
destructor call has used a '.' instead of a '->' on a pointer to the destructed
type. The diagnostic now suggests to use '->' instead of '.', and adds a fixit
where appropriate.
[clang-format] Remove redundant test in style-on-command-line.cpp
Summary:
rL292562 added a fix to always format if the fallback style is set to "none".
In test/Format/style-on-command-line.cpp:19 is redundant, since -fallback-style
has a default value of LLVM set in ClangFormat.cpp:72.
@amaiorano: I believe that the rest of the test cases still cover your change in
case the fallback style is explicitly set to "none". Please, if this is not the
case, initiate a discussion.
Philipp Stephani [Fri, 20 Jan 2017 09:37:50 +0000 (09:37 +0000)]
Use UTF-8 for all communication with clang-format
Summary: Instead of picking the buffer file coding system, always use utf-8-unix for communicating with clang-format. This is fine because clang-format never actually reads the file to be formatted, only standard input. This is a bit simpler (process coding system is now a constant) and potentially faster, as utf-8-unix is Emacs's internal coding system. Also add an end-to-end test that actually invokes clang-format.
Alexey Bataev [Fri, 20 Jan 2017 08:57:28 +0000 (08:57 +0000)]
[OPENMP] Fix for PR31643: Clang crashes when compiling code on Windows
with SEH and openmp
In some cituations (during codegen for Windows SEH constructs)
CodeGenFunction instance may have CurFn equal to nullptr. OpenMP related
code does not expect such situation during cleanup.
Antonio Maiorano [Fri, 20 Jan 2017 01:22:42 +0000 (01:22 +0000)]
clang-format: fix fallback style set to "none" not always formatting
This fixes clang-format not formatting if fallback-style is explicitly set to
"none", and either a config file is found or YAML is passed in without a
"BasedOnStyle". With this change, passing "none" in these cases will have no
affect, and LLVM style will be used as the base style.
Richard Smith [Thu, 19 Jan 2017 21:00:13 +0000 (21:00 +0000)]
PR13403 (+duplicates): implement C++ DR1310 (http://wg21.link/cwg1310).
Under this defect resolution, the injected-class-name of a class or class
template cannot be used except in very limited circumstances (when declaring a
constructor, in a nested-name-specifier, in a base-specifier, or in an
elaborated-type-specifier). This is apparently done to make parsing easier, but
it's a pain for us since we don't know whether a template-id using the
injected-class-name is valid at the point when we annotate it (we don't yet
know whether the template-id will become part of an elaborated-type-specifier).
As a tentative resolution to a perceived language defect, mem-initializer-ids
are added to the list of exceptions here (they generally follow the same rules
as base-specifiers).
When the reference to the injected-class-name uses the 'typename' or 'template'
keywords, we permit it to be used to name a type or template as an extension;
other compilers also accept some cases in this area. There are also a couple of
corner cases with dependent template names that we do not yet diagnose, but
which will also get this treatment.
Malcolm Parsons [Thu, 19 Jan 2017 17:19:22 +0000 (17:19 +0000)]
[Sema] Reword unused lambda capture warning
Summary:
The warning doesn't know why the variable was looked up but not
odr-used, so reword it to not claim that it was used in an unevaluated
context.
Dehao Chen [Thu, 19 Jan 2017 00:44:21 +0000 (00:44 +0000)]
Add -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted:
* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)
The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):
Move vtable type metadata emission behind a cc1-level flag.
In ThinLTO mode, type metadata will require the module to be written as a
multi-module bitcode file, which is currently incompatible with the Darwin
linker. It is also useful to be able to enable or disable multi-module bitcode
for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit,
which is used by the driver to enable multi-module bitcode on all but
Darwin+ThinLTO, and can also be used to enable/disable the feature manually.
David Blaikie [Wed, 18 Jan 2017 21:15:18 +0000 (21:15 +0000)]
Remove now redundant code that ensured debug info for class definitions was emitted under certain circumstances
Introduced in r181561 - it may've been subsumed by work done to allow
emission of declarations for vtable types while still emitting some of
their member functions correctly for those declarations. Whatever the
reason, the tests pass without this code now.
[OpenMP] Support for the if-clause on the combined directive 'target parallel'.
The if-clause on the combined directive potentially applies to both the
'target' and the 'parallel' regions. Codegen'ing the if-clause on the
combined directive requires additional support because the expression in
the clause must be captured by the 'target' capture statement but not
the 'parallel' capture statement. Note that this situation arises for
other clauses such as num_threads.
The OMPIfClause class inherits OMPClauseWithPreInit to support capturing
of expressions in the clause. A member CaptureRegion is added to
OMPClauseWithPreInit to indicate which captured statement (in this case
'target' but not 'parallel') captures these expressions.
To ensure correct codegen of captured expressions in the presence of
combined 'target' directives, OMPParallelScope was added to 'parallel'
codegen.
Graydon Hoare [Wed, 18 Jan 2017 20:36:59 +0000 (20:36 +0000)]
[ASTReader] Add a DeserializationListener callback for IMPORTED_MODULES
Summary:
Add a callback from ASTReader to DeserializationListener when the former
reads an IMPORTED_MODULES block. This supports Swift in using PCH for
bridging headers.
[OpenMP] Codegen for the 'target parallel' directive on the NVPTX device.
This patch adds codegen for the 'target parallel' directive on the NVPTX
device. We term offload OpenMP directives such as 'target parallel' and
'target teams distribute parallel for' as SPMD constructs. SPMD constructs,
in contrast to Generic ones like the plain 'target', can never contain
a serial region.
SPMD constructs can be handled more efficiently on the GPU and do not
require the Warp Loop of the Generic codegen scheme. This patch adds
SPMD codegen support for 'target parallel' on the NVPTX device and can
be reused for other SPMD constructs.
This rule permits the injected-class-name of a class template to be used as
both a template type argument and a template template argument, with no extra
syntax required to disambiguate.
[OpenMP] Codegen support for 'target parallel' on the host.
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Benjamin Kramer [Wed, 18 Jan 2017 15:50:26 +0000 (15:50 +0000)]
[Basic] Remove source manager references from diag state points.
This is just wasted space, we don't support state points from multiple
source managers. Validate that there's no state when resetting the
source manager and use the 'global' reference to the sourcemanager
instead of the ones in the diag state.
Jonathan Roelofs [Wed, 18 Jan 2017 15:31:11 +0000 (15:31 +0000)]
Warn when calling a non interrupt function from an interrupt on ARM
The idea for this originated from a really tricky bug: ISRs on ARM don't
automatically save off the VFP regs, so if say, memcpy gets interrupted and the
ISR itself calls memcpy, the regs are left clobbered when the ISR is done.
[OpenMP] Codegen support for 'target parallel' on the host.
This patch adds support for codegen of 'target parallel' on the host.
It is also the first combined directive that requires two or more
captured statements. Support for this functionality is included in
the patch.
A combined directive such as 'target parallel' has two captured
statements, one for the 'target' and the other for the 'parallel'
region. Two captured statements are required because each has
different implicit parameters (see SemaOpenMP.cpp). For example,
the 'parallel' has 'global_tid' and 'bound_tid' while the 'target'
does not. The patch adds support for handling multiple captured
statements based on the combined directive.
When codegen'ing the 'target parallel' directive, the 'target'
outlined function is created using the outer captured statement
and the 'parallel' outlined function is created using the inner
captured statement.
Akira Hatanaka [Tue, 17 Jan 2017 19:35:54 +0000 (19:35 +0000)]
[Sema] Fix bug in handling of designated initializer.
CheckDesignatedInitializer wasn't taking into account the base classes
when computing the index for the field in the derived class, which
caused the test case to crash during IRGen because of a malformed AST.
George Rimar [Tue, 17 Jan 2017 15:45:31 +0000 (15:45 +0000)]
[Clang] - Update code to match upcoming llvm::zlib API.
D28684 changed llvm::zlib to return Error instead of Status.
It was accepted and committed in r292214, but then reverted in r292217
because I missed that clang code also needs to be updated.
Frederic Riss [Tue, 17 Jan 2017 03:38:45 +0000 (03:38 +0000)]
Fix AArch64 global-merge backend option name.
-mglobal-merge is translated to the appropriate backend option in
the driver. r277322 changed the AArch64 option name in the backend,
but the driver was never updated.
Richard Smith [Tue, 17 Jan 2017 02:14:37 +0000 (02:14 +0000)]
Partial revert of r290511.
The rules around typechecking deduced template arguments during partial
ordering are not clear, and while the prior behavior does not seem to be
correct (it doesn't follow the general model of partial ordering where each
template parameter is replaced by a non-dependent but unique value), the new
behavior is also not clearly right and breaks some existing idioms.
The new behavior is retained for dealing with non-type template parameters
with 'auto' types, as without it even the most basic uses of that feature
don't work. We can revisit this once CWG has come to an agreement on how
partial ordering with 'auto' non-type template parameters is supposed to
work.
Antonio Maiorano [Tue, 17 Jan 2017 00:12:27 +0000 (00:12 +0000)]
clang-format: Make GetStyle return Expected<FormatStyle> instead of FormatStyle
Change the contract of GetStyle so that it returns an error when an error occurs
(i.e. when it writes to stderr), and only returns the fallback style when it
can't find a configuration file.