Samuel Antao [Fri, 15 Jul 2016 23:13:27 +0000 (23:13 +0000)]
[CUDA][OpenMP] Create generic offload action
Summary:
This patch replaces the CUDA specific action by a generic offload action. The offload action may have multiple dependences classier in “host” and “device”. The way this generic offloading action is used is very similar to what is done today by the CUDA implementation: it is used to set a specific toolchain and architecture to its dependences during the generation of jobs.
This patch also proposes propagating the offloading information through the action graph so that that information can be easily retrieved at any time during the generation of commands. This allows e.g. the "clang tool” to evaluate whether CUDA should be supported for the device or host and ptas to easily retrieve the target architecture.
This is an example of how the action graphs would look like (compilation of a single CUDA file with two GPU architectures)
```
0: input, "cudatests.cu", cuda, (host-cuda)
1: preprocessor, {0}, cuda-cpp-output, (host-cuda)
2: compiler, {1}, ir, (host-cuda)
3: input, "cudatests.cu", cuda, (device-cuda, sm_35)
4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_35)
5: compiler, {4}, ir, (device-cuda, sm_35)
6: backend, {5}, assembler, (device-cuda, sm_35)
7: assembler, {6}, object, (device-cuda, sm_35)
8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {7}, object
9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {6}, assembler
10: input, "cudatests.cu", cuda, (device-cuda, sm_37)
11: preprocessor, {10}, cuda-cpp-output, (device-cuda, sm_37)
12: compiler, {11}, ir, (device-cuda, sm_37)
13: backend, {12}, assembler, (device-cuda, sm_37)
14: assembler, {13}, object, (device-cuda, sm_37)
15: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {14}, object
16: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {13}, assembler
17: linker, {8, 9, 15, 16}, cuda-fatbin, (device-cuda)
18: offload, "host-cuda (powerpc64le-unknown-linux-gnu)" {2}, "device-cuda (nvptx64-nvidia-cuda)" {17}, ir
19: backend, {18}, assembler
20: assembler, {19}, object
21: input, "cuda", object
22: input, "cudart", object
23: linker, {20, 21, 22}, image
```
The changes in this patch pass the existent regression tests (keeps the existent functionality) and resulting binaries execute correctly in a Power8+K40 machine.
Reviewers: echristo, hfinkel, jlebar, ABataev, tra
Richard Smith [Fri, 15 Jul 2016 20:53:25 +0000 (20:53 +0000)]
Push alias-declarations and alias-template declarations into scope even if
they're redeclarations. This is necessary in order for name lookup to correctly
find the most recent declaration of the name (which affects default template
argument lookup and cross-module merging, among other things).
Extend the __declspec(dll*) attribute to cover ObjC interfaces. This was
requested by Microsoft for their ObjC support. Cover both import and export.
This only adds the semantic analysis portion of the support, code-generation
still remains outstanding. Add some basic initial documentation on the
attributes that were previously empty. Tweak the previous tests to use the
relative expected-warnings to make the tests easier to read.
Removing a few more :option: tags that we do not have corresponding .. option directives for; these are causing the sphinx bot to fail (http://lab.llvm.org:8011/builders/clang-sphinx-docs/builds/15214/steps/docs-clang-html/logs/stdio).
Removing a few more :option: tags that we do not have corresponding .. option directives for; these are causing the sphinx bot to fail (http://lab.llvm.org:8011/builders/clang-sphinx-docs/builds/15213/steps/docs-clang-html/logs/stdio).
Frontend: Simplify ownership model for clang's output streams.
This changes the CompilerInstance::createOutputFile function to return
a std::unique_ptr<llvm::raw_ostream>, rather than an llvm::raw_ostream
implicitly owned by the CompilerInstance. This in most cases required that
I move ownership of the output stream to the relevant ASTConsumer.
The motivation for this change is to allow BackendConsumer to be a client
of interfaces such as D20268 which take ownership of the output stream.
Richard Smith [Thu, 14 Jul 2016 21:50:09 +0000 (21:50 +0000)]
[modules] Don't pass interesting decls to the consumer for a module file that's
passed on the command line but never actually used. We consider a (top-level)
module to be used if any part of it is imported, either by the current
translation unit, or by any part of a top-level module that is itself used.
(Put another way, a module is used if an implicit modules build would have
loaded its .pcm file.)
Ben Langmuir [Thu, 14 Jul 2016 20:08:43 +0000 (20:08 +0000)]
Attempt to workaround Windows bots after my previous commit
For some reason it seems the second invocation is getting DMOD_OTHER_H
set to a path with/forward/slashes, but one of the use sites
has\back\slashes. There should be no difference with what was already
there, but for now try to avoid checking those paths.
Sean Callanan [Thu, 14 Jul 2016 19:53:44 +0000 (19:53 +0000)]
When importing classes and structs with anonymous structs, it is critical that
distinct anonymous structs remain distinct despite having similar layout.
This is already ensured by distinguishing based on their placement in the parent
struct, using the function `findAnonymousStructOrUnionIndex`.
The problem is that this function only handles anonymous structs, like
```
class Foo { struct { int a; } }
```
and not untagged structs like
```
class Foo { struct { int a; } var; }
```
Both need to be handled, and this patch fixes that. The test case ensures that this functionality doesn't regress.
Ben Langmuir [Thu, 14 Jul 2016 18:51:55 +0000 (18:51 +0000)]
[index] Index system ImportDecls even when there is a DeclarationsOnly filter
Whether we call an ImportDecl a decl or a reference symbol role is
somewhat academic, but in practice it's more like a declaration because
it is interesting even to consumers who wouldn't care about references.
Most importantly, we want to report the module dependencies of system
modules even when we have declaration-only filtering.
Despite there being an option, it seems that Sphinx has decided that "=123" is part of the option directive name, and so having "=0" in the option tag is problematic. Since the option tag is part of the option directive definition, it's superfluous, and so I've removed it.
Removing a few more :option: tags that we do not have corresponding .. option directives for; these are causing the sphinx bot to fail (http://lab.llvm.org:8011/builders/clang-sphinx-docs/builds/15197/steps/docs-clang-html/logs/stdio).
Benjamin Kramer [Thu, 14 Jul 2016 15:06:57 +0000 (15:06 +0000)]
[OpenCL] In test/Driver/opencl.cl, don't require name of Clang binary to contain "clang"
The test currently fails if the name of the Clang binary doesn't contain "clang".
This patch removes that requirement, as some environments may choose to run the test with a differently named binary. This shouldn't make the test any less strict -- the only place where the flags we're searching for can really occur is the Clang command line.
Diagnose taking address and reference binding of packed members
This patch implements PR#22821.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member.
Conversions (either implicit or via a valid casting) to pointer types
with lower or equal alignment requirements (e.g. void* or char*)
silence the warning.
This change also adds a new error diagnostic when the user attempts to
bind a reference to a packed member, regardless of the alignment.
Removing more :option: tags that we do not have corresponding .. option directives for; these are causing the sphinx bot to fail (http://lab.llvm.org:8011/builders/clang-sphinx-docs/builds/15195/steps/docs-clang-html/logs/stdio).
Pierre Gousseau [Thu, 14 Jul 2016 13:58:27 +0000 (13:58 +0000)]
The test added in r275267 does not work on read-only checkouts because of the use of touch -m -t.
Following Tom Rybka suggestion, the test files are now copied to a temporary directory first.
This is a malformed :option: tag -- we don't have an option directive that matches it, so turning it actual text instead of a markup tag. This will hopefully fix the clang docs build (http://lab.llvm.org:8011/builders/clang-sphinx-docs/builds/15194/steps/docs-clang-html/logs/stdio)
Benjamin Kramer [Thu, 14 Jul 2016 12:56:21 +0000 (12:56 +0000)]
[OpenCL] Actually activate Frontend/opencl.cl test and fix test bugs
rL275318 added the test Frontend/opencl.cl test, but that test was never actually run because Frontend/lit.local.cfg doesn't contain the '.cl' file suffix.
Once the test is activated, it fails with (unintended) compile errors in the newly added CHECK_INVALID_OPENCL_VERSION checks.
This patch adds the '.cl' file suffix to Frontend/lit.local.cfg to activate the test and fixes the test bug by adding '-fblocks' to the relevant command lines.
Summary:
Depends on D21982 which implements the in-memory logging implementation of the
XRay runtime. These additional changes also depends on D20352 which adds the
bulk of XRay flags/dependencies when using the `-fxray-instrument` flag from
Clang.
Add XRay flags to Clang. We implement two flags to control the XRay behaviour:
-fxray-instrument: enables XRay annotation of IR
-fxray-instruction-threshold: configures the threshold for function size (looking at IR instructions), and allow LLVM to decide whether to add the nop sleds later on in the process.
Also implements the related xray_always_instrument and xray_never_instrument function attributes.
Carlo Bertolli [Wed, 13 Jul 2016 15:37:16 +0000 (15:37 +0000)]
[OpenMP] Initial implementation of parse+sema for clause use_device_ptr of 'target data'
http://reviews.llvm.org/D21904
This patch is similar to the implementation of 'private' clause: it adds a list of private pointers to be used within the target data region to store the device pointers returned by the runtime.
Please refer to the following document for a full description of what the runtime witll return in this case (page 10 and 11):
https://github.com/clang-omp/OffloadingDesign
I am happy to answer any question related to the runtime interface to help reviewing this patch.
Pierre Gousseau [Wed, 13 Jul 2016 11:58:28 +0000 (11:58 +0000)]
[PCH] Fix timestamp check on windows hosts.
On Linux, if the timestamp of a header file, included in the pch, is modified, then including the pch without regenerating it causes a fatal error, which is reasonable.
On Windows the check is ifdefed out, allowing the compilation to continue in a broken state.
The root of the broken state is that, if timestamps dont match, the preprocessor will reparse a header without discarding the pch data.
This leads to "#pragma once" header to be included twice.
The reason behind the ifdefing of the check lacks documentation, and was done 6 years ago.
This change tentatively removes the ifdefing.
Initialise more members in initializer lists. Invert the condition that had
grown to be pretty confusing. The `_objc_empty_vtable` is only used on macOS
<10.9. This simplifies the code. NFC.
[CUDA] Use the multi-element remove function in EraseUnwantedCUDAMatches.
Summary:
Bug pointed out by Benjamin Kramer in r264008. I think the bug is
benign because by the time this is called, we should only have at most
two overloads to consider (either a host and a device overload, or a
host+device overload, but not all three).
[CUDA] Add additional testcases for EraseUnwantedCUDAMatches.
Summary:
Specifically, this patch adds testcases for all three calls to
EraseUnwantedCUDAMatches. The addr-of-overloaded-fn test I accidentally
neutered in r264207, which moved much of
CodeGenCUDA/function-overload.cu into SemaCUDA/function-overload.cu.
The coverage from overloaded-delete test is new.
[CUDA] Don't assume that destructors can't be overloaded.
Summary:
You can overload a destructor in CUDA, and SemaOverload needs to be
tweaked not to crash when it sees an explicit call to an overloaded
destructor.
A BuiltinTemplateDecl has no underlying templated decl and as such they
cannot be relied upon for mangling. The ItaniumMangler had some bugs
here which lead to crashes.
I am not sure exactly which test breakage Martin was trying to fix in
r273694. For now, fix the behavior for top-level conditionals, which
(surprisingly) are actually used somewhat commonly.
Do not assign source regions located within system headers file ID's,
and do not construct counter mapping regions out of them.
This makes coverage reports less cluttered and less mysterious. E.g
using the "assert" macro doesn't cause assert.h to appear in reports,
and it no longer shows the "assertion failed" branch as an uncovered
region.
It also makes coverage mapping sections a bit smaller (e.g a 1%
reduction in a stage2 build of bin/llvm-as).
Wolfgang Pieb [Mon, 11 Jul 2016 22:22:23 +0000 (22:22 +0000)]
Prevent the creation of empty (forwarding) blocks resulting from nested ifs.
Summary:
Nested if statements can generate empty BBs whose terminator branches
unconditionally to its successor. These branches are not eliminated
to help generate better line number information in some cases, but there
is no reason to keep the empty blocks that result from nested ifs.
Eric Liu [Mon, 11 Jul 2016 13:53:12 +0000 (13:53 +0000)]
Make tooling::applyAllReplacements return llvm::Expected<string> instead of empty string to indicate potential error.
Summary:
return llvm::Expected<> to carry error status and error information.
This is the first step towards introducing "Error" into tooling::Replacements.
- Changes diagnostics for Blocks to be implicitly
const qualified OpenCL v2.0 s6.12.5.
- Added and unified diagnostics of some OpenCL special types:
blocks, images, samplers, pipes. These types are intended for use
with the OpenCL builtin functions only and, therefore, most regular
uses are not allowed including assignments, arithmetic operations,
pointer dereferencing, etc.
Driver: Stop linking to C++ when using sanitizers on Darwin
Sanitizers on Darwin are built as dynamic libraries, not static libraries.
Sanitizers will have their C++ dependency satisfied internally (LC_LOAD_DYLIB)
in the libclang_rt dylib. As long as the sanitizers stay dynamic and not static,
linking against C++ when enabling a sanitizer becomes over linkage.
Add CLANG_BUILD_TOOLS as a clang counterpart for LLVM_BUILD_TOOLS
LLVM_BUILD_TOOLS is a boolean variable that controls whether or not generated
targets for llvm tools are built by the "all" target. CLANG_BUILD_TOOLS is an
analogous variable for clang targets.
This is useful functionality for selectively disabling the building of clang
targets by default to speed up builds.
In terms of implementation, I just followed the model of LLVM's implementation
of this functionality.
David Majnemer [Sat, 9 Jul 2016 19:26:25 +0000 (19:26 +0000)]
[MS ABI] Some code cleanups
Don't create unnecessary truncations if the result will not be used.
Also prefer preforming math before the truncation, it makes it a little
easier to reason about.
Martin Probst [Sat, 9 Jul 2016 15:11:18 +0000 (15:11 +0000)]
clang-format: [JS] Sort imports case insensitive.
Summary: ASCII case sorting does not help finding imported symbols quickly, and it is common to have e.g. class Foo and function fooFactory exported/imported from the same file.