Rafael Espindola [Mon, 13 Jul 2015 22:26:30 +0000 (22:26 +0000)]
This reverts commit r242058, r242065, r242067.
The tests were failing on OS X.
Revert "[cuda] Driver changes to compile and stitch together host and device-side CUDA code."
Revert "Fixed regex to properly match '64' in the test case."
Revert "clang/test/Driver/cuda-options.cu REQUIRES clang-driver, at least."
[cuda] Driver changes to compile and stitch together host and device-side CUDA code.
- Changed driver pipeline to compile host and device side of CUDA
files and incorporate results of device-side compilation into host
object file.
- Added a test for cuda pipeline creation in clang driver.
New clang options:
--cuda-host-only - Do host-side compilation only.
--cuda-device-only - Do device-side compilation only.
--cuda-gpu-arch=<ARCH> - specify GPU architecture for device-side
compilation. E.g. sm_35, sm_30. Default is sm_20. May be used more
than once in which case one device-compilation will be done per
unique specified GPU architecture.
Ben Langmuir [Mon, 13 Jul 2015 19:48:52 +0000 (19:48 +0000)]
[Modules] Allow missing header before a missing requirement
And make the module unavailable without breaking any parent modules.
If there's a missing requirement after we've already seen a missing
header, still update the IsMissingRequiement bit correctly. Also,
diagnose missing requirements before missing headers, since the
existence of the header is moot if there are missing requirements.
clang-cl: For files setting output names, mention which flags they belong to.
It always takes me a while to figure out how to say "preprocess to file
foo.txt" with clang-cl. With this, it might be easier.
http://reviews.llvm.org/D10890
Mark Heffernan [Mon, 13 Jul 2015 18:31:37 +0000 (18:31 +0000)]
Update documentation for unroll pragmas on loops with runtime trip counts.
This change updates the documentation for the loop unrolling pragma behavior
change in r242047. Specifically, with that change "#pragma unroll" will not
unroll loops with a runtime trip count.
Support alternate attribute spelling __enable_if__
Attribute names usually support an alternate spelling that uses double
underscores before and after the attribute name, like e.g. attribute
((__aligned__)) for attribute ((aligned)). This is necessary to allow
use of attributes in system headers without polluting the name space.
However, for attribute ((enable_if)) that alternate spelling does not
work correctly. This is because of code in Parser::ParseGNUAttributeArgs
(ParseDecl.cpp) that specifically checks for the "enable_if" spelling
without allowing the alternate spelling.
Similar code in ParseDecl.cpp uses the normalizeAttrName helper to allow
both spellings. This patch adds use of that helper for the "enable_if"
check as well, which fixes attribute ((__enable_if__)).
David Majnemer [Mon, 13 Jul 2015 02:53:19 +0000 (02:53 +0000)]
Intrin.h: Clean up our atomic intrinsics
Three things:
- The atomic intrinsics mandate memory barriers, let's start emitting
some.
- We don't need to manually create RMW operations, we can just do
__atomic_fetch_foo instead of performing __atomic_foo_fetch and
undoing foo.
- Don't use inline assembly, we don't need it for these intrinsics.
Richard Smith [Sun, 12 Jul 2015 23:43:21 +0000 (23:43 +0000)]
[modules] Improve performance when there is a local declaration of an entity
before the first imported declaration.
We don't need to track all formerly-canonical declarations of an entity; it's sufficient to track those ones for which no other formerly-canonical declaration was imported into the same module. We call those ones "key declarations", and use them as our starting points for collecting redeclarations and performing namespace lookups.
Davide Italiano [Sun, 12 Jul 2015 22:10:56 +0000 (22:10 +0000)]
[Sema] If lvalue to rvalue reference cast is valid don't emit diagnostic.
In the test, y1 is not reference compatible to y2 and we currently assume
the cast is ill-formed so we emit a diagnostic. Instead, in order to honour
the standard, if y1 it's not reference-compatible to y2 then it can't be
converted using a static_cast, and a reinterpret_cast should be tried instead.
Richard Smith provided the correct interpretation of the standard and
explanation about the subtle difference between "can't be cast" and "the cast
is ill-formed". The former applies in this case.
Sema: Allow null names to be passed in to isAcceptableTagRedeclaration
It's possible for TagRedeclarations to involve decls without a name,
ie, anonymous enums. We hit some undefined behaviour if we bind these
null names to the reference here.
We never dereference the name, so it's harmless if it's null - make it
a pointer to allow that.
Fixes the Modules/submodules-merge-defs.cpp test under ubsan.
Richard Smith [Fri, 10 Jul 2015 22:27:17 +0000 (22:27 +0000)]
[modules] When checking the include guard for a header, check whether it's
visible in the module we're considering entering. Previously we assumed that if
we knew the include guard for a modular header, we'd already parsed it, but
that need not be the case if a header is present in the current module and one
of its dependencies; the result of getting this wrong was that the current
module's submodule for the header would end up empty.
Disable C++ EH by default for clang-cl and MSVC environments
We don't need any more bug reports from users telling us that MSVC-style
C++ exceptions are broken. Developers and adventurous users can still
test the existing functionality by passing along -fexceptions to either
clang or clang-cl.
Jordan Rose [Fri, 10 Jul 2015 21:41:59 +0000 (21:41 +0000)]
[analyzer] When forced to fake a block type, do it correctly.
BlockDecl has a poor AST representation because it doesn't carry its type
with it. Instead, the containing BlockExpr has the full type. This almost
never matters for the analyzer, but if the block decl contains static
local variables we need to synthesize a region to put them in, and this
region will necessarily not have the right type.
Even /that/ doesn't matter, unless
(1) the block calls the function or method containing the block, and
(2) the value of the block expr is used in some interesting way.
In this case, we actually end up needing the type of the block region,
and it will be set to our synthesized type. It turns out we've been doing
a terrible job faking that type -- it wasn't a block pointer type at all.
This commit fixes that to at least guarantee a block pointer type, using
the signature written by the user if there is one.
This is not really a correct answer because the block region's type will
/still/ be wrong, but further efforts to make this right in the analyzer
would probably be silly. We should just change the AST.
[inlineasm] Attach readonly and readnone to inline-asm instructions.
Previously, clang/llvm treated inline-asm instructions conservatively,
choosing not to eliminate the instructions or hoisting them out of a loop
even when it was safe to do so. This commit makes changes to attach a
readonly or readnone attribute to an inline-asm instruction, which enables
passes such as LICM and EarlyCSE to move or optimize away the instruction.
Teach clang that -no-pthread is a valid command line option
The winpthreads library in mingw-w64 passes -no-pthread when building
since pthreads is not available to build itself and pthreads it is linked
by default. clang does not link to pthreads by default but did error on
unknown -no-pthread option thus stopping the winpthreads build.
Eric Christopher [Fri, 10 Jul 2015 18:25:54 +0000 (18:25 +0000)]
Refactor PPC ABI handling to accept and silently ignore -mabi=altivec.
All of the ABIs we support are altivec style anyhow and so the option
doesn't make much sense with the modern ABIs. We could make this a more
noisy ignore, but it would break builds for projects that just pass
it along by default because of historical reasons.
Diego Novillo [Fri, 10 Jul 2015 18:00:07 +0000 (18:00 +0000)]
Factor PGO and coverage flag processing out of Clang::ConstructJob
The function is massively large and GCC is emitting stack overflow
errors when building it (stack, as counted by the compiler, grows to
more than 16Kb).
The new flag processing logic added in r241825 took it over the limit.
The "align 4" is actually wrong. Accessing all of "gbitfield.y" as a single
i32 is of course possible, but that still doesn't make it 4-byte aligned as
it remains packed at offset 1 in the surrounding gbitfield object.
This alignment was changed by commit r169489, which also introduced changes
to bitfield access code in CGExpr.cpp. Code before that change used to take
into account *both* the alignment of the field to be accessed within the
current struct, *and* the alignment of that outer struct itself; this logic
was removed by the above commit.
Neglecting to consider both values can cause incorrect code to be generated
(I've seen an unaligned access crash on SystemZ due to this bug).
In order to always use the best known alignment value, this patch removes
the CGBitFieldInfo::StorageAlignment member and replaces it with a
StorageOffset member specifying the offset from the start of the surrounding
struct to the bitfield's underlying storage. This offset can then be combined
with the best-known alignment for a bitfield access lvalue to determine the
alignment to use when accessing the bitfield's storage.
Add missing builtins to altivec.h for ABI compliance (vol. 3)
This patch corresponds to review:
http://reviews.llvm.org/D10972
Fix for the handling of dependent features that are enabled by default
on some CPU's (such as -mvsx, -mpower8-vector).
Also provides a number of new interfaces or fixes existing ones in
altivec.h.
Changed signatures to conform to ABI:
vector short vec_perm(vector signed short, vector signed short, vector unsigned char)
vector int vec_perm(vector signed int, vector signed int, vector unsigned char)
vector long long vec_perm(vector signed long long, vector signed long long, vector unsigned char)
vector signed char vec_sld(vector signed char, vector signed char, const int)
vector unsigned char vec_sld(vector unsigned char, vector unsigned char, const int)
vector bool char vec_sld(vector bool char, vector bool char, const int)
vector unsigned short vec_sld(vector unsigned short, vector unsigned short, const int)
vector signed short vec_sld(vector signed short, vector signed short, const int)
vector signed int vec_sld(vector signed int, vector signed int, const int)
vector unsigned int vec_sld(vector unsigned int, vector unsigned int, const int)
vector float vec_sld(vector float, vector float, const int)
vector signed char vec_splat(vector signed char, const int)
vector unsigned char vec_splat(vector unsigned char, const int)
vector bool char vec_splat(vector bool char, const int)
vector signed short vec_splat(vector signed short, const int)
vector unsigned short vec_splat(vector unsigned short, const int)
vector bool short vec_splat(vector bool short, const int)
vector pixel vec_splat(vector pixel, const int)
vector signed int vec_splat(vector signed int, const int)
vector unsigned int vec_splat(vector unsigned int, const int)
vector bool int vec_splat(vector bool int, const int)
vector float vec_splat(vector float, const int)
Added a VSX path to:
vector float vec_round(vector float)
Added interfaces:
vector signed char vec_eqv(vector signed char, vector signed char)
vector signed char vec_eqv(vector bool char, vector signed char)
vector signed char vec_eqv(vector signed char, vector bool char)
vector unsigned char vec_eqv(vector unsigned char, vector unsigned char)
vector unsigned char vec_eqv(vector bool char, vector unsigned char)
vector unsigned char vec_eqv(vector unsigned char, vector bool char)
vector signed short vec_eqv(vector signed short, vector signed short)
vector signed short vec_eqv(vector bool short, vector signed short)
vector signed short vec_eqv(vector signed short, vector bool short)
vector unsigned short vec_eqv(vector unsigned short, vector unsigned short)
vector unsigned short vec_eqv(vector bool short, vector unsigned short)
vector unsigned short vec_eqv(vector unsigned short, vector bool short)
vector signed int vec_eqv(vector signed int, vector signed int)
vector signed int vec_eqv(vector bool int, vector signed int)
vector signed int vec_eqv(vector signed int, vector bool int)
vector unsigned int vec_eqv(vector unsigned int, vector unsigned int)
vector unsigned int vec_eqv(vector bool int, vector unsigned int)
vector unsigned int vec_eqv(vector unsigned int, vector bool int)
vector signed long long vec_eqv(vector signed long long, vector signed long long)
vector signed long long vec_eqv(vector bool long long, vector signed long long)
vector signed long long vec_eqv(vector signed long long, vector bool long long)
vector unsigned long long vec_eqv(vector unsigned long long, vector unsigned long long)
vector unsigned long long vec_eqv(vector bool long long, vector unsigned long long)
vector unsigned long long vec_eqv(vector unsigned long long, vector bool long long)
vector float vec_eqv(vector float, vector float)
vector float vec_eqv(vector bool int, vector float)
vector float vec_eqv(vector float, vector bool int)
vector double vec_eqv(vector double, vector double)
vector double vec_eqv(vector bool long long, vector double)
vector double vec_eqv(vector double, vector bool long long)
vector bool long long vec_perm(vector bool long long, vector bool long long, vector unsigned char)
vector double vec_round(vector double)
vector double vec_splat(vector double, const int)
vector bool long long vec_splat(vector bool long long, const int)
vector signed long long vec_splat(vector signed long long, const int)
vector unsigned long long vec_splat(vector unsigned long long,
vector bool int vec_sld(vector bool int, vector bool int, const int)
vector bool short vec_sld(vector bool short, vector bool short, const int)
Respect alignment when loading up a coerced function argument
Code in CGCall.cpp that loads up function arguments that need to be
coerced to a different type may in some cases ignore the fact that
the source of the argument is not naturally aligned. This may cause
incorrect code to be generated. In some places in CreateCoercedLoad,
we already have setAlignment calls to address this, but I ran into one
where it was missing, causing wrong code generation on SystemZ.
However, in that location, we do not actually know what alignment of
the source location we can rely on; the callers do not pass anything
to this routine. This is already an issue in other places in
CreateCoercedLoad; and the same problem exists for CreateCoercedStore.
To avoid pessimising code, and to fix the FIXMEs already in place,
this patch also adds an alignment argument to the CreateCoerced*
routines and uses it instead of forcing an alignment of 1. The
callers are changed to pass in the best information they have.
This actually requires changes in a number of existing test cases
since we now get better alignment in many places.
Daniel Jasper [Fri, 10 Jul 2015 05:57:23 +0000 (05:57 +0000)]
Remove test that tests referring to the current working directory. You
cannot assume that the current working directory is writable in all test
environments. I don't know a better way to write this test of hand, lets
discuss. Possibly, a better option would be to put these together with
other test testing the driver directly.
CFI: Emit correct bit set information if RTTI is disabled under MS ABI.
We were previously creating bit set entries at virtual table offset
sizeof(void*) unconditionally under the Microsoft C++ ABI. This is incorrect
if RTTI data is disabled; in that case the "address point" is at offset
0. This change modifies bit set emission to take into account whether RTTI
data is being emitted.
Also make a start on a blacklisting scheme for records.
Diego Novillo [Thu, 9 Jul 2015 17:23:53 +0000 (17:23 +0000)]
Add GCC-compatible flags -fprofile-generate and -fprofile-use.
This patch adds support for specifying where the profile is emitted in a
way similar to GCC. These flags are used to specify directories instead
of filenames. When -fprofile-generate=DIR is used, the compiler will
generate code to write to <DIR>/default.profraw.
The patch also adds a couple of extensions: LLVM_PROFILE_FILE can still be
used to override the directory and file name to use and -fprofile-use
accepts both directories and filenames.
To simplify the set of flags used in the backend, all the flags get
canonicalized to -fprofile-instr-{generate,use} when passed to the
backend. The decision to use a default name for the profile is done
in the driver.
Add clang_free to libclang to free memory allocated in libclang.
One of the problems libclang tests has running under Windows is memory
allocated in libclang.dll but being freed in the test executable, possibly
by a different memory manager. This patch exposes a new export function,
clang_free(), used to free any allocated memory with the same libclang.dll
memory manager that allocated the memory.
Move the diagnostic back to codegen so that we can compile ATL on the
self-host bot. We don't actually end up emitting code for the __try, so
the diagnostic won't be hit.
Richard Smith [Wed, 8 Jul 2015 21:49:31 +0000 (21:49 +0000)]
[modules] Fix merging support for forward-declared enums with fixed underlying types. A visible declaration is enough to make the type complete, but not enough to make the definition visible.
Adrian Prantl [Wed, 8 Jul 2015 21:18:34 +0000 (21:18 +0000)]
Cleanup the doxygen comments in CGDebugInfo.cpp according to the coding
standards. Remove several hilariously out-of-date and redundant comments
and move the non-redundant ones into the header file.
Desugar doesn't necessarily initialize ShouldAKA, but as of r241542 it
may read it. Fix the misuse of the API and initialize this before
passing it in.
Petar Jovanovic [Wed, 8 Jul 2015 13:07:31 +0000 (13:07 +0000)]
[MIPS] Add support for direct-to-nacl in Clang
For Mips direct-to-nacl, the goal is to be close to le32 front-end and
use Mips32EL backend. This patch defines new NaClMips32ELTargetInfo and
modifies it slightly to be close to le32. It also adds necessary parts,
inline with ARM and X86.
[EH] Fix for clang bug 24005 - no cleanup for array of memcpy-able objects in struct (patch by Denis Zobnin)
The fix is to emit cleanup for arrays of memcpy-able objects in struct if an exception is thrown later during copy-construction.
Differential Revision: http://reviews.llvm.org/D10989
This reverts commit r239846 and r239879. They caused clang's
-fms-extensions behavior to incorrectly parse lambdas and includes a
testcase to ensure we don't regress again.
David Majnemer [Wed, 8 Jul 2015 05:07:05 +0000 (05:07 +0000)]
[CodeGen] Don't crash classifying a union of an AVX vector and an int
We forgot to run postMerge after decided that the union had to be
classified as MEMORY. This left us with Lo == MEMORY and Hi == SSEUp
which is an invalid combination.
Richard Smith [Wed, 8 Jul 2015 02:22:15 +0000 (02:22 +0000)]
[modules] When determining the visible module set during template
instantiation, use the set of modules visible from the template definition, not
from whichever declaration the specialization was instantiated from.
Adrian Prantl [Wed, 8 Jul 2015 01:39:38 +0000 (01:39 +0000)]
Change the expectation for test/Tooling/ms-asm-no-target.cpp since
clang-check now initializes the available targets to support
clang module containers.
Adrian Prantl [Tue, 7 Jul 2015 20:11:29 +0000 (20:11 +0000)]
Wrap clang modules and pch files in an object file container.
This patch adds ObjectFilePCHContainerOperations uses the LLVM backend
to put the contents of a PCH into a __clangast section inside a COFF, ELF,
or Mach-O object file container.
This is done to facilitate module debugging by makeing it possible to
store the debug info for the types defined by a module alongside the AST.