Ranjeet Singh [Fri, 17 Jun 2016 00:59:41 +0000 (00:59 +0000)]
[ARM] Add mrrc/mrrc2 intrinsics and update existing mcrr/mcrr2 intrinsics.
Reapplying patch in r272777 which was reverted
because the llvm patch which added support
for generating the mcrr/mcrr2 instructions
from the intrinsic was causing an assertion
failure. This has now been fixed in llvm.
Olivier Goffart [Thu, 16 Jun 2016 21:39:46 +0000 (21:39 +0000)]
Functions declared in a scope should not hide previous declaration in earlier scopes
This code should be an error:
void foo(int);
void f3() {
int foo(float);
{
float foo(int); // expected-error {{functions that differ only in their return type cannot be overloaded}}
}
}
the foo(float) function declared at function scope should not hide the float(int)
while trying to redeclare functions.
Samuel Antao [Thu, 16 Jun 2016 15:09:31 +0000 (15:09 +0000)]
[OpenMP] Cast captures by copy when passed to fork call so that they are compatible to what the runtime library expects.
Summary:
This patch fixes an issue detected when firstprivate variables are passed to an OpenMP outlined function vararg list. Currently they are not compatible with what the runtime library expects causing malfunction in some targets.
This patch fixes the issue by moving the casting logic already in place for offloading to the common code that creates the outline function and arguments and updates the regression tests accordingly.
Andrey Turetskiy [Thu, 16 Jun 2016 12:26:20 +0000 (12:26 +0000)]
Patch "Compilation for Intel MCU (Part 2/3)" caused the clang-x64-ninja-win7
buildbot to fail because of inaccurate CHECK in the test. This is a quick fix
for the test to make it platform independent.
Andrey Turetskiy [Thu, 16 Jun 2016 10:36:09 +0000 (10:36 +0000)]
Compilation for Intel MCU (Part 2/3)
This is the second patch required to support compilation for Intel MCU target (e.g. Intel(R) Quark(TM) micro controller D 2000).
When IAMCU triple is used:
* Recognize and use IAMCU GCC toolchain
* Set up include paths
* Forbid C++
Sanjay Patel [Wed, 15 Jun 2016 21:20:04 +0000 (21:20 +0000)]
[x86] translate SSE packed FP comparison builtins to IR
As noted in the code comment, a potential follow-on would be to remove
the builtins themselves. Other than ord/unord, this already works as
expected. Eg:
typedef float v4sf __attribute__((__vector_size__(16)));
v4sf fcmpgt(v4sf a, v4sf b) { return a > b; }
Matthias Braun [Wed, 15 Jun 2016 19:24:55 +0000 (19:24 +0000)]
cc1_main: Do not print statistics twice in -disable_free mode.
llvm statistics are currently printed when the destructor of a "static
ManagedStatic<StatisticInfo> StatInfo" in llvm runs. This destructor
currently runs in each case as part of llvm_shutdown() which is run even
in disable_free mode as part of main(). I assume that this hasn't always
been the case.
Removing the special code here avoids the statistics getting printed
twice.
John Brawn [Wed, 15 Jun 2016 14:14:51 +0000 (14:14 +0000)]
Don't use static variables in LambdaCapture
When static variables are used in inline functions in header files anything that
uses that function ends up with a reference to the variable. Because
RecursiveASTVisitor uses the inline functions in LambdaCapture that use static
variables any AST plugin that uses RecursiveASTVisitor, such as the
PrintFunctionNames example, ends up with a reference to these variables. This is
bad on Windows when building with MSVC with LLVM_EXPORT_SYMBOLS_FOR_PLUGINS=ON
as variables used across a DLL boundary need to be explicitly dllimported in
the DLL using them.
This patch avoids that by adjusting LambdaCapture to be similar to before
r263921, with a capture of either 'this' or a VLA represented by a null Decl
pointer in DeclAndBits with an extra flag added to the bits to distinguish
between the two. This requires the use of an extra bit, and while Decl does
happen to be sufficiently aligned to allow this it's done in a way that means
PointerIntPair doesn't realise it and gives an assertion failure. Therefore I
also adjust Decl slightly to use LLVM_ALIGNAS to allow this.
Alexey Bataev [Wed, 15 Jun 2016 11:19:39 +0000 (11:19 +0000)]
[MSVC] Late parsing of in-class defined member functions in template
classes.
MSVC actively uses unqualified lookup in dependent bases, lookup at the
instantiation point (non-dependent names may be resolved on things
declared later) etc. and all this stuff is the main cause of
incompatibility between clang and MSVC.
Clang tries to emulate MSVC behavior but it may fail in many cases.
clang could store lexed tokens for member functions definitions within
ClassTemplateDecl for later parsing during template instantiation.
It will allow resolving many possible issues with lookup in dependent
base classes and removing many already existing MSVC-specific
hacks/workarounds from the clang code.
The reason is that this (a) seems to work just fine and (b) useful when building stuff with
sanitizer+coverage, but need to exclude the sanitizer for a particular source file.
Vedant Kumar [Tue, 14 Jun 2016 19:06:48 +0000 (19:06 +0000)]
[perf-training] Ignore 'Profile Note' warnings from the runtime
After r272599, -DLLVM_BUILD_INSTRUMENTED passes a default argument to
-fprofile-instr-generate. This confuses the perf-helper script because
the runtime emits a note stating that the default is overridden by the
LLVM_PROFILE_FILE environment variable.
Change the perf-helper script s.t it does not treat these notes as
failures.
This isn't a strictly NFC change, but I don't see a simple way to add a
test for it.
Rafael Espindola [Tue, 14 Jun 2016 12:47:24 +0000 (12:47 +0000)]
Start adding support for Musl.
The two patches together enable clang to support targets like
"x86_64-pc-linux-musl" and build binaries against musl-libc instead of
glibc. This make it easy for clang to work on some musl-based systems
like Alpine Linux and certain flavors of Gentoo.
Adam Nemet [Tue, 14 Jun 2016 12:04:26 +0000 (12:04 +0000)]
Add loop pragma for Loop Distribution
Summary:
This is similar to other loop pragmas like 'vectorize'. Currently it
only has state values: distribute(enable) and distribute(disable). When
one of these is specified the corresponding loop metadata is generated:
!{!"llvm.loop.distribute.enable", i1 true/false}
As a result, loop distribution will be attempted on the loop even if
Loop Distribution in not enabled globally. Analogously, with 'disable'
distribution can be turned off for an individual loop even when the pass
is otherwise enabled.
There are some slight differences compared to the existing loop pragmas.
1. There is no 'assume_safety' variant which makes its handling slightly
different from 'vectorize'/'interleave'.
2. Unlike the existing loop pragmas, it does not have a corresponding
numeric pragma like 'vectorize' -> 'vectorize_width'. So for the
consistency checks in CheckForIncompatibleAttributes we don't need to
check it against other pragmas. We just need to check for duplicates of
the same pragma.
Daniel Sanders [Tue, 14 Jun 2016 08:58:50 +0000 (08:58 +0000)]
[mips] Defer validity check for CPU/ABI pairs and improve error message for invalid cases.
Summary:
The validity of ABI/CPU pairs is no longer checked on the fly but is
instead checked after initialization. As a result, invalid CPU/ABI pairs
can be reported as being known but invalid instead of being unknown. For
example, we now emit:
error: ABI 'n32' is not supported on CPU 'mips32r2'
instead of:
error: unknown target ABI 'n64'
Faisal Vali [Tue, 14 Jun 2016 03:23:15 +0000 (03:23 +0000)]
Fix PR28100 - Allow redeclarations of deleted explicit specializations.
See https://llvm.org/bugs/show_bug.cgi?id=28100.
In r266561 when I implemented allowing explicit specializations of function templates to override deleted status, I mistakenly assumed (and hence introduced a violable assertion) that when an explicit specialization was being declared, the corresponding specialization of the most specialized function template that it would get linked to would always be the one that was implicitly generated - and so if it was marked as 'deleted' it must have inherited it from the primary template and so should be safe to reset its deleted status, and set it to being an explicit specialization. Obviously during redeclaration of a deleted explicit specialization, in order to avoid a recursive reset, we need to check that the previous specialization is not an explicit specialization (instead of assuming and asserting it) and that it hasn't been referenced, and so only then is it safe to reset its 'deleted' status.
All regression tests pass.
Thanks to Zhendong Su for reporting the bug and David Majnemer for tracking it to my commit r266561, and promptly bringing it to my attention.
Serge Pavlov [Tue, 14 Jun 2016 02:55:56 +0000 (02:55 +0000)]
Detect recursive default argument definition
If definition of default function argument uses itself, clang crashed,
because corresponding function parameter is not associated with the default
argument yet. With this fix clang emits appropriate error message.
Richard Smith [Tue, 14 Jun 2016 01:13:21 +0000 (01:13 +0000)]
Remove nonsense and simplify. To forward a reference, we always just load the
pointer-to-pointer representing the parameter. An aggregate rvalue representing
a pointer does not make sense.
We got away with this weirdness because CGCall happens to blindly load an
RValue in aggregate form in this case, without checking whether an RValue for
the type should be in scalar or aggregate form.
Samuel Antao [Mon, 13 Jun 2016 18:10:57 +0000 (18:10 +0000)]
[CUDA][OpenMP] Create generic offload toolchains
Summary:
This patch introduces the concept of offloading tool chain and offloading kind. Each tool chain may have associated an offloading kind that marks it as used in a given programming model that requires offloading.
It also adds the logic to iterate on the tool chains based on the kind. Currently, only CUDA is supported, but in general a programming model (an offloading kind) may have associated multiple tool chains that require supporting offloading.
This patch does not add tests - its goal is to keep the existing functionality.
This patch is the first of a series of three that attempts to make the current support of CUDA more generic and easier to extend to other programming models, namely OpenMP. It tries to capture the suggestions/improvements/concerns on the initial proposal in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html. It only tackles the more consensual part of the proposal, i.e.does not address the problem of intermediate files bundling yet.
Reviewers: ABataev, jlebar, echristo, hfinkel, tra
There is no need to use a target-specific intrinsic to implement
_bit_scan_forward or _bit_scan_reverse, reimplementing them using
generic intrinsics makes it more likely that the middle end will
understand what's going on.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member
Simon Pilgrim [Mon, 13 Jun 2016 09:57:52 +0000 (09:57 +0000)]
[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp
The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.
The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load
This also applies to variable declarations, e.g.:
int const * a;
However, this form is very uncommon (most people would write
"const int* a" instead) and contracting to "const*" might actually send
the wrong signal of what the const binds to.
Mike Spertus [Mon, 13 Jun 2016 04:02:35 +0000 (04:02 +0000)]
Improved Visual Studio visualization of OpaquePtr
Create a special visualizer for OpaquePtr<QualType> because the
standard visualizer doesn't work with OpaquePtr<QualType>
due to QualType being heavily dependent on traits to be pointer-like.
Also, created an identical visualizer for UnionOpaquePtr
Devin Coughlin [Mon, 13 Jun 2016 03:58:58 +0000 (03:58 +0000)]
[analyzer] Remove some list initialization from MPI Checker to make MSVC bots happy.
This is a speculative attempt to fix the compiler error: "list initialization inside
member initializer list or non-static data member initializer is not implemented" with
r272529.
Devin Coughlin [Mon, 13 Jun 2016 03:22:41 +0000 (03:22 +0000)]
[analyzer] Add checker to verify the correct usage of the MPI API
This commit adds a static analysis checker to verify the correct usage of the MPI API in C
and C++. This version updates the reverted r271981 to fix a memory corruption found by the
ASan bots.
Three path-sensitive checks are included:
- Double nonblocking: Double request usage by nonblocking calls without intermediate wait
- Missing wait: Nonblocking call without matching wait.
- Unmatched wait: Waiting for a request that was never used by a nonblocking call
Examples of how to use the checker can be found at https://github.com/0ax1/MPI-Checker
Mike Spertus [Sun, 12 Jun 2016 22:21:56 +0000 (22:21 +0000)]
Visual Studio native visualizer for ParsedTemplateArgument
Does a good job with type and non-type template arguments
and lays the groundwork for template template arguments to
visualize well once there is a TemplateName visualizer.
Also fixed what looks like an incorrect comment in the
header for ParsedTemplate.h.