Vedant Kumar [Tue, 14 Jun 2016 19:06:48 +0000 (19:06 +0000)]
[perf-training] Ignore 'Profile Note' warnings from the runtime
After r272599, -DLLVM_BUILD_INSTRUMENTED passes a default argument to
-fprofile-instr-generate. This confuses the perf-helper script because
the runtime emits a note stating that the default is overridden by the
LLVM_PROFILE_FILE environment variable.
Change the perf-helper script s.t it does not treat these notes as
failures.
This isn't a strictly NFC change, but I don't see a simple way to add a
test for it.
Rafael Espindola [Tue, 14 Jun 2016 12:47:24 +0000 (12:47 +0000)]
Start adding support for Musl.
The two patches together enable clang to support targets like
"x86_64-pc-linux-musl" and build binaries against musl-libc instead of
glibc. This make it easy for clang to work on some musl-based systems
like Alpine Linux and certain flavors of Gentoo.
Adam Nemet [Tue, 14 Jun 2016 12:04:26 +0000 (12:04 +0000)]
Add loop pragma for Loop Distribution
Summary:
This is similar to other loop pragmas like 'vectorize'. Currently it
only has state values: distribute(enable) and distribute(disable). When
one of these is specified the corresponding loop metadata is generated:
!{!"llvm.loop.distribute.enable", i1 true/false}
As a result, loop distribution will be attempted on the loop even if
Loop Distribution in not enabled globally. Analogously, with 'disable'
distribution can be turned off for an individual loop even when the pass
is otherwise enabled.
There are some slight differences compared to the existing loop pragmas.
1. There is no 'assume_safety' variant which makes its handling slightly
different from 'vectorize'/'interleave'.
2. Unlike the existing loop pragmas, it does not have a corresponding
numeric pragma like 'vectorize' -> 'vectorize_width'. So for the
consistency checks in CheckForIncompatibleAttributes we don't need to
check it against other pragmas. We just need to check for duplicates of
the same pragma.
Daniel Sanders [Tue, 14 Jun 2016 08:58:50 +0000 (08:58 +0000)]
[mips] Defer validity check for CPU/ABI pairs and improve error message for invalid cases.
Summary:
The validity of ABI/CPU pairs is no longer checked on the fly but is
instead checked after initialization. As a result, invalid CPU/ABI pairs
can be reported as being known but invalid instead of being unknown. For
example, we now emit:
error: ABI 'n32' is not supported on CPU 'mips32r2'
instead of:
error: unknown target ABI 'n64'
Faisal Vali [Tue, 14 Jun 2016 03:23:15 +0000 (03:23 +0000)]
Fix PR28100 - Allow redeclarations of deleted explicit specializations.
See https://llvm.org/bugs/show_bug.cgi?id=28100.
In r266561 when I implemented allowing explicit specializations of function templates to override deleted status, I mistakenly assumed (and hence introduced a violable assertion) that when an explicit specialization was being declared, the corresponding specialization of the most specialized function template that it would get linked to would always be the one that was implicitly generated - and so if it was marked as 'deleted' it must have inherited it from the primary template and so should be safe to reset its deleted status, and set it to being an explicit specialization. Obviously during redeclaration of a deleted explicit specialization, in order to avoid a recursive reset, we need to check that the previous specialization is not an explicit specialization (instead of assuming and asserting it) and that it hasn't been referenced, and so only then is it safe to reset its 'deleted' status.
All regression tests pass.
Thanks to Zhendong Su for reporting the bug and David Majnemer for tracking it to my commit r266561, and promptly bringing it to my attention.
Serge Pavlov [Tue, 14 Jun 2016 02:55:56 +0000 (02:55 +0000)]
Detect recursive default argument definition
If definition of default function argument uses itself, clang crashed,
because corresponding function parameter is not associated with the default
argument yet. With this fix clang emits appropriate error message.
Richard Smith [Tue, 14 Jun 2016 01:13:21 +0000 (01:13 +0000)]
Remove nonsense and simplify. To forward a reference, we always just load the
pointer-to-pointer representing the parameter. An aggregate rvalue representing
a pointer does not make sense.
We got away with this weirdness because CGCall happens to blindly load an
RValue in aggregate form in this case, without checking whether an RValue for
the type should be in scalar or aggregate form.
Samuel Antao [Mon, 13 Jun 2016 18:10:57 +0000 (18:10 +0000)]
[CUDA][OpenMP] Create generic offload toolchains
Summary:
This patch introduces the concept of offloading tool chain and offloading kind. Each tool chain may have associated an offloading kind that marks it as used in a given programming model that requires offloading.
It also adds the logic to iterate on the tool chains based on the kind. Currently, only CUDA is supported, but in general a programming model (an offloading kind) may have associated multiple tool chains that require supporting offloading.
This patch does not add tests - its goal is to keep the existing functionality.
This patch is the first of a series of three that attempts to make the current support of CUDA more generic and easier to extend to other programming models, namely OpenMP. It tries to capture the suggestions/improvements/concerns on the initial proposal in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html. It only tackles the more consensual part of the proposal, i.e.does not address the problem of intermediate files bundling yet.
Reviewers: ABataev, jlebar, echristo, hfinkel, tra
There is no need to use a target-specific intrinsic to implement
_bit_scan_forward or _bit_scan_reverse, reimplementing them using
generic intrinsics makes it more likely that the middle end will
understand what's going on.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member
Simon Pilgrim [Mon, 13 Jun 2016 09:57:52 +0000 (09:57 +0000)]
[Clang][X86] Convert non-temporal store builtins to generic __builtin_nontemporal_store in headers
We can now use __builtin_nontemporal_store instead of target specific builtins for naturally aligned nontemporal stores which avoids the need for handling in CGBuiltin.cpp
The scalar integer nontemporal (unaligned) store builtins will have to wait as __builtin_nontemporal_store currently assumes natural alignment and doesn't accept the 'packed struct' trick that we use for normal unaligned load/stores.
The nontemporal loads require further backend support before we can safely convert them to __builtin_nontemporal_load
This also applies to variable declarations, e.g.:
int const * a;
However, this form is very uncommon (most people would write
"const int* a" instead) and contracting to "const*" might actually send
the wrong signal of what the const binds to.
Mike Spertus [Mon, 13 Jun 2016 04:02:35 +0000 (04:02 +0000)]
Improved Visual Studio visualization of OpaquePtr
Create a special visualizer for OpaquePtr<QualType> because the
standard visualizer doesn't work with OpaquePtr<QualType>
due to QualType being heavily dependent on traits to be pointer-like.
Also, created an identical visualizer for UnionOpaquePtr
Devin Coughlin [Mon, 13 Jun 2016 03:58:58 +0000 (03:58 +0000)]
[analyzer] Remove some list initialization from MPI Checker to make MSVC bots happy.
This is a speculative attempt to fix the compiler error: "list initialization inside
member initializer list or non-static data member initializer is not implemented" with
r272529.
Devin Coughlin [Mon, 13 Jun 2016 03:22:41 +0000 (03:22 +0000)]
[analyzer] Add checker to verify the correct usage of the MPI API
This commit adds a static analysis checker to verify the correct usage of the MPI API in C
and C++. This version updates the reverted r271981 to fix a memory corruption found by the
ASan bots.
Three path-sensitive checks are included:
- Double nonblocking: Double request usage by nonblocking calls without intermediate wait
- Missing wait: Nonblocking call without matching wait.
- Unmatched wait: Waiting for a request that was never used by a nonblocking call
Examples of how to use the checker can be found at https://github.com/0ax1/MPI-Checker
Mike Spertus [Sun, 12 Jun 2016 22:21:56 +0000 (22:21 +0000)]
Visual Studio native visualizer for ParsedTemplateArgument
Does a good job with type and non-type template arguments
and lays the groundwork for template template arguments to
visualize well once there is a TemplateName visualizer.
Also fixed what looks like an incorrect comment in the
header for ParsedTemplate.h.
Mike Spertus [Sat, 11 Jun 2016 20:15:19 +0000 (20:15 +0000)]
Visual Studio Visualizers for ActionResult, LocInfoType, and and TypeSourceInfo
Created a visualizer for ActionResult that displayed the validity and the pointer,
but many of them initially displayed poorly. It turns out that the primary culprit
is that LocInfoType is often passed in an action result, but it is not the same
as other types. For example, LocInfoType is not in TypeNodes.def and clang::Type::TypeClass
does not have a LocInfoType enum. After adding a special visualizer for LocInfoType,
the display was more useful
Faisal Vali [Sat, 11 Jun 2016 16:41:54 +0000 (16:41 +0000)]
Fix cv-qualification of '*this' captures and nasty bug PR27507
The bug report by Gonzalo (https://llvm.org/bugs/show_bug.cgi?id=27507 -- which results in clang crashing when generic lambdas that capture 'this' are instantiated in contexts where the Functionscopeinfo stack is not in a reliable state - yet getCurrentThisType expects it to be) - unearthed some additional bugs in regards to maintaining proper cv qualification through 'this' when performing by value captures of '*this'.
This patch attempts to correct those bugs and makes the following changes:
o) when capturing 'this', we do not need to remember the type of 'this' within the LambdaScopeInfo's Capture - it is never really used for a this capture - so remove it.
o) teach getCurrentThisType to walk the stack of lambdas (even in scenarios where we run out of LambdaScopeInfo's such as when instantiating call operators) looking for by copy captures of '*this' and resetting the type of 'this' based on the constness of that capturing lambda's call operator.
This patch has been baking in review-hell for > 6 weeks - all the comments so far have been addressed and the bug (that it addresses in passing, and I regret not submitting as a separate patch initially) has been reported twice independently, so is frequent and important for us not to just sit on. I merged the cv qualification-fix and the PR-fix initially in one patch, since they resulted from my initial implementation of star-this and so were related. If someone really feels strongly, I can put in the time to revert this - separate the two out - and recommit. I won't claim it's immunized against all bugs, but I feel confident enough about the fix to land it for now.
Craig Topper [Sat, 11 Jun 2016 03:31:13 +0000 (03:31 +0000)]
[AVX512] Implement 512-bit and masked shufflelo and shufflehi intrinsics directly with __builtin_shufflevector and __builtin_ia32_select. Also improve the formatting of the AVX2 version.
Mike Spertus [Sat, 11 Jun 2016 03:02:33 +0000 (03:02 +0000)]
Visual Studio visualizers associated with LookupResults
Visualizers for DeclAccessPair, UnresolvedSet, and LookupResult. For example,
when combined with LLVM diff D21256 (currently in review), a Lookup set will
show much more naturally in the Locals window something like
David Majnemer [Sat, 11 Jun 2016 01:25:04 +0000 (01:25 +0000)]
[Sema] Return an appropriate result from CheckSpecifiedExceptionType
We shouldn't return true from CheckSpecifiedExceptionType if
the record type is incomplete and -fms-extensions is engaged. Otherwise
we will have an incomplete AST.
Driver: make it easier to select the SjLj EH model
GCC still permits enabling the SjLj EH model. This is something which can be
done on various targets. Hoist the -fsjlj-exceptions option into the driver and
pass it through. This allows one to opt into the alternative EH model while
retaining the default to be the target's default.
Josh Gao [Fri, 10 Jun 2016 18:30:33 +0000 (18:30 +0000)]
Strip Android version when looking up toolchain paths.
Summary:
Android target triples can include a version number in the abi field
(e.g. 'aarch64-linux-android21'), used for checking for availability.
However, the driver was searching for toolchain binaries using the
passed in triple as a prefix.
David Majnemer [Fri, 10 Jun 2016 18:24:41 +0000 (18:24 +0000)]
[-fms-extensions] Permit incomplete types in dynamic exception specifications
Microsoft headers, comdef.h and comutil.h, assume that this is an OK
thing to do. Downgrade the hard error to a warning if we are in
-fms-extensions mode.
Ben Craig [Fri, 10 Jun 2016 13:22:13 +0000 (13:22 +0000)]
Preallocate ExplodedNode hash table
Rehashing the ExplodedNode table is very expensive. The hashing
itself is expensive, and the general activity of iterating over the
hash table is highly cache unfriendly. Instead, we guess at the
eventual size by using the maximum number of steps allowed. This
generally avoids a rehash. It is possible that we still need to
rehash if the backlog of work that is added to the worklist
significantly exceeds the number of work items that we process. Even
if we do need to rehash in that scenario, this change is still a
win, as we still have fewer rehashes that we would have prior to
this change.
For small work loads, this will increase the memory used. For large
work loads, it will somewhat reduce the memory used. Speed is
significantly increased. A large .C file took 3m53.812s to analyze
prior to this change. Now it takes 3m38.976s, for a ~6% improvement.
Richard Trieu [Fri, 10 Jun 2016 04:52:09 +0000 (04:52 +0000)]
Check for null pointers before calling the Stmt Profiler
Some calls from OMPClauseProfiler were calling the Stmt Profiler with null
pointers, but the profiler can only handle non-null pointers. Add an assert
to the VisitStmt for valid pointers, and check all calls from OMPClauseProfiler
to be non-null pointers.
Serge Pavlov [Fri, 10 Jun 2016 04:39:07 +0000 (04:39 +0000)]
Fix recognition of shadowed template parameter
Crash reported in PR28023 is caused by the fact that non-type template
parameters are found by tag name lookup. In the code provided in that PR:
template<int V> struct A {
struct B {
template <int> friend struct V;
};
};
the template parameter V is found when lookup for redeclarations of 'struct V'
is made. Latter on the error about shadowing of 'V' is emitted but the semantic
context of 'struct V' is already determined wrong: 'struct A' instead of
translation unit.
The fix moves the check for shadowing toward the beginning of the method and
thus prevents from wrong context calculations.
Summary:
Add RenderScript language type and associate it with ".rs" extensions.
Test that the driver passes "-x renderscript" to the frontend for ".rs"
files.
(Also add '.rs' to the list of suffixes tested by lit).
Tim Shen [Thu, 9 Jun 2016 19:54:46 +0000 (19:54 +0000)]
[Temporary] Add an ExprWithCleanups for each C++ MaterializeTemporaryExpr.
These ExprWithCleanups are added for holding a RunCleanupsScope not
for destructor calls; rather, they are for lifetime marks. This requires
ExprWithCleanups to keep a bit to indicate whether it have cleanups with
side effects (e.g. dtor calls).
Chris Bieneman [Thu, 9 Jun 2016 17:24:16 +0000 (17:24 +0000)]
Revert "[CMake] Fix an issue building out-of-tree introduced in r272200"
This reverts r272275. This actually wasn't the right way to fix the problem. The correct solution is in r272279.
Applying the fix to LLVM as done in r272279, means this fix will get picked up by all projects building out of tree using LLVM's CMake modules. As opposed to the fix I had in r272275, which would require each project to change.