[mips][msa] Range check MSA intrinsics with immediates
This patch teaches clang to range check immediates for MIPS MSA instrinsics.
This checking is done strictly in comparison to some existing GCC
implementations. E.g. msa_andvi_b(var, 257) does not result in andvi $wX, 1.
Similarily msa_ldi_b takes a range of -128 to 127.
As part of this effort, correct the existing MSA test as it has both illegal
types and immediates.
Adjust type-trait evaluation to properly handle Using(Shadow)Decls
Since r274049, for an inheriting constructor declaration, the name of the using
declaration (and using shadow declaration comes from the using declaration) is
the name of a derived class, not the base class (line 8225-8232 of
lib/Sema/SemaDeclCXX.cpp in https://reviews.llvm.org/rL274049). Because of
this, name-based lookup performed inside Sema::LookupConstructors returns not
only CXXConstructorDecls but also Using(Shadow)Decls, which results assertion
failure reported in PR29087.
[OPENMP] Fixed codegen for 'omp cancel' construct.
If 'omp cancel' construct is used in a worksharing construct it may
cause hanging of the software in case if reduction clause is used. Patch fixes this problem by avoiding extra reduction processing for branches that were canceled.
------------------------------------------------------------------------
Having the following code pattern will result in incorrect diagnostic
int main() {
int arr[10];
#pragma omp target data map(arr[:])
#pragma omp target map(arr)
{}
}
t.cpp:4:24: error: original storage of expression in data environment is shared
but data environment do not fully contain mapped expression storage
#pragma omp target map(arr)
^~~
t.cpp:3:29: note: used here
#pragma omp target data map(arr[:])
^~~~~~
1 error generated.
Fix a couple of wrong-code bugs in switch-on-constant optimization:
* recurse through intermediate LabelStmts and AttributedStmts when checking
whether a statement inside a switch declares a variable
* if the end of a compound statement is reachable from the chosen case label,
and the compound statement contains a variable declaration, it's not valid
to just emit the contents of the compound statement -- we must emit the
statement itself or we lose the scope (and thus end lifetimes at the wrong
point)
There was a bug in the implementation of captured statements. If it has
a lambda expression in it and the same lambda expression is used outside
the captured region, clang produced an error:
```
error: definition with same mangled name as another definition
```
Here is an example:
```
struct A {
template <typename L>
void g(const L&) { }
};
Error report:
```
main.cpp:3:10: error: definition with same mangled name as another
definition
void g(const L&) { }
^
main.cpp:3:10: note: previous definition is here
```
Patch fixes this bug.
------------------------------------------------------------------------
[OPENMP] Fixed codegen for 'omp cancel' construct.
If 'omp cancel' construct is used in a worksharing construct it may cause
hanging of the software in case if reduction clause is used. Patch fixes
this problem by avoiding extra reduction processing for branches that
were canceled.
------------------------------------------------------------------------
Clang emits error message for the following code:
```
template <class F> void parallel_loop(F &&f) { f(0); }
int main() {
int x;
parallel_loop([&](auto y) {
{
x = y;
};
});
}
```
$ clang++ --std=gnu++14 clang_test.cc -o clang_test
clang_test.cc:9:7: error: reference to local variable 'x' declared in enclosing function 'main'
x = y;
^
clang_test.cc:2:48: note: in instantiation of function template specialization 'main()::(anonymous class)::operator()<int>' requested here
template <class F> void parallel_loop(F &&f) { f(0); }
^
clang_test.cc:6:3: note: in instantiation of function template specialization 'parallel_loop<(lambda at clang_test.cc:6:17)>' requested here parallel_loop([&](auto y) {
^
clang_test.cc:5:7: note: 'x' declared here
int x;
^
1 error generated.
Patch fixes this issue.
------------------------------------------------------------------------
[OPENMP] Fixed codegen for __real/__imag expressions in atomic
constructs.
For __real/__imag unary expressions clang emits lvalue with the
associated type from the original complex expression, but not the
underlying builtin integer or float type. This causes crash in codegen
for atomic constructs, if __real/__imag expression are used in atomic
constructs.
------------------------------------------------------------------------
Fix for PR30639: CGDebugInfo Null dereference with OpenMP array
access, by Erich Keane
OpenMP creates a variable array type with a a null size-expr. The Debug
generation failed to due to this. This patch corrects the openmp
implementation, updates the tests, and adds a new one for this
condition.
The diagnostic was attempting to access the QualType of a TypeDecl by calling
TypeDecl::getTypeForDecl. However, the Type pointer stored there is lazily
loaded by functions in ASTContext. In most cases, the pointer is loaded and
this does not cause a problem. However, when more that 50 or so unknown types
are seen beforehand, this causes the Type to not be loaded, passing a null
Type to the diagnostics, leading to the crash. Using
ASTContext::getTypeDeclType will give a proper QualType for all cases.
With templated classes, is possible to not be able to determine is a member
function is a special member function before the class is instantiated. Only
these special member functions can be defaulted. In some cases, knowing
whether a function is a special member function can't be determined until
instantiation, so an uninstantiated function could possibly be defaulted too.
Add a case to the error diagnostic when the function marked with a default is
not known to be a special member function.
Fix interaction between serialization and c++1z feature.
In c++1z, static_assert is not required to have a StringLiteral message, where
previously it was required. Update the AST Reader to be able to handle a
null StringLiteral.
PR12298 et al: don't recursively instantiate a template specialization from
within the instantiation of that same specialization. This could previously
happen for eagerly-instantiated function templates, variable templates,
exception specifications, default arguments, and a handful of other cases.
We still have an issue here for default template arguments that recursively
make use of themselves and likewise for substitution into the type of a
non-type template parameter, but in those cases we're producing a different
entity each time, so they should instead be caught by the instantiation depth
limit. However, currently we will typically run out of stack before we reach
it. :(
Summary:
The eprintf library was added before the general OS X builtins library existed as a place to store one builtin function. Since we have for several years had an actual mandated builtin library for OS X > 10.5, we should just merge eprintf into the main library.
This change will resolve PR28855.
As a follow up I'll also patch compiler-rt to not generate the eprintf library anymore.
PR26423: Assert on valid use of using declaration of a function with an undeduced auto return type
For now just disregard the using declaration in this case. Suboptimal,
but wiring up the ability to have declarations of functions that are
separate from their definition (we currently only do that for member
functions) and have differing return types (we don't have any support
for that) is more work than seems reasonable to at least fix this crash.
------------------------------------------------------------------------
PR28978: If we need overload resolution for the move constructor of an
anonymous union member of a class, we need overload resolution for the move
constructor of the class itself too; we can't rely on Sema to do the right
thing for us for anonymous union types.
[ADT] Change iterator_adaptor_base's default template arguments to forward more underlying typedefs
Summary:
The corresponding LLVM change: D23217.
LazyVector::iterator breaks, because int isn't an iterator type.
Since iterator_adaptor_base shouldn't be blamed to break at the call to
iterator_traits<int>::xxx, I'd rather "fix" LazyVector::iterator.
The perfect solution is to model "relative pointer", but it's beyond the goal of this patch.
Ed Schouten [Sat, 13 Aug 2016 20:43:56 +0000 (20:43 +0000)]
Merge r278393 and r278395.
LLVM/Clang 3.8 is directly usable as a cross compiler for CloudABI/i686.
In 3.9rc1 there are a couple of regressions in the driver that cause it
to be less usable:
- PIE was enabled unconditionally, even though it's only available for
x86-64 and aarch64.
- Some inline assembly fails to build, due to a shortage of registers,
as frame pointers are not omitted.
Both these changes are fairly low risk (read: they don't affect other
targets), so go ahead and merge them into 3.9, so we can use an
unmodified compiler on all architectures.
[Sema] Fix a crash on variadic enable_if functions.
Currently, when trying to evaluate an enable_if condition, we try to
evaluate all arguments a user passes to a function. Given that we can't
use variadic arguments from said condition anyway, not converting them
is a reasonable thing to do. So, this patch makes us ignore any varargs
when attempting to check an enable_if condition.
We'd crash because, in order to convert an argument, we need its
ParmVarDecl. Variadic arguments don't have ParmVarDecls.
[clang-cl] Make -gline-tables-only imply -gcodeview
It's surprising that you have to pass /Z7 in addition to -gcodeview to
get debug info. The sanitizer runtime, for example, expects that if the
compiler supports the -gline-tables-only flag, then it will emit debug
info.
------------------------------------------------------------------------
Allow -1 to assign max value to unsigned bitfields.
Silence the -Wbitfield-constant-conversion warning for when -1 or other
negative values are assigned to unsigned bitfields, provided that the bitfield
is wider than the minimum number of bits needed to encode the negative value.
When the type being diffed is a type alias, and the orginal type is not a
templated type, then there will be no unsugared TemplateSpecializationType.
When this happens, exit early from the constructor. Also add assertions to
the other iterator accessor to prevent the iterator from being used.
Fix false positive in -Wunsequenced and templates.
For builtin logical operators, there is a well-defined ordering of argument
evaluation. For overloaded operator of the same type, there is no argument
evaluation order, similar to other function calls. When both are present,
uninstantiated templates with an operator&& is treated as an unresolved
function call. Unresolved function calls are treated as normal function calls,
and may result in false positives when the builtin logical operator is used.
Have the unsequenced checker ignore dependent expressions to avoid this
false positive. The check also happens in template instantiations to catch
when the overloaded operator is used.
If the return type is a pointer and the function returns the reference to a
pointer, don't warn since only the value is returned, not the reference.
If a reference function parameter appears in the reference chain, don't warn
since binding happens at the caller scope, so addresses returned are not
to local stack. This includes default arguments as well.
[OpenCL] Added underscores to the names of 'to_addr' OpenCL built-ins.
Summary:
In order to re-define OpenCL built-in functions
'to_{private,local,global}' in OpenCL run-time library LLVM names must
be different from the clang built-in function names.
Diana Picus [Tue, 2 Aug 2016 14:34:15 +0000 (14:34 +0000)]
Merging r277457
[clang-cl] Fix PCH tests to use x86_64 as target
These tests require x86-registered-target, but they don't force the target as
x86 on the command line, which means they will be run and they might fail when
building the x86 backend on another platform (such as AArch64).
Add more gcc compatibility names to clang's cpuid.h
Summary:
Some cpuid bit defines are named slightly different from how gcc's
cpuid.h calls them.
Define a few more compatibility names to appease software built for gcc:
* `bit_PCLMUL` alias of `bit_PCLMULQDQ`
* `bit_SSE4_1` alias of `bit_SSE41`
* `bit_SSE4_2` alias of `bit_SSE42`
* `bit_AES` alias of `bit_AESNI`
* `bit_CMPXCHG8B` alias of `bit_CX8`
While here, add the misssing 29th bit, `bit_F16C` (which is how gcc
calls this bit).
Make test not fail on hosts where the default omp library is gomp.
This is the case on some linuxes, just force libomp so we get the
desired results.
------------------------------------------------------------------------
The '#pragma once' directive was erroneously ignored when encountered
in the header-file specified in generate-PCH-mode. This resulted in
compile-time errors in some cases with legal code, and also a misleading
warning being produced.
[OpenMP][CUDA] Do not forward OpenMP flags for CUDA device actions.
Summary:
This patch prevents OpenMP flags from being forwarded to CUDA device commands. That was causing the CUDA frontend to attempt to emit OpenMP code which is not supported.
This fixes the bug reported in https://llvm.org/bugs/show_bug.cgi?id=28723.
[X86][SSE] Reimplement SSE fp2si conversion intrinsics instead of using generic IR
D20859 and D20860 attempted to replace the SSE (V)CVTTPS2DQ and VCVTTPD2DQ truncating conversions with generic IR instead.
It turns out that the behaviour of these intrinsics is different enough from generic IR that this will cause problems, INF/NAN/out of range values are guaranteed to result in a 0x80000000 value - which plays havoc with constant folding which converts them to either zero or UNDEF. This is also an issue with the scalar implementations (which were already generic IR and what I was trying to match).
This patch changes both scalar and packed versions back to using x86-specific builtins.
It also deals with the other scalar conversion cases that are runtime rounding mode dependent and can have similar issues with constant folding.
[Coverage] Do not write out coverage mappings with zero entries
After r275121, we stopped mapping regions from system headers. Lambdas
declared in regions belonging to system headers started producing empty
coverage mappings, since the files corresponding to their spelling locs
were being ignored.
The coverage reader doesn't know what to do with these empty mappings.
This commit makes sure that we don't produce them and adds a test. I'll
make the reader stricter in a follow-up commit.
------------------------------------------------------------------------
[modules] Teach the ASTWriter to ignore mutations coming from the ASTReader.
Processing update records (and loading a module, in general) might trigger
unexpected calls to the ASTWriter (being a mutation listener). Now we have a
mechanism to suppress those calls to the ASTWriter but notify other possible
mutation listeners.
Reverting r275115 which caused PR28634.
When empty (forwarding) basic blocks that are referenced by user labels
are removed, incorrect code may be generated.
[analyzer] Add checker modeling potential C++ self-assignment
This checker checks copy and move assignment operators whether they are
protected against self-assignment. Since C++ core guidelines discourages
explicit checking for `&rhs==this` in general we take a different approach: in
top-frame analysis we branch the exploded graph for two cases, where &rhs==this
and &rhs!=this and let existing checkers (e.g. unix.Malloc) do the rest of the
work. It is important that we check all copy and move assignment operator in top
frame even if we checked them already since self-assignments may happen
undetected even in the same translation unit (e.g. using random indices for an
array what may or may not be the same).
Kelvin Li [Mon, 18 Jul 2016 16:09:53 +0000 (16:09 +0000)]
[OpenMP] update test cases for -std=c++11 compile
target_parallel_for_simd_collapse_messages.cpp and target_parallel_for_simd_ordered_messages.cpp give different diagnostic messages in compiling with -std=c++11. The test cases are updated to make it compatible.
Add support for ObjC types to respect the DLLImport/DLLExport storage
annotations. This only effects COFF output. This would allow usage with
clang/C2, but not with clang/LLVM due to hard coded section names.
David Majnemer [Sun, 17 Jul 2016 00:39:12 +0000 (00:39 +0000)]
[CodeGen] Some assorted cleanups
No functional change, just some cleanups:
- Use auto when it is appropriate.
- There were some strange static_casts which were superfluous.
- Use range-based for loops when appropriate.
- The dyn_cast_or_null construct was used when null was impossible.
CodeGen: use StringRefs more in ObjC class generation, NFC
Rather than building up a number of SmallString-s in order to construct a
std::string, use more StringRefs and construct the string once before use. This
avoids unnecessary string constructions. NFC.
Add a couple of local variables for the class interface and the super class
interface. This allows for the repeated access of the information to be cached
and makes the code simpler to understand. NFC.
Erik Pilkington [Sat, 16 Jul 2016 00:35:23 +0000 (00:35 +0000)]
[ObjC] Implement @available in the Parser and AST
This patch adds a new AST node: ObjCAvailabilityCheckExpr, and teaches the
Parser and Sema to generate it. This node represents an availability check of
the form:
@available(macos 10.10, *);
Which will eventually compile to a runtime check of the host's OS version. This
is the first patch of the feature I proposed here:
http://lists.llvm.org/pipermail/cfe-dev/2016-July/049851.html
Richard Smith [Sat, 16 Jul 2016 00:35:14 +0000 (00:35 +0000)]
Reimplement ExternalSemaSource delegation in terms of
MultiplexExternalSemaSource to remove one of the places that needs updating
every time the ExternalSemaSource interface changes.
Samuel Antao [Fri, 15 Jul 2016 23:13:27 +0000 (23:13 +0000)]
[CUDA][OpenMP] Create generic offload action
Summary:
This patch replaces the CUDA specific action by a generic offload action. The offload action may have multiple dependences classier in “host” and “device”. The way this generic offloading action is used is very similar to what is done today by the CUDA implementation: it is used to set a specific toolchain and architecture to its dependences during the generation of jobs.
This patch also proposes propagating the offloading information through the action graph so that that information can be easily retrieved at any time during the generation of commands. This allows e.g. the "clang tool” to evaluate whether CUDA should be supported for the device or host and ptas to easily retrieve the target architecture.
This is an example of how the action graphs would look like (compilation of a single CUDA file with two GPU architectures)
```
0: input, "cudatests.cu", cuda, (host-cuda)
1: preprocessor, {0}, cuda-cpp-output, (host-cuda)
2: compiler, {1}, ir, (host-cuda)
3: input, "cudatests.cu", cuda, (device-cuda, sm_35)
4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_35)
5: compiler, {4}, ir, (device-cuda, sm_35)
6: backend, {5}, assembler, (device-cuda, sm_35)
7: assembler, {6}, object, (device-cuda, sm_35)
8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {7}, object
9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {6}, assembler
10: input, "cudatests.cu", cuda, (device-cuda, sm_37)
11: preprocessor, {10}, cuda-cpp-output, (device-cuda, sm_37)
12: compiler, {11}, ir, (device-cuda, sm_37)
13: backend, {12}, assembler, (device-cuda, sm_37)
14: assembler, {13}, object, (device-cuda, sm_37)
15: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {14}, object
16: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {13}, assembler
17: linker, {8, 9, 15, 16}, cuda-fatbin, (device-cuda)
18: offload, "host-cuda (powerpc64le-unknown-linux-gnu)" {2}, "device-cuda (nvptx64-nvidia-cuda)" {17}, ir
19: backend, {18}, assembler
20: assembler, {19}, object
21: input, "cuda", object
22: input, "cudart", object
23: linker, {20, 21, 22}, image
```
The changes in this patch pass the existent regression tests (keeps the existent functionality) and resulting binaries execute correctly in a Power8+K40 machine.
Reviewers: echristo, hfinkel, jlebar, ABataev, tra