[clang-format] Break template declarations followed by comments
Summary:
This patch fixes two bugs in clang-format where the template wrapper doesn't skip over
comments causing a long template declaration to not be split into multiple lines.
These were latent and exposed by r332436.
Gabor Marton [Wed, 23 May 2018 13:53:36 +0000 (13:53 +0000)]
Fix duplicate class template definitions problem
Summary:
We fail to import a `ClassTemplateDecl` if the "To" context already
contains a definition and then a forward decl. This is because
`localUncachedLookup` does not find the definition. This is not a
lookup error, the parser behaves differently than assumed in the
importer code. A `DeclContext` contains one DenseMap (`LookupPtr`)
which maps names to lists. The list is a special list `StoredDeclsList`
which is optimized to have one element. During building the initial
AST, the parser first adds the definition to the `DeclContext`. Then
during parsing the second declaration (the forward decl) the parser
again calls `DeclContext::addDecl` but that will not add a new element
to the `StoredDeclsList` rarther it simply overwrites the old element
with the most recent one. This patch fixes the error by finding the
definition in the redecl chain. Added tests for the same issue with
`CXXRecordDecl` and with `ClassTemplateSpecializationDecl`. These tests
pass and they pass because in `VisitRecordDecl` and in
`VisitClassTemplateSpecializationDecl` we already use
`D->getDefinition()` after the lookup.
Hans Wennborg [Wed, 23 May 2018 08:24:01 +0000 (08:24 +0000)]
Revert r333044 "Use zeroinitializer for (trailing zero portion of) large array initializers"
It caused asserts, see PR37560.
> Use zeroinitializer for (trailing zero portion of) large array initializers
> more reliably.
>
> Clang has two different ways it emits array constants (from InitListExprs and
> from APValues), and both had some ability to emit zeroinitializer, but neither
> was able to catch all cases where we could use zeroinitializer reliably. In
> particular, emitting from an APValue would fail to notice if all the explicit
> array elements happened to be zero. In addition, for large arrays where only an
> initial portion has an explicit initializer, we would emit the complete
> initializer (which could be huge) rather than emitting only the non-zero
> portion. With this change, when the element would have a suffix of more than 8
> zero elements, we emit the array constant as a packed struct of its initial
> portion followed by a zeroinitializer constant for the trailing zero portion.
>
> In passing, I found a bug where SemaInit would sometimes walk the entire array
> when checking an initializer that only covers the first few elements; that's
> fixed here to unblock testing of the rest.
>
> Differential Revision: https://reviews.llvm.org/D47166
Craig Topper [Wed, 23 May 2018 05:51:52 +0000 (05:51 +0000)]
[X86] In the floating point max reduction intrinsics, negate infinity before feeding it to set1.
Previously we negated the whole vector after splatting infinity. But its better to negate the infinity before splatting. This generates IR with the negate already folded with the infinity constant.
Craig Topper [Wed, 23 May 2018 04:51:54 +0000 (04:51 +0000)]
[X86] Remove mask argument from more builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
Richard Smith [Wed, 23 May 2018 00:09:29 +0000 (00:09 +0000)]
Use zeroinitializer for (trailing zero portion of) large array initializers
more reliably.
Clang has two different ways it emits array constants (from InitListExprs and
from APValues), and both had some ability to emit zeroinitializer, but neither
was able to catch all cases where we could use zeroinitializer reliably. In
particular, emitting from an APValue would fail to notice if all the explicit
array elements happened to be zero. In addition, for large arrays where only an
initial portion has an explicit initializer, we would emit the complete
initializer (which could be huge) rather than emitting only the non-zero
portion. With this change, when the element would have a suffix of more than 8
zero elements, we emit the array constant as a packed struct of its initial
portion followed by a zeroinitializer constant for the trailing zero portion.
In passing, I found a bug where SemaInit would sometimes walk the entire array
when checking an initializer that only covers the first few elements; that's
fixed here to unblock testing of the rest.
Sanjay Patel [Tue, 22 May 2018 23:02:13 +0000 (23:02 +0000)]
[CodeGen] use nsw negation for builtin abs
The clang builtins have the same semantics as the stdlib functions.
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."
That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.
Craig Topper [Tue, 22 May 2018 22:19:19 +0000 (22:19 +0000)]
[X86] As mentioned in post-commit feedback in D47174, move the 128 bit f16c intrinsics into f16cintrin.h and remove __emmintrin_f16c.h
These were included in emmintrin.h to match Intel Intrinsics Guide documentation. But this is because icc is capable of emulating them on targets that don't support F16C using library calls. Clang/LLVM doesn't have this emulation support. So it makes more sense to include them in immintrin.h instead.
I've left a comment behind to hopefully deter someone from trying to move them again in the future.
Craig Topper [Tue, 22 May 2018 20:48:24 +0000 (20:48 +0000)]
[X86] Remove mask argument from some builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
Craig Topper [Tue, 22 May 2018 18:54:19 +0000 (18:54 +0000)]
[X86] Move 128-bit f16c intrinsics to __emmintrin_f16c.h include from emmintrin.h. Move 256-bit f16c intrinsics back to f16cintrin.h
Intel documents the 128-bit versions as being in emmintrin.h and the 256-bit version as being in immintrin.h.
This patch makes a new __emmtrin_f16c.h to hold the 128-bit versions to be included from emmintrin.h. And makes the existing f16cintrin.h contain the 256-bit versions and include it from immintrin.h with an error if its included directly.
Yaxun Liu [Tue, 22 May 2018 14:36:26 +0000 (14:36 +0000)]
Call CreateTempMemWithoutCast for ActiveFlag
Introduced CreateMemTempWithoutCast and CreateTemporaryAllocaWithoutCast to emit alloca
without casting to default addr space.
ActiveFlag is a temporary variable emitted for clean up. It is defined as AllocaInst* type and there is
a cast to AlllocaInst in SetActiveFlag. An alloca casted to generic pointer causes assertion in
SetActiveFlag.
Since there is only load/store of ActiveFlag, it is safe to use the original alloca, therefore use
CreateMemTempWithoutCast is called.
Craig Topper [Mon, 21 May 2018 20:58:23 +0000 (20:58 +0000)]
[X86] Remove masking from pternlog llvm intrinsics and use a select instruction instead.
Because the intrinsics in the headers are implemented as macros, we can't just use a select builtin and pternlog builtin. This would require one of the macro arguments to be used twice. Depending on what was passed to the macro we could expand an expression twice leading to weird behavior. We could maybe declare our local variable in the macro, but that would need to worry about name collisions.
To avoid that just generate IR directly in CGBuiltin.cpp.
Craig Topper [Mon, 21 May 2018 20:19:17 +0000 (20:19 +0000)]
[X86] Use __builtin_convertvector to implement some of the packed integer to packed float conversion intrinsics.
I believe this is safe assuming default default FP environment. The conversion might be inexact, but it can never overflow the FP type so this shouldn't be undefined behavior for the uitofp/sitofp instructions.
We already do something similar for scalar conversions.
Mark Searles [Mon, 21 May 2018 17:29:08 +0000 (17:29 +0000)]
[Clang Tablegen] Add llvm_unreachable() to getModifierName()
Fix internal build failure:
../../../ClangDiagnosticsEmitter.cpp -o ClangDiagnosticsEmitter.o
../../../ClangDiagnosticsEmitter.cpp: In function 'llvm::StringRef
{anonymous}::getModifierName({anonymous}::ModifierType)':
../../../ClangDiagnosticsEmitter.cpp:495:1: error: control reaches end of non-void function [-Werror=return-type]
}
^
Build failure triggered by git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@332799 91177308-0d34-0410-b5e6-96231b3b80d8
Alexey Bataev [Mon, 21 May 2018 16:40:32 +0000 (16:40 +0000)]
[OPENMP-SIMD] Fix PR37536: Fix definition of _OPENMP macro.
if `-fopenmp-simd` is specified alone, `_OPENMP` macro should not be
defined. If `-fopenmp-simd` is specified along with the `-fopenmp`,
`_OPENMP` macro should be defined with the value `201511`.
Daniil Fukalov [Mon, 21 May 2018 16:18:07 +0000 (16:18 +0000)]
[AMDGPU] fixes for lds f32 builtins
1. added restrictions to memory scope, order and volatile parameters
2. added custom processing for these builtins - currently is not used code,
needed to switch off GCCBuiltin link to the builtins (ongoing change to llvm
tree)
3. builtins renamed as requested
Serge Pavlov [Mon, 21 May 2018 16:09:54 +0000 (16:09 +0000)]
[CodeGen] Recognize more cases of zero initialization
If a variable has an initializer, codegen tries to build its value. If
the variable is large in size, building its value requires substantial
resources. It causes strange behavior from user viewpoint: compilation
of huge zero initialized arrays like:
char data_1[2147483648u] = { 0 };
consumes enormous amount of time and memory.
With this change codegen tries to determine if variable initializer is
equivalent to zero initializer. In this case variable value is not
constructed.
Handle attributes before checking the record layout (e.g. underalignment check
during `alignas` processing), as layout may be cached without taking into
account attributes that may affect it.
Pavel Labath [Mon, 21 May 2018 11:47:45 +0000 (11:47 +0000)]
[CodeGen] Disable aggressive structor optimizations at -O0, take 2
The first version of the patch (r332228) was flawed because it was
putting structors into C5/D5 comdats very eagerly. This is correct only
if we can ensure the comdat contains all required versions of the
structor (which wasn't the case). This version uses a more nuanced
approach:
- for local structor symbols we use an alias because we don't have to
worry about comdats or other compilation units.
- linkonce symbols are emitted separately, as we cannot guarantee we
will have all symbols we need to form a comdat (they are emitted
lazily, only when referenced).
- available_externally symbols are also emitted separately, as the code
seemed to be worried about emitting an alias in this case.
- other linkage types are not affected by the optimization level. They
either get put into a comdat (weak) or get aliased (external).
Craig Topper [Mon, 21 May 2018 06:07:49 +0000 (06:07 +0000)]
[X86] Remove some preprocessor feature checks from intrinsic headers
Summary:
These look to be a couple things that weren't removed when we switched to target attribute.
The popcnt makes including just smmintrin.h also include popcntintrin.h. The popcnt file itself already contains target attrributes.
The prefetch ones are just wrappers around __builtin_prefetch which we have graceful fallbacks for in the backend if the exact instruction isn't available. So there's no reason to hide them. And it makes them available in functions that have the write target attribute but not a -march command line flag.
Brian Gesiak [Sat, 19 May 2018 11:46:58 +0000 (11:46 +0000)]
[Driver] Loosen test for LLVM findNearest
Summary:
When https://reviews.llvm.org/D46776 landed to improve the behavior of
`llvm::OptTable::findNearest`, a PS4 buildbot began failing due to an
assertion that a suggestion "-debug-info-macro" should be provided for
the unrecognized option `clang -cc1as -debug-info-macros`. All other
buildbots succeeded in this check, and the PS4 buildbot succeeded in the
other `findNearest` tests.
Temporarily loosen this check in order to reland the `findNearest`
change.
JF Bastien [Sat, 19 May 2018 04:21:26 +0000 (04:21 +0000)]
CodeGen: block capture shouldn't ICE
When a lambda capture captures a __block in the same statement, the compiler asserts out because isCapturedBy assumes that an Expr can only be a BlockExpr, StmtExpr, or if it's a Stmt then all the statement's children are expressions. That's wrong, we need to visit all sub-statements even if they're not expressions to see if they also capture.
Fix this issue by pulling out the isCapturedBy logic to use RecursiveASTVisitor.
Eric Fiselier [Sat, 19 May 2018 03:12:04 +0000 (03:12 +0000)]
[Clang Tablegen][RFC] Allow Early Textual Substitutions in `Diagnostic` messages.
Summary:
There are cases where the same string or select is repeated verbatim in a lot of diagnostics. This can be a pain to maintain and update. Tablegen provides no way stash the common text somewhere and reuse it in the diagnostics, until now!
This patch allows diagnostic texts to contain `%sub{<definition-name>}`, where `<definition-name>` names a Tablegen record of type `TextSubstitution`. These substitutions are done early, before the diagnostic string is otherwise processed. All `%sub` modifiers will be replaced before the diagnostic definitions are emitted.
The substitution must specify all arguments used by the substitution, and modifier indexes in the substitution are re-numbered accordingly. For example:
```
def select_ovl_candidate : TextSubstitution<"%select{function|constructor}0%select{| template| %2}1">;
```
when used as
```
"candidate `%sub{select_ovl_candidate}3,2,1 not viable"
```
will act as if we wrote:
```
"candidate %select{function|constructor}3%select{| template| %1}2 not viable"
```
Sunil Srivastava [Fri, 18 May 2018 23:32:01 +0000 (23:32 +0000)]
Do not enable RTTI with -fexceptions, for PS4
NFC for targets other than PS4.
This patch is a change in behavior for PS4, in that PS4 will no longer enable
RTTI when -fexceptions is specified (RTTI and Exceptions are disabled by default
on PS4). RTTI will remain disabled except for types being thrown or caught.
Also, '-fexceptions -fno-rtti' (previously prohibited on PS4) is now accepted,
as it is for other targets.
This patch removes some PS4 specific code, making the code cleaner.
Also, in the test file rtti-options.cpp, PS4 tests where the behavior is the
same as the generic x86_64-linux are removed, making the test cleaner.
Petr Hosek [Fri, 18 May 2018 18:33:07 +0000 (18:33 +0000)]
[Support] Avoid normalization in sys::getDefaultTargetTriple
The return value of sys::getDefaultTargetTriple, which is derived from
-DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target,
and in the future also to control the search path directly; as such it
should be used textually, without interpretation by LLVM.
Normalization of this value may lead to unexpected results, for example
if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu,
normalization will transform that value to x86_64--linux-gnu. Driver will
use that value to search for tools prefixed with x86_64--linux-gnu- which
may be confusing. This is also inconsistent with the behavior of the
--target flag which is taken as-is without any normalization and overrides
the value of LLVM_DEFAULT_TARGET_TRIPLE.
Users of sys::getDefaultTargetTriple already perform their own
normalization as needed, so this change shouldn't impact existing logic.
MC: Change the streamer ctors to take an object writer instead of a stream. NFCI.
The idea is that a client that wants split dwarf would create a
specific kind of object writer that creates two files, and use it to
create the streamer.
Summary:
Previously, clang-format's parser would fail to annotate the
selector in a single-component Objective-C method invocation with
`TT_SelectorName`. For example, the following:
Eric Liu [Fri, 18 May 2018 14:16:37 +0000 (14:16 +0000)]
Move #include manipulation code to new lib/Tooling/Inclusions.
Summary:
clangToolingCore is linked into almost everything (incl. clang), but
not few tools need #include manipulation at this point. So pull this into a
separate library in Tooling.
This patch aims to match the changes introduced
in gcc by https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html.
The -mibt feature flag is being removed, and the -fcf-protection
option now also defines a CET macro and causes errors when used
on non-X86 targets, while X86 targets no longer check for -mibt
and -mshstk to determine if -fcf-protection is supported. -mshstk
is now used only to determine availability of shadow stack intrinsics.
Reid Kleckner [Thu, 17 May 2018 18:12:18 +0000 (18:12 +0000)]
Fix a mangling failure on clang-cl C++17
MethodVFTableLocations in MigrosoftVTableContext contains canonicalized
decl. But, it's sometimes asked to lookup for non-canonicalized decl,
and that causes assertion failure, and compilation failure.
Walter Lee [Thu, 17 May 2018 18:04:39 +0000 (18:04 +0000)]
[sanitizer] Don't add --export-dynamic for Myriad
This is to work around a bug in some versions of gnu ld, where
--export-dynamic implies -shared even if -static is explicitly given.
Myriad supports static linking only, so --export-dynamic is never
needed.
Justin Lebar [Thu, 17 May 2018 16:15:07 +0000 (16:15 +0000)]
[CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces.
Summary:
Previously this triggered a -Wundefined-internal warning. But it's not
an undefined variable -- any variable of this form is a pointer to the
base of GPU core's shared memory.
Nico Weber [Thu, 17 May 2018 15:26:37 +0000 (15:26 +0000)]
Fix __uuidof handling on non-type template parameter in C++17
Clang used to pass the base lvalue of a non-type template parameter
to the template instantiation phase when the base part is __uuidof
and it's running in C++17 mode.
However, that drops its LValuePath, and unintentionally transforms
&__uuidof(...) to __uuidof(...).
This CL fixes that by passing whole expr. Fixes PR24986.
https://reviews.llvm.org/D46820?id=146557
Patch from Taiju Tsuiki <tzik@chromium.org>!
Peter Smith [Thu, 17 May 2018 13:17:33 +0000 (13:17 +0000)]
[AArch64] Correct inline assembly test case for S modifier [NFC]
The existing test for the AArch64 inline assembly constraint S uses the
A and L modifiers. These modifiers were implemented in the original
AArch64 backend but were not carried forward to the merged backend. The
A is associated with ADRP and does nothing, the L is associated with
:lo12: . Given that A and L are not supported by GCC and not supported
by the new implementation of constraint S in LLVM (see D46745) I've
altered the test to put :lo12: directly in the string so that A and L
are not needed.
Jan Korous [Thu, 17 May 2018 11:51:49 +0000 (11:51 +0000)]
Use dotted format of version tuple for availability diagnostics
E. g. use "10.11" instead of "10_11".
We are maintaining backward compatibility by parsing underscore-delimited version tuples but no longer keep track of the separator and using dot format for output.