Taiju Tsuiki [Wed, 30 May 2018 03:53:16 +0000 (03:53 +0000)]
Update NRVO logic to support early return
Summary:
The previous implementation misses an opportunity to apply NRVO (Named Return Value
Optimization) below. That discourages user to write early return code.
```
struct Foo {};
Foo f(bool b) {
if (b)
return Foo();
Foo oo;
return oo;
}
```
That is, we can/should apply RVO for a local variable if:
* It's directly returned by at least one return statement.
* And, all reachable return statements in its scope returns the variable directly.
While, the previous implementation disables the RVO in a scope if there are multiple return
statements that refers different variables.
On the new algorithm, local variables are in NRVO_Candidate state at first, and a return
statement changes it to NRVO_Disabled for all visible variables but the return statement refers.
Then, at the end of the function AST traversal, NRVO is enabled for variables in NRVO_Candidate
state and refers from at least one return statement.
Sema: Add a flag for rejecting member pointers with incomplete base types.
Codebases that need to be compatible with the Microsoft ABI can pass
this flag to avoid issues caused by the lack of a fixed ABI for
incomplete member pointers.
Richard Smith [Wed, 30 May 2018 01:52:16 +0000 (01:52 +0000)]
Make the mangled name collision diagnostic a bit more useful by listing the mangling.
This helps especially when the collision is for a template specialization,
where the template arguments are not available from anywhere else in the
diagnostic, and are likely relevant to the problem.
Eric Fiselier [Wed, 30 May 2018 01:00:41 +0000 (01:00 +0000)]
[Sema] Use %sub to cleanup overload diagnostics
Summary:
This patch adds the newly added `%sub` diagnostic modifier to cleanup repetition in the overload candidate diagnostics.
I think this should be good to go.
@rsmith: Some of the notes now emit `function template` where they only said `function` previously. It seems OK to me, but I would like your sign off on it.
Yaxun Liu [Wed, 30 May 2018 00:49:10 +0000 (00:49 +0000)]
Add action builder for HIP
To support separate compile/link and linking across device IR in different source files,
a new HIP action builder is introduced. Basically it compiles/links host and device
code separately, and embed fat binary in host linking stage through linker script.
Richard Trieu [Tue, 29 May 2018 22:43:00 +0000 (22:43 +0000)]
Check pointer null-ness before dereferencing it.
-Warc-repeated-use-of-weak may trigger a segmentation fault when the Decl
being checked is outside of a function scope, leaving the current function
info pointer null. This adds a check before using the function info.
Petr Hosek [Tue, 29 May 2018 22:35:39 +0000 (22:35 +0000)]
[Driver] Rename DefaultTargetTriple to TargetTriple
While this value is initialized with the DefaultTargetTriple, it
can be later overriden using the -target flag so TargetTriple is
a more accurate name. This change also provides an accessor which
could be accessed from ToolChain implementations.
Akira Hatanaka [Tue, 29 May 2018 18:28:49 +0000 (18:28 +0000)]
[CodeGen][Darwin] Set the calling-convention of thread-local variable
initialization functions to 'cxx_fast_tlscc'.
This fixes a bug where instructions calling initialization functions for
thread-local static members of c++ template classes were using calling
convention 'cxx_fast_tlscc' while the called functions weren't annotated
with the calling convention.
David Stenberg [Tue, 29 May 2018 13:07:58 +0000 (13:07 +0000)]
Fix emission of phony dependency targets when adding extra deps
Summary:
This commit fixes a bug where passing extra dependency entries
(using -fdepfile-entry) would result in -MP incorrectly emitting
a phony target for the input file, and no phony target for the
first extra dependency.
The extra dependencies are added first to the filename vector in
DFGImpl. That clashed with the emission of the phony targets, as
the code assumed that the first index always correspond to the
input file.
Craig Topper [Tue, 29 May 2018 03:26:38 +0000 (03:26 +0000)]
[X86] Merge the 3 different flavors of masked vpermi2var/vpermt2var builtins to a single version without masking. Use select builtins with appropriate operand instead.
Craig Topper [Sat, 26 May 2018 18:57:41 +0000 (18:57 +0000)]
Revert r333347 "[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible."
Craig Topper [Sat, 26 May 2018 18:55:24 +0000 (18:55 +0000)]
[X86] Rewrite the max and min reduction intrinsics to make better use of other functions and to reduce width to 256 and 128 bits were possible.
Summary:
We only need to use 512 bit vectors all the way through v8i64 reductions since those max instructions are new to avx512f and only available in 512 bits until SKX.
For v16i32 and floating point we have legacy 128/256 bit instructions we can use.
I've tried to use other intrinsics to reduce the verbosity of the code and avoid having to mention all the shuffles. I've also removed all the -1 shuffle indices so the output sequence is fully specified and not left to backend optimization.
David Bolvansky [Sat, 26 May 2018 09:24:00 +0000 (09:24 +0000)]
[ClangDiagnostics] Silence warning about fallthrough after PrintFatalError
Summary:
ClangDiagnosticsEmitter.cpp:1047:57: warning: this statement may fall through [-Wimplicit-fallthrough=]
Builder.PrintFatalError("Unknown modifier type: " + Modifier);
~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
ClangDiagnosticsEmitter.cpp:1048:5: note: here
case MT_Select: {
^
Paul Robinson [Fri, 25 May 2018 20:59:29 +0000 (20:59 +0000)]
[DebugInfo] Don't bother with MD5 checksums of preprocessed files.
The checksum will not reflect the real source, so there's no clear
reason to include them in the debug info. Also this was causing a
crash on the DWARF side.
Petr Hosek [Fri, 25 May 2018 20:39:37 +0000 (20:39 +0000)]
[Support] Avoid normalization in sys::getDefaultTargetTriple
The return value of sys::getDefaultTargetTriple, which is derived from
-DLLVM_DEFAULT_TRIPLE, is used to construct tool names, default target,
and in the future also to control the search path directly; as such it
should be used textually, without interpretation by LLVM.
Normalization of this value may lead to unexpected results, for example
if we configure LLVM with -DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-linux-gnu,
normalization will transform that value to x86_64--linux-gnu. Driver will
use that value to search for tools prefixed with x86_64--linux-gnu- which
may be confusing. This is also inconsistent with the behavior of the
--target flag which is taken as-is without any normalization and overrides
the value of LLVM_DEFAULT_TARGET_TRIPLE.
Users of sys::getDefaultTargetTriple already perform their own
normalization as needed, so this change shouldn't impact existing logic.
Alexey Bataev [Fri, 25 May 2018 20:16:03 +0000 (20:16 +0000)]
[OPENMP, NVPTX] Fixed codegen for orphaned parallel region.
If orphaned parallel region is found, the next code must be emitted:
```
if(__kmpc_is_spmd_exec_mode() || __kmpc_parallel_level(loc, gtid))
Serialized execution.
else if (IsMasterThread())
Prepare and signal worker.
else
Outined function call.
```
JF Bastien [Fri, 25 May 2018 17:36:49 +0000 (17:36 +0000)]
Follow-up fix for nonnull atomic non-member functions
Handling of the third parameter was only checking for *_n and not for the C11 variant, which means that cmpxchg of a 'desired' 0 value was erroneously warning. Handle C11 properly, and add extgensive tests for this as well as NULL pointers in a bunch of places.
Kristof Umann [Fri, 25 May 2018 13:18:38 +0000 (13:18 +0000)]
[analyzer] Added template argument lists to the Pathdiagnostic output
Because template parameter lists were not displayed
in the plist output, it was difficult to decide in
some cases whether a given checker found a true or a
false positive. This patch aims to correct this.
Ivan Donchevskii [Fri, 25 May 2018 12:56:26 +0000 (12:56 +0000)]
Optionally add code completion results for arrow instead of dot
Currently getting such completions requires source correction, reparsing
and calling completion again. And if it shows no results and rollback is
required then it costs one more reparse.
With this change it's possible to get all results which can be later
filtered to split changes which require correction.
JF Bastien [Fri, 25 May 2018 00:07:09 +0000 (00:07 +0000)]
Make atomic non-member functions as nonnull
Summary:
As a companion to libc++ patch https://reviews.llvm.org/D47225, mark builtin atomic non-member functions which accept pointers as nonnull.
The atomic non-member functions accept pointers to std::atomic / std::atomic_flag as well as to the non-atomic value. These are all dereferenced unconditionally when lowered, and therefore will fault if null. It's a tiny gotcha for new users, especially when they pass in NULL as expected value (instead of passing a pointer to a NULL value).
Richard Smith [Thu, 24 May 2018 20:03:51 +0000 (20:03 +0000)]
Improve diagnostics for config mismatches with -fmodule-file.
Unless the user uses -Wno-module-file-config-mismatch (or -Wno-error=...),
allow the AST reader to produce errors describing the nature of the config
mismatch.
Ben Langmuir [Thu, 24 May 2018 16:25:40 +0000 (16:25 +0000)]
[bash-completion] Fix tab separation on macOS
We have a regex that needs to match a tab character in the command
output, but on macOS sed doesn't support '\t', causing it to split on
the 't' character instead. Fix by having bash expand the \t first.
Jacek Olesiak [Thu, 24 May 2018 10:50:36 +0000 (10:50 +0000)]
[clang-format] Fix putting ObjC message arguments in one line for multiline receiver
Summary:
Reapply reverted changes from D46879.
Currently BreakBeforeParameter is set to true everytime message receiver spans multiple lines, e.g.:
```
[[object block:^{
return 42;
}] aa:42 bb:42];
```
will be formatted:
```
[[object block:^{
return 42;
}] aa:42
bb:42];
```
even though arguments could fit into one line. This change fixes this behavior.
Test Plan:
make -j12 FormatTests && tools/clang/unittests/Format/FormatTests
Gabor Marton [Thu, 24 May 2018 08:41:07 +0000 (08:41 +0000)]
[ASTImporter] Add unit tests for structural equivalence
Summary:
This patch add new tests for structural equivalence. For that a new
common header is created which holds the test related language specific
types and functions.
Eric Christopher [Thu, 24 May 2018 06:00:50 +0000 (06:00 +0000)]
Migrate libcalls-fno-builtin.c test from checking optimized assembly
to checking for attributes on the call site - and fix up builtin
functions that we were testing for but not ensuring wouldn't be
optimized by the backend.
Leave one set of asm tests to make sure that we're also communicating
builtin-ness to TLI.
Steven Wu [Thu, 24 May 2018 01:01:43 +0000 (01:01 +0000)]
[Sema][ObjC] Do not DiagnoseUseOfDecl in LookupMemberExpr
Summary:
Remove the call to DiagnoseUseOfDecl in LookupMemberExpr because:
1. LookupMemberExpr eagerly lookup both getter and setter, reguardless
if they are used or not. It causes wrong diagnostics if you are only
using getter.
2. LookupMemberExpr only diagnoses getter, but not setter.
3. ObjCPropertyOpBuilder already DiagnoseUseOfDecl when building getter
and setter. Doing it again in LookupMemberExpr causes duplicated
diagnostics.
Richard Smith [Wed, 23 May 2018 21:18:00 +0000 (21:18 +0000)]
Rework __builtin_classify_type support to better match GCC and to not assert on
unusual types.
Following the observed behavior of GCC, we now return -1 for vector types
(along with all of our extensions that GCC doesn't support), and for atomic
types we classify the underlying type.
GCC appears to have changed its classification for function and array arguments
between version 5 and version 6. Previously it would classify them as pointers
in C and as functions or arrays in C++, but from version 6 onwards, it
classifies them as pointers. We now follow the more recent GCC behavior rather
than emulating what I can only assume to be a historical bug in their C++
support for this builtin.
Finally, no version of GCC that I can find has ever used the "method"
classification for C++ pointers to member functions. Instead, GCC classifies
them as record types, presumably reflecting an internal implementation detail,
but whatever the reason we now produce compatible results.
Raphael Isemann [Wed, 23 May 2018 20:59:46 +0000 (20:59 +0000)]
[modules] Mark __wmmintrin_pclmul.h/__wmmintrin_aes.h as textual
Summary:
Since clang r332929 these two headers throw errors when included from somewhere else than their wrapper header. It seems marking them as textual is the best way to fix the builds.
Fixes this new module build error:
While building module '_Builtin_intrinsics' imported from ...:
In file included from <module-includes>:2:
In file included from lib/clang/7.0.0/include/immintrin.h:54:
In file included from lib/clang/7.0.0/include/wmmintrin.h:29:
lib/clang/7.0.0/include/__wmmintrin_aes.h:25:2: error: "Never use <__wmmintrin_aes.h> directly; include <wmmintrin.h> instead."
#error "Never use <__wmmintrin_aes.h> directly; include <wmmintrin.h> instead."
Craig Topper [Wed, 23 May 2018 18:32:58 +0000 (18:32 +0000)]
[X86] Move all Intel defined intrinsic includes into immintrin.h
This matches the Intel documentation which shows them available by importing immintrin.h. x86intrin.h also includes immintrin.h so anyone including x86intrin.h will still get them.
This is different than gcc, but I don't think we were a perfect match there already. I'm unclear what gcc's policy is about how they choose which to add things to.
[clang-format] Break template declarations followed by comments
Summary:
This patch fixes two bugs in clang-format where the template wrapper doesn't skip over
comments causing a long template declaration to not be split into multiple lines.
These were latent and exposed by r332436.
Gabor Marton [Wed, 23 May 2018 13:53:36 +0000 (13:53 +0000)]
Fix duplicate class template definitions problem
Summary:
We fail to import a `ClassTemplateDecl` if the "To" context already
contains a definition and then a forward decl. This is because
`localUncachedLookup` does not find the definition. This is not a
lookup error, the parser behaves differently than assumed in the
importer code. A `DeclContext` contains one DenseMap (`LookupPtr`)
which maps names to lists. The list is a special list `StoredDeclsList`
which is optimized to have one element. During building the initial
AST, the parser first adds the definition to the `DeclContext`. Then
during parsing the second declaration (the forward decl) the parser
again calls `DeclContext::addDecl` but that will not add a new element
to the `StoredDeclsList` rarther it simply overwrites the old element
with the most recent one. This patch fixes the error by finding the
definition in the redecl chain. Added tests for the same issue with
`CXXRecordDecl` and with `ClassTemplateSpecializationDecl`. These tests
pass and they pass because in `VisitRecordDecl` and in
`VisitClassTemplateSpecializationDecl` we already use
`D->getDefinition()` after the lookup.
Hans Wennborg [Wed, 23 May 2018 08:24:01 +0000 (08:24 +0000)]
Revert r333044 "Use zeroinitializer for (trailing zero portion of) large array initializers"
It caused asserts, see PR37560.
> Use zeroinitializer for (trailing zero portion of) large array initializers
> more reliably.
>
> Clang has two different ways it emits array constants (from InitListExprs and
> from APValues), and both had some ability to emit zeroinitializer, but neither
> was able to catch all cases where we could use zeroinitializer reliably. In
> particular, emitting from an APValue would fail to notice if all the explicit
> array elements happened to be zero. In addition, for large arrays where only an
> initial portion has an explicit initializer, we would emit the complete
> initializer (which could be huge) rather than emitting only the non-zero
> portion. With this change, when the element would have a suffix of more than 8
> zero elements, we emit the array constant as a packed struct of its initial
> portion followed by a zeroinitializer constant for the trailing zero portion.
>
> In passing, I found a bug where SemaInit would sometimes walk the entire array
> when checking an initializer that only covers the first few elements; that's
> fixed here to unblock testing of the rest.
>
> Differential Revision: https://reviews.llvm.org/D47166
Craig Topper [Wed, 23 May 2018 05:51:52 +0000 (05:51 +0000)]
[X86] In the floating point max reduction intrinsics, negate infinity before feeding it to set1.
Previously we negated the whole vector after splatting infinity. But its better to negate the infinity before splatting. This generates IR with the negate already folded with the infinity constant.
Craig Topper [Wed, 23 May 2018 04:51:54 +0000 (04:51 +0000)]
[X86] Remove mask argument from more builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
Richard Smith [Wed, 23 May 2018 00:09:29 +0000 (00:09 +0000)]
Use zeroinitializer for (trailing zero portion of) large array initializers
more reliably.
Clang has two different ways it emits array constants (from InitListExprs and
from APValues), and both had some ability to emit zeroinitializer, but neither
was able to catch all cases where we could use zeroinitializer reliably. In
particular, emitting from an APValue would fail to notice if all the explicit
array elements happened to be zero. In addition, for large arrays where only an
initial portion has an explicit initializer, we would emit the complete
initializer (which could be huge) rather than emitting only the non-zero
portion. With this change, when the element would have a suffix of more than 8
zero elements, we emit the array constant as a packed struct of its initial
portion followed by a zeroinitializer constant for the trailing zero portion.
In passing, I found a bug where SemaInit would sometimes walk the entire array
when checking an initializer that only covers the first few elements; that's
fixed here to unblock testing of the rest.
Sanjay Patel [Tue, 22 May 2018 23:02:13 +0000 (23:02 +0000)]
[CodeGen] use nsw negation for builtin abs
The clang builtins have the same semantics as the stdlib functions.
The stdlib functions are defined in section 7.20.6.1 of the C standard with:
"If the result cannot be represented, the behavior is undefined."
That lets us mark the negation with 'nsw' because "sub i32 0, INT_MIN" would
be UB/poison.
Craig Topper [Tue, 22 May 2018 22:19:19 +0000 (22:19 +0000)]
[X86] As mentioned in post-commit feedback in D47174, move the 128 bit f16c intrinsics into f16cintrin.h and remove __emmintrin_f16c.h
These were included in emmintrin.h to match Intel Intrinsics Guide documentation. But this is because icc is capable of emulating them on targets that don't support F16C using library calls. Clang/LLVM doesn't have this emulation support. So it makes more sense to include them in immintrin.h instead.
I've left a comment behind to hopefully deter someone from trying to move them again in the future.
Craig Topper [Tue, 22 May 2018 20:48:24 +0000 (20:48 +0000)]
[X86] Remove mask argument from some builtins that are handled completely in CGBuiltin.cpp. Just wrap a select builtin around them in the header file instead.
Craig Topper [Tue, 22 May 2018 18:54:19 +0000 (18:54 +0000)]
[X86] Move 128-bit f16c intrinsics to __emmintrin_f16c.h include from emmintrin.h. Move 256-bit f16c intrinsics back to f16cintrin.h
Intel documents the 128-bit versions as being in emmintrin.h and the 256-bit version as being in immintrin.h.
This patch makes a new __emmtrin_f16c.h to hold the 128-bit versions to be included from emmintrin.h. And makes the existing f16cintrin.h contain the 256-bit versions and include it from immintrin.h with an error if its included directly.