Akira Hatanaka [Sat, 27 Jan 2018 00:34:09 +0000 (00:34 +0000)]
[CodeGen] Use the non-virtual alignment when emitting the base
constructor.
Previously, clang would emit an over-aligned (16-byte) store to
initialize B::x in B's base constructor when compiling the following
code:
struct A {
__attribute__((aligned(16))) double data1;
};
struct B : public virtual A {
B() : x(123) {}
double a;
int x;
};
struct C : public virtual B {};
void test() { B b; C c; }
This was happening because the code in IRGen that does member
initialization was using the alignment of a complete object instead of
the non-virtual alignment.
Matt Davis [Sat, 27 Jan 2018 00:25:29 +0000 (00:25 +0000)]
Always allow "#pragma region".
Summary:
Both MS and PS4 targets are capable of recognizing the
existence of: #pragma region, #pragma endregion.
Since this pragma is only a hint for certain editors, and has no logic,
it seems helpful to permit this pragma in all cases, not just MS compatibility mode.
[index] Fix crash when indexing a C++14 PCH/module related to TemplateTemplateParmDecls of alias templates
TemplateTemplateParmDecls of alias templates ended-up serialized as 'file-level decls' which was causing a crash while trying to index a PCH/module file that contained them.
Commit makes sure TemplateTemplateParmDecls are not recorded as such kind of decls.
AST: support protocol conformances on id/class/interfaces in MS ABI
Add support for mangling ObjC protocol conformances in MS ABI as if they are
COM interfaces. By diverging from the itanium mangling of `objc_protocol`
prefixed names, this approach allows for a semi-reasonable, albeit of
questionable sanity, undecoration via existing tooling. There is also the
possibility of adding an extension and taking part of the namespace to add the
conformance via the `L` and `Z` "modifiers", but the existing tooling would not
be able to properly undecorated the symbol even though incidentally `undname`
currently produces something legible while wine's implementation is not able to
cope with the extension.
This allows for the disambiguation of overloads where the parameter differs
only in the protocol conformance of the ObjC type, e.g.
Martin Probst [Fri, 26 Jan 2018 15:07:49 +0000 (15:07 +0000)]
clang-format: [JS] Prevent ASI before [ and (.
Summary:
JavaScript automatic semicolon insertion can trigger before [ and (, so
avoid breaking before them if the previous token is likely to terminate
an expression.
Benjamin Kramer [Fri, 26 Jan 2018 14:14:11 +0000 (14:14 +0000)]
[AST] Use bit packing to reduce sizeof(TypedefNameDecl) from 88 to 80.
We can stash the cached transparent tag bit in existing pointer padding.
Everything coming out of ASTContext is always aligned to a multiple of
8, so we have 8 spare bits.
Alexey Bader [Fri, 26 Jan 2018 11:48:46 +0000 (11:48 +0000)]
[OpenCL] Add "cles_khr_int64" extension.
Summary:
For OpenCL 1.1 embedded profile 64 bit integers i.e. long,
ulong including the appropriate vector data types and operations
on 64-bit integers are optional. The "cles_khr_int64" extension
string will be reported if the embedded profile implementation
supports 64-bit integers.
Nico Weber [Thu, 25 Jan 2018 15:24:43 +0000 (15:24 +0000)]
[clang-cl] Add support for /arch:AVX512F and /arch:AVX512
For /arch:AVX512F:
clang-cl and cl.exe both defines __AVX512F__ __AVX512CD__.
clang-cl also defines __AVX512ER__ __AVX512PF__.
64-bit cl.exe also defines (according to /Bz) _NO_PREFETCHW.
For /arch:AVX512:
clang-cl and cl.exe both define
__AVX512F__ __AVX512CD__ __AVX512BW__ __AVX512DQ__ __AVX512VL__.
64-bit cl.exe also defines _NO_PREFETCHW.
So not 100% identical, but pretty close.
Also refactor the existing AVX / AVX2 code to not repeat itself in both the
32-bit and 64-bit cases.
Nico Weber [Thu, 25 Jan 2018 14:38:29 +0000 (14:38 +0000)]
clang-cl: Simplify handling of /arch: flag.
r213083 initially implemented /arch: support by mapping it to CPU features.
Then r241077 additionally mapped it to CPU, which made the feature flags
redundant (if harmless). This change here removes the redundant mapping to
feature flags, and rewrites test/Driver/cl-x86-flags.c to be a bit more of an
integration test that checks for preprocessor defines like AVX (like documented
on MSDN) instead of for driver flags.
To keep emitting warn_drv_unused_argument, use getLastArgNoClaim() followed by an explicit claim() if needed.
This is in preparation for adding support for /arch:AVX512(F).
[clang-format] Fixes indentation of inner text proto messages
Summary:
Consider the text proto:
```
message {
sub { key: value }
}
```
Previously the first `{` was TT_Unknown, which caused the inner message to be
indented by the continuation width. This didn't happen for:
```
message {
sub: { key: value }
}
```
This is because the code to mark the first `{` as a TT_DictLiteral was only
considering the case where it marches forward and reaches a `:`.
This patch updates this by looking not only for `:`, but also for `<` and `{`.
[analyzer] Do not attempt to get the pointee of void*
Do not attempt to get the pointee of void* while generating a bug report
(otherwise it will trigger an assert inside RegionStoreManager::getBinding
assert(!T->isVoidType() && "Attempting to dereference a void pointer!")).
Brian Gesiak [Wed, 24 Jan 2018 22:15:42 +0000 (22:15 +0000)]
[coroutines] Pass coro func args to promise ctor
Summary:
Use corutine function arguments to initialize a promise type, but only
if the promise type defines a constructor that takes those arguments.
Otherwise, fall back to the default constructor.
Artem Dergachev [Wed, 24 Jan 2018 21:24:10 +0000 (21:24 +0000)]
[analyzer] NFC: Run many existing C++ tests with a custom operator new().
In order to provide more test coverage for inlined operator new(), add more
run-lines to existing test cases, which would trigger our fake header
to provide a body for operator new(). Most of the code should still behave
reasonably. When behavior intentionally changes, #ifs are provided.
Artem Dergachev [Wed, 24 Jan 2018 20:59:40 +0000 (20:59 +0000)]
[analyzer] Enable c++-allocator-inlining by default.
This allows the analyzer to analyze ("inline") custom operator new() calls and,
even more importantly, inline constructors of objects that were allocated
by any operator new() - not necessarily a custom one.
All changes in the tests in the current commit are intended improvements,
even if they didn't carry any explicit FIXME flag.
It is possible to restore the old behavior via
-analyzer-config c++-allocator-inlining=false
(this flag is supported by scan-build as well, and it can be into a clang
--analyze invocation via -Xclang .. -Xclang ..). There is no intention to
remove the old behavior for now.
Artem Dergachev [Wed, 24 Jan 2018 20:32:26 +0000 (20:32 +0000)]
[analyzer] Assume that the allocated value is non-null before construction.
I.e. not after. In the c++-allocator-inlining=true mode, we need to make the
assumption that the conservatively evaluated operator new() has returned a
non-null value. Previously we did this on CXXNewExpr, but now we have to do that
before calling the constructor, because some clever constructors are sometimes
assuming that their "this" is null and doing weird stuff. We would also crash
upon evaluating CXXNewExpr when the allocator was inlined and returned null and
had a throw specification; this is UB even for custom allocators, but we still
need not to crash.
Added more FIXME tests to ensure that eventually we fix calling the constructor
for null return values.
Rafael Espindola [Wed, 24 Jan 2018 18:58:32 +0000 (18:58 +0000)]
Don't create hidden dllimport global values.
Hidden visibility is almost the opposite of dllimport. We were
producing them before (dllimport wins in the existing llvm
implementation), but now the llvm verifier produces an error.
Artem Belevich [Wed, 24 Jan 2018 17:41:02 +0000 (17:41 +0000)]
[CUDA] Disable PGO and coverage instrumentation in NVPTX.
NVPTX does not have runtime support necessary for profiling to work
and even call arc collection is prohibitively expensive. Furthermore,
there's no easy way to collect the samples. NVPTX also does not
support global constructors that clang generates if sample/arc collection
is enabled.
Wei Mi [Tue, 23 Jan 2018 23:27:57 +0000 (23:27 +0000)]
Adjust MaxAtomicInlineWidth for i386/i486 targets.
This is to fix the bug reported in https://bugs.llvm.org/show_bug.cgi?id=34347#c6.
Currently, all MaxAtomicInlineWidth of x86-32 targets are set to 64. However,
i386 doesn't support any cmpxchg related instructions. i486 only supports cmpxchg.
So in this patch MaxAtomicInlineWidth is reset as follows:
For i386, the MaxAtomicInlineWidth should be 0 because no cmpxchg is supported.
For i486, the MaxAtomicInlineWidth should be 32 because it supports cmpxchg.
For others 32 bits x86 cpu, the MaxAtomicInlineWidth should be 64 because of cmpxchg8b.
We would previously treat `SEL` as a pointer-only type. This is not the
case. It should be treated similarly to `id` and `Class`. Add some
test cases to ensure that it will be properly handled as well.
These symbols are supposed to be preserved even by the linker. Use the
`llvm.used` to ensure that the symbols are not removed by DCE in the
linker. This should be a no-op change on MachO since the symbols are
annotated as `no_dead_strip`.
George Karpenkov [Tue, 23 Jan 2018 19:28:52 +0000 (19:28 +0000)]
[analyzer] Show full analyzer invocation for reproducibility in HTML reports
Analyzing problems which appear in scan-build results can be very
difficult, as after the launch no exact invocation is stored, and it's
super-hard to launch the debugger.
With this patch, the exact analyzer invocation appears in the footer,
and can be copied to debug/check reproducibility/etc.
AST: adjust ObjC MS mangling to work with typedefs
Rather than hardcode the pointerness of the `id` and `class` types,
handle them generically. This allows for the template type
specialization of `remove_pointer<id>` which would look through the `id`
type and deal with the `objc_object` structure without the pointer.
Artem Belevich [Tue, 23 Jan 2018 19:08:18 +0000 (19:08 +0000)]
[CUDA] CUDA has no device-side library builtins.
We should (almost) never consider a device-side declaration to match a
library builtin functio. Otherwise clang may ignore the implementation
provided by the CUDA headers and emit clang's idea of the builtin.
The tests are targeting Windows but do not specify an environment. When
executed on Linux, they would use an ELF output rather than the COFF
output. Explicitly provide an environment.
Fedor Sergeev [Tue, 23 Jan 2018 12:24:01 +0000 (12:24 +0000)]
[Solaris] Make RHEL devtoolsets handling Linux-specific
Summary:
This patch is meant to address the last outstanding review comment on the already approved
(but not yet commited) https://reviews.llvm.org/D35755, namely making the handling of the RHEL
devtoolsets Linux-specific.
Don't know if it's best integrated into the former or applied subsequently.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
[clang-format] Ignore UnbreakableTailLength sometimes during breaking
Summary:
This patch fixes an issue where the UnbreakableTailLength would be counted towards
the length of a token during breaking, even though we can break after the token.
For example, this proto text with column limit 20
```
# ColumnLimit: 20 V
foo: {
bar: {
bazoo: "aaaaaaa"
}
}
```
was broken:
```
# ColumnLimit: 20 V
foo: {
bar: {
bazoo:
"aaaaaaa"
}
}
```
because the 2 closing `}` were counted towards the string literal's `UnbreakableTailLength`.
Volodymyr Sapsai [Mon, 22 Jan 2018 22:29:24 +0000 (22:29 +0000)]
Reland "[CodeGen] Fix crash when a function taking transparent union is redeclared."
When a function taking transparent union is declared as taking one of
union members earlier in the translation unit, clang would hit an
"Invalid cast" assertion during EmitFunctionProlog. This case
corresponds to function f1 in test/CodeGen/transparent-union-redecl.c.
We decided to cast i32 to union because after merging function
declarations function parameter type becomes int,
CGFunctionInfo::ArgInfo type matches with ABIArgInfo type, so we decide
it is a trivial case. But these types should also be castable to
parameter declaration type which is not the case here.
Now the fix is in converting from ABIArgInfo type to VarDecl type and using
argument demotion when necessary.
Additional tests in Sema/transparent-union.c capture current behavior and make
sure there are no regressions.
Chandler Carruth [Mon, 22 Jan 2018 22:05:25 +0000 (22:05 +0000)]
Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre..
Summary:
First, we need to explain the core of the vulnerability. Note that this
is a very incomplete description, please see the Project Zero blog post
for details:
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
The basis for branch target injection is to direct speculative execution
of the processor to some "gadget" of executable code by poisoning the
prediction of indirect branches with the address of that gadget. The
gadget in turn contains an operation that provides a side channel for
reading data. Most commonly, this will look like a load of secret data
followed by a branch on the loaded value and then a load of some
predictable cache line. The attacker then uses timing of the processors
cache to determine which direction the branch took *in the speculative
execution*, and in turn what one bit of the loaded value was. Due to the
nature of these timing side channels and the branch predictor on Intel
processors, this allows an attacker to leak data only accessible to
a privileged domain (like the kernel) back into an unprivileged domain.
The goal is simple: avoid generating code which contains an indirect
branch that could have its prediction poisoned by an attacker. In many
cases, the compiler can simply use directed conditional branches and
a small search tree. LLVM already has support for lowering switches in
this way and the first step of this patch is to disable jump-table
lowering of switches and introduce a pass to rewrite explicit indirectbr
sequences into a switch over integers.
However, there is no fully general alternative to indirect calls. We
introduce a new construct we call a "retpoline" to implement indirect
calls in a non-speculatable way. It can be thought of loosely as
a trampoline for indirect calls which uses the RET instruction on x86.
Further, we arrange for a specific call->ret sequence which ensures the
processor predicts the return to go to a controlled, known location. The
retpoline then "smashes" the return address pushed onto the stack by the
call with the desired target of the original indirect call. The result
is a predicted return to the next instruction after a call (which can be
used to trap speculative execution within an infinite loop) and an
actual indirect branch to an arbitrary address.
On 64-bit x86 ABIs, this is especially easily done in the compiler by
using a guaranteed scratch register to pass the target into this device.
For 32-bit ABIs there isn't a guaranteed scratch register and so several
different retpoline variants are introduced to use a scratch register if
one is available in the calling convention and to otherwise use direct
stack push/pop sequences to pass the target address.
This "retpoline" mitigation is fully described in the following blog
post: https://support.google.com/faqs/answer/7625886
We also support a target feature that disables emission of the retpoline
thunk by the compiler to allow for custom thunks if users want them.
These are particularly useful in environments like kernels that
routinely do hot-patching on boot and want to hot-patch their thunk to
different code sequences. They can write this custom thunk and use
`-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this
case, on x86-64 thu thunk names must be:
```
__llvm_external_retpoline_r11
```
or on 32-bit:
```
__llvm_external_retpoline_eax
__llvm_external_retpoline_ecx
__llvm_external_retpoline_edx
__llvm_external_retpoline_push
```
And the target of the retpoline is passed in the named register, or in
the case of the `push` suffix on the top of the stack via a `pushl`
instruction.
There is one other important source of indirect branches in x86 ELF
binaries: the PLT. These patches also include support for LLD to
generate PLT entries that perform a retpoline-style indirection.
The only other indirect branches remaining that we are aware of are from
precompiled runtimes (such as crt0.o and similar). The ones we have
found are not really attackable, and so we have not focused on them
here, but eventually these runtimes should also be replicated for
retpoline-ed configurations for completeness.
For kernels or other freestanding or fully static executables, the
compiler switch `-mretpoline` is sufficient to fully mitigate this
particular attack. For dynamic executables, you must compile *all*
libraries with `-mretpoline` and additionally link the dynamic
executable and all shared libraries with LLD and pass `-z retpolineplt`
(or use similar functionality from some other linker). We strongly
recommend also using `-z now` as non-lazy binding allows the
retpoline-mitigated PLT to be substantially smaller.
When manually apply similar transformations to `-mretpoline` to the
Linux kernel we observed very small performance hits to applications
running typical workloads, and relatively minor hits (approximately 2%)
even for extremely syscall-heavy applications. This is largely due to
the small number of indirect branches that occur in performance
sensitive paths of the kernel.
When using these patches on statically linked applications, especially
C++ applications, you should expect to see a much more dramatic
performance hit. For microbenchmarks that are switch, indirect-, or
virtual-call heavy we have seen overheads ranging from 10% to 50%.
However, real-world workloads exhibit substantially lower performance
impact. Notably, techniques such as PGO and ThinLTO dramatically reduce
the impact of hot indirect calls (by speculatively promoting them to
direct calls) and allow optimized search trees to be used to lower
switches. If you need to deploy these techniques in C++ applications, we
*strongly* recommend that you ensure all hot call targets are statically
linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well
tuned servers using all of these techniques saw 5% - 10% overhead from
the use of retpoline.
We will add detailed documentation covering these components in
subsequent patches, but wanted to make the core functionality available
as soon as possible. Happy for more code review, but we'd really like to
get these patches landed and backported ASAP for obvious reasons. We're
planning to backport this to both 6.0 and 5.0 release streams and get
a 5.0 release with just this cherry picked ASAP for distros and vendors.
This patch is the work of a number of people over the past month: Eric, Reid,
Rui, and myself. I'm mailing it out as a single commit due to the time
sensitive nature of landing this and the need to backport it. Huge thanks to
everyone who helped out here, and everyone at Intel who helped out in
discussions about how to craft this. Also, credit goes to Paul Turner (at
Google, but not an LLVM contributor) for much of the underlying retpoline
design.
Ilya Biryukov [Mon, 22 Jan 2018 17:18:28 +0000 (17:18 +0000)]
[CodeComplete] Fix completion in the middle of idents in macro calls
Summary:
This patch removes IdentifierInfo from completion token after remembering
the identifier in the preprocessor.
Prior to this patch, completion token had the IdentifierInfo set to null when
completing at the start of identifier and to the II for completion prefix
when in the middle of identifier.
This patch unifies how code completion token is handled when it is insterted
before the identifier and in the middle of the identifier.
The actual IdentifierInfo can still be obtained from the Preprocessor.
Raphael Isemann [Mon, 22 Jan 2018 15:27:25 +0000 (15:27 +0000)]
[modules] Correctly overload getModule in the MultiplexExternalSemaSource
Summary:
The MultiplexExternalSemaSource doesn't correctly overload the `getModule` function,
causing the multiplexer to not forward this call as intended.
Devin Coughlin [Sat, 20 Jan 2018 23:11:17 +0000 (23:11 +0000)]
[analyzer] Provide a check name when MallocChecker enables CStringChecker
Fix an assertion failure caused by a missing CheckName. The malloc checker
enables "basic" support in the CStringChecker, which causes some CString
bounds checks to be enabled. In this case, make sure that we have a
valid CheckName for the BugType.
Craig Topper [Sat, 20 Jan 2018 18:36:06 +0000 (18:36 +0000)]
[X86] Put the code that defines __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 for the preprocessor with the other __GCC_HAVE_SYNC_COMPARE_AND_SWAP_* defines. NFC
Kamil Rytarowski [Sat, 20 Jan 2018 01:03:45 +0000 (01:03 +0000)]
Link sanitized programs on NetBSD with -lkvm
Summary:
kvm - kernel memory interface
The kvm(3) functions like kvm_open(), kvm_getargv() or kvm_getenvv()
are used in programs that can request information about a kernel and
its processes. The LLVM sanitizers will make use of them on NetBSD.
Volodymyr Sapsai [Fri, 19 Jan 2018 23:41:47 +0000 (23:41 +0000)]
[Lex] Fix crash on code completion in comment in included file.
This fixes PR32732 by updating CurLexerKind to reflect available lexers.
We were hitting null pointer in Preprocessor::Lex because CurLexerKind
was CLK_Lexer but CurLexer was null. And we set it to null in
Preprocessor::HandleEndOfFile when exiting a file with code completion
point.
To reproduce the crash it is important for a comment to be inside a
class specifier. In this case in Parser::ParseClassSpecifier we improve
error recovery by pushing a semicolon token back into the preprocessor
and later on try to lex a token because we haven't reached the end of
file.
Also clang crashes only on code completion in included file, i.e. when
IncludeMacroStack is not empty. Though we reset CurLexer even if include
stack is empty. The difference is that during pushing back a semicolon
token, preprocessor calls EnterCachingLexMode which decides it is
already in caching mode because various lexers are null and
IncludeMacroStack is not empty. As the result, CurLexerKind remains
CLK_Lexer instead of updating to CLK_CachingLexer.
Richard Trieu [Fri, 19 Jan 2018 20:46:19 +0000 (20:46 +0000)]
Allow BlockDecl in CXXRecord scope to have no access specifier.
Using a BlockDecl in a default member initializer causes it to be attached to
CXXMethodDecl without its access specifier being set. This prevents a crash
where getAccess is called on this BlockDecl, since that method expects any
Decl in CXXRecord scope to have an access specifier.
Don Hinton [Fri, 19 Jan 2018 18:31:12 +0000 (18:31 +0000)]
[cmake] Also pass CMAKE_ASM_COMPILER_ID to next stage when bootstrapping
Summary:
When setting CMAKE_ASM_COMPILER=clang, we also need to set
CMAKE_ASM_COMPILER_ID=Clang.
This is needed because cmake won't set CMAKE_ASM_COMPILER_ID if
CMAKE_ASM_COMPILER is already set.
Without CMAKE_ASM_COMPILER_ID, cmake can't set
CMAKE_ASM_COMPILER_OPTIONS_TARGET either, which means
CMAKE_ASM_COMPILER_TARGET is ignored, causing cross compiling to fail,
i.e., `--target=${CMAKE_ASM_COMPILER_TARGET}` isn't passed.
Daniel Neilson [Fri, 19 Jan 2018 17:12:54 +0000 (17:12 +0000)]
Change memcpy/memove/memset to have dest and source alignment attributes (Step 1).
Summary:
Upstream LLVM is changing the the prototypes of the @llvm.memcpy/memmove/memset
intrinsics. This change updates the Clang tests for this change.
The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.
This change removes the alignment argument in favour of placing the alignment
attribute on the source and destination pointers of the memory intrinsic call.
For example, code which used to read:
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)
At this time the source and destination alignments must be the same (Step 1).
Step 2 of the change, to be landed shortly, will relax that contraint and allow
the source and destination to have different alignments.
Sanjay Patel [Fri, 19 Jan 2018 15:14:51 +0000 (15:14 +0000)]
[CodeGenCXX] annotate a GEP to a derived class with 'inbounds' (PR35909)
The standard says:
[expr.static.cast] p11: "If the prvalue of type “pointer to cv1 B” points to a B
that is actually a subobject of an object of type D, the resulting pointer points
to the enclosing object of type D. Otherwise, the behavior is undefined."
Therefore, the GEP must be inbounds.
This should solve the failure to optimize away a null check shown in PR35909:
https://bugs.llvm.org/show_bug.cgi?id=35909