CodeGen: Avoid dereferencing end() in ScalarExprEmitter::EmitOverflowCheckedBinOp
Use BB.getNextNode(), which returns nullptr on end(), instead of
&*BB.getIterator(), which is UB on end().
CodeGenFunction::createBasicBlock expects nullptr in this case already.
Richard Smith [Wed, 17 Aug 2016 01:05:07 +0000 (01:05 +0000)]
If possible, set the stack rlimit to at least 8MiB on cc1 startup, and work
around a Linux kernel bug where the actual amount of available stack may be a
*lot* lower than the rlimit.
GCC also sets a higher stack rlimit on startup, but it goes all the way to
64MiB. We can increase this limit if it proves necessary.
The kernel bug is as follows: Linux kernels prior to version 4.1 may choose to
map the process's heap as little as 128MiB before the process's stack for a PIE
binary, even in a 64-bit virtual address space. This means that allocating more
than 128MiB before you reach the process's stack high water mark can lead to
crashes, even if you don't recurse particularly deeply.
We work around the kernel bug by touching a page deep within the stack (after
ensuring that we know how big it is), to preallocate virtual address space for
the stack so that the kernel doesn't allow the brk() area to wander into it,
when building clang as a Linux PIE binary.
Chris Bieneman [Tue, 16 Aug 2016 22:16:29 +0000 (22:16 +0000)]
[CMake] Workflow improvements to PGO generation
This patch adds a few new convenience options used by the PGO CMake cache to setup options on bootstrap stages. The new options are:
PGO_INSTRUMENT_LTO - Builds the instrumented and final builds with LTO
PGO_BUILD_CONFIGURATION - Accepts a CMake cache script that can be used for complex configuration of the stage2-instrumented and stage2 builds.
The patch also includes a fix for bootstrap dependencies so that the instrumented LTO tools don't get used when building the final stage, and it adds distribution targets to the passthrough.
Adrian McCarthy [Tue, 16 Aug 2016 22:11:18 +0000 (22:11 +0000)]
Emit debug info for dynamic classes if they are imported from a DLL.
With -debug-info-kind=limited, we omit debug info for dynamic classes that live in other TUs. This reduces duplicate type information. When statically linked, the type information comes together. But if your binary has a class derived from a base in a DLL, the base class info is not available to the debugger.
The decision is made in shouldOmitDefinition (CGDebugInfo.cpp). Per a suggestion from rnk, I've tweaked the decision so that we do include definitions for classes marked as DLL imports. This should be a relatively small number of classes, so we don't pay a large price for duplication of the type info, yet it should cover most cases on Windows.
Essentially this makes debug info for DLLs independent, but we still assume that all TUs within the same DLL will be consistently built with (or without) debug info and the debugger will be able to search across the debug info within that scope to resolve any declarations into definitions, etc.
Erik Pilkington [Tue, 16 Aug 2016 17:44:11 +0000 (17:44 +0000)]
[ObjC] Warn on unguarded use of partial declaration
This commit adds a traversal of the AST after Sema of a function that diagnoses
unguarded references to declarations that are partially available (based on
availability attributes). This traversal is only done when we would otherwise
emit -Wpartial-availability.
This commit is part of a feature I proposed here:
http://lists.llvm.org/pipermail/cfe-dev/2016-July/049851.html
Aaron Ballman [Tue, 16 Aug 2016 14:48:39 +0000 (14:48 +0000)]
Reduce the number of allocations required for AST attributes. In test cases, the max resident memory changed from 65760k to 64476k which is 1.9% improvement. Allocations in grow_pod changed from 8847 to 4872 according to tcmalloc heap profiler. Overall running time remained the same.
Samuel Antao [Tue, 16 Aug 2016 14:31:39 +0000 (14:31 +0000)]
Add empty --gcc-toolchain empty to cuda-detect test.
Unless we overload the default gcc toolchain with an empty string
the system root used in the tests will be ignored if the user builds
clang with a custom gcc toolchain.
Richard Smith [Tue, 16 Aug 2016 00:13:47 +0000 (00:13 +0000)]
PR28978: If we need overload resolution for the move constructor of an
anonymous union member of a class, we need overload resolution for the move
constructor of the class itself too; we can't rely on Sema to do the right
thing for us for anonymous union types.
Justin Lebar [Mon, 15 Aug 2016 23:00:49 +0000 (23:00 +0000)]
[CUDA] Raise an error if a wrong-side call is codegen'ed.
Summary:
Some function calls in CUDA are allowed to appear in
semantically-correct programs but are an error if they're ever
codegen'ed. Specifically, a host+device function may call a host
function, but it's an error if such a function is ever codegen'ed in
device mode (and vice versa).
Previously, clang made no attempt to catch these errors. For the most
part, they would be caught by ptxas, and reported as "call to unknown
function 'foo'".
Now we catch these errors and report them the same as we report other
illegal calls (e.g. a call from a host function to a device function).
This has a small change in error-message behavior for calls that were
previously disallowed (e.g. calls from a host to a device function).
Previously, we'd catch disallowed calls fairly early, before doing
additional semantic checking e.g. of the call's arguments. Now we catch
these illegal calls at the very end of our semantic checks, so we'll
only emit a "illegal CUDA call" error if the call is otherwise
well-formed.
Manman Ren [Mon, 15 Aug 2016 21:05:00 +0000 (21:05 +0000)]
Objective-C diagnostics: isObjCNSObjectType should check through AttributedType.
For the following example:
typedef __attribute__((NSObject)) CGColorRef ColorAttrRef;
@property (strong, nullable) ColorAttrRef color;
The property type should be ObjC NSObject type and the compiler should not emit
error: property with 'retain (or strong)' attribute must be of object type
Justin Lebar [Mon, 15 Aug 2016 20:38:56 +0000 (20:38 +0000)]
Add the notion of deferred diagnostics.
Summary:
This patch lets you create diagnostics that are emitted if and only if a
particular FunctionDecl is codegen'ed.
This is necessary for CUDA, where some constructs -- e.g. calls from
host+device functions to host functions when compiling for device -- are
allowed to appear in semantically-correct programs, but only if they're
never codegen'ed.
Justin Lebar [Mon, 15 Aug 2016 20:38:52 +0000 (20:38 +0000)]
[CUDA] Include CUDA headers before anything else.
Summary:
There's no point to --cuda-path if we then go and include /usr/include
first. And if you install the right packages, Ubuntu will install (very
old) CUDA headers there.
David Majnemer [Mon, 15 Aug 2016 07:20:40 +0000 (07:20 +0000)]
[CodeGen] Ignore unnamed bitfields before handling vector fields
We processed unnamed bitfields after our logic for non-vector field
elements in records larger than 128 bits. The vector logic would
determine that the bit-field disqualifies the record from occupying a
register despite the unnamed bit-field not participating in the record
size nor its alignment.
Richard Smith [Mon, 15 Aug 2016 02:37:43 +0000 (02:37 +0000)]
cxx_status: mark decomposition declarations as "partial": the implementation is
essentially complete, other than parts where design questions have been raised
(lambda capture, decomposition of arrays by copy).
Richard Smith [Sun, 14 Aug 2016 23:15:52 +0000 (23:15 +0000)]
Explicitly generate a reference variable to hold the initializer for a
tuple-like decomposition declaration. This significantly simplifies the
semantics of BindingDecls for AST consumers (they can now always be evalated
at the point of use).
Currently, when parsing umbrella directories, we use
vfs::recursive_directory_iterator to gather the header files to generate the
equivalent modules for. If the external contents for a header does not exist,
we currently are unable to build a module, since the VFS
vfs::recursive_directory_iterator will fail when it finds an entry without a
reliable real path.
Since the YAML file could be prepared ahead of time and shared among
different compiler invocations, an entry might not yet have a reliable
path in 'external-contents', breaking the iteration.
Give the VFS the capability to skip such entries whenever
'ignore-non-existent-contents' property is set in the YAML file.
Zachary Turner [Fri, 12 Aug 2016 17:47:52 +0000 (17:47 +0000)]
[Driver] Set the default driver mode based on the executable.
Currently, if --driver-mode is not passed at all, it will default
to GCC style driver. This is never an issue for clang because
it manually constructs a --driver-mode option and passes it.
However, we should still try to do as good as we can even if no
--driver-mode is passed. LibTooling, for example, does not pass
a --driver-mode option and while it could, it seems like we should
still fallback to the best possible default we can.
This is one of two steps necessary to get clang-tidy working on Windows.
Taking the address of a packed member is dangerous since the reduced
alignment of the pointee is lost. This can lead to memory alignment
faults in some architectures if the pointer value is dereferenced.
This change adds a new warning to clang emitted when taking the address
of a packed member. A packed member is either a field/data member
declared as attribute((packed)) or belonging to a struct/class
declared as such. The associated flag is -Waddress-of-packed-member.
Conversions (either implicit or via a valid casting) to pointer types
with lower or equal alignment requirements (e.g. void* or char*)
will silence the warning.
[Sema] Fix a crash on variadic enable_if functions.
Currently, when trying to evaluate an enable_if condition, we try to
evaluate all arguments a user passes to a function. Given that we can't
use variadic arguments from said condition anyway, not converting them
is a reasonable thing to do. So, this patch makes us ignore any varargs
when attempting to check an enable_if condition.
We'd crash because, in order to convert an argument, we need its
ParmVarDecl. Variadic arguments don't have ParmVarDecls.
Currently, when parsing umbrella directories, we use
vfs::recursive_directory_iterator to gather the header files to generate the
equivalent modules for. If the external contents for a header does not exist,
we currently are unable to build a module, since the VFS
vfs::recursive_directory_iterator will fail when it finds an entry without a
reliable real path.
Since the YAML file could be prepared ahead of time and shared among
different compiler invocations, an entry might not yet have a reliable
path in 'external-contents', breaking the iteration.
Give the VFS the capability to skip such entries whenever
'ignore-non-existent-contents' property is set in the YAML file.
[VFS] Add 'ignore-non-existent-contents' field to YAML files
Add 'ignore-non-existent-contents' to tell the VFS whether an invalid path
obtained via 'external-contents' should cause iteration on the VFS to stop.
If 'true', the VFS should ignore the entry and continue with the next. Allows
YAML files to be shared across multiple compiler invocations regardless of
prior existent paths in 'external-contents'. This global value is overridable
on a per-file basis.
This adds the parsing and write test part, but use by VFS comes next.
Richard Smith [Thu, 11 Aug 2016 22:25:46 +0000 (22:25 +0000)]
P0217R3: Perform semantic checks and initialization for the bindings in a
decomposition declaration for arrays, aggregate-like structs, tuple-like
types, and (as an extension) for complex and vector types.
Ed Schouten [Thu, 11 Aug 2016 20:03:22 +0000 (20:03 +0000)]
Don't enable PIE on i686-unknown-cloudabi.
We're only going to provide support for using PIE on architectures that
provide PC-relative addressing. i686 is not one of those, so add the
necessary bits for only passing in -pie -zrelro conditionally.
Ed Schouten [Thu, 11 Aug 2016 19:23:30 +0000 (19:23 +0000)]
Pass in frame pointer omitting compiler flags for CloudABI as well.
On Linux we pass in -fomit-frame-pointer flags (and similar)
automatically if optimization is enabled. Let's do the same thing on
CloudABI. Without this, Clang seems to run out of registers quite
quickly while trying to build code with inline assembly.
Devin Coughlin [Thu, 11 Aug 2016 18:41:29 +0000 (18:41 +0000)]
[analyzer] Teach RetainCountChecker about CVFooRetain
Change the retain count checker to treat CoreFoundation-style "CV"-prefixed
reference types from CoreVideo similarly to CoreGraphics types. With this
change, we treat CVFooRetain() on a CVFooRef type as a retain. CVFooRelease()
APIs are annotated as consuming their parameter, so this change prevents false
positives about incorrect decrements of reference counts.
[Sema] Add more strict check for sizeof diagnostics for bzero
Follow-up from r278264 after Joerg's feedback.
Since bzero is not standard, be more strict: also check if the first
argument is a pointer, which harden the check for when it does not come
originally from a builtin.
Chris Bieneman [Thu, 11 Aug 2016 00:19:51 +0000 (00:19 +0000)]
[Order Files] Don't use empty order files
LD64 does optimization on symbol layouts that gets disabled whenever an order file is passed (even if it is empty). This change prevents disabling that optimization, and still enables iterative generation and usage of order files.
If the order file is empty it does not setup the order file flags, instead it sets the empty order file as a configuration dependency. When the order file changes it will then trigger a re-configuration that adds the linker flag.
Reapply r277787. For memset (and others) we can get diagnostics like:
struct stat { int x; };
void foo(struct stat *stamps) {
bzero(stamps, sizeof(stamps));
memset(stamps, 0, sizeof(stamps));
}
t.c:7:28: warning: 'memset' call operates on objects of type 'struct stat' while the size is based on a different type 'struct stat *' [-Wsizeof-pointer-memaccess]
memset(stamps, 0, sizeof(stamps));
~~~~~~ ^~~~~~
t.c:7:28: note: did you mean to dereference the argument to 'sizeof' (and multiply it by the number of elements)?
memset(stamps, 0, sizeof(stamps));
^~~~~~
This patch implements the same class of warnings for bzero.
Eric Liu [Wed, 10 Aug 2016 09:32:23 +0000 (09:32 +0000)]
Make clang-format remove duplicate headers when sorting #includes.
Summary: When sorting #includes, #include directives that have the same text will be deduplicated when sorting #includes, and only the first #include in the duplicate #includes remains. If the `Cursor` is provided and put on a deleted #include, it will be put on the remaining #include in the duplicate #includes.
Chandler Carruth [Wed, 10 Aug 2016 07:32:47 +0000 (07:32 +0000)]
[x86] Fix a really nasty bug introduced in r276417 where alignment
constraints were added to _mm256_broadcast_{pd,ps} intel intrinsics.
The spec for these intrinics is ... pretty much silent on alignment.
This is especially frustrating considering the amount of discussion of
alignment in the load and store instrinsics. So I was forced to rely on
the specification for the VBROADCASTF128 instruction.
That instruction's spec is *also* completely silent on alignment.
Fortunately, when it comes to the instruction's spec, silence is enough.
There is no #GP fault option for an underaligned address so this
instruction, and by inference the intrinsic, can read any alignment.
As it happens, the old code worked exactly this way and in fact we have
plenty of code that hands pointers with less than 16-byte alignment to
these intrinsics. This code broke pretty spectacularly with this commit.
Fortunately, the fix is super simple! Change a 16 to a 1, and ta da!
Anyways, a lot of debugging for a really boring fix. =]
Justin Lebar [Wed, 10 Aug 2016 01:09:18 +0000 (01:09 +0000)]
[CUDA] Print a "previous-decl" note when calling an illegal member fn.
Summary:
When we emit err_ref_bad_target, we should emit a "'method' declared
here" note. We already do so in most places, just not in
BuildCallToMemberFunction.
Justin Lebar [Wed, 10 Aug 2016 01:09:14 +0000 (01:09 +0000)]
[CUDA] Add __device__ overloads for placement new and delete.
Summary:
Previously these sort of worked because they didn't end up resulting in
calls at the ptx layer. But I'm adding stricter checks that break
placement new without these changes.
Tim Shen [Tue, 9 Aug 2016 20:22:55 +0000 (20:22 +0000)]
[ADT] Change iterator_adaptor_base's default template arguments to forward more underlying typedefs
Summary:
The corresponding LLVM change: D23217.
LazyVector::iterator breaks, because int isn't an iterator type.
Since iterator_adaptor_base shouldn't be blamed to break at the call to
iterator_traits<int>::xxx, I'd rather "fix" LazyVector::iterator.
The perfect solution is to model "relative pointer", but it's beyond the goal of this patch.
Let the driver pass the option to frontend. Do not set precision metadata for division instructions when this option is set. Set function attribute "correctly-rounded-divide-sqrt-fp-math" based on this option.
Yaxun Liu [Tue, 9 Aug 2016 19:43:38 +0000 (19:43 +0000)]
[OpenCL][AMDGPU] Add support for -cl-denorms-are-zero
Adjust target features for amdgcn target when -cl-denorms-are-zero is set.
Denormal support is controlled by feature strings fp32-denormals fp64-denormals in amdgcn target. If -cl-denorms-are-zero is not set and the command line does not set fp32/64-denormals feature string, +fp32-denormals +fp64-denormals will be on for GPU's supporting them.
A new virtual function virtual void TargetInfo::adjustTargetOptions(const CodeGenOptions &CGOpts, TargetOptions &TargetOpts) const is introduced to allow adjusting target option by codegen option.
Reid Kleckner [Tue, 9 Aug 2016 17:23:56 +0000 (17:23 +0000)]
[clang-cl] Make -gline-tables-only imply -gcodeview
It's surprising that you have to pass /Z7 in addition to -gcodeview to
get debug info. The sanitizer runtime, for example, expects that if the
compiler supports the -gline-tables-only flag, then it will emit debug
info.