Justin Bogner [Tue, 14 Oct 2014 00:40:55 +0000 (00:40 +0000)]
Frontend: Extract SerializedDiagnosticReader out of CXLoadedDiagnostic (NFC)
We currently read serialized diagnostics directly in the C API, which
makes it difficult to reuse this logic elsewhere. This extracts the
core of the serialized diagnostic parsing logic into a base class that
can be subclassed using a visitor pattern.
Alexey Samsonov [Mon, 13 Oct 2014 23:59:00 +0000 (23:59 +0000)]
Sanitize upcasts and conversion to virtual base.
This change adds UBSan check to upcasts. Namely, when we
perform derived-to-base conversion, we:
1) check that the pointer-to-derived has suitable alignment
and underlying storage, if this pointer is non-null.
2) if vptr-sanitizer is enabled, and we perform conversion to
virtual base, we check that pointer-to-derived has a matching vptr.
Objective-C [Sema]. Fixes a bug in comparing qualified
Objective-C pointer types. In this case, checker incorrectly
claims incompatible pointer types if redundant protocol conformance
is specified. rdar://18491222
Samuel Benzaquen [Mon, 13 Oct 2014 18:17:11 +0000 (18:17 +0000)]
Fix order of evaluation bug in DynTypedMatcher::constructVariadic().
Fix order of evaluation bug in DynTypedMatcher::constructVariadic().
If it evaluates right-to-left, the vector gets moved before we read the
kind from it.
Samuel Benzaquen [Mon, 13 Oct 2014 17:38:12 +0000 (17:38 +0000)]
Fix bug in DynTypedMatcher::constructVariadic() that would cause false negatives.
Summary:
Change r219118 fixed the bug for anyOf and eachOf, but it is still
present for unless.
The variadic wrapper doesn't have enough information to know how to
restrict the type. Different operators handle restrict failures in
different ways.
Bradley Smith [Mon, 13 Oct 2014 10:16:06 +0000 (10:16 +0000)]
[AArch64] Add workaround for Cortex-A53 erratum (835769)
Some early revisions of the Cortex-A53 have an erratum (835769) whereby it is
possible for a 64-bit multiply-accumulate instruction in AArch64 state to
generate an incorrect result. The details are quite complex and hard to
determine statically, since branches in the code may exist in some
circumstances, but all cases end with a memory (load, store, or prefetch)
instruction followed immediately by the multiply-accumulate operation.
The safest work-around for this issue is to make the compiler avoid emitting
multiply-accumulate instructions immediately after memory instructions and the
simplest way to do this is to insert a NOP.
This patch implements clang options to enable this workaround in the backend.
The work-around code generation is not enabled by default.
Alexey Bataev [Mon, 13 Oct 2014 08:23:51 +0000 (08:23 +0000)]
[OPENMP] Codegen for 'num_threads' clause in 'parallel' directive.
This patch generates call to "kmpc_push_num_threads(ident_t *loc, kmp_int32 global_tid, kmp_int32 num_threads);" library function before calling "kmpc_fork_call" each time there is an associated "num_threads" clause in the "omp parallel" directive.
Differential Revision: http://reviews.llvm.org/D5145
Alexey Bataev [Mon, 13 Oct 2014 06:02:40 +0000 (06:02 +0000)]
[OPENMP] Codegen for 'if' clause in 'parallel' directive.
Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive.
If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call.
Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D4716
Tyler Nowicki [Sun, 12 Oct 2014 20:46:07 +0000 (20:46 +0000)]
Allow constant expressions in pragma loop hints.
Previously loop hints such as #pragma loop vectorize_width(#) required a constant. This patch allows a constant expression to be used as well. Such as a non-type template parameter or an expression (2 * c + 1).
Chandler Carruth [Sat, 11 Oct 2014 11:03:30 +0000 (11:03 +0000)]
[complex] Teach the other two binary operators on complex numbers (==
and !=) to support mixed complex and real operand types.
This requires removing an assert from SemaChecking, and adding support
both to the constant evaluator and the code generator to synthesize the
imaginary part when needed. This seemed somewhat cleaner than having
just the comparison operators force real-to-complex conversions.
I've added test cases for these operations. I'm really terrified that
there were *no* tests in-tree which exercised this.
This turned up when trying to build R after my change to the complex
type lowering.
Chandler Carruth [Sat, 11 Oct 2014 09:24:41 +0000 (09:24 +0000)]
[complex] Use the much more powerful EmitCall routine to call libcalls
for complex math.
This should fix the windows build bots that started having trouble here
and generally fix complex libcall emission on targets which use sret for
complex data types. It also makes the code a bit simpler (despite
calling into a much more complex bucket of code).
Chandler Carruth [Sat, 11 Oct 2014 00:57:18 +0000 (00:57 +0000)]
[complex] Teach Clang to preserve different-type operands to arithmetic
operators where one type is a C complex type, and to emit both the
efficient and correct implementation for complex arithmetic according to
C11 Annex G using this extra information.
For both multiply and divide the old code was writing a long-hand
reduced version of the math without any of the special handling of inf
and NaN recommended by the standard here. Instead of putting more
complexity here, this change does what GCC does which is to emit
a libcall for the fully general case.
However, the old code also failed to do the proper minimization of the
set of operations when there was a mixed complex and real operation. In
those cases, C provides a spec for much more minimal operations that are
valid. Clang now emits the exact suggested operations. This change isn't
*just* about performance though, without minimizing these operations, we
again lose the correct handling of infinities and NaNs. It is critical
that this happen in the frontend based on assymetric type operands to
complex math operations.
The performance implications of this change aren't trivial either. I've
run a set of benchmarks in Eigen, an open source mathematics library
that makes heavy use of complex. While a few have slowed down due to the
libcall being introduce, most sped up and some by a huge amount: up to
100% and 140%.
In order to make all of this work, also match the algorithm in the
constant evaluator to the one in the runtime library. Currently it is
a broken port of the simplifications from C's Annex G to the long-hand
formulation of the algorithm.
Splitting this patch up is very hard because none of this works without
the AST change to preserve non-complex operands. Sorry for the enormous
change.
Follow-up changes will include support for sinking the libcalls onto
cold paths in common cases and fastmath improvements to allow more
aggressive backend folding.
Richard Smith [Sat, 11 Oct 2014 00:37:16 +0000 (00:37 +0000)]
[modules] When instantiating a class member, don't expect to find the previous
declaration in the instantiation if the previous declaration came from another
definition of the class template that got merged into the pattern definition.
Bob Wilson [Fri, 10 Oct 2014 23:10:10 +0000 (23:10 +0000)]
Treat -mios-simulator-version-min option as an alias for -mios-version-min.
We can safely rely on the architecture to distinguish iOS device builds from
iOS simulator builds. We already have code to do that, in fact. This simplifies
some of the error checking for the option handling.
Justin Bogner [Fri, 10 Oct 2014 22:20:26 +0000 (22:20 +0000)]
Correctly handle reading locations from serialized diagnostics
When reading a serialized diagnostic location with no file ID, we were
failing to increment the cursor past the rest of the location. This
would lead to the flags and category always appearing blank in such
diagnostics.
This changes the function to unconditionally increment the cursor and
updates the test to check for the correct output instead of testing
that we were doing this wrong. I've also updated the error check to
check for the correct number of fields.
Objective-C [qoi]. When reporting that a property is not
auto synthesized because it is synthesized in its super
class. locate property declaration in super class
which will default synthesize the property. rdar://18488727
Bob Wilson [Fri, 10 Oct 2014 19:38:34 +0000 (19:38 +0000)]
Remove a FIXME: use the ios_simulator_version_min linker option consistently.
This was previously only used when explicitly requested with a command line
option because it had to work with some old versions of the linker when it
was first introduced. That is ancient history now, and it should be safe to
use the correct option even when using the IPHONEOS_DEPLOYMENT_TARGET
environment variable to specify that the target is the iOS simulator.
Besides updating the test for this, I also added a few more tests for the
iOS linker options.
Alexey Bataev [Fri, 10 Oct 2014 18:58:13 +0000 (18:58 +0000)]
Bugfix for predefined expressions in dependent context.
This bug break compilation with precompiled headers and predefined expressions in dependent context.
It's possible to construct cases where the first field we are trying to
copy is in the middle of an IR field. In some complicated cases, we
would fail to use an appropriate offset inside the object. Earlier
builds of clang seemed to miscompile the code by copying an insufficient
number of bytes. Up until now, we would assert: the copying offset was
insufficiently aligned.
John McCall [Fri, 10 Oct 2014 18:44:34 +0000 (18:44 +0000)]
Change how we distinguish bitfield widths, in-class
initializers, and captured VLA types so that we can
answer questions like "is this a bit-field" without
looking at the enclosing DeclContext. NFC.
Bill Schmidt [Fri, 10 Oct 2014 15:09:43 +0000 (15:09 +0000)]
[PowerPC] Add feature for Power8 vector extensions
The current VSX feature for PowerPC specifies availability of the VSX
instructions added with the 2.06 architecture version. With 2.07, the
architecture adds new instructions to both the Category:Vector and
Category:VSX instruction sets. Additionally, unaligned vector storage
operations have improved performance.
This patch adds a feature to provide access to the new instructions
and performance capabilities of Power8. For compatibility with GCC,
the feature is controlled via a new -mpower8-vector switch, and the
feature causes the __POWER8_VECTOR__ builtin define to be generated by
the preprocessor.
There is a companion patch for llvm being committed at the same time.
Alexey Bataev [Fri, 10 Oct 2014 12:19:54 +0000 (12:19 +0000)]
Code reformatting and improvement for OpenMP.
Moved CGOpenMPRegionInfo from CGOpenMPRuntime.h to CGOpenMPRuntime.cpp file and reworked the code for this change. Also added processing of ThreadID variable passed as an argument in outlined functions in parallel and task directives.
Alexey Bataev [Fri, 10 Oct 2014 09:48:26 +0000 (09:48 +0000)]
Code improvements in OpenMP CodeGen.
This patch makes class OMPPrivateScope a common class for all private variables. Reworked processing of firstprivate variables (now it is based on OMPPrivateScope too).
Bob Wilson [Fri, 10 Oct 2014 03:12:15 +0000 (03:12 +0000)]
Remove support for the IOS_SIMULATOR_DEPLOYMENT_TARGET env var.
It turns out that this was never used. Instead we just use the
IPHONEOS_DEPLOYMENT_TARGET variable for both iOS devices and simulator.
rdar://problem/18596744
Dan Albert [Fri, 10 Oct 2014 01:01:29 +0000 (01:01 +0000)]
PR21195: Emit .gcno files to the proper location.
When building with coverage, -no-integrated-as, and -c, the driver was
emitting -cc1 -coverage-file pointing at a file in /tmp. Ensure the
coverage file is emitted in the same directory as the output file.
Reid Kleckner [Fri, 10 Oct 2014 00:05:45 +0000 (00:05 +0000)]
Promote null pointer constants used as arguments to variadic functions
Make it possible to pass NULL through variadic functions on 64-bit
Windows targets. The Visual C++ headers define NULL to 0, when they
should define it to 0LL on Win64 so that NULL is a pointer-sized
integer.
Fix completion logic to allow for heterogeneous argument types in matcher overloads.
Summary:
There was an assumption that there were no matchers that were overloaded
on matchers and other types of arguments.
This assumption was broken recently with the addition of new matcher
overloads.
Objective-C SDK modernization. import Foundation even
when a previous definition of NS_OPTION is available
; e.g. from a pch. enhancement to rdar://18498550
Special case 0 and 1 matcher in makeAllOfComposite().
Summary:
Remove unnecessary wrapping for the 0 and 1 matcher cases of
makeAllOfComposite(). We don't need a variadic wrapper for those cases.
Refactor TrueMatcher to take advandage of the new conversions between
DynTypedMatcher and Matcher<T>. Also, make it a singleton.
This change improves our clang-tidy related benchmarks by ~12%.
Add experimental clang/driver flag -fsanitize-address-field-padding=N
Summary:
This change adds an experimental flag -fsanitize-address-field-padding=N (0, 1, 2)
to clang and driver. With this flag ASAN will be able to detect some cases of
intra-object-overflow bugs,
see https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow
There is no actual functionality here yet, just the flag parsing.
The functionality is being reviewed at http://reviews.llvm.org/D5687
Test Plan: Build and run SPEC, LLVM Bootstrap, Chrome with this flag.
Alexey Bataev [Thu, 9 Oct 2014 08:45:04 +0000 (08:45 +0000)]
Fix for bug http://llvm.org/PR17427.
Assertion failed: "Computed __func__ length differs from type!"
Reworked PredefinedExpr representation with internal StringLiteral field for function declaration.
Differential Revision: http://reviews.llvm.org/D5365
Alexey Bataev [Thu, 9 Oct 2014 04:18:56 +0000 (04:18 +0000)]
[OPENMP] 'omp teams' directive basic support.
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
Replace a destructor of EHCleanupScope with a Destroy() method to reflect the current usage.
Summary:
The current code uses memset to re-initialize EHCleanupScope objects
with breaks the assumptions of the upcoming asan's intra-object-overflow checker.
If there is no DTOR, the new checker will refuse to work.
Alexey Bataev [Wed, 8 Oct 2014 14:01:46 +0000 (14:01 +0000)]
[OPENMP] Codegen for 'firstprivate' clause.
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
Remove threshold on object size for inserting lifetime begin / end
Boostrapping LLVM+Clang+LLDB without threshold on object size for
lifetime markers insertion has shown there was no significant change
in compile time, so let the stack slot colorizer do its optimization
for all slots.
Alexey Bataev [Wed, 8 Oct 2014 11:35:04 +0000 (11:35 +0000)]
[OPENMP] Codegen for 'firstprivate' clause.
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
Alexey Bataev [Wed, 8 Oct 2014 10:42:55 +0000 (10:42 +0000)]
[OPENMP] Codegen for 'firstprivate' clause.
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
Renato Golin [Wed, 8 Oct 2014 09:06:45 +0000 (09:06 +0000)]
Revert "[OPENMP] 'omp teams' directive basic support. Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive."
This reverts commit r219197 because it broke ARM self-hosting buildbots with
segmentation fault errors in many tests.
Reid Kleckner [Wed, 8 Oct 2014 01:07:54 +0000 (01:07 +0000)]
Fix IRGen for referencing a static local before emitting its decl
Summary:
Previously CodeGen assumed that static locals were emitted before they
could be accessed, which is true for automatic storage duration locals.
However, it is possible to have CodeGen emit a nested function that uses
a static local before emitting the function that defines the static
local, breaking that assumption.
Fix it by creating the static local upon access and ensuring that the
deferred function body gets emitted. We may not be able to emit the
initializer properly from outside the function body, so don't try.
Fixes PR18020. See also previous attempts to fix static locals in
PR6769 and PR7101.
Objective-C SDK modernization. When modernizing to
use NS_ENUM/NS_OPTIONS macros, add an import of
Foundation.h (or its module) as necessary.
rdar://18498550
Alexey Bataev [Tue, 7 Oct 2014 10:13:33 +0000 (10:13 +0000)]
[OPENMP] 'omp teams' directive basic support.
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
[OPENMP] Small refactoring of EmitOMPSimdLoop helper routine.
No functional changes intended.
Renamed EmitOMPSimdLoop to EmitOMPInnerLoop, I plan to re-use
it to emit inner loop in the future patches for CodeGen of the
worksharing loop directives (omp for, omp for simd).