Chandler Carruth [Sat, 11 Oct 2014 09:24:41 +0000 (09:24 +0000)]
[complex] Use the much more powerful EmitCall routine to call libcalls
for complex math.
This should fix the windows build bots that started having trouble here
and generally fix complex libcall emission on targets which use sret for
complex data types. It also makes the code a bit simpler (despite
calling into a much more complex bucket of code).
Chandler Carruth [Sat, 11 Oct 2014 00:57:18 +0000 (00:57 +0000)]
[complex] Teach Clang to preserve different-type operands to arithmetic
operators where one type is a C complex type, and to emit both the
efficient and correct implementation for complex arithmetic according to
C11 Annex G using this extra information.
For both multiply and divide the old code was writing a long-hand
reduced version of the math without any of the special handling of inf
and NaN recommended by the standard here. Instead of putting more
complexity here, this change does what GCC does which is to emit
a libcall for the fully general case.
However, the old code also failed to do the proper minimization of the
set of operations when there was a mixed complex and real operation. In
those cases, C provides a spec for much more minimal operations that are
valid. Clang now emits the exact suggested operations. This change isn't
*just* about performance though, without minimizing these operations, we
again lose the correct handling of infinities and NaNs. It is critical
that this happen in the frontend based on assymetric type operands to
complex math operations.
The performance implications of this change aren't trivial either. I've
run a set of benchmarks in Eigen, an open source mathematics library
that makes heavy use of complex. While a few have slowed down due to the
libcall being introduce, most sped up and some by a huge amount: up to
100% and 140%.
In order to make all of this work, also match the algorithm in the
constant evaluator to the one in the runtime library. Currently it is
a broken port of the simplifications from C's Annex G to the long-hand
formulation of the algorithm.
Splitting this patch up is very hard because none of this works without
the AST change to preserve non-complex operands. Sorry for the enormous
change.
Follow-up changes will include support for sinking the libcalls onto
cold paths in common cases and fastmath improvements to allow more
aggressive backend folding.
Richard Smith [Sat, 11 Oct 2014 00:37:16 +0000 (00:37 +0000)]
[modules] When instantiating a class member, don't expect to find the previous
declaration in the instantiation if the previous declaration came from another
definition of the class template that got merged into the pattern definition.
Bob Wilson [Fri, 10 Oct 2014 23:10:10 +0000 (23:10 +0000)]
Treat -mios-simulator-version-min option as an alias for -mios-version-min.
We can safely rely on the architecture to distinguish iOS device builds from
iOS simulator builds. We already have code to do that, in fact. This simplifies
some of the error checking for the option handling.
Justin Bogner [Fri, 10 Oct 2014 22:20:26 +0000 (22:20 +0000)]
Correctly handle reading locations from serialized diagnostics
When reading a serialized diagnostic location with no file ID, we were
failing to increment the cursor past the rest of the location. This
would lead to the flags and category always appearing blank in such
diagnostics.
This changes the function to unconditionally increment the cursor and
updates the test to check for the correct output instead of testing
that we were doing this wrong. I've also updated the error check to
check for the correct number of fields.
Objective-C [qoi]. When reporting that a property is not
auto synthesized because it is synthesized in its super
class. locate property declaration in super class
which will default synthesize the property. rdar://18488727
Bob Wilson [Fri, 10 Oct 2014 19:38:34 +0000 (19:38 +0000)]
Remove a FIXME: use the ios_simulator_version_min linker option consistently.
This was previously only used when explicitly requested with a command line
option because it had to work with some old versions of the linker when it
was first introduced. That is ancient history now, and it should be safe to
use the correct option even when using the IPHONEOS_DEPLOYMENT_TARGET
environment variable to specify that the target is the iOS simulator.
Besides updating the test for this, I also added a few more tests for the
iOS linker options.
Alexey Bataev [Fri, 10 Oct 2014 18:58:13 +0000 (18:58 +0000)]
Bugfix for predefined expressions in dependent context.
This bug break compilation with precompiled headers and predefined expressions in dependent context.
It's possible to construct cases where the first field we are trying to
copy is in the middle of an IR field. In some complicated cases, we
would fail to use an appropriate offset inside the object. Earlier
builds of clang seemed to miscompile the code by copying an insufficient
number of bytes. Up until now, we would assert: the copying offset was
insufficiently aligned.
John McCall [Fri, 10 Oct 2014 18:44:34 +0000 (18:44 +0000)]
Change how we distinguish bitfield widths, in-class
initializers, and captured VLA types so that we can
answer questions like "is this a bit-field" without
looking at the enclosing DeclContext. NFC.
Bill Schmidt [Fri, 10 Oct 2014 15:09:43 +0000 (15:09 +0000)]
[PowerPC] Add feature for Power8 vector extensions
The current VSX feature for PowerPC specifies availability of the VSX
instructions added with the 2.06 architecture version. With 2.07, the
architecture adds new instructions to both the Category:Vector and
Category:VSX instruction sets. Additionally, unaligned vector storage
operations have improved performance.
This patch adds a feature to provide access to the new instructions
and performance capabilities of Power8. For compatibility with GCC,
the feature is controlled via a new -mpower8-vector switch, and the
feature causes the __POWER8_VECTOR__ builtin define to be generated by
the preprocessor.
There is a companion patch for llvm being committed at the same time.
Alexey Bataev [Fri, 10 Oct 2014 12:19:54 +0000 (12:19 +0000)]
Code reformatting and improvement for OpenMP.
Moved CGOpenMPRegionInfo from CGOpenMPRuntime.h to CGOpenMPRuntime.cpp file and reworked the code for this change. Also added processing of ThreadID variable passed as an argument in outlined functions in parallel and task directives.
Alexey Bataev [Fri, 10 Oct 2014 09:48:26 +0000 (09:48 +0000)]
Code improvements in OpenMP CodeGen.
This patch makes class OMPPrivateScope a common class for all private variables. Reworked processing of firstprivate variables (now it is based on OMPPrivateScope too).
Bob Wilson [Fri, 10 Oct 2014 03:12:15 +0000 (03:12 +0000)]
Remove support for the IOS_SIMULATOR_DEPLOYMENT_TARGET env var.
It turns out that this was never used. Instead we just use the
IPHONEOS_DEPLOYMENT_TARGET variable for both iOS devices and simulator.
rdar://problem/18596744
Dan Albert [Fri, 10 Oct 2014 01:01:29 +0000 (01:01 +0000)]
PR21195: Emit .gcno files to the proper location.
When building with coverage, -no-integrated-as, and -c, the driver was
emitting -cc1 -coverage-file pointing at a file in /tmp. Ensure the
coverage file is emitted in the same directory as the output file.
Reid Kleckner [Fri, 10 Oct 2014 00:05:45 +0000 (00:05 +0000)]
Promote null pointer constants used as arguments to variadic functions
Make it possible to pass NULL through variadic functions on 64-bit
Windows targets. The Visual C++ headers define NULL to 0, when they
should define it to 0LL on Win64 so that NULL is a pointer-sized
integer.
Fix completion logic to allow for heterogeneous argument types in matcher overloads.
Summary:
There was an assumption that there were no matchers that were overloaded
on matchers and other types of arguments.
This assumption was broken recently with the addition of new matcher
overloads.
Objective-C SDK modernization. import Foundation even
when a previous definition of NS_OPTION is available
; e.g. from a pch. enhancement to rdar://18498550
Special case 0 and 1 matcher in makeAllOfComposite().
Summary:
Remove unnecessary wrapping for the 0 and 1 matcher cases of
makeAllOfComposite(). We don't need a variadic wrapper for those cases.
Refactor TrueMatcher to take advandage of the new conversions between
DynTypedMatcher and Matcher<T>. Also, make it a singleton.
This change improves our clang-tidy related benchmarks by ~12%.
Add experimental clang/driver flag -fsanitize-address-field-padding=N
Summary:
This change adds an experimental flag -fsanitize-address-field-padding=N (0, 1, 2)
to clang and driver. With this flag ASAN will be able to detect some cases of
intra-object-overflow bugs,
see https://code.google.com/p/address-sanitizer/wiki/IntraObjectOverflow
There is no actual functionality here yet, just the flag parsing.
The functionality is being reviewed at http://reviews.llvm.org/D5687
Test Plan: Build and run SPEC, LLVM Bootstrap, Chrome with this flag.
Alexey Bataev [Thu, 9 Oct 2014 08:45:04 +0000 (08:45 +0000)]
Fix for bug http://llvm.org/PR17427.
Assertion failed: "Computed __func__ length differs from type!"
Reworked PredefinedExpr representation with internal StringLiteral field for function declaration.
Differential Revision: http://reviews.llvm.org/D5365
Alexey Bataev [Thu, 9 Oct 2014 04:18:56 +0000 (04:18 +0000)]
[OPENMP] 'omp teams' directive basic support.
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
Replace a destructor of EHCleanupScope with a Destroy() method to reflect the current usage.
Summary:
The current code uses memset to re-initialize EHCleanupScope objects
with breaks the assumptions of the upcoming asan's intra-object-overflow checker.
If there is no DTOR, the new checker will refuse to work.
Alexey Bataev [Wed, 8 Oct 2014 14:01:46 +0000 (14:01 +0000)]
[OPENMP] Codegen for 'firstprivate' clause.
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
Remove threshold on object size for inserting lifetime begin / end
Boostrapping LLVM+Clang+LLDB without threshold on object size for
lifetime markers insertion has shown there was no significant change
in compile time, so let the stack slot colorizer do its optimization
for all slots.
Alexey Bataev [Wed, 8 Oct 2014 11:35:04 +0000 (11:35 +0000)]
[OPENMP] Codegen for 'firstprivate' clause.
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
Alexey Bataev [Wed, 8 Oct 2014 10:42:55 +0000 (10:42 +0000)]
[OPENMP] Codegen for 'firstprivate' clause.
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
Renato Golin [Wed, 8 Oct 2014 09:06:45 +0000 (09:06 +0000)]
Revert "[OPENMP] 'omp teams' directive basic support. Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive."
This reverts commit r219197 because it broke ARM self-hosting buildbots with
segmentation fault errors in many tests.
Reid Kleckner [Wed, 8 Oct 2014 01:07:54 +0000 (01:07 +0000)]
Fix IRGen for referencing a static local before emitting its decl
Summary:
Previously CodeGen assumed that static locals were emitted before they
could be accessed, which is true for automatic storage duration locals.
However, it is possible to have CodeGen emit a nested function that uses
a static local before emitting the function that defines the static
local, breaking that assumption.
Fix it by creating the static local upon access and ensuring that the
deferred function body gets emitted. We may not be able to emit the
initializer properly from outside the function body, so don't try.
Fixes PR18020. See also previous attempts to fix static locals in
PR6769 and PR7101.
Objective-C SDK modernization. When modernizing to
use NS_ENUM/NS_OPTIONS macros, add an import of
Foundation.h (or its module) as necessary.
rdar://18498550
Alexey Bataev [Tue, 7 Oct 2014 10:13:33 +0000 (10:13 +0000)]
[OPENMP] 'omp teams' directive basic support.
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
[OPENMP] Small refactoring of EmitOMPSimdLoop helper routine.
No functional changes intended.
Renamed EmitOMPSimdLoop to EmitOMPInnerLoop, I plan to re-use
it to emit inner loop in the future patches for CodeGen of the
worksharing loop directives (omp for, omp for simd).
David Majnemer [Mon, 6 Oct 2014 23:52:23 +0000 (23:52 +0000)]
driver: Map closed standard file descriptors to /dev/null
Utilize Process::FixupStandardFileDescriptors, introduced in r219170, to
guard against files from being treated as one of the standard file
descriptors.
Objective-C SDK modernizer. Patch to support modernization
to NS_ENUM/NS_OPTION macros when typedef names are other
than NSInteger/NSUInteger (int8_t, etc.).
rdar://18532199
Bill Schmidt [Mon, 6 Oct 2014 19:02:20 +0000 (19:02 +0000)]
[PATCH][Power] Fix (and deprecate) vec_lvsl and vec_lvsr for little endian
The use of the vec_lvsl and vec_lvsr interfaces are discouraged for
little endian targets since Power8 hardware is a minimum requirement,
and Power8 provides reasonable performance for unaligned vector loads
and stores. Up till now we have not provided "correct" (i.e., big-
endian-compatible) code generation for these interfaces, as to do so
produces poorly performing code. However, this has become the source
of too many questions.
With this patch, LLVM will now produce compatible code for these
interfaces, but will also produce a deprecation warning message for
PPC64LE when one of them is used. This should make the porting direction
clearer to programmers. A similar patch has recently been committed to
GCC.
This patch includes a test for the warning message. There is a companion
patch that adds two unit tests to projects/test-suite.
Patch to wrap up '_' as separator in version numbers
in availability attribute by preserving this info.
in VersionTuple and using it in pretty printing of attributes
and yet using '.' as separator when diagnosing unavailable
message calls. rdar://18490958
Fix bug in DynTypedMatcher::constructVariadic() that would cause false negatives.
Summary:
DynTypedMatcher::constructVariadic() where the restrict kind of the
different matchers are not related causes the matcher to have a "None"
restrict kind. This causes false negatives for anyOf and eachOf.
Change the logic to get a common ancestor if there is one.
Also added regression tests that fail without the fix.
David Blaikie [Mon, 6 Oct 2014 05:18:55 +0000 (05:18 +0000)]
DebugInfo: Don't include implicit special members in the list of class members
By leaving these members out of the member list, we avoid them being
emitted into type unit definitions - while still allowing the
definition/declaration to be injected into the compile unit as expected.
David Blaikie [Mon, 6 Oct 2014 05:06:54 +0000 (05:06 +0000)]
DebugInfo: Don't include member function template specializations in the list of class members
By leaving these members out of the member list, we avoid them being
emitted into type unit definitions - while still allowing the
definition/declaration to be injected into the compile unit as expected.
David Majnemer [Sun, 5 Oct 2014 06:44:53 +0000 (06:44 +0000)]
MS ABI: Use '1' (instead of '0') relative scope discriminators
This changes the scope discriminator's behavior to start at '1' instead
of '0'. Symbol table diffing, for ABI compatibility testing, kept
finding these as false positives.
David Majnemer [Sun, 5 Oct 2014 05:05:40 +0000 (05:05 +0000)]
MS ABI: Implement thread_local for global variables
Summary:
This add support for the C++11 feature, thread_local global variables.
The ABI Clang implements is an improvement of the MSVC ABI. Sadly,
further improvements could be made but not without sacrificing ABI
compatibility.
The feature is implemented as follows:
- All thread_local initialization routines are pointed to from the
.CRT$XDU section.
- All non-weak thread_local variables have their initialization routines
call from a single function instead of getting their own .CRT$XDU
section entry. This is done to open up optimization opportunities to
the compiler.
- All weak thread_local variables have their own .CRT$XDU section entry.
This entry is in a COMDAT with the global variable it is initializing;
this ensures that we will initialize the global exactly once.
- Destructors are registered in the initialization function using
__tlregdtor.
Hal Finkel [Sat, 4 Oct 2014 15:26:49 +0000 (15:26 +0000)]
Emit @llvm.assume for non-parameter lvalue align_value-attribute loads
We already add the align parameter attribute for function parameters that have
the align_value attribute (or those with a typedef type having that attribute),
which is an important special case, but does not handle pointers with value
alignment assumptions that come into scope in any other way. To handle the
general case, emit an @llvm.assume-based alignment assumption whenever we load
the pointer-typed lvalue of an align_value-attributed variable (except for
function parameters, which we already deal with at entry).
I'll also note that this is more general than Intel's described support in:
https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization
which states that the compiler inserts __assume_aligned directives in response
to align_value-attributed variables only for function parameters and for the
initializers of local variables. I think that we can make the optimizer deal
with this more-general scheme (which could lead to a lot of calls to
@llvm.assume inside of loop bodies, for example), but if not, I'll rework this
to be less aggressive.
David Majnemer [Sat, 4 Oct 2014 06:51:54 +0000 (06:51 +0000)]
MS ABI: Disallow dllimported/exported variables from having TLS
Windows TLS relies on indexing through a tls_index in order to get at
the DLL's thread local variables. However, this index is not exported
along with the variable: it is assumed that all accesses to thread local
variables are inside the same module which created the variable in the
first place.
While there are several implementation techniques we could adopt to fix
this (notably, the Itanium ABI gets this for free), it is not worth the
heroics.
Instead, let's just ban this combination. We could revisit this in the
future if we need to.