Alexey Bataev [Fri, 5 Dec 2014 04:09:23 +0000 (04:09 +0000)]
[OPENMP] Codegen for 'omp barrier' directive.
Adds generation of call to "i32 kmpc_cancel_barrier(ident_t *, i32)" libcall for explicitly specified barriers (OMP_IDENT_BARRIER_EXPL flag is added to "flags" field of "ident_t" structure).
Also this patch replaces all calls to "kmpc_barrier" function by calls of "__kmpc_cancel_barrier" function which provides additional functionality for OpenMP 4.0.
Also, library specific enum OpenMPLocationFlags moved to private section of CGOpenMPRuntime class to make it more independent from library implementation.
Differential Revision: http://reviews.llvm.org/D6447
Richard Smith [Fri, 5 Dec 2014 02:33:27 +0000 (02:33 +0000)]
[modules] Instead of storing absolute paths in a .pcm file, store the path to
the root of the module and use paths relative to that directory wherever
possible. This is a step towards allowing explicit modules to be relocated
without being rebuilt, which is important for some kinds of distributed builds,
for good paths in diagnostics, and for appropriate .d output.
Have the driver and the target code agree on what the default ABI
is for each machine. Fix up darwin tests that were testing for
aapcs on armv7-ios when the actual ABI is apcs.
Alexey Bataev [Thu, 4 Dec 2014 07:23:53 +0000 (07:23 +0000)]
[OPENMP] Codegen for 'omp master' directive
Patch adds 2 library functions to OpenMPRuntime class - int32 kmpc_master(ident_t *, int32 gtid) and void kmpc_end_master(ident_t *, int32 gtid);
For 'omp master' directive the next code is generated:
if (__kmpc_master(loc, gtid)) {
<Associated structured block>;
__kmpc_end_master(log, gtid);
}
Currently, kernel argument metadata is omitted unless the
"-cl-kernel-arg-info" option is specified. But the SPIR 1.2 spec
requires that all metadata except kernel_arg_name should always be
emitted, and kernel_arg_name is only emitted when
"-cl-kernel-arg-info" is specified.
Patch ported by Ryan Burn from the Khronos SPIR generator.
https://github.com/KhronosGroup/SPIR
Create a helper function to construct a value for the ARM hint intrinsic
rather than inling the construction. In order to avoid the use of the sentinel
value, inline the use of intrinsic instruction retrieval. NFC.
Reid Kleckner [Wed, 3 Dec 2014 21:00:21 +0000 (21:00 +0000)]
Cast vtable address points to i32 (...)** to enable more globalopt
We currently use i32 (...)** as the type of the vptr field in the LLVM
struct type. LLVM's GlobalOpt prefers any bitcasts to be on the side of
the data being stored rather than on the pointer being stored to.
Hal Finkel [Wed, 3 Dec 2014 08:19:17 +0000 (08:19 +0000)]
Preserve LD_LIBRARY_PATH when using the 'env' command
In many Linux environments (and similar), just-built applications won't run
correctly without making use of the current LD_LIBRARY_PATH environmental
variable in order to find dynamic libraries. Propagate it through the 'env'
command (hopefully this works on all platforms).
Nico Weber [Wed, 3 Dec 2014 01:21:41 +0000 (01:21 +0000)]
Fix incorrect codegen for devirtualized calls to virtual overloaded operators.
Consider this program:
struct A {
virtual void operator-() { printf("base\n"); }
};
struct B final : public A {
virtual void operator-() override { printf("derived\n"); }
};
int main() {
B* b = new B;
-static_cast<A&>(*b);
}
Before this patch, clang saw the virtual call to A::operator-(), figured out
that it can be devirtualized, and then just called A::operator-() directly,
without going through the vtable. Instead, it should've looked up which
operator-() the call devirtualizes to and should've called that.
For regular virtual member calls, clang gets all this right already. So
instead of giving EmitCXXOperatorMemberCallee() all the logic that
EmitCXXMemberCallExpr() already has, cut the latter function into two pieces,
call the second piece EmitCXXMemberOrOperatorMemberCallExpr(), and use it also
to generate code for calls to virtual member operators.
This way, virtual overloaded operators automatically don't get devirtualized
if they have covariant returns (like it was done for regular calls in r218602),
etc.
This also happens to fix (or at least improve) codegen for explicit constructor
calls (`A a; a.A::A()`) in MS mode with -fsanitize-address-field-padding=1.
(This adjustment for virtual operator calls seems still wrong with the MS ABI.)
David Majnemer [Tue, 2 Dec 2014 23:30:24 +0000 (23:30 +0000)]
Intrin: Add _umul128
Implement _umul128; it provides the high and low halves of a 128-bit
multiply. We can simply use our __int128 arithmetic to implement this,
we generate great code for it:
movq %rdx, %rax
mulq %rcx
movq %rdx, (%r8)
retq
Justin Bogner [Tue, 2 Dec 2014 23:15:30 +0000 (23:15 +0000)]
InstrProf: Use the same names for variables as we use in the profile
There's no need to use different names for the local variables than we
use in the profile itself, and it's a bit simpler and easier to debug
if we're consistent.
This patch fixes a crash involving use of predefined
expressions. It fixes crash when mangling name for block's helper
function used inside a constructor/destructor.
rdar://19065361.
Summary:
Skip some unnecessary type checks wrt DynTypedNodes.
Add DynTypedNode::getUnchecked() to skip the runtime check when the type
is known.
Speed up DynTypedNode::operator== by using isSame() instead of
isBaseOf().
Skip the type check in MatcherInterface<T>::matches(). All calls come
from DynTypedMatcher::matches(), which already did the type check.
This change speeds up our clang-tidy benchmark by ~4%.
Fix invalid calling convention used for libcalls on ARM.
ARM ABI specifies that all the libcalls use soft FP ABI
(even hard FP binaries). These days clang emits _mulsc3 / _muldc3
calls with default (C) calling convention which would be translated
into AAPCS_VFP LLVM calling and thus the result of complex
multiplication will be bogus.
Introduce a way for a target to specify explicitly calling
convention for libcalls. Right now this is temporary correctness
fix. Ultimately, we'll end with intrinsic for complex
multiplication and all calling convention decisions for libcalls
will be put into backend.
Serge Pavlov [Tue, 2 Dec 2014 11:06:09 +0000 (11:06 +0000)]
Emit warning if define or undef reserved identifier or keyword.
Summary:
This change implements warnings if macro name is identical to a keyword or
reserved identifier. The warnings are different depending on the "danger"
of the operation. Defining macro that replaces a keyword is on by default.
Other cases produce warning that is off by default but can be turned on
using option -Wreserved-id-macro.
Rely on fewer features of the 'env' command. Darwin only supports '-i'.
I'm explicitly setting LC_ALL=C somewhat for documentation, but
hopefully this also removes some host variation from the test results.
Add a test that ensures the Clang driver behaves itself when the PATH
environment variable is changed to strange things out from under it.
Prior to r223099 in LLVM, these test cases would crash in various ways
(assert fails, stack exhaustion, etc.).
Bob Wilson [Tue, 2 Dec 2014 00:27:35 +0000 (00:27 +0000)]
Remove special case for aarch64 static vs. PIC code in iOS kernel code.
I added this check a while back but then made a note to myself that it
should be completely unnecessary since iOS always uses PIC code-gen for
aarch64. Since I could never come up with any reason why it would be
necessary, I'm just going to remove it and we'll see if anything breaks.
rdar://problem/13627985
Richard Smith [Tue, 2 Dec 2014 00:08:08 +0000 (00:08 +0000)]
[modules] Track how 'header' directives were written in module map files,
rather than trying to extract this information from the FileEntry after the
fact.
This has a number of beneficial effects. For instance, diagnostic messages for
failed module builds give a path relative to the "module root" rather than an
absolute file path, and the contents of the module includes file is no longer
dependent on what files the including TU happened to inspect prior to
triggering the module build.
Zachary Turner [Mon, 1 Dec 2014 23:06:47 +0000 (23:06 +0000)]
Make -fuse-ld=lld work properly on Windows.
Using lld on Windows requires calling link-lld.exe instead of
lld.exe. This patch puts this knowledge into clang so that when
using the GCC style clang driver, it can properly delegate to
lld.
Reid Kleckner [Mon, 1 Dec 2014 22:02:27 +0000 (22:02 +0000)]
Use nullptr to silence -Wsentinel when self-hosting on Windows
Richard rejected my Sema change to interpret an integer literal zero in
a varargs context as a null pointer, so -Wsentinel sees an integer
literal zero and fires off a warning. Only CodeGen currently knows that
it promotes integer literal zeroes in this context to pointer size on
Windows. I didn't want to teach -Wsentinel about that compatibility
hack. Therefore, I'm migrating to C++11 nullptr.
Nico Weber [Mon, 1 Dec 2014 17:48:04 +0000 (17:48 +0000)]
Add a test for devirtualization of virtual operator calls.
There was no test coverage for this before: Modifiying
EmitCXXOperatorMemberCallee() to not call CanDevirtualizeMemberFunctionCall()
didn't make any test fail.
Make the function pointer a template argument instead of a runtime value.
Summary:
Speed up the variadic matchers by removing one indirect call.
Making the function pointer a template arguments allows the compiler to
inline the call instead of doing an runtime call by pointer.
Also, optimize the allOf() case to avoid redundant kind checks.
This speeds up our clang-tidy benchmark by ~2%
Remove threshold for lifetime marker insertion of named temporaries
Now that TailRecursionElimination has been fixed with r222354, the
threshold on size for lifetime marker insertion can be removed. This
only affects named temporary though, as the patch for unnamed temporaries
is still in progress.
Richard Barton [Fri, 28 Nov 2014 20:39:59 +0000 (20:39 +0000)]
Add additional arguments for -mfpu options
Add neon-vfpv3 to allow specifying both at the same time. This is not an
option that GCC supports, but follows the same track and should be
non-controversial.
Alexey Bataev [Fri, 28 Nov 2014 07:21:40 +0000 (07:21 +0000)]
[OPENMP] Additional processing of 'omp atomic write' directive.
According to OpenMP standard, Section 2.12.6, atomic Construct, '#pragma omp atomic write' is allowed to be used only for expression statements of form 'x = expr;', where x is a lvalue expression and expr is an expression with scalar type. Patch adds checks for it.
Sean Hunt [Fri, 28 Nov 2014 00:53:20 +0000 (00:53 +0000)]
Create a new 'flag_enum' attribute.
This attribute serves as a hint to improve warnings about the ranges of
enumerators used as flag types. It currently has no working C++ implementation
due to different semantics for enums in C++. For more explanation, see the docs
and testcases.
Tim Northover [Thu, 27 Nov 2014 21:02:49 +0000 (21:02 +0000)]
AArch64: simplify PCS mapping.
Now that LLVM can count the registers needed to implement AAPCS rules, we don't
need to duplicate that logic here. This means we can drop the explicit padding
and also use more natural types in many cases (e.g. "struct { float arr[3]; }"
used to end up as "[2 x double]" to avoid holes on the stack.
The one wrinkle is that AAPCS va_arg was also using the register counting
machinery. But the local replacement isn't too bad.
Aaron Ballman [Thu, 27 Nov 2014 15:45:59 +0000 (15:45 +0000)]
Sphinx does not have a lexer for OpenCL, so falling back to C for the language on the code block. Also fixing an indentation warning. NFC to the content of the documentation itself.
[OpenCL] Implemented restrictions for pointer conversions specified in OpenCL v2.0.
OpenCL v2.0 s6.5.5 restricts conversion of pointers to different address spaces:
- the named address spaces (__global, __local, and __private) => __generic - implicitly converted;
- __generic => named - with an explicit cast;
- named <=> named - disallowed;
- __constant <=> any other - disallowed.