Don't actually generate code for testing the frontend's target cpu flag,
just verify. This should fix the bots where the x86 backend isn't built
into Clang. Sorry for the breakage.
Re-work the Clang system for classifying Intel x86 CPUs to use their
basic microarchitecture names, and add support (with tests) for parsing
all of the masic microarchitecture names for CPUs documented to be
accepted by GCC with -march. I didn't go back through the 32-bit-only
old microarchitectures, but this at least brings the recent architecture
names up to speed. This is essentially the follow-up to the LLVM commit
r223769 which did similar cleanups for the LLVM CPUs.
One particular benefit is that you can now use -march=westmere in Clang
and get the LLVM westmere processor which is a different ISA variant (!)
and so quite significant.
Much like with r223769, I would appreciate the Intel folks carefully
thinking about the macros defined, names used, etc for the atom chips
and newest primary x86 chips. The current patterns seem quite strange to
me, especially here in Clang.
Note that I haven't replicated the per-microarchitecture macro defines
provided by GCC. I'm really opposed to source code using these rather
than using ISA feature macros.
Richard Smith [Tue, 9 Dec 2014 03:20:04 +0000 (03:20 +0000)]
[modules] Add experimental -fmodule-map-file-home-is-cwd flag to -cc1.
For files named by -fmodule-map-file=, and files found by 'extern module'
directives, this flag specifies that we should resolve filenames relative to
the current working directory rather than relative to the directory in which
the module map file resides. This is aimed at fixing path handling, in
particular for relative -I paths, when building modules that represent
components of the current project (rather than libraries installed on the
current system, which the current project has as dependencies, where we'd
typically expect the module map files to be looked up implicitly).
David Blaikie [Tue, 9 Dec 2014 00:32:22 +0000 (00:32 +0000)]
DebugInfo: Correctly identify the location of C++ member initializer list elements
This particularly helps the fidelity of ASan reports (which can occur
even in these examples - if, for example, one uses placement new over a
buffer of insufficient size - now ASan will correctly identify which
member's initialization went over the end of the buffer).
This doesn't cover all types of members - more coming.
David Majnemer [Tue, 9 Dec 2014 00:12:30 +0000 (00:12 +0000)]
Revert "Driver: Objective-C should respect -fno-exceptions"
This reverts commit r223455. It's been succesfully argued that
-fexceptions (at the driver level) is a misnomer and has little to do
with -fobjc-exceptions.
Fix isInstantiated and isInTemplateInstantiation to not recreate the matchers on each call.
Summary:
Store the result matcher after the first call and reuse it later on.
Recreating the matchers just to use them once incurs in a lot of
unnecessary temporary memory allocations.
This change speeds up our clang-tidy benchmarks by ~2%.
[libclang] Use same USR encoding for 'class' as 'struct'.
'class' and 'struct' can be used interchangebly for forward references.
Use the same encoding otherwise we may get into a weird situation where the USR for the same
declaration is different based on whether the definition of the tag reference is visible or not.
[libclang] Function templates can be 'overloaded' by return type, so encode the return type in the USR
and handle DependentNameType in order to be able to distinguish them.
Richard Smith [Sat, 6 Dec 2014 03:21:08 +0000 (03:21 +0000)]
[modules] If we import a module, and we've seen a module map that describes the
module, use the path from the module map file in preference to the path from
the .pcm file when resolving relative paths in the .pcm file. This allows
diagnostics (and .d output) to give relative paths if the module was found via
a relative path.
Richard Smith [Fri, 5 Dec 2014 22:42:13 +0000 (22:42 +0000)]
[modules] Instead of storing absolute paths in a .pcm file, store the path to
the root of the module and use paths relative to that directory wherever
possible. This is a step towards allowing explicit modules to be relocated
without being rebuilt, which is important for some kinds of distributed builds,
for good paths in diagnostics, and for appropriate .d output.
This is a recommit of r223443, reverted in r223465; when joining together
imported file paths, we now use the system's separator rather than always
using '/'. This avoids path mismatches between the original module build and
the module user on Windows (at least, in some cases). A more comprehensive
fix will follow.
Matt Arsenault [Fri, 5 Dec 2014 18:03:58 +0000 (18:03 +0000)]
Workaround attribute ordering issue with kernel only attributes
Placing the attribute after the kernel keyword would incorrectly
reject the attribute, so use the smae workaround that other
kernel only attributes use.
Also add a FIXME because there are two different phrasings now
for the same error, althoug amdgpu_num_[sv]gpr uses a consistent one.
Aaron Ballman [Fri, 5 Dec 2014 15:24:55 +0000 (15:24 +0000)]
Modify __has_attribute so that it only looks for GNU-style attributes. Removes the ability to look for generic attributes and keywords via this macro, which has the potential to be a breaking change. However, since there is __has_cpp_attribute and __has_declspec_attribute, and given the limited usefulness of querying a generic attribute name regardless of syntax, this seems like the correct path forward.
Aaron Ballman [Fri, 5 Dec 2014 15:05:29 +0000 (15:05 +0000)]
Added a new preprocessor macro: __has_declspec_attribute. This can be used as a way to determine whether Clang supports a __declspec spelling for a given attribute, similar to __has_attribute and __has_cpp_attribute.
David Majnemer [Fri, 5 Dec 2014 08:56:55 +0000 (08:56 +0000)]
Driver: Objective-C should respect -fno-exceptions
Clang attempted to replicate a GCC bug: -fobjc-exceptions forces
-fexceptions to be enabled. However, this has unintended effects and
other awkard side effects that Clang doesn't "correctly" ape (e.g. it's
impossible to turn off C++ exceptions in ObjC++ mode).
Instead, -f[no]objc-exceptions and -f[no]cxx-exceptions now have an
identical relationship with -f[no]exceptions.
Alexey Bataev [Fri, 5 Dec 2014 04:09:23 +0000 (04:09 +0000)]
[OPENMP] Codegen for 'omp barrier' directive.
Adds generation of call to "i32 kmpc_cancel_barrier(ident_t *, i32)" libcall for explicitly specified barriers (OMP_IDENT_BARRIER_EXPL flag is added to "flags" field of "ident_t" structure).
Also this patch replaces all calls to "kmpc_barrier" function by calls of "__kmpc_cancel_barrier" function which provides additional functionality for OpenMP 4.0.
Also, library specific enum OpenMPLocationFlags moved to private section of CGOpenMPRuntime class to make it more independent from library implementation.
Differential Revision: http://reviews.llvm.org/D6447
Richard Smith [Fri, 5 Dec 2014 02:33:27 +0000 (02:33 +0000)]
[modules] Instead of storing absolute paths in a .pcm file, store the path to
the root of the module and use paths relative to that directory wherever
possible. This is a step towards allowing explicit modules to be relocated
without being rebuilt, which is important for some kinds of distributed builds,
for good paths in diagnostics, and for appropriate .d output.
Have the driver and the target code agree on what the default ABI
is for each machine. Fix up darwin tests that were testing for
aapcs on armv7-ios when the actual ABI is apcs.
Alexey Bataev [Thu, 4 Dec 2014 07:23:53 +0000 (07:23 +0000)]
[OPENMP] Codegen for 'omp master' directive
Patch adds 2 library functions to OpenMPRuntime class - int32 kmpc_master(ident_t *, int32 gtid) and void kmpc_end_master(ident_t *, int32 gtid);
For 'omp master' directive the next code is generated:
if (__kmpc_master(loc, gtid)) {
<Associated structured block>;
__kmpc_end_master(log, gtid);
}
Currently, kernel argument metadata is omitted unless the
"-cl-kernel-arg-info" option is specified. But the SPIR 1.2 spec
requires that all metadata except kernel_arg_name should always be
emitted, and kernel_arg_name is only emitted when
"-cl-kernel-arg-info" is specified.
Patch ported by Ryan Burn from the Khronos SPIR generator.
https://github.com/KhronosGroup/SPIR
Create a helper function to construct a value for the ARM hint intrinsic
rather than inling the construction. In order to avoid the use of the sentinel
value, inline the use of intrinsic instruction retrieval. NFC.
Reid Kleckner [Wed, 3 Dec 2014 21:00:21 +0000 (21:00 +0000)]
Cast vtable address points to i32 (...)** to enable more globalopt
We currently use i32 (...)** as the type of the vptr field in the LLVM
struct type. LLVM's GlobalOpt prefers any bitcasts to be on the side of
the data being stored rather than on the pointer being stored to.
Hal Finkel [Wed, 3 Dec 2014 08:19:17 +0000 (08:19 +0000)]
Preserve LD_LIBRARY_PATH when using the 'env' command
In many Linux environments (and similar), just-built applications won't run
correctly without making use of the current LD_LIBRARY_PATH environmental
variable in order to find dynamic libraries. Propagate it through the 'env'
command (hopefully this works on all platforms).
Nico Weber [Wed, 3 Dec 2014 01:21:41 +0000 (01:21 +0000)]
Fix incorrect codegen for devirtualized calls to virtual overloaded operators.
Consider this program:
struct A {
virtual void operator-() { printf("base\n"); }
};
struct B final : public A {
virtual void operator-() override { printf("derived\n"); }
};
int main() {
B* b = new B;
-static_cast<A&>(*b);
}
Before this patch, clang saw the virtual call to A::operator-(), figured out
that it can be devirtualized, and then just called A::operator-() directly,
without going through the vtable. Instead, it should've looked up which
operator-() the call devirtualizes to and should've called that.
For regular virtual member calls, clang gets all this right already. So
instead of giving EmitCXXOperatorMemberCallee() all the logic that
EmitCXXMemberCallExpr() already has, cut the latter function into two pieces,
call the second piece EmitCXXMemberOrOperatorMemberCallExpr(), and use it also
to generate code for calls to virtual member operators.
This way, virtual overloaded operators automatically don't get devirtualized
if they have covariant returns (like it was done for regular calls in r218602),
etc.
This also happens to fix (or at least improve) codegen for explicit constructor
calls (`A a; a.A::A()`) in MS mode with -fsanitize-address-field-padding=1.
(This adjustment for virtual operator calls seems still wrong with the MS ABI.)
David Majnemer [Tue, 2 Dec 2014 23:30:24 +0000 (23:30 +0000)]
Intrin: Add _umul128
Implement _umul128; it provides the high and low halves of a 128-bit
multiply. We can simply use our __int128 arithmetic to implement this,
we generate great code for it:
movq %rdx, %rax
mulq %rcx
movq %rdx, (%r8)
retq
Justin Bogner [Tue, 2 Dec 2014 23:15:30 +0000 (23:15 +0000)]
InstrProf: Use the same names for variables as we use in the profile
There's no need to use different names for the local variables than we
use in the profile itself, and it's a bit simpler and easier to debug
if we're consistent.
This patch fixes a crash involving use of predefined
expressions. It fixes crash when mangling name for block's helper
function used inside a constructor/destructor.
rdar://19065361.
Summary:
Skip some unnecessary type checks wrt DynTypedNodes.
Add DynTypedNode::getUnchecked() to skip the runtime check when the type
is known.
Speed up DynTypedNode::operator== by using isSame() instead of
isBaseOf().
Skip the type check in MatcherInterface<T>::matches(). All calls come
from DynTypedMatcher::matches(), which already did the type check.
This change speeds up our clang-tidy benchmark by ~4%.
Fix invalid calling convention used for libcalls on ARM.
ARM ABI specifies that all the libcalls use soft FP ABI
(even hard FP binaries). These days clang emits _mulsc3 / _muldc3
calls with default (C) calling convention which would be translated
into AAPCS_VFP LLVM calling and thus the result of complex
multiplication will be bogus.
Introduce a way for a target to specify explicitly calling
convention for libcalls. Right now this is temporary correctness
fix. Ultimately, we'll end with intrinsic for complex
multiplication and all calling convention decisions for libcalls
will be put into backend.