David Majnemer [Mon, 23 May 2016 17:21:55 +0000 (17:21 +0000)]
Clang support for __is_assignable intrinsic
MSVC now supports the __is_assignable type trait intrinsic,
to enable easier and more efficient implementation of the
Standard Library's is_assignable trait.
As of Visual Studio 2015 Update 3, the VC Standard Library
implementation uses the new intrinsic unconditionally.
The implementation is pretty straightforward due to the previously
existing is_nothrow_assignable and is_trivially_assignable.
We handle __is_assignable via the same code as the other two except
that we skip the extra checks for nothrow or triviality.
David Majnemer [Mon, 23 May 2016 17:16:12 +0000 (17:16 +0000)]
[MS ABI] Implement __declspec(empty_bases) and __declspec(layout_version)
The layout_version attribute is pretty straightforward: use the layout
rules from version XYZ of MSVC when used like
struct __declspec(layout_version(XYZ)) S {};
The empty_bases attribute is more interesting. It tries to get the C++
empty base optimization to fire more often by tweaking the MSVC ABI
rules in subtle ways:
1. Disable the leading and trailing zero-sized object flags if a class
is marked __declspec(empty_bases) and is empty.
This means that given:
struct __declspec(empty_bases) A {};
struct __declspec(empty_bases) B {};
struct C : A, B {};
'C' will have size 1 and nvsize 0 despite not being annotated
__declspec(empty_bases).
2. When laying out virtual or non-virtual bases, disable the injection
of padding between classes if the most derived class is marked
__declspec(empty_bases).
This means that given:
struct A {};
struct B {};
struct __declspec(empty_bases) C : A, B {};
'C' will have size 1 and nvsize 0.
3. When calculating the offset of a non-virtual base, choose offset zero
if the most derived class is marked __declspec(empty_bases) and the
base is empty _and_ has an nvsize of 0.
Because of the ABI rules, this does not mean that empty bases
reliably get placed at offset 0!
For example:
struct A {};
struct B {};
struct __declspec(empty_bases) C : A, B { virtual ~C(); };
'C' will be pointer sized to account for the vfptr at offset 0.
'A' and 'B' will _not_ be at offset 0 despite being empty!
Instead, they will be located right after the vfptr.
This occurs due to the interaction betweeen non-virtual base layout
and virtual function pointer injection: injection occurs after the
nv-bases and shifts them down by the size of a pointer.
Exherbo has an alternative file system layout to accommodate multiarch. The
loader is located at /usr/${triple}/lib/${loader}. Adjust the Linux toolchain
to support that on exherbo.
Driver: sink getLinuxDynamicLoader into the Toolchain
The parameter already requires the toolchain, sink the method into the class.
This also enables the use of the distro detection logic which will be needed to
support Exherbo's multiarch approach.
Convert the cascading if/else to a switch. This makes it easier to follow the
logic. Minor difference is that we no longer default to x86_64 but rather to
the architecture specified by the architecture.
OpenCL builtin functions to_{global|local|private} accepts argument of pointer type to arbitrary pointee type, and return a pointer to the same pointee type in different addr space, i.e.
global gentype *to_global(gentype *p);
It is not desirable to declare it as
global void *to_global(void *);
in opencl header file since it misses diagnostics.
This patch implements these builtin functions as Clang builtin functions. In the builtin def file they are defined to have signature void*(void*). When handling call expressions, their declarations are re-written to have correct parameter type and return type corresponding to the call argument.
In codegen call to addr void *to_addr(void*) is generated with addrcasts or bitcasts to facilitate implementation in builtin library.
Dimitry Andric [Fri, 20 May 2016 17:27:22 +0000 (17:27 +0000)]
Make __FreeBSD_cc_version predefined macro configurable at build time
The `FreeBSDTargetInfo` class has always set the `__FreeBSD_cc_version`
predefined macro to a rather static value, calculated from the major OS
version.
In the FreeBSD base system, we will start incrementing the value of this
macro whenever we make any signifant change to clang, so we need a way
to configure the macro's value at build time.
Use `FREEBSD_CC_VERSION` for this, which we can define in the FreeBSD
build system using either the `-D` command line option, or an include
file. Stock builds will keep the earlier value.
Benjamin Kramer [Fri, 20 May 2016 15:21:08 +0000 (15:21 +0000)]
Add all the avx512 flavors to __builtin_cpu_supports's list.
This is matching what trunk gcc is accepting. Also adds a missing ssse3
case. PR27779. The amount of duplication here is annoying, maybe it
should be factored into a separate .def file?
Martin Probst [Fri, 20 May 2016 11:24:24 +0000 (11:24 +0000)]
clang-format: [JS] sort ES6 imports.
Summary:
This change automatically sorts ES6 imports and exports into four groups:
absolute imports, parent imports, relative imports, and then exports. Exports
are sorted in the same order, but not grouped further.
To keep JS import sorting out of Format.cpp, this required extracting the
TokenAnalyzer infrastructure to separate header and implementation files.
Vedant Kumar [Thu, 19 May 2016 23:44:02 +0000 (23:44 +0000)]
[Lexer] Don't merge macro args from different macro files
The lexer sets the end location of macro arguments incorrectly *if*,
while merging consecutive args to fit into a single SLocEntry, it finds
args which come from different macro files.
Fix the issue by using separate SLocEntries in this situation.
This fixes a code coverage crasher (rdar://problem/26181005). Because
the lexer reported end locations for certain macro args incorrectly, we
would generate bogus coverage mappings with negative line offsets.
Anton Yartsev [Thu, 19 May 2016 23:03:49 +0000 (23:03 +0000)]
[analyzer] Fix for PR23790 : constrain return value of strcmp() rather than returning a concrete value.
The function strcmp() can return any value, not just {-1,0,1} : "The strcmp(const char *s1, const char *s2) function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2." [C11 7.24.4.2p3]
https://llvm.org/bugs/show_bug.cgi?id=23790
http://reviews.llvm.org/D16317
Justin Lebar [Thu, 19 May 2016 22:49:13 +0000 (22:49 +0000)]
[CUDA] Implement __ldg using intrinsics.
Summary:
Previously it was implemented as inline asm in the CUDA headers.
This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions. This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.
Artem Belevich [Thu, 19 May 2016 20:13:53 +0000 (20:13 +0000)]
[CUDA] Do not allow non-empty destructors for global device-side variables.
According to Cuda Programming guide (v7.5, E2.3.1):
> __device__, __constant__ and __shared__ variables defined in namespace
> scope, that are of class type, cannot have a non-empty constructor or a
> non-empty destructor.
Clang already deals with device-side constructors (see D15305).
This patch enforces similar rules for destructors.
Artem Belevich [Thu, 19 May 2016 20:13:39 +0000 (20:13 +0000)]
[CUDA] Split device-var-init.cu tests into separate Sema and CodeGen parts.
Codegen tests for device-side variable initialization are subset of test
cases used to verify Sema's part of the job.
Including CodeGenCUDA/device-var-init.cu from SemaCUDA makes it easier to
keep both sides in sync.
Simon Atanasyan [Thu, 19 May 2016 15:07:21 +0000 (15:07 +0000)]
[driver] Do not pass install dir to the MultilibSet include dirs callback
All additional include directories are relative to the toolchain install
folder. So let's do not pass this folder to each callback to simplify
and slightly reduce the code.
Simon Atanasyan [Thu, 19 May 2016 15:05:22 +0000 (15:05 +0000)]
[driver][mips] Hardcode triple name in case of CodeSourcery toolchain. NFC
CodeSourcery toolchain is a standalone toolchain which always uses
the same triple name in its paths. It is independent from target
triple used by the driver.
Ranjeet Singh [Thu, 19 May 2016 13:04:34 +0000 (13:04 +0000)]
[ARM] Fix cdp intrinsic
- Fixed cdp intrinsic to only accept compile time
constant values previously you could pass in a
variable to the builtin which would result in
illegal llvm assembly output
Richard Smith [Thu, 19 May 2016 01:39:10 +0000 (01:39 +0000)]
Make Sema::getPrintingPolicy less ridiculously expensive. This used to perform
an identifier table lookup, *and* copy the LangOptions (including various
std::vector<std::string>s). Twice. We call this function once each time we start
parsing a declaration specifier sequence, and once for each call to Sema::Diag.
This reduces the compile time for a sample .c file from the linux kernel by 20%.
Steven Wu [Wed, 18 May 2016 17:04:52 +0000 (17:04 +0000)]
[Driver] Fix the case when use -fembed-bitcode and -flto= together
Summary:
-fembed-bitcode was only checking for old style LTO flag (-flto) but not
considering the new -flto= style option. That makes clang output bitcode
embedded in bitcode object when using -flto= and -fembed-bitcode= together.
Now clang should output normal bitcode file when using LTO and ignores
-fembed-bitcode option.
Ashutosh Nema [Wed, 18 May 2016 11:56:23 +0000 (11:56 +0000)]
Add new intrinsic support for MONITORX and MWAITX instructions
Summary:
MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.
The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction
execution and enter an implementation-dependent optimized state until
occurrence of a class of events.
Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.
These instructions are enabled for AMD's bdver4 architecture.
Craig Topper [Wed, 18 May 2016 04:11:25 +0000 (04:11 +0000)]
[Sema,CodeGen] Remove comment from SemaChecking about a builtin_shufflevector form that it doesn't support. Remove CodeGen support for the same form since it could never have been used due to the missing support in Sema.
I couldn't find any documentation that this form existed either. Nor is there documentation for one of the remaining two forms, but there is a testcase that uses it.
Richard Smith [Tue, 17 May 2016 22:44:15 +0000 (22:44 +0000)]
PR27754: CXXRecordDecl::data() needs to perform an update even if it's called
on a declaration that already knows the location of the DefinitionData object.
Richard Smith [Tue, 17 May 2016 21:48:41 +0000 (21:48 +0000)]
Revert r269717. That change alone did not provide the intended benefit (which
would come from changing the type of ASTContext::DeclAttrs from
DenseMap<Decl*,AttrVec*> to DenseMap<Decl*,AttrVec>), and it turns out to be
impractical to avoid the allocation there, because we expose the address of the
attribute vector in ways that are hard to fix.
Yaron Keren [Tue, 17 May 2016 19:01:16 +0000 (19:01 +0000)]
Teach clang to look for libcxx in /usr/local/include/c++ on Linux
As The default CMAKE install prefix is /usr/local ( https://cmake.org/cmake/help/v3.0/variable/CMAKE_INSTALL_PREFIX.html ),
sudo ninja install ends up installing clang, LLVM and libcxx under /usr/local.
In development scenario, when clang is run from the build location it will not
find libcxx at neither (build location)/../include/c++ nor /usr/include/c++.
This patch lets development clang find system installed libcxx without adding
-isystem /usr/local/include/c++. Also addresses the FIXME by explaining the
use-case for these include paths.
Reid Kleckner [Tue, 17 May 2016 16:50:45 +0000 (16:50 +0000)]
Tentatively enable -Wcast-calling-convention by default
In Chrome, this would have found two true positives around CreateThread
if we hadn't already fixed them while rolling out ASan. We didn't get
any other hits in Chrome. I'm curious to hear if this warning finds
anything in other projects.