Richard Smith [Fri, 8 Jan 2016 22:36:45 +0000 (22:36 +0000)]
[modules] Make sure we always include the contents of private headers when
building a module. Prior to this change, the private header's content would
only be included if the header were included by another header in the same
module. If not (if the private header is only used by the .cc files of the
module, or is included from outside the module via -Wno-private-header),
a #include of that file would be silently ignored.
David Majnemer [Fri, 8 Jan 2016 20:48:26 +0000 (20:48 +0000)]
[MS ABI] Complete and base constructor GlobalDecls must have the same name
Clang got itself into the situation where we mangled the same
constructor twice with two different constructor types. After one of
the constructors were utilized, the tag used for one of the types
changed from class to struct because a class template became complete.
This resulted in one of the constructor types varying from the other
constructor.
Instead, force "base" constructor types to "complete" if the ABI doesn't
have constructor variants. This will ensure that GlobalDecls for both
variants will get the same mangled name.
Teresa Johnson [Fri, 8 Jan 2016 17:04:29 +0000 (17:04 +0000)]
[ThinLTO] Leverage new in-place renaming support
Due to the new in-place renaming support added in r257174, we no
longer need to invoke ThinLTO global renaming from clang. It will be
invoked on the module in the FunctionImport pass (by an immediately
following llvm commit).
As a result, we don't need to load the FunctionInfoIndex as early,
so that is moved down into EmitAssemblyHelper::EmitAssembly.
Dimitry Andric [Thu, 7 Jan 2016 22:09:47 +0000 (22:09 +0000)]
Turn off lldb debug tuning by default for FreeBSD
Summary:
This is the clang part of D15966. In rL256104, debugger tuning was
added to the clang driver, and again the default for FreeBSD was set to
lldb. The default needs to be gdb instead.
Coverage mapping data may reference names of functions
that are skipped by FE (e.g, unused inline functions). Since
those functions are skipped, normal instr-prof function lowering
pass won't put those names in the right section, so special
handling is needed to walk through coverage mapping structure
and recollect the references.
With this patch, only names that are skipped are processed. This
simplifies the lowering code and it no longer needs to make
assumptions coverage mapping data layout. It should also be
more efficient.
[Sema] Teach overload resolution about unaddressable functions.
Given an expression like `(&Foo)();`, we perform overload resolution as
if we are calling `Foo` directly. This causes problems if `Foo` is a
function that can't have its address taken. This patch teaches overload
resolution to ignore functions that can't have their address taken in
such cases.
John McCall [Wed, 6 Jan 2016 23:34:20 +0000 (23:34 +0000)]
Properly bind up any cleanups in an ExprWithCleanups after
instantiating a default argument expression.
This was previously just working implicitly by reinstantiating
in the current context, but caching means that we weren't
registering cleanups in subsequent uses.
Richard Smith [Wed, 6 Jan 2016 22:49:11 +0000 (22:49 +0000)]
PR26048, PR26050: put non-type template parameters and indirect field decls
into IDNS_Tag in C++, because they conflict with redeclarations of tags. (This
doesn't affect elaborated-type-specifier lookup, which looks for IDNS_Type in
C++).
John McCall [Wed, 6 Jan 2016 22:34:54 +0000 (22:34 +0000)]
Only instantiate a default argument once.
By storing the instantiated expression back in the ParmVarDecl,
we remove the last need for separately storing the sub-expression
of a CXXDefaultArgExpr. This makes PCH/Modules merging quite
simple: CXXDefaultArgExpr records are serialized as references
to the ParmVarDecl, and we ignore redundant attempts to overwrite
the instantiated expression.
This has some extremely marginal impact on user-facing semantics.
However, the major effect is that it avoids IRGen errors about
conflicting definitions due to lambdas in the argument being
instantiated multiple times while sharing the same mangling.
It should also slightly improve memory usage and module file size.
Richard Trieu [Wed, 6 Jan 2016 21:11:18 +0000 (21:11 +0000)]
Improve conditional checking during template instantiation.
When the condition in an if statement, while statement, or for loop is created
during template instantiation, it calls MakeFullExpr with only the condition
expression. However, when these conditions are created for non-templated
code in the Parser, an additional SourceLocation is passed to MakeFullExpr.
The impact of this was that non-dependent templated code could produce
diagnostics that the same code outside templates would not. Adding the missing
SourceLocation makes diagnostics consistent between templated and non-templated
code.
Nico Weber [Wed, 6 Jan 2016 20:55:00 +0000 (20:55 +0000)]
Add -Wfor-loop-analysis to -Wall.
This warning seems to have 0 false positives and some true positives in
practice, without a measurable compile time cost. It should be in -Wall, and
possibly even become a default warning at some point.
Dan Gohman [Wed, 6 Jan 2016 19:43:32 +0000 (19:43 +0000)]
[WebAssembly] Add --gc-sections to the link line.
This will eventually be accompanied with a change to enable -ffunction-sections
and -fdata-sections by default, which is currently delayed by some development
process issues.
Erik Verbruggen [Wed, 6 Jan 2016 15:12:51 +0000 (15:12 +0000)]
Show inclusions from a preamble in clang_getInclusions.
When reparsing a translation unit with preamble generation turned on,
no includes are found. This is due to the fact that all SLocs from
AST/PCH files are skipped as they are 'loaded', and inclusions from a
preamble are also 'loaded'. So, in case a file has a preamble, it first
needs to process those loaded inclusions, and then check for any local
inclusions. This latter one is for any includes that are not part of the
preamble, like includes half-way through a file.
Dimitry Andric [Wed, 6 Jan 2016 07:42:18 +0000 (07:42 +0000)]
Add -fno-movt frontend option, to disable movt/movw on ARM
Summary:
In rL256641, @davide turned off movt generation by default for FreeBSD.
This was because our ld is very old, and did not support the relocations
for it. However, Ian Lepore added the support very recently, so we
would like to revert rL256641, and replace it with a new `-fno-movt`
frontend option. This way, it can be turned off when needed.
Change the set of actions built for external gcc tools.
A gcc tool has an "integrated" assembler (usually gas) that it
will call to produce an object. Let it use that assembler so
that we don't have to deal with assembly syntax incompatibilities.
Richard Smith [Wed, 6 Jan 2016 03:52:10 +0000 (03:52 +0000)]
[modules] When a tag type that was imported from a module is referenced via an
elaborated-type-specifier, create a declaration of it to track that the current
module makes it visible too.
Anna Zaks [Wed, 6 Jan 2016 00:32:52 +0000 (00:32 +0000)]
[analyzer] Suppress reports coming from std::__independent_bits_engine
The analyzer reports a shift by a negative value in the constructor. The bug can
be easily triggered by calling std::random_shuffle on a vector
(<rdar://problem/19658126>).
(The shift by a negative value is reported because __w0_ gets constrained to
63 by the conditions along the path:__w0_ < _WDt && __w0_ >= _WDt-1,
where _WDt is 64. In normal execution, __w0_ is not 63, it is 1 and there is
no overflow. The path is infeasible, but the analyzer does not know about that.)
After configuration the following additional targets will be generated:
stage2-instrumented:
Builds a stage1 x86 compiler, runtime, and required tools (llvm-config, llvm-profdata) then uses that compiler to build an instrumented stage2 compiler.
stage2-instrumented-generate-profdata:
Depends on "stage2-instrumented" and will use the instrumented compiler to generate profdata based on the training files in <clang>/utils/perf-training
stage2:
Depends on "stage2-instrumented-generate-profdata" and will use the stage1 compiler with the stage2 profdata to build a PGO-optimized compiler.
stage2-check-llvm:
Depends on stage2 and runs check-llvm using the stage3 compiler.
stage2-check-clang:
Depends on stage2 and runs check-clang using the stage3 compiler.
stage2-check-all:
Depends on stage2 and runs check-all using the stage3 compiler.
stage2-test-suite:
Depends on stage2 and runs the test-suite using the stage3 compiler (requires in-tree test-suite).
Oleg Ranevskyy [Tue, 5 Jan 2016 19:54:39 +0000 (19:54 +0000)]
[Clang/Support/Windows/Unix] Command lines created by clang may exceed the command length limit set by the OS
Summary:
LLVM part of the patch is D15831.
When clang runs an external tool such as a linker it may create a command line that exceeds the length limit.
Clang uses the llvm::sys::argumentsFitWithinSystemLimits function to check if command line length fits the OS
limitation. There are two problems in this function that may cause exceeding of the limit:
1. It ignores the length of the program path in its calculations. On the other hand, clang adds the program
path to the command line when it runs the program.
2. It assumes no space character is inserted after the last argument, which is not true for Windows. The flattenArgs function adds the trailing space for *each* argument. The result of this is that the terminating NULL character is not counted and may be placed beyond the length limit if the command line is exactly 32768 characters long. The WinAPI's CreateProcess does not find the NULL character and fails.
[PGO] Enable clang to pass compiler-rt profile support library to linker on Windows
Summary: This change enables clang to automatically link binaries built with the -fprofile-instr-generate against the clang_rt.profile-i386.lib library.
Samuel Antao [Tue, 5 Jan 2016 16:23:04 +0000 (16:23 +0000)]
[OpenMP] Offloading descriptor registration and device codegen.
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.
This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.
About offloading descriptor:
The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
void *addr;
char *name;
int64_t size;
};
```
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.
The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.
The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.
About target codegen:
The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}
!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives.
Entry 3) a unique ID of the file where the input source file that contain the target region lives.
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.
Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )
This patch replaces http://reviews.llvm.org/D12306.
[OpenCL] Disallow taking an address of a function.
An undecorated function designator implies taking the address of a function,
which is illegal in OpenCL. Implementing a check for this earlier to allow
the error to be reported even in the presence of other more obvious errors.
Easwaran Raman [Mon, 4 Jan 2016 23:32:28 +0000 (23:32 +0000)]
Remove setting of inlinehint and cold attributes based on profile data
NFC. These hints are only used for inlining and the inliner now uses
the same criteria to identify hot and cold callees and set appropriate
thresholds without relying on these hints. Hence this removed code is
superfluous.
Dimitry Andric [Mon, 4 Jan 2016 10:17:48 +0000 (10:17 +0000)]
Convert test/CXX/lex/lex.literal/lex.string/p4.cpp back to DOS line
endings, since the file is supposed to have them, according to its
comments. Also set its svn:eol-style property. Noticed by Nico Weber.
[PGO] Cleanup: Use covmap header definition in the template file
This is one last remaining instrumentatation related structure
that needs to be migrate to use the centralized template
definition. With this change, instrumentation code
related to coverage module header will be kept in sync
with the coverage mapping reader. The remaining code
which makes implicit assumption about covmap control
structure layout in the the lowering pass will cleaned
up in a different patch. This patch is not intended to
have no functional change.