Richard Trieu [Thu, 27 Aug 2015 23:38:45 +0000 (23:38 +0000)]
Fix macro backtrace printing.
Sometimes, a macro that expands to another macro name will not be printed in
the macro backtrace. This patch finds the missed macro expansions and prints
them. Fixes PR16799
Ahmed Bougacha [Thu, 27 Aug 2015 22:24:56 +0000 (22:24 +0000)]
[X86] Use AVX features instead of ABI to init. SimdDefaultAlign.
The ABI string only exists to communicate with TargetCodeGenInfo.
Concretely, since we only used "avx*" ABI strings on x86_64 (as AVX
doesn't affect the i386 ABIs), this meant that, when initializing
SimdDefaultAlign, we would ignore AVX/AVX512 on i386, for no good
reason.
Instead, directly check the features. A similar change for
MaxVectorAlign will follow.
Piotr Padlewski [Thu, 27 Aug 2015 21:35:41 +0000 (21:35 +0000)]
Assume loads fix #2
There was linker problem, and it turns out that it is not always safe
to refer to vtable. If the vtable is used, then we can refer to it
without any problem, but because we don't know when it will be used or
not, we can only check if vtable is external or it is safe to to emit it
speculativly (when class it doesn't have any inline virtual functions).
It should be fixed in the future.
Adrian Prantl [Thu, 27 Aug 2015 21:21:19 +0000 (21:21 +0000)]
CGDebugInfo: Factor out a getOrCreateStandaloneType() method.
Usually debug info is created on the fly while during codegen.
With this API it becomes possible to create standalone debug info
for types that are not referenced by any code, such as emitting debug info
for a clang module or for implementing something like -gfull.
Because on-the-fly debug info generation may still insert retained types
on top of them, all RetainedTypes are uniqued in CGDebugInfo::finalize().
Eric Christopher [Thu, 27 Aug 2015 19:59:34 +0000 (19:59 +0000)]
Rewrite the code generation handling for function feature and cpu attributes.
A couple of changes here:
a) Do less work in the case where we don't have a target attribute on the
function. We've already canonicalized the attributes for the function -
no need to do more work.
b) Use the newer canonicalized feature adding functions from TargetInfo
to do the work when we do have a target attribute. This enables us to diagnose
some warnings in the case of conflicting written attributes (only ppc does
this today) and also make sure to get all of the features for a cpu that's
listed rather than just change the cpu.
Updated all testcases accordingly and added a new testcase to verify that we'll
error out on ppc if we have some incompatible options using the existing diagnosis
framework there.
Adrian Prantl [Thu, 27 Aug 2015 19:46:20 +0000 (19:46 +0000)]
Add a -gmodules option to the driver and a -dwarf-ext-refs to cc1
to enable the use of external type references in the debug info
(a.k.a. module debugging).
The driver expands -gmodules to "-g -fmodule-format=obj -dwarf-ext-refs"
and passes that to cc1. All this does at the moment is set a flag
codegenopts.
Tyler Nowicki [Thu, 27 Aug 2015 18:58:34 +0000 (18:58 +0000)]
Improve options printed on vectorization analysis diagnostics.
The LLVM patch changes the analysis diagnostics produced when loops with
floating-point recurrences or memory operations are identified. The new messages
say "cannot prove it is safe to reorder * operations; allow reordering by
specifying #pragma clang loop vectorize(enable)". Depending on the type of
diagnostic the message will include additional options such as ffast-math or
__restrict__.
Gabor Horvath [Thu, 27 Aug 2015 18:57:00 +0000 (18:57 +0000)]
[Static Analyzer] BugReporter.cpp:2869: Assertion failed: !RemainingNodes.empty() && "No error node found in the trimmed graph"
The assertion is caused by reusing a “filler” ExplodedNode as an error node.
The “filler” nodes are only used for intermediate processing and are not
essential for analyzer history, so they can be reclaimed when the
ExplodedGraph is trimmed by the “collectNode” function. When a checker finds a
bug, they generate a new transition in the ExplodedGraph. The analyzer will
try to reuse the existing predecessor node. If it cannot, it creates a new
ExplodedNode, which always has a tag to uniquely identify the creation site.
The assertion is caused when the analyzer reuses a “filler” node.
In the test case, some “filler” nodes were reused and then reclaimed later
when the ExplodedGraph was trimmed. This caused an assertion because the node
was needed to generate the report. The “filler” nodes should not be reused as
error nodes. The patch adds a constraint to prevent this happening, which
solves the problem and makes the test cases pass.
Artem Belevich [Thu, 27 Aug 2015 18:10:41 +0000 (18:10 +0000)]
[CUDA] Improve CUDA compilation pipeline creation.
Current implementation tries to guess which Action will result in a
job which needs to incorporate device-side GPU binaries. The guessing
was attempting to work around the fact that multiple actions may be
combined into a single compiler invocation. If CudaHostAction ends up
being combined (and thus bypassed during action list traversal) no
device-side actions it pointed to were processed. The guessing worked
for most of the usual cases, but fell apart when external assembler
was used.
This change removes the guessing and makes sure we create and pass
device-side jobs regardless of how the jobs get combined.
* CudaHostAction is always inserted either at Compile phase or the
FinalPhase of current compilation, whichever happens first.
* If selectToolForJob combines CudaHostAction with other actions, it
passes info about CudaHostAction up to the caller
* When it sees that CudaHostAction got combined with other actions
(and hence will never be passed to BuildJobsForActions),
BuildJobsForActions creates device-side jobs the same way they would
be created if CudaHostAction was passed to BuildJobsForActions
directly.
* Added two more test cases to make sure GPU binaries are passed to
correct jobs.
Richard Smith [Wed, 26 Aug 2015 23:55:49 +0000 (23:55 +0000)]
[modules] The key to a DeclContext name lookup table is not actually a
DeclarationName (because all ctor names are considered the same, and so on).
Reflect this in the type used as the lookup table key. As a side-effect, remove
one copy of the duplicated code used to compute the hash of the key.
DI: Clarify meaning of createTempFunctionFwdDecl() arg, NFC
I stared at `false /*declaration*/` for quite some time before giving up
and checking the actual function to see what it meant. Replacing with
`/* isDefinition = */ false` to save myself effort later.
Akira Hatanaka [Wed, 26 Aug 2015 19:00:11 +0000 (19:00 +0000)]
[ARM] Error out if float-ab=hard and abi=apcs-gnu on macho platforms.
Error out if -mfloat-abi=hard or -mhard-float is specified on the command
line and the target ABI is APCS. Previously clang issued no warnings or
errors and just passed the option to the backend, which had no effect on
code generation for targets using APCS.
This commit corrects the patch commited in r245866, which didn't take into
account the fact that not all darwin targets use APCS.
Derek Schuff [Wed, 26 Aug 2015 17:14:08 +0000 (17:14 +0000)]
Change Native Client x86 usr include and link path to match SDK expectations
GNU multilib style uses x86_64-nacl/include and x86_64-nacl/usr/include
but the SDK expects i686-nacl/usr/include for its files. Change the driver
to use this.
Yaron Keren [Wed, 26 Aug 2015 08:10:22 +0000 (08:10 +0000)]
Make FileManager::getFileSystemOptions consistent with CompilerInstance::getFileSystemOpts
and CompilerInvocation::getFileSystemOpts by renaming it to getFileSystemOpts,
marking the const-returning access method const and adding a non-const version,
making the function prototypes identical to CompilerInstance::getFileSystemOpts.
David Majnemer [Wed, 26 Aug 2015 05:13:19 +0000 (05:13 +0000)]
[Sema] Don't assume CallExpr::getDirectCallee will succeed
We tried to provide a very nice diagnostic when diagnosing an assignment
to a const int & produced by a function call. However, we cannot always
determine what function was called.
Nathan Wilson [Wed, 26 Aug 2015 04:19:36 +0000 (04:19 +0000)]
Modify DeclaratorChuck::getFunction to be passed an Exception Specification SourceRange
Summary:
- Store the exception specification range's begin and end SourceLocation in DeclaratorChuck::FunctionTypeInfo. These SourceLocations can be used in a FixItHint Range.
- Add diagnostic; function concept having an exception specification.
Ahmed Bougacha [Tue, 25 Aug 2015 23:09:05 +0000 (23:09 +0000)]
[Headers][X86] Add -O0 assembly tests for avx2 intrinsics.
We agreed for r245605 that, as long as we don't affect -O0 codegen
too much, it's OK to use native constructs rather than intrinsics.
Let's test that, starting with AVX2 here.
Simon Pilgrim [Tue, 25 Aug 2015 21:27:46 +0000 (21:27 +0000)]
[X86] Remove unnecessary MMX declarations from Intrin.h
As discussed in PR23648 - the intrinsics _m_from_int, _m_to_int and _m_prefetch are defined in mmintrin.h and prfchwintrin.h so we don't need to in Intrin.h
David Majnemer [Tue, 25 Aug 2015 16:44:38 +0000 (16:44 +0000)]
[Sema] Handle leading and trailing __ for GNU attributes
GNU attributes can have a leading and trailing __ appended/prepended to
the attribute name. While the parser and AttributeList::getKind did the
right thing, AttributeList::getAttributeSpellingListIndex did not.
Diego Novillo [Tue, 25 Aug 2015 15:25:13 +0000 (15:25 +0000)]
Convert SampleProfile pass into a Module pass.
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models. To do this, we need the sample profile pass to
be a module pass.
This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.
Alexey Bataev [Tue, 25 Aug 2015 14:24:04 +0000 (14:24 +0000)]
[OPENMP 4.0] Initial support for array sections.
Adds parsing/sema analysis/serialization/deserialization for array sections in OpenMP constructs (introduced in OpenMP 4.0).
Currently it is allowed to use array sections only in OpenMP clauses that accepts list of expressions.
Differential Revision: http://reviews.llvm.org/D10732
Eric Christopher [Tue, 25 Aug 2015 13:45:28 +0000 (13:45 +0000)]
Rewrite the PPC target feature handling to more resemble other targets.
This involved specializing handleUserFeatures so that we could perform
diagnostics on -only- user supplied features and migrating the rest of
the initialization functions to set features based on enabling and disabling
full feature sets. No functional change intended.
[X86] Expose the various _rot intrinsics on non-MS platforms
_rotl, _rotwl and _lrotl (and their right-shift counterparts) are official x86
intrinsics, and should be supported regardless of environment. This is in contrast
to _rotl8, _rotl16, and _rotl64 which are MS-specific.
Note that the MS documentation for _lrotl is different from the Intel
documentation. Intel explicitly documents it as a 64-bit rotate, while for MS,
since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is implied.
Akira Hatanaka [Mon, 24 Aug 2015 19:50:35 +0000 (19:50 +0000)]
[ARM] Error out on apple darwin platforms if float-abi is "hard".
Error out if the user tries to use float-abi="hard" since it isn't
supported on darwin platforms. Previously clang issued no warnings or
erros and just passed the option to the backend, which had no effect on
code generation for targets using apcs.
Keith Walker [Mon, 24 Aug 2015 10:11:14 +0000 (10:11 +0000)]
[AArch64] Define the macro __ARM_FP16_ARGS
The ACLE (ARM C Language Extensions) 2.0 defines that the predefined macro
__ARM_FP16_ARGS should be defined if __fp16 can be used as an argument and
result.
The support for __fp16 to be used as an argument and result is already
implemented for AArch64 so this change is just adding the missing macro.
Richard Smith [Mon, 24 Aug 2015 03:38:11 +0000 (03:38 +0000)]
[modules] If local submodule visibility is disabled, don't bother checking
whether the owning module of a hidden declaration is visible -- it can't be.
Richard Smith [Mon, 24 Aug 2015 03:33:22 +0000 (03:33 +0000)]
[modules] Stop updating all identifiers when writing a module. This is
unnecessary in C++ modules (where we don't need the identifiers for their
Decls) and expensive.
Faisal Vali [Sun, 23 Aug 2015 13:14:42 +0000 (13:14 +0000)]
Add a missing 'classof' to AST Node TypoExpr to identify its 'Kind'.
I'm not sure why TypoExpr had its classof left out - but I expect every AST node should fulfill the 'contract of classof' (http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html).
There should be no functionality change. I just happened to notice it was missing, while messing around with something else.
Joseph Tremoulet [Sun, 23 Aug 2015 00:26:48 +0000 (00:26 +0000)]
[WinEH] Update to new EH pad/ret signatures (with tokens required)
Summary:
The signatures of the methods in LLVM for creating EH pads/rets are changing
to require token arguments on rets and assume token return type on pads.
Update creation code accordingly.
Richard Smith [Sat, 22 Aug 2015 21:37:34 +0000 (21:37 +0000)]
Improve the performance of resolving a lookup result. We usually don't need to
pick the most recent declaration, and we can often tell which declaration is
more recent without walking the redeclaration chain. Do so when possible.
Richard Smith [Sat, 22 Aug 2015 20:13:39 +0000 (20:13 +0000)]
[modules] Further simplification and speedup of redeclaration chain loading.
Instead of eagerly deserializing a list of DeclIDs when we load a module file
and doing a binary search to find the redeclarations of a decl, store a list of
redeclarations of each chain before the first declaration and load it directly.
Jingyue Wu [Sat, 22 Aug 2015 05:49:28 +0000 (05:49 +0000)]
[CUDA] Change initializer for CUDA device code based on CUDA documentation.
Summary:
According to CUDA documentation, global variables declared with __device__,
__constant__ can be initialized from host code, so mark them as
externally initialized. Because __shared__ variables cannot have an
initialization as part of their declaration and since the value maybe kept
across different kernel invocation, the value of __shared__ is effectively
undefined instead of zero initialized.
Wrongly using zero initializer may cause illegitimate optimization, e.g.
removing unused __constant__ variable because it's not updated in the device
code and the value is initialized with zero.