Simon Pilgrim [Tue, 25 Aug 2015 21:27:46 +0000 (21:27 +0000)]
[X86] Remove unnecessary MMX declarations from Intrin.h
As discussed in PR23648 - the intrinsics _m_from_int, _m_to_int and _m_prefetch are defined in mmintrin.h and prfchwintrin.h so we don't need to in Intrin.h
David Majnemer [Tue, 25 Aug 2015 16:44:38 +0000 (16:44 +0000)]
[Sema] Handle leading and trailing __ for GNU attributes
GNU attributes can have a leading and trailing __ appended/prepended to
the attribute name. While the parser and AttributeList::getKind did the
right thing, AttributeList::getAttributeSpellingListIndex did not.
Diego Novillo [Tue, 25 Aug 2015 15:25:13 +0000 (15:25 +0000)]
Convert SampleProfile pass into a Module pass.
Eventually, we will need sample profiles to be incorporated into the
inliner's cost models. To do this, we need the sample profile pass to
be a module pass.
This patch makes no functional changes beyond the mechanical adjustments
needed to run SampleProfile as a module pass.
Alexey Bataev [Tue, 25 Aug 2015 14:24:04 +0000 (14:24 +0000)]
[OPENMP 4.0] Initial support for array sections.
Adds parsing/sema analysis/serialization/deserialization for array sections in OpenMP constructs (introduced in OpenMP 4.0).
Currently it is allowed to use array sections only in OpenMP clauses that accepts list of expressions.
Differential Revision: http://reviews.llvm.org/D10732
Eric Christopher [Tue, 25 Aug 2015 13:45:28 +0000 (13:45 +0000)]
Rewrite the PPC target feature handling to more resemble other targets.
This involved specializing handleUserFeatures so that we could perform
diagnostics on -only- user supplied features and migrating the rest of
the initialization functions to set features based on enabling and disabling
full feature sets. No functional change intended.
[X86] Expose the various _rot intrinsics on non-MS platforms
_rotl, _rotwl and _lrotl (and their right-shift counterparts) are official x86
intrinsics, and should be supported regardless of environment. This is in contrast
to _rotl8, _rotl16, and _rotl64 which are MS-specific.
Note that the MS documentation for _lrotl is different from the Intel
documentation. Intel explicitly documents it as a 64-bit rotate, while for MS,
since sizeof(unsigned long) for MSVC is always 4, a 32-bit rotate is implied.
Akira Hatanaka [Mon, 24 Aug 2015 19:50:35 +0000 (19:50 +0000)]
[ARM] Error out on apple darwin platforms if float-abi is "hard".
Error out if the user tries to use float-abi="hard" since it isn't
supported on darwin platforms. Previously clang issued no warnings or
erros and just passed the option to the backend, which had no effect on
code generation for targets using apcs.
Keith Walker [Mon, 24 Aug 2015 10:11:14 +0000 (10:11 +0000)]
[AArch64] Define the macro __ARM_FP16_ARGS
The ACLE (ARM C Language Extensions) 2.0 defines that the predefined macro
__ARM_FP16_ARGS should be defined if __fp16 can be used as an argument and
result.
The support for __fp16 to be used as an argument and result is already
implemented for AArch64 so this change is just adding the missing macro.
Richard Smith [Mon, 24 Aug 2015 03:38:11 +0000 (03:38 +0000)]
[modules] If local submodule visibility is disabled, don't bother checking
whether the owning module of a hidden declaration is visible -- it can't be.
Richard Smith [Mon, 24 Aug 2015 03:33:22 +0000 (03:33 +0000)]
[modules] Stop updating all identifiers when writing a module. This is
unnecessary in C++ modules (where we don't need the identifiers for their
Decls) and expensive.
Faisal Vali [Sun, 23 Aug 2015 13:14:42 +0000 (13:14 +0000)]
Add a missing 'classof' to AST Node TypoExpr to identify its 'Kind'.
I'm not sure why TypoExpr had its classof left out - but I expect every AST node should fulfill the 'contract of classof' (http://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html).
There should be no functionality change. I just happened to notice it was missing, while messing around with something else.
Joseph Tremoulet [Sun, 23 Aug 2015 00:26:48 +0000 (00:26 +0000)]
[WinEH] Update to new EH pad/ret signatures (with tokens required)
Summary:
The signatures of the methods in LLVM for creating EH pads/rets are changing
to require token arguments on rets and assume token return type on pads.
Update creation code accordingly.
Richard Smith [Sat, 22 Aug 2015 21:37:34 +0000 (21:37 +0000)]
Improve the performance of resolving a lookup result. We usually don't need to
pick the most recent declaration, and we can often tell which declaration is
more recent without walking the redeclaration chain. Do so when possible.
Richard Smith [Sat, 22 Aug 2015 20:13:39 +0000 (20:13 +0000)]
[modules] Further simplification and speedup of redeclaration chain loading.
Instead of eagerly deserializing a list of DeclIDs when we load a module file
and doing a binary search to find the redeclarations of a decl, store a list of
redeclarations of each chain before the first declaration and load it directly.
Jingyue Wu [Sat, 22 Aug 2015 05:49:28 +0000 (05:49 +0000)]
[CUDA] Change initializer for CUDA device code based on CUDA documentation.
Summary:
According to CUDA documentation, global variables declared with __device__,
__constant__ can be initialized from host code, so mark them as
externally initialized. Because __shared__ variables cannot have an
initialization as part of their declaration and since the value maybe kept
across different kernel invocation, the value of __shared__ is effectively
undefined instead of zero initialized.
Wrongly using zero initializer may cause illegitimate optimization, e.g.
removing unused __constant__ variable because it's not updated in the device
code and the value is initialized with zero.
Richard Smith [Sat, 22 Aug 2015 01:47:18 +0000 (01:47 +0000)]
[modules] Rearrange how redeclaration chains are loaded, to remove a walk over
all modules and reduce the number of declarations we load when loading a
redeclaration chain.
The new approach is:
* when loading the first declaration of an entity within a module file, we
first load all declarations of the entity that were imported into that
module file, and then load all the other declarations of that entity from
that module file and build a suitable decl chain from them
* when loading any other declaration of an entity, we first load the first
declaration from the same module file
As before, we complete redecl chains through name lookup where necessary.
To make this work, I also had to change the way that template specializations
are stored -- it no longer suffices to track only canonical specializations; we
now emit all "first local" declarations when emitting a list of specializations
for a template.
On one testcase with several thousand imported module files, this reduces the
total runtime by 72%.
Ahmed Bougacha [Sat, 22 Aug 2015 01:30:13 +0000 (01:30 +0000)]
[ARM NEON] Remove special-case for f16 vcvt handling. NFCI.
We can use the 'H' typespec modifier to use 128-bit vectors directly
in the only two users of this special-case: the vcvt f16 intrinsics.
This also lets us use more meaningful prototype modifiers.
John McCall [Sat, 22 Aug 2015 00:35:27 +0000 (00:35 +0000)]
When building a pseudo-object assignment, and the RHS is
a contextually-typed expression that semantic analysis will
probably need to invasively rewrite, don't include the
RHS OVE as a separate semantic expression, and check the
operation with the original RHS expression.
There are two contextually-typed expressions that can survive
to here: overloaded function references, which are at least
safe to double-emit, and C++11 initializer list expressions,
which are not at all safe to double-emit and which often
don't update the original syntactic InitListExpr with
implicit conversions to member types, etc.
This means that the original RHS may appear, undecorated by
an OVE, in the semantic expressions. Fortunately, it will
only ever be used in a single place there, and I don't
believe there are clients that rely on being able to pick
out the original RHS from the semantic expressions.
But this could be problematic if there are clients that do
visit the entire tree and rely on not seeing the same
expression multiple times, once in the syntactic and once
in the semantic expressions. This is a very fiddly part
of the compiler.
Ahmed Bougacha [Fri, 21 Aug 2015 23:34:20 +0000 (23:34 +0000)]
[ARM NEON] Use the common naming scheme for vcvt f16 builtins. NFC.
We had "vcvt_f16" and "VCVT_HIGH_F16": for other FP types, this naming
is used for intrinsics with integer overloads. The FP->FP conversions,
on the other hand, use the full "vcvt_f32_f64" name instead.
Use the same naming convention for the f16<->f32 conversions.
While there, reorder the definitions a little bit.
Alexey Bataev [Fri, 21 Aug 2015 12:19:04 +0000 (12:19 +0000)]
[OPENMP 4.1] Add codegen for 'simdlen' clause.
Add emission of metadata for simd loops in presence of 'simdlen' clause.
If 'simdlen' clause is provided without 'safelen' clause, the vectorizer width for the loop is set to value of 'simdlen' clause + all read/write ops in loop are marked with '!llvm.mem.parallel_loop_access' metadata.
If 'simdlen' clause is provided along with 'safelen' clause, the vectorizer width for the loop is set to value of 'simdlen' clause + all read/write ops in loop are not marked with '!llvm.mem.parallel_loop_access' metadata.
If 'safelen' clause is provided without 'simdlen' clause, the vectorizer width for the loop is set to value of 'safelen' clause + all read/write ops in loop are not marked with '!llvm.mem.parallel_loop_access' metadata.
The block formatting logic in JavaScript will probably go some other changes,
too, and we'll potentially be able to make the rules more consistent again. For
now, this seems to be the best approach for C++.
Alexey Bataev [Fri, 21 Aug 2015 11:14:16 +0000 (11:14 +0000)]
[OPENMP 4.1] Initial support for 'simdlen' clause.
Add parsing/sema analysis for 'simdlen' clause in simd directives. Also add check that if both 'safelen' and 'simdlen' clauses are specified, the value of 'simdlen' parameter is less than the value of 'safelen' parameter.
David Majnemer [Fri, 21 Aug 2015 06:44:10 +0000 (06:44 +0000)]
[Sema] Don't crash when diagnosing hack in libstdc++
While working around a bug in certain standard library implementations,
we would try to diagnose the issue so that library implementors would
fix their code. However, we assumed an entity being initialized was
a non-static data member subobject when other circumstances are
possible.
Alexey Bataev [Fri, 21 Aug 2015 06:41:23 +0000 (06:41 +0000)]
[OPENMP 4.1] Improved codegen for 'uval' qualifier of 'linear' clause.
According to standard the 'uval' modifier declares the address of the original list item to have an invariant value for all iterations of the associated loop(s). Patch improves codegen for this qualifier by removing usage of the original reference variable and replacing by referenced l-value.
Richard Trieu [Fri, 21 Aug 2015 03:43:09 +0000 (03:43 +0000)]
Fix a few things with -Winfinite-recursion. NFC
Now that -Winfinite-recursion no longer uses recursive calls to before path
analysis, several bits of the code can be improved. The main changes:
1) Early return when finding a path to the exit block without a recursive call
2) Moving the states vector into checkForRecursiveFunctionCall instead of
passing it in by reference
3) Change checkForRecursiveFunctionCall to return a bool when the warning
should be emitted.
4) Use the State vector instead of storing it in the Stack vector.
Richard Smith [Fri, 21 Aug 2015 03:04:33 +0000 (03:04 +0000)]
[modules] When we see a definition of a function for which we already have a
non-visible definition, skip the new definition to avoid ending up with a
function with multiple definitions.
Evgeniy Stepanov [Thu, 20 Aug 2015 21:47:16 +0000 (21:47 +0000)]
Revert r245344.
That change is causing strange test failures on Fedora 22 (PR24503),
and it does not have any effect with Gold linker anyway (PR15823,
https://sourceware.org/bugzilla/show_bug.cgi?id=18859).
Ahmed Bougacha [Thu, 20 Aug 2015 20:27:21 +0000 (20:27 +0000)]
[Headers][X86] Use __builtin_shufflevector in AVX2 broadcasts.
This lets us optimize them better. We agreed to remove the intrinsics,
instead of combining them later, as, at -O0, we generate the expected
instructions. Plus, it's a nice cleanup.
Olivier Goffart [Thu, 20 Aug 2015 13:11:14 +0000 (13:11 +0000)]
Fix crash with two typos in the arguments of a function
The problem is that the arguments are of TheCall are reset later
to the ones in Args, making TypoExpr put back. Some TypoExpr that have
already been diagnosed and will assert later in Sema::getTypoExprState
Alexey Bataev [Thu, 20 Aug 2015 12:15:57 +0000 (12:15 +0000)]
[OPENMP 4.1] Allow to use 'uval' and 'ref' modifiers for reference types only.
Standard allows to use 'uval' and 'ref' modifiers in 'linear' clause for variables with reference types only. Added check for it and modified test.
Alexey Bataev [Thu, 20 Aug 2015 10:54:39 +0000 (10:54 +0000)]
[OPENMP 4.1] Initial support for modifiers in 'linear' clause.
OpenMP 4.1 adds 3 optional modifiers to 'linear' clause.
Format of 'linear' clause has changed to:
```
linear(linear-list[ : linear-step])
```
where linear-list is one of the following
```
list
modifier(list)
```
where modifier is one of the following:
```
ref (C++)
val (C/C++)
uval (C++)
```
Patch adds parsing and sema analysis for these modifiers.
John McCall [Wed, 19 Aug 2015 22:42:36 +0000 (22:42 +0000)]
Fix the layout of bitfields in ms_struct unions: their
alignment is ignored, and they always allocate a complete
storage unit.
Also, change the dumping of AST record layouts: use the more
readable C++-style dumping even in C, include bitfield offset
information in the dump, and don't print sizeof/alignof
information for fields of record type, since we don't do so
for bases or other kinds of field.
Richard Smith [Wed, 19 Aug 2015 20:49:38 +0000 (20:49 +0000)]
Internal-linkage variables with constant-evaluatable initializers do not need to be emitted. (Also reduces the set of variables that need to be eagerly deserialized when using PCH / modules.)
James Y Knight [Wed, 19 Aug 2015 15:12:02 +0000 (15:12 +0000)]
Properly pass through the PIC mode to the integrated assembler when
doing assembly-only, and unify the Driver's PIC argument parsing.
On a few architectures, parsing of assembly files annoyingly depends
on whether PIC is enabled or not. This was handled for external 'as'
already (passing -KPIC), but was missed for calls to the standalone
internal assembler.
The integrated-as.s test needed to be modified to not expect
-fsanitize=address to be unused, as now fsanitize *IS* used for
assembly, since -fsanitize=memory can sometimes imply -fPIE, which the
assembler needs to know (gack!!).