Philip Reames [Fri, 27 Mar 2015 05:09:33 +0000 (05:09 +0000)]
Require a GC strategy be specified for functions which use gc.statepoint
This was discussed a while back and I left it optional for migration. Since it's been far more than the 'week or two' that was discussed, time to actually make this manditory.
Philip Reames [Fri, 27 Mar 2015 04:52:48 +0000 (04:52 +0000)]
Allow explicit spill slots to be specified for a gc.statepoint
This patch adds support for explicitly provided spill slots in the GC arguments of a gc.statepoint. This is somewhat analogous to gcroot, but leverages the STATEPOINT MI node and StackMap infrastructure. The motivation for this is:
1) The stack spilling code for gc.statepoints hasn't advanced as fast as I'd like. One major option is to give up on doing spilling in the backend and do it at the IR level instead. We'd give up the ability to have gc values in registers, but that's a minor cost in practice. We are not neccessarily moving in that direction, but having the ability to prototype such a thing cheaply is interesting.
2) I want to port the gcroot lowering to use the statepoint infastructure. Given the metadata printers for gcroot expect a fixed set of stack roots, it's easiest to just reuse the explicit stack slots and pass them directly to the underlying statepoint.
I'm holding off on the documentation for the new feature until I'm reasonable sure this is going to stick around.
David Majnemer [Fri, 27 Mar 2015 04:17:07 +0000 (04:17 +0000)]
WinEH: Create a parent frame alloca for HandlerType xdata tables
We don't have any logic to emit those tables yet, so the SDAG lowering
of this intrinsic is just a stub. We can see the intrinsic in the
prepared IR, though.
Karthik Bhat [Fri, 27 Mar 2015 03:44:15 +0000 (03:44 +0000)]
Refactor Code inside LoopVectorizer's function isInductionVariable.
This patch exposes LoopVectorizer's isInductionVariable function as common
a functionality.
http://reviews.llvm.org/D8608
Andrew Trick [Fri, 27 Mar 2015 03:44:13 +0000 (03:44 +0000)]
Fix a bug in SelectionDAG scheduling backtracking code: PR22304.
It can happen (by line CurSU->isPending = true; // This SU is not in
AvailableQueue right now.) that a SUnit is mark as available but is
not in the AvailableQueue. For SUnit being selected for scheduling
both conditions must be met.
This patch mainly defensively protects from invalid removing a node
from a queue. Sometimes nodes are marked isAvailable but are not in
the queue because they have been defered due to some hazard.
DebugInfo: Update testcases with invalid variables
Fix testcases whose variables are invalid. I'm working on a patch that
adds `Verifier` checks for `MDLocalVariable` (and `MDGlobalVariable`),
and these failed because:
- `scope:` fields need to point at `MDLocalScope` and can't be null.
- `file:` fields need to point at `MDFile`.
- `inlinedAt:` fields need to point at `MDLocation`.
DIBuilder: Change a few helpers to return downcasted MDNodes
Change `getNonCompileUnitScope()` to return `MDScope` and
`getConstantAsMetadata()` to return `ConstantAsMetadata`. This will
make it easier to start requiring more type safety in the debug info
hierarchy.
AsmWriter: Cleanup debug info fields with MDFieldPrinter, NFC
Move all the `MDNode` field helper methods into a new class,
`MDFieldPrinter`, and add helpers for integers, bools, and `DW_*`
symbolic constants. This reduces a ton of code duplication, and makes
it more mechanical to update `AsmWriter` to print broken code in the
context of stricter accessors (like in r233322).
Ahmed Bougacha [Thu, 26 Mar 2015 22:44:58 +0000 (22:44 +0000)]
[CodeGen] Don't pretend we can expand f16 libcalls.
We used to mark a bunch of libm nodes as Expand for f16. There are no
libcalls we can use for those, so we eventually just hit an unhelpful
llvm_unreachable in ExpandFPLibCall.
Instead, just ignore them altogether. If nothing else changes, we'll
then get the more descriptive and pleasant "Cannot select" fatal error.
There's an argument to be made for consistency, but f16 is already
special in all the good ways, and as long as there's no f16 support in
the ops expander (this patch), as well as the Soften/Expand float
legalizers (which, when hit, will currently segfault), I think there's
no point in even pretending we can legalize any of this.
This shouldn't affect anything that's not already broken.
Derek Schuff [Thu, 26 Mar 2015 22:11:00 +0000 (22:11 +0000)]
Use movw/movt instead of constant pool loads to lower byval parameter copies
Summary:
The ARM backend can use a loop to implement copying byval parameters before
a call. In non-thumb2 mode it uses a constant pool load to materialize the
trip count. For targets that need movt instead (e.g. Native Client), use
the same code as in thumb2 mode to materialize the trip count.
Check accessors of `MDLocation`, and change them to `cast<>` down to the
right types. Also add type-safe factory functions.
All the callers that handle broken code need to use the new versions of
the accessors (`getRawScope()` instead of `getScope()`) that still
return `Metadata*`. This is also necessary for things like
`MDNodeKeyImpl<MDLocation>` (in LLVMContextImpl.h) that need to unique
the nodes when their operands might still be forward references of the
wrong type.
In the `Value` hierarchy, consumers that handle broken code use
`getOperand()` directly. However, debug info nodes have a ton of
operands, and their order (even their existence) isn't stable yet. It's
safer and more maintainable to add an explicit "raw" accessor on the
class itself.
Derek Schuff [Thu, 26 Mar 2015 21:58:46 +0000 (21:58 +0000)]
Default to armv7 cpu for NaCl when march=arm
Summary:
When the arch is given as "arm" clang uses the default target CPU from
LLVM to determine what the real arch should be (i.e. "arm" becomes
"armv4t" because LLVM's getARMCPUForArch falls back to "arm7tdmi").
Default to "cortex-a8" so that we end up with "armv7" in clang.
the nacl-direct.c test in clang also covers this case.
Yaron Keren [Thu, 26 Mar 2015 19:45:19 +0000 (19:45 +0000)]
Fix rare case where APInt divide algorithm applied un-needed transformation.
APInt uses Knuth's D algorithm for long division. In rare cases the
implementation applied a transformation that was not needed.
Added unit tests for long division. KnuthDiv() procedure is fully covered.
There is a case in APInt::divide() that I believe is never used (marked with
a comment) as all users of divide() handle trivial cases earlier.
Sanjoy Das [Thu, 26 Mar 2015 19:25:01 +0000 (19:25 +0000)]
[ADT][CMake][AutoConf] Fail-fast iterators for DenseMap
Summary:
This patch is an attempt at making `DenseMapIterator`s "fail-fast".
Fail-fast iterators that have been invalidated due to insertion into
the host `DenseMap` deterministically trip an assert (in debug mode)
on access, instead of non-deterministically hitting memory corruption
issues.
Enabling fail-fast iterators breaks the LLVM C++ ABI, so they are
predicated on `LLVM_ENABLE_ABI_BREAKING_CHECKS`.
`LLVM_ENABLE_ABI_BREAKING_CHECKS` by default flips with
`LLVM_ENABLE_ASSERTS`, but can be clamped to ON or OFF using the CMake /
autoconf build system.
Reapply "Linker: Drop function pointers for overridden subprograms"
This reverts commit r233254, effectively reapplying r233164 (and its
successors), with an additional testcase for when subprograms match
exactly. This fixes PR22792 (again).
I'm using the same approach, but I've moved up the call to
`stripReplacedSubprograms()`. The function pointers need to be dropped
before mapping any metadata from the source module, or else this can
drop the function from new subprograms that have merged (via Metadata
uniquing) with the old ones. Dropping the pointers first prevents them
from merging.
**** The original commit message follows. ****
Linker: Drop function pointers for overridden subprograms
Instead of dropping subprograms that have been overridden, just set
their function pointers to `nullptr`. This is a minor adjustment to the
stop-gap fix for PR21910 committed in r224487, and fixes the crasher
from PR22792.
The problem that r224487 put a band-aid on: how do we find the canonical
subprogram for a `Function`? Since the backend currently relies on
`DebugInfoFinder` (which does a naive in-order traversal of compile
units and picks the first subprogram) for this, r224487 tried dropping
non-canonical subprograms.
Dropping subprograms fails because the backend *also* builds up a map
from subprogram to compile unit (`DwarfDebug::SPMap`) based on the
subprogram lists. A missing subprogram causes segfaults later when an
inlined reference (such as in this testcase) is created.
Instead, just drop the `Function` pointer to `nullptr`, which nicely
mirrors what happens when an already-inlined `Function` is optimized
out. We can't really be sure that it's the same definition anyway, as
the testcase demonstrates.
This still isn't completely satisfactory. Two flaws at least that I can
think of:
- I still haven't found a straightforward way to make this symmetric
in the IR. (Interestingly, the DWARF output is already symmetric,
and I've tested for that to be sure we don't regress.)
- Using `DebugInfoFinder` to find the canonical subprogram for a
function is kind of crazy. We should just attach metadata to the
function, like this:
[AArch64, ARM] Add v8.1a architecture and generic cpu
New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development
Aaron Ballman [Thu, 26 Mar 2015 16:24:38 +0000 (16:24 +0000)]
Sometimes report_fatal_error is called when there is not a handler function used to fail gracefully. In that case, RunInterruptHandlers is called, which attempts to enter a critical section object. Ensure that the critical section is properly initialized so that this code functions properly, and tools like clang-tidy do not crash in Debug builds.
Revert "Linker: Drop function pointers for overridden subprograms"
This reverts commit r233164 and its testcase follow-ups in r233165,
r233207, r233214, and r233221. It apparently unleashed an LTO bootstrap
failure, at least on Darwin:
Quentin Colombet [Thu, 26 Mar 2015 01:01:48 +0000 (01:01 +0000)]
[RegisterCoalescer] Add a rule to consider more profitable copies first when
those are in the same basic block.
The previous approach was the topological order of the basic block.
Eric Christopher [Thu, 26 Mar 2015 00:50:23 +0000 (00:50 +0000)]
Add computeFSAdditions to the function based subtarget creation
for PPC due to some unfortunate default setting via TargetMachine
creation. I've added a FIXME on how this can be unraveled in the
backend and a test to make sure we successfully legalize 64-bit things
if we say we're 64-bits.
Otherwise, broken input modules can cause assertions. I've updated two
of the testcases that started failing (modules that had `Require` flags
but didn't meet their own requirements), but Rafael and I decided that
test/Linker/2011-08-22-ResolveAlias.ll should just be deleted outright
-- it's a leftover of the way llvm-gcc used to implement weakref.
Simon Pilgrim [Wed, 25 Mar 2015 22:30:31 +0000 (22:30 +0000)]
[DAGCombiner] Add support for TRUNCATE + FP_EXTEND vector constant folding
This patch adds supports for the vector constant folding of TRUNCATE and FP_EXTEND instructions and tidies up the SINT_TO_FP and UINT_TO_FP instructions to match.
It also moves the vector constant folding for the FNEG and FABS instructions to use the DAG.getNode() functionality like the other unary instructions.
Linker: Stop using -gmlt test/Linker/subprogram-linkonce-weak.ll
As dblaikie pointed out, if I stop setting `emissionKind: 2` then the
backend won't do magical things on Linux vs. Darwin. I had wrongly
assumed that there were stricter requirements on the input if we weren't
in line-tables-only mode, but apparently not.
With that knowledge, clean up this testcase a little more.
- Set `emissionKind: 1`.
- Add back checks for the weak version of @foo.
- Check more robustly that we have the right subprograms by checking
the `DW_AT_decl_file` and `DW_AT_decl_line` which now show up.
- Check the line table in isolation (since it's no longer doubling as
an indirect test for the subprogram of the weak version of @foo).
Matthias Braun [Wed, 25 Mar 2015 21:18:24 +0000 (21:18 +0000)]
RegisterCoalescer: Fix implicit def handling in register coalescer
If liveranges induced by an IMPLICIT_DEF get completely covered by a
proper liverange the IMPLICIT_DEF instructions and its corresponding
definitions have to be removed from the live ranges. This has to happen
in the subregister live ranges as well (I didn't see this case earlier
because in most programs only some subregisters are covered and the
IMPLCIT_DEF won't get removed).
No testcase, I spent hours trying to create one for one of the public
targets, but ultimately failed because I couldn't manage to properly
control the placement of COPY and IMPLICIT_DEF instructions from an .ll
file.
Reid Kleckner [Wed, 25 Mar 2015 20:10:36 +0000 (20:10 +0000)]
WinEH: Create an unwind help alloca for __CxxFrameHandler3 xdata tables
We don't have any logic to emit those tables yet, so the sdag lowering
of this intrinsic is just a stub. We can see the intrinsic in the
prepared IR, though.
Rewrite the checks from r233164 that I temporarily disabled in r233165.
It turns out that the line-tables only debug info we emit from `llc` is
(intentionally) different on Linux than on Darwin. r218129 started
skipping emission of subprograms with no inlined subroutines, and
r218702 was a spiritual revert of that behaviour for Darwin.
I think we can still test this in a platform-neutral way.
- Stop checking for the possibly missing `DW_TAG_subprogram` defining
the debug info for the real version of `@foo`.
- Start checking the line tables, ensuring that the right debug info
was used to generate them (grabbing `DW_AT_low_pc` from the compile
unit).
- I changed up the line numbers used in the "weak" version so it's
easier to follow.
Kit Barton [Wed, 25 Mar 2015 19:36:23 +0000 (19:36 +0000)]
Add Hardware Transactional Memory (HTM) Support
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], but currently only the
'PowerPC HTM Low Level Built-in Function' are implemented.
The HTM instructions follows the RC ones and the transaction initiation result
is set on RC0 (with exception of tcheck). Currently approach is to create a
register copy from CR0 to GPR and comapring. Although this is suboptimal, since
the branch could be taken directly by comparing the CR0 value, it generates code
correctly on both test and branch and just return value. A possible future
optimization could be elimitate the MFCR instruction to branch directly.
The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.
This is send along a clang patch to enabled the builtins and option switch.
DebugInfo: Permit DW_TAG_structure_type, DW_TAG_member, DW_TAG_typedef tags with empty file names.
Some languages, such as Go, have pre-defined structure types (e.g. "string"
is essentially a pointer/length pair) or pre-defined "typedef" types
(e.g. "error" is essentially a typedef for a specific interface type).
Such types do not have associated source location, so a Go frontend would
be correct not to associate a file name with such types.
This change relaxes the DIType verifier to permit unlocated types with
these tags.
Sanjay Patel [Wed, 25 Mar 2015 17:34:11 +0000 (17:34 +0000)]
use update_llc_test_checks.py to tighten checking in these tests
1. There were no CHECK-LABELs, so we could match instructions from the wrong function.
2. The use of zero operands meant multiple xor instructions could match some CHECKs.
3. The test was over-specified to need a Sandybridge CPU and Darwin triple.
Justin Bogner [Wed, 25 Mar 2015 08:07:47 +0000 (08:07 +0000)]
test: Fix the dependencies for the check-llvm-* targets
In r233009 we gained specific check-llvm-* build targets for invoking
specific parts of the test suite, but they were copying the
dependencies for check-all, rather than just listing the dependencies
for check-llvm.
This moves the creation of these targets next to the check-llvm
target, and uses that target's configuration rather than the check-all
config.
Craig Topper [Wed, 25 Mar 2015 04:16:50 +0000 (04:16 +0000)]
[X86] Remove GetCpuIDAndInfo, GetCpuIDAndInfoEx and DetectFamilyModel functions from X86 MC layer. They haven't been used since CPU autodetection was removed from X86Subtarget.cpp.
Linker: Temporarily disable dwarfdump checks from r233164
At least one Linux bot [1] doesn't like my dwarfdump checks, so I've
disable those until I can investigate what's going on there. I'll
continue to track this in PR22792.
Linker: Drop function pointers for overridden subprograms
Instead of dropping subprograms that have been overridden, just set
their function pointers to `nullptr`. This is a minor adjustment to the
stop-gap fix for PR21910 committed in r224487, and fixes the crasher
from PR22792.
The problem that r224487 put a band-aid on: how do we find the canonical
subprogram for a `Function`? Since the backend currently relies on
`DebugInfoFinder` (which does a naive in-order traversal of compile
units and picks the first subprogram) for this, r224487 tried dropping
non-canonical subprograms.
Dropping subprograms fails because the backend *also* builds up a map
from subprogram to compile unit (`DwarfDebug::SPMap`) based on the
subprogram lists. A missing subprogram causes segfaults later when an
inlined reference (such as in this testcase) is created.
Instead, just drop the `Function` pointer to `nullptr`, which nicely
mirrors what happens when an already-inlined `Function` is optimized
out. We can't really be sure that it's the same definition anyway, as
the testcase demonstrates.
This still isn't completely satisfactory. Two flaws at least that I can
think of:
- I still haven't found a straightforward way to make this symmetric
in the IR. (Interestingly, the DWARF output is already symmetric,
and I've tested for that to be sure we don't regress.)
- Using `DebugInfoFinder` to find the canonical subprogram for a
function is kind of crazy. We should just attach metadata to the
function, like this:
Paul Robinson [Wed, 25 Mar 2015 00:10:24 +0000 (00:10 +0000)]
'optnone' should not disable DAG combiner.
Reverts the code change from r221168 and the relevant test.
It was a mistake to disable the combiner, and based on the ultimate
definition of 'optnone' we shouldn't have considered the test case
as failing in the first place.
Philip Reames [Tue, 24 Mar 2015 23:54:54 +0000 (23:54 +0000)]
!invariant.load semantics with potentially clobbering calls
A load from an invariant location is assumed to not alias any otherwise potentially aliasing stores. Our implementation only applied this rule to store instructions themselves whereas they it should apply for any memory accessing instruction. This results in both FRE and PRE becoming more effective at eliminating invariant loads.
Note that as a follow on change I will likely move this into AliasAnalysis itself. That's where the TBAA constant flag is handled and the semantics are essentially the same. I'd like to separate the semantic change from the refactoring and thus have extended the hack that's already in MemoryDependenceAnalysis for this change.
Reid Kleckner [Tue, 24 Mar 2015 23:46:01 +0000 (23:46 +0000)]
X86: Fix frameescape when not using an FP
We can't use TargetFrameLowering::getFrameIndexOffset directly, because
Win64 really wants the offset from the stack pointer at the end of the
prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(),
which is a pretty close approximiation of that. It fails to handle cases
with interestingly large stack alignments, which is pretty uncommon on
Win64 and is TODO.
Justin Bogner [Tue, 24 Mar 2015 23:34:36 +0000 (23:34 +0000)]
llvm-cov: Require a subcommand when invoked as llvm-cov
A while ago llvm-cov gained support for clang's instrumentation based
profiling in addition to its gcov support, and subcommands were added
to choose which behaviour to use. When no subcommand was specified, we
fell back to gcov compatibility with a warning that a subcommand would
be required in the future. Now, we require the subcommand.
Note that if the basename of llvm-cov is gcov (via symlink or
hardlink, for example), we still use the gcov compatible behaviour
with no subcommand required.
David Blaikie [Tue, 24 Mar 2015 23:34:31 +0000 (23:34 +0000)]
Opaque Pointer Types: GEP API migrations to specify the gep type explicitly
The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make
anything obviously better to have the caller access the pointers element
type (the thing I'm trying to remove) than the GEP itself, but it's a
helpful migration step. This will allow me to more obviously lock down
GEP (& Load, etc) API usage, then fix all the code that accesses pointer
element types except the places that need to be removed (most of the
InstCombines) anyway - at which point I'll need to just remove all that
code because it won't be meaningful anymore (there will be no pointer
types, so no bitcasts to combine)
SCEV looks like it'll need some restructuring - we'll have to do a bit
more work for GEP canonicalization, since it'll depend on how it's used
if we can even manage to canonicalize it to a non-ugly GEP. I guess we
can do some fun stuff like voting (do 2 out of 3 load from the GEP with
a certain type that gives a pretty GEP? Does every typed use of the GEP
use either a specific type or a generic type (i8*, etc)?)
Frederic Riss [Tue, 24 Mar 2015 23:11:07 +0000 (23:11 +0000)]
[dsymutil] Temporarily disable some tests on windows.
It seems one windows bot fails since I added ilne table linking to
llvm-dsymutil (see r232333 commit thread).
Disable the affected tests until I can figure out what's happening.
Sanjay Patel [Tue, 24 Mar 2015 22:39:29 +0000 (22:39 +0000)]
optimize the AVX2 (integer) version of vperm2 into a shuffle
...because this is what happens when an instruction
set puts its underwear on after its pants.
This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110