Alex Lorenz [Tue, 30 Jun 2015 17:55:00 +0000 (17:55 +0000)]
MIR Parser: refactor error reporting for machine instruction parser errors. NFC.
This commit extracts the code that reports an error that's produced by the
machine instruction parser into a new method that can be reused in other places.
Alex Lorenz [Tue, 30 Jun 2015 17:47:50 +0000 (17:47 +0000)]
MIR Parser: make the machine instruction parsing interface more consistent. NFC.
This commit refactors the interface for machine instruction parser. It adopts
the pattern of returning a bool and passing in the result in the first argument
that is used by the other parsing methods for the the method 'parse' and the
function 'parseMachineInstr'.
Dan Liew [Tue, 30 Jun 2015 17:16:39 +0000 (17:16 +0000)]
[CMake] Make the CMake files (LLVMConfig.cmake and LLVMExports.cmake)
generated by the Autoconf/Makefile build system relocatable.
Previously the generated CMake files contained hardcoded paths which
prevented a binary installation from being relocated to a different
place in the file system. This problem was most noticeable in LLVM's
official binary releases which were completely unusable by a downstream
project trying to import the CMake targets.
Package maintainers who choose to modify the install location of the
CMake directory without using the ``PROJ_cmake`` Makefile variable
override will need to patch the generated``LLVMConfig.cmake`` so that
``LLVM_INSTALL_PREFIX`` and ``_LLVM_CMAKE_DIR`` variables are set
correctly.
Alex Lorenz [Tue, 30 Jun 2015 16:51:29 +0000 (16:51 +0000)]
MIR Parser: adopt the 'maybeLex...' pattern. NFC.
This commit refactors the machine instruction lexer so that the lexing
functions use the 'maybeLex...' pattern, where they determine if they
can lex the current token by themselves.
Duplicating an FP register "as itself" is a bad idea, since it violates the
invariant that every FP register is mapped to at most one FPU stack slot.
Use the scratch FP register instead.
Toma Tabacu [Tue, 30 Jun 2015 13:46:03 +0000 (13:46 +0000)]
[mips] [IAS] Add support for the .module softfloat/hardfloat directives.
These directives are used to set the default value of the SoftFloat feature.
They have the same effect as setting -m{soft, hard}-float from the command line.
Ranjeet Singh [Tue, 30 Jun 2015 11:30:42 +0000 (11:30 +0000)]
There are a few places where subtarget features are still
represented by uint64_t, this patch replaces these
usages with the FeatureBitset (std::bitset) type.
Matthias Braun [Tue, 30 Jun 2015 00:33:44 +0000 (00:33 +0000)]
RegisterCoalescer: Cleanup empty subranges after shrinkToUses()
A call to removeEmptySubranges() is necessary after every operation that
potentially removes all segments from a subregister range; this case in
the register coalescer was missing.
Rui Ueyama [Tue, 30 Jun 2015 00:03:56 +0000 (00:03 +0000)]
Object/COFF: Define coff_symbol_generic.
If you only need Name and Value fields in the COFF symbol,
you don't need to distinguish 32 bit and 64 bit COFF symbols.
These fields start at the same offsets and have the same size.
This data strucutre is one pointer smaller than COFFSymbolRef
thus slightly efficient. I'll use this class in LLD as we create
millions of LLD symbol objects that currently contain COFFSymbolRef.
Shaving off 8 byte (or 4 byte on 32 bit) from that class actually
matters becasue of the number of objects we create in LLD.
Rafael Espindola [Mon, 29 Jun 2015 23:55:05 +0000 (23:55 +0000)]
Use asserts for checks that should never fail.
If a section is not SHT_REL or SHT_RELA, we never create a valid iterator,
so the getRelocation* methods should always see a section with the correct type.
lto: Clean up C libLTO interfaces pertaining to linker flags.
Specifically, remove the dependent library interface and replace the existing
linker option interface with a new one that returns a single list of flags.
Adrian Prantl [Mon, 29 Jun 2015 23:03:47 +0000 (23:03 +0000)]
Add a DIModule metadata node to the IR.
It is meant to be used to record modules @imported by the current
compile unit, so a debugger an import the same modules to replicate this
environment before dropping into the expression evaluator.
DIModule is a sibling to DINamespace and behaves quite similarly.
In addition to the name of the module it also records the module
configuration details that are necessary to uniquely identify the module.
This includes the configuration macros (e.g., -DNDEBUG), the include path
where the module.map file is to be found, and the isysroot.
The idea is that the backend will turn this into a DW_TAG_module.
Add all the new `Metadata` codes since LLVM 3.6, and at the same time
follow the precedent set in other blocks by removing the `METADATA_`
prefix from the string output.
Ben Langmuir [Mon, 29 Jun 2015 22:16:39 +0000 (22:16 +0000)]
Reapply "Use gethostuuid() on Mac to identify hosts for LockFileManager"
Reapplies r241005 after fixing the build on non-Mac platforms. Original
commit message below.
The hostname can be very unstable when there are many machines on the
network competing for the same name. Using the hardware UUID makes it
less likely to have collisions or to consider files written by the
current host to be owned by a different one at a later time.
Teach LTOModule to emit linker flags for dllexported symbols, plus interface cleanup.
This change unifies how LTOModule and the backend obtain linker flags
for globals: via a new TargetLoweringObjectFile member function named
emitLinkerFlagsForGlobal. A new function LTOModule::getLinkerOpts() returns
the list of linker flags as a single concatenated string.
This change affects the C libLTO API: the function lto_module_get_*deplibs now
exposes an empty list, and lto_module_get_*linkeropts exposes a single element
which combines the contents of all observed flags. libLTO should never have
tried to parse the linker flags; it is the linker's job to do so. Because
linkers will need to be able to parse flags in regular object files, it
makes little sense for libLTO to have a redundant mechanism for doing so.
The new API is compatible with the old one. It is valid for a user to specify
multiple linker flags in a single pragma directive like this:
The previous implementation would not have exposed
either flag via lto_module_get_*deplibs (as the test in
TargetLoweringObjectFileCOFF::getDepLibFromLinkerOpt was case sensitive)
and would have exposed "/defaultlib:foo /defaultlib:bar" as a single flag via
lto_module_get_*linkeropts. This may have been a bug in the implementation,
but it does give us a chance to fix the interface.
Ben Langmuir [Mon, 29 Jun 2015 21:47:44 +0000 (21:47 +0000)]
Use gethostuuid() on Mac to identify hosts for LockFileManager
The hostname can be very unstable when there are many machines on the
network competing for the same name. Using the hardware UUID makes it
less likely to have collisions or to consider files written by the
current host to be owned by a different one at a later time.
This is a new version of http://reviews.llvm.org/D10260.
It turned out that when you specify an integer register in inline asm on
x86 you get the register of the required type size back. That means that
X86TargetLowering::getRegForInlineAsmConstraint() has to accept any of
the integer registers and adapt its size to the given target size which
may be any 8/16/32/64 bit sized type. Surprisingly that means given a
constraint of "{ax}" and a type of MVT::F32 we need to return X86::EAX.
This change makes this face explicit, the previous code seemed like
working by accident because there it never returned an error once a
register was found. On the other hand this rewrite allows to actually
return errors for invalid situations like requesting an integer register
for an i128 type.
Alexey Samsonov [Mon, 29 Jun 2015 21:30:14 +0000 (21:30 +0000)]
[LoopSimplify] Set proper debug location in loop backedge blocks.
Set debug location for terminator instruction in loop backedge block
(which is an unconditional jump to loop header). We can't copy debug
location from original backedges, as there can be several of them,
with different debug info locations. So, we follow the approach of
SplitBlockPredecessors, and copy the debug info from first non-PHI
instruction in the header (i.e. destination block).
Dan Liew [Mon, 29 Jun 2015 18:45:56 +0000 (18:45 +0000)]
Fix bug #23967. The gtest and gtest_main targets were exported into the
CMake files and should not be by both build systems and also the targets
were also installed by the CMake build system which they should not be.
The problem was that
- the CMake build of LLVM installs and exports the gtest library
targets. We should not being doing this, these are not part of LLVM.
- the Autoconf/Makefile build of LLVM still had gtest libraries in the
installed LLVMConfig.cmake.
These problems would cause problems for an external project because when
calling llvm_map_components_to_libnames(XXX all) ${XXX} would to contain
LLVM's internal gtest libraries.
Avoid listing inclusions (like `!projects/LLVMBuild.txt`) for files
directly underneath `projects/` in `.gitignore`. Instead, change the
`projects/*` exclusion to the more specific `projects/*/`.
Ben Langmuir [Mon, 29 Jun 2015 17:08:41 +0000 (17:08 +0000)]
Clean up unique lock files on signal and always release the lock
Make sure to remove the unique lock file, which is what the .lock
symlink points to, if there is a signal while the lock is held. This
will release the lock, since the symlink will point to nothing (already
tested in unit tests). For good measure, also clean up the unique lock
file if there is an error or signal before the lock is acquired.
Alex Lorenz [Mon, 29 Jun 2015 16:57:06 +0000 (16:57 +0000)]
MIR Serialization: Serialize the register mask machine operands.
This commit implements serialization of the register mask machine
operands. This commit serializes only the call preserved register
masks that are defined by a target, it doesn't serialize arbitrary
register masks.
This commit also extends the TargetRegisterInfo class and TableGen so that
the users of TRI can get the list of all the call preserved register masks and
their names.
Tobias Grosser [Mon, 29 Jun 2015 14:42:48 +0000 (14:42 +0000)]
Move delinearization from SCEVAddRecExpr to ScalarEvolution
The expressions we delinearize do not necessarily have to have a SCEVAddRecExpr
at the outermost level. At this moment, the additional flexibility is not
exploited in LLVM itself, but in Polly we will soon soonish use this
functionality. For LLVM, this change should not affect existing functionality
(which is covered by test/Analysis/Delinearization/)
Rafael Espindola [Mon, 29 Jun 2015 12:38:31 +0000 (12:38 +0000)]
Remove Elf_Sym_Iter.
It was a fairly broken concept for an ELF only class.
An ELF file can have two symbol tables, but they have exactly the same
format. There is no concept of a dynamic or a static symbol. Storing this
on the iterator also makes us do more work per symbol than necessary. To fetch
a name we would:
* Find if we had a static or a dynamic symbol.
* Look at the corresponding symbol table and find the string table section.
* Look at the string table section to fetch its contents.
* Compute the name as a substring of the string table.
All but the last step can be done per symbol table instead of per symbol. This
is a step in that direction.
Javed Absar [Mon, 29 Jun 2015 09:32:29 +0000 (09:32 +0000)]
[ARM]: Extend -mfpu options for half-precision and vfpv3xd
Some of the the permissible ARM -mfpu options, which are supported in GCC,
are currently not present in llvm/clang.This patch adds the options:
'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16.
These are related to half-precision floating-point and single precision.
Benjamin Kramer [Sat, 27 Jun 2015 20:33:26 +0000 (20:33 +0000)]
[SDAG] Now that we have a way to communicate the exact bit on sdiv use it to simplify sdiv by a constant.
We had a hack in SDAGBuilder in place to work around this but now we
can avoid that. Call BuildExactSDIV from BuildSDIV so DAGCombiner can
perform this trick automatically.
The added check in DAGCombiner is necessary to prevent exact sdiv by pow2
from regressing as the target-specific pow2 lowering is not aware of
exact bits yet.
This is mostly covered by existing tests. One side effect is that we
get the better lowering for exact vector sdivs now too :)
Adrian Prantl [Sat, 27 Jun 2015 20:12:43 +0000 (20:12 +0000)]
Debug Info: One more bitfield bugfix. While yesterday's r240853 fixed
the DW_AT_bit_offset computation, the byte offset is in fact also
endian-dependent as it needs to point to the storage unit containing the
most-significant bit of the the bitfield.
I'm so looking forward to emitting the endian-agnostic DWARF 3 version
instead.
Peter Zotov [Sat, 27 Jun 2015 14:32:30 +0000 (14:32 +0000)]
[OCaml] Bump ctypes dependency to 0.4.
ctypes 0.3 and earlier contains an interface-definig bug:
its ptr_of_raw_address accepts Int64 and not Nativeint. ctypes 0.4
was not released during the 3.6 cycle, and because of that, LLVM 3.6
was released with ctypes 0.3 as a dependency, which now breaks
the build on modern ctypes.
David Majnemer [Sat, 27 Jun 2015 08:38:17 +0000 (08:38 +0000)]
[LoopVectorize] Pointer indicies may be wider than the pointer
If we are dealing with a pointer induction variable, isInductionPHI
gives back a step value of Stride / size of pointer. However, we might
be indexing with a legal type wider than the pointer width.
Handle this by inserting casts where appropriate instead of crashing.
David Majnemer [Sat, 27 Jun 2015 07:52:53 +0000 (07:52 +0000)]
[PruneEH] A naked, noinline function can return via InlineAsm
The PruneEH pass tries to annotate functions as 'noreturn' if it doesn't
see a ReturnInst. However, a naked function containing inline assembly
can contain control flow leaving the function.
Lang Hames [Sat, 27 Jun 2015 03:49:25 +0000 (03:49 +0000)]
[Stackmap] Pre-assemble the stackmap parser test case. (Fix builders).
This case had been failing on testers that didn't have x86 support. Rather
than XFAIL it on testers without x86 support, I've just assembled it and used
the raw object as the test input.
Petr Hosek [Sat, 27 Jun 2015 01:54:17 +0000 (01:54 +0000)]
[MC] Ensure that pending labels are flushed when -mc-relax-all flag is used
Summary:
The current implementation doesn't always flush all pending labels
beforeemitting data which can result in an incorrectly placed labels in
case when when instruction bundling is enabled and -mc-relax-all flag is
being used. To address this issue, we always flush pending labels before
emitting data.
The change was tested by running PNaCl toolchain trybots with
-mc-relax-all flag set.
Petr Hosek [Sat, 27 Jun 2015 01:49:53 +0000 (01:49 +0000)]
[MC] Align fragments when -mc-relax-all flag is used
Summary:
Ensure that fragments are bundle aligned when instruction bundling
is enabled and the -mc-relax-all flag is set. This is implicitly
assumed by the bundle padding implementation but this assumption
does not hold when custom alignment is being used.
The change was tested by running PNaCl toolchain trybots with
-mc-relax-all flag set.
AsmPrinter: Document why DIEValueList uses a linked-list, NFC
There are two main reasons why a linked-list makes sense for
`DIEValueList`.
1. We want `DIE` to be on a `BumpPtrAllocator` to improve teardown
efficiency. Making `DIEValueList` array-based would make that much
more complicated.
2. The singly-linked list is fairly memory efficient. The histogram
[1] shows that most DIEs have relatively few values, so we often pay
less than the 2/3-pointer static overhead of a vector. Furthermore,
we don't know ahead of time exactly how many values a `DIE` needs,
so a vector-like scheme will on average over-allocate by ~50%. As
it happens, that's the same memory overhead as the linked list node.
Allow callers of `Value::print()` and `Metadata::print()` to pass in a
`ModuleSlotTracker`. This allows them to pay only once for calculating
module-level slots (such as Metadata).
This is related to PR23865, where there was a huge cost for
`MachineFunction::print()`. Although I don't have a *particular* user
in mind for this new code, I have hit big slowdowns before when running
`opt -debug`, and I think this will be useful. Going forward, if
someone hits a big slowdown with `print()` statements, they can create a
`ModuleSlotTracker` and send it through. Similarly, adding support to
`Value::dump()` and `Metadata::dump()` should be trivial.
I added unit tests to be sure the `print()` functions actually behave
the same way with and without the slot tracker.
LowerBitSets: Ignore bitset entries that do not directly refer to a global.
It is possible for a global to be substituted with another global of a
different type or a different kind (i.e. an alias) at IR link time. One
example of this scenario is when a Microsoft ABI vtable is substituted with
an alias referring to a larger vtable containing an RTTI reference.
This will cause the global to be RAUW'd with a possibly bitcasted reference
to the other global. This will of course also affect any references to the
global in bitset metadata.
The right way to handle such metadata is simply to ignore it. This is sound
because the linked module should contain another copy of the bitset entries as
applied to the new global.
Lang Hames [Fri, 26 Jun 2015 23:56:53 +0000 (23:56 +0000)]
[StackMaps] Add a lightweight parser for stackmap version 1 sections.
The parser provides a convenient interface for reading llvm stackmap v1 sections
in object files.
This patch also includes a new option for llvm-readobj, '-stackmap', which uses
the parser to pretty-print stackmap sections for debugging/testing purposes.
CodeGen: Create a proper ModuleSlotTracker for MachineInstr
Another follow-up related to r240848: try a little harder to share slot
tracking calculations within a single `MachineInstr` dump. This is
unrelated to `MachineFunction::print()`, since that should be passing
through the function's `ModuleSlotTracker` by now, but could affect the
speed of dumping from a debugger if there is more than one IR-level
operand.
Alex Lorenz [Fri, 26 Jun 2015 22:56:48 +0000 (22:56 +0000)]
MIR Serialization: Serialize global address machine operands.
This commit serializes the global address machine operands.
This commit doesn't serialize the operand's offset and target
flags, it serializes only the global value reference.
Philip Reames [Fri, 26 Jun 2015 22:47:37 +0000 (22:47 +0000)]
[RewriteStatepointsForGC] Generalized vector phi/select handling for base pointers
This change extends the detection of base pointers for vector constructs to handle arbitrary phi and select nodes. The existing non-vector code already handles those, so this is basically just extending the vector special case to be less special cased. It still isn't generalized vector handling since we can't handle arbitrary vector instructions (e.g. shufflevectors), but it's a lot closer.
The general structure of the change is as follows:
* Extend the base defining value relation over a subset of vector instructions and vector typed phi & select instructions.
* Move scalarization from before base pointer rewriting to after base pointer rewriting. The extension of the BDV relation is sufficient to find vector base phis for vector inputs.
* Preserve the existing special case logic for when the base of a vector element is locally obvious. This general idea could be extended to the scalar case as well.
CodeGen: Push the ModuleSlotTracker through MachineOperands
Push `ModuleSlotTracker` through `MachineOperand`s, dropping the time
for `llc -print-machineinstrs` on the testcase in PR23865 from ~13
seconds to ~9 seconds. Now `SlotTracker::processFunctionMetadata()`
accounts for only 8% of the runtime, which seems reasonable.
CodeGen: Use a single SlotTracker in MachineFunction::print()
Expose enough of the IR-level `SlotTracker` so that
`MachineFunction::print()` can use a single one for printing
`BasicBlock`s. Next step would be to lift this through a few more APIs
so that we can make other print methods faster.
Fixes PR23865, changing the runtime of `llc -print-machineinstrs` from
many minutes (killed after 3 minutes, but it wasn't very close) to
13 seconds for a 502185 line dump.