Alex Lorenz [Mon, 15 Jun 2015 23:52:35 +0000 (23:52 +0000)]
MIR Serialization: move the MIR printer out of the MIR printing pass.
This commit decouples the MIR printer and the MIR printing pass so
that it will be possible to move the MIR printer into a separate
machine IR library later on.
Alex Lorenz [Mon, 15 Jun 2015 23:07:38 +0000 (23:07 +0000)]
MIR Serialization: Create dummy functions when the MIR file doesn't have LLVM IR.
This commit creates a dummy LLVM IR function with one basic block and an unreachable
instruction for each parsed machine function when the MIR file doesn't have LLVM IR.
This change is required as the machine function analysis pass creates machine
functions only for the functions that are defined in the current LLVM module.
Alex Lorenz [Mon, 15 Jun 2015 22:23:23 +0000 (22:23 +0000)]
MIR Serialization: Report an error when machine functions have the same name.
This commit reports an error when the MIR parser encounters a machine
function with the name that is the same as the name of a different
machine function.
Add safestack attribute to LLVMAttribute enum and Go bindings. Correct
constants in commented-out part of LLVMAttribute enum. Add tests that verify
that the safestack attribute is only allowed as a function attribute.
Colin LeMahieu [Mon, 15 Jun 2015 21:52:13 +0000 (21:52 +0000)]
[Hexagon] PC-relative offsets are relative to packet start rather than the offset of the relocation. Set relocation addend and check it's correct in the ELF.
Protection against stack-based memory corruption errors using SafeStack
This patch adds the safe stack instrumentation pass to LLVM, which separates
the program stack into a safe stack, which stores return addresses, register
spills, and local variables that are statically verified to be accessed
in a safe way, and the unsafe stack, which stores everything else. Such
separation makes it much harder for an attacker to corrupt objects on the
safe stack, including function pointers stored in spilled registers and
return addresses. You can find more information about the safe stack, as
well as other parts of or control-flow hijack protection technique in our
OSDI paper on code-pointer integrity (http://dslab.epfl.ch/pubs/cpi.pdf)
and our project website (http://levee.epfl.ch).
The overhead of our implementation of the safe stack is very close to zero
(0.01% on the Phoronix benchmarks). This is lower than the overhead of
stack cookies, which are supported by LLVM and are commonly used today,
yet the security guarantees of the safe stack are strictly stronger than
stack cookies. In some cases, the safe stack improves performance due to
better cache locality.
Our current implementation of the safe stack is stable and robust, we
used it to recompile multiple projects on Linux including Chromium, and
we also recompiled the entire FreeBSD user-space system and more than 100
packages. We ran unit tests on the FreeBSD system and many of the packages
and observed no errors caused by the safe stack. The safe stack is also fully
binary compatible with non-instrumented code and can be applied to parts of
a program selectively.
This patch is our implementation of the safe stack on top of LLVM. The
patches make the following changes:
- Add the safestack function attribute, similar to the ssp, sspstrong and
sspreq attributes.
- Add the SafeStack instrumentation pass that applies the safe stack to all
functions that have the safestack attribute. This pass moves all unsafe local
variables to the unsafe stack with a separate stack pointer, whereas all
safe variables remain on the regular stack that is managed by LLVM as usual.
- Invoke the pass as the last stage before code generation (at the same time
the existing cookie-based stack protector pass is invoked).
- Add unit tests for the safe stack.
Original patch by Volodymyr Kuznetsov and others at the Dependable Systems
Lab at EPFL; updates and upstreaming by myself.
Alex Lorenz [Mon, 15 Jun 2015 20:30:22 +0000 (20:30 +0000)]
MIR Serialization: Connect the machine function analysis pass to the MIR parser.
This commit connects the machine function analysis pass (which creates machine
functions) to the MIR parser, which will initialize the machine functions
with the state from the MIR file and reconstruct the machine IR.
This commit introduces a new interface called 'MachineFunctionInitializer',
which can be used to provide custom initialization for the machine functions.
This commit also introduces a new diagnostic class called
'DiagnosticInfoMIRParser' which is used for MIR parsing errors.
This commit modifies the default diagnostic handling in LLVMContext - now the
the diagnostics are printed directly into llvm::errs() so that the MIR parsing
errors can be printed with colours.
Colin LeMahieu [Mon, 15 Jun 2015 19:05:35 +0000 (19:05 +0000)]
[Hexagon] Moving pass declarations out of header and in to implementation files. Removing unused function getSubtargetInfo from HexagonMCCodeEmitter.cpp Removing deletion of copy construction and assignment operator since parent already deletes it.
Sanjoy Das [Mon, 15 Jun 2015 18:44:27 +0000 (18:44 +0000)]
[CodeGen] Add a pass to fold null checks into nearby memory operations.
Summary:
This change adds an "ImplicitNullChecks" target dependent pass. This
pass folds null checks into memory operation using the FAULTING_LOAD
pseudo-op introduced in previous patches.
Depends on D10197
Depends on D10199
Depends on D10200
Sanjoy Das [Mon, 15 Jun 2015 18:44:21 +0000 (18:44 +0000)]
[TargetInstrInfo] Add new hook: AnalyzeBranchPredicate.
Summary:
NFC: no one uses AnalyzeBranchPredicate yet.
Add TargetInstrInfo::AnalyzeBranchPredicate and implement for x86. A
later change adding support for page-fault based implicit null checks
depends on this.
Sanjoy Das [Mon, 15 Jun 2015 18:44:14 +0000 (18:44 +0000)]
[TargetInstrInfo] Rename getLdStBaseRegImmOfs and implement for x86.
Summary:
TargetInstrInfo::getLdStBaseRegImmOfs to
TargetInstrInfo::getMemOpBaseRegImmOfs and implement for x86. The
implementation only handles a few easy cases now and will be made more
sophisticated in the future.
This is NFCI: the only user of `getLdStBaseRegImmOfs` (now
`getmemOpBaseRegImmOfs`) is `LoadClusterMotion` and `LoadClusterMotion`
is disabled for x86.
Sanjoy Das [Mon, 15 Jun 2015 18:44:08 +0000 (18:44 +0000)]
[CodeGen] Introduce a FAULTING_LOAD_OP pseudo-op.
Summary:
This instruction encodes a loading operation that may fault, and a label
to branch to if the load page-faults. The locations of potentially
faulting loads and their "handler" destinations are recorded in a
FaultMap section, meant to be consumed by LLVM's clients.
Nothing generates FAULTING_LOAD_OP instructions yet, but they will be
used in a future change.
The documentation (FaultMaps.rst) needs improvement and I will update
this diff with a more expanded version shortly.
LLVM targeting aarch64 doesn't correctly produce aligned accesses for non-aligned
data at -O0/fast-isel (-mno-unaligned-access).
The root cause seems to be in fast-isel not producing unaligned access correctly
for -mno-unaligned-access.
The patch just aborts fast-isel for loads and stores when -mno-unaligned-access is
present.
The regression test is updated to check this new test case (-mno-unaligned-access
together with fast-isel).
Daniel Sanders [Mon, 15 Jun 2015 09:19:41 +0000 (09:19 +0000)]
Replace string GNU Triples with llvm::Triple in InitMCObjectFileInfo. NFC.
Summary:
This affects other tools so the previous C++ API has been retained as a
deprecated function for the moment. Clang has been updated with a trivial
patch (not covered by the pre-commit review) to avoid breaking -Werror builds.
Other in-tree tools will be fixed with similar trivial patches.
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
This patch fixes a compilation time issue, when MachineSink faces PHIs
with a huge number of operands. This can happen for example in goto table
based interpreters, where some basic blocks can have several of those PHIs,
each one with several hundreds operands. MachineSink was spending a
significant time re-building and re-sorting the list of successors of
the current MachineBasicBlock. The computing and sorting of the current
MachineBasicBlock successors is now cached.
Hao Liu [Mon, 15 Jun 2015 01:35:49 +0000 (01:35 +0000)]
[AArch64] Match interleaved memory accesses into ldN/stN instructions.
Re-commit after adding "-aarch64-neon-syntax=generic" to fix the failure on OS X.
This patch was firstly committed in r239514, then reverted in r239544 because of a syntax incompatible failure on OS X.
NAKAMURA Takumi [Sun, 14 Jun 2015 21:47:29 +0000 (21:47 +0000)]
[CMake] Try to fix r239612, not to miss resources/windows_version_resource.rc in clang build.
- Who defines ${LLVM_SOURCE_DIR} ?
- Would windows_version_resource.rc be available in an *installed* llvm tree?
I suggest it may be installed in ${PREFIX}/share.
Igor Breger [Sun, 14 Jun 2015 12:44:55 +0000 (12:44 +0000)]
AVX-512: Implemented cvtsi2ss/d cvtusi2ss/d instructions with round control for KNL.
Added intrinsics for cvtsi2ss/d instructions.
Added tests for intrinsics and encoding.
Rafael Espindola [Sat, 13 Jun 2015 17:23:04 +0000 (17:23 +0000)]
Don't use std::errc.
As noted on Errc.h:
// * std::errc is just marked with is_error_condition_enum. This means that
// common patters like AnErrorCode == errc::no_such_file_or_directory take
// 4 virtual calls instead of two comparisons.
And on some libstdc++ those virtual functions conclude that
Eric Fiselier [Sat, 13 Jun 2015 06:55:44 +0000 (06:55 +0000)]
[LIT] Fix failing LIT tests
Summary:
I spend some time trying to get the LIT test suite passing. Here are the changes that I needed to make on my machine.
I made the following changes for the following reasons.
1. google-test.py: The Google test format now checks for "[ PASSED ] 1 test." to check if a test passes.
2. discovery.py: The output appears in a different order on my machine than it did in the test.
3. unittest-adaptor.py: The output appears in a different order on my machine than it did in the test.
4. The classname is now formed differently in `getJUnitXML(...)`.
I'm not sure what is causing the output order to differ in discovery.py and unittest-adaptor.py. Does anybody have any thoughts?
Matthias Braun [Sat, 13 Jun 2015 03:42:16 +0000 (03:42 +0000)]
Rename TargetSubtargetInfo::enablePostMachineScheduler() to enablePostRAScheduler()
r213101 changed the behaviour of this method to not only affect the
PostMachineScheduler scheduler but also the PostRAScheduler scheduler,
renaming should make this fact clear. Also document that the preferred
way is to specify this in the scheduling model instead of overriding
this method.
Matt Wala [Fri, 12 Jun 2015 22:49:11 +0000 (22:49 +0000)]
[Scalarizer] Fix potential for stale data in Scattered across invocations
Summary:
Scalarizer has two data structures that hold information about changes
to the function, Gathered and Scattered. These are cleared in finish()
at the end of runOnFunction() if finish() detects any changes to the
function.
However, finish() was checking for changes by only checking if
Gathered was non-empty. The function visitStore() only modifies
Scattered without touching Gathered. As a result, Scattered could have
ended up having stale data if Scalarizer only scalarized store
instructions. Since the data in Scattered is used during the execution
of the pass, this introduced dangling pointer errors.
The fix is to check whether both Scattered and Gathered are empty
before deciding what to do in finish().
Lang Hames [Fri, 12 Jun 2015 21:31:15 +0000 (21:31 +0000)]
[Orc] Fix a bug in the CompileOnDemand layer where stub decls were not cloned
into partitions. Also, add an option to clone stub definitions (not just decls)
into partitions: these definitions could be inlined in some places to avoid the
overhead of calling via the stub.
Found by inspection - no test case yet, although I plan to add a unit test for
this once the CompileOnDemand layer refactoring settles down.
Douglas Katzman [Fri, 12 Jun 2015 18:31:38 +0000 (18:31 +0000)]
Add 'shave' processor name to Triple
Based on ArchType, Clang's driver can select a non-Clang compiler.
String parsing in Clang would have sufficed if it were only that,
however this change anticipates true llvm support.
David Blaikie [Fri, 12 Jun 2015 18:22:03 +0000 (18:22 +0000)]
Refix a use of explicit pointer types in GEP constant folding
In the glorious future of opaque pointer types, it won't be possible to
retrieve the pointee type of a pointer type which is what's being done
in this GEP loop - but the first iteration is always a pointer type and
the loop doesn't care about that case, except whether or not the index
is a constant.
So pull that special case out before the loop and start at the second
iteration (index 1) instead.
Originally committed in r236670 and reverted with a test case in
r239015. This change keeps the test case working while also avoiding
depending on pointee types.
Pete Cooper [Fri, 12 Jun 2015 17:48:18 +0000 (17:48 +0000)]
Move OperandList to be allocated prior to User for hung off subclasses.
For hung off uses, we need a Use* to tell use where the operands are.
This was User::OperandList but we want to remove that to save space
of all subclasses which aren't making use of 'hung off uses'.
Hung off uses now allocate their own 'OperandList' Use* in the
User::new which they call.
getOperandList() now uses the hung off uses bit to work out where the
Use* for the OperandList lives. If a User has hung off uses, then this
bit tells them to go back a single Use* from the User* and use that
value as the OperandList.
If a User has no hung off uses, then we get the first operand by
subtracting (NumOperands * sizeof(Use)) from the User this pointer.
This saves a pointer from User and all subclasses. Given the average
size of a subclass of User is 112 or 128 bytes, this saves around 7% of space
With malloc tending to align to 16-bytes the real saving is typically more like 3.5%.
On 'opt -O2 verify-uselistorder.lto.bc', peak memory usage prior to this change
is 149MB and after is 143MB so the savings are around 2.5% of peak.
Looking at some passes which allocate many Instructions and Values, parseIR drops
from 54.25MB to 52.21MB while the Inliner calls to Instruction::clone() drops
from 28.20MB to 27.05MB.
Pete Cooper [Fri, 12 Jun 2015 17:48:14 +0000 (17:48 +0000)]
Added a version of User::new for hung off uses.
There are now 2 versions of User::new. The first takes a size_t and is the current
implementation for subclasses which need 0 or more Use's allocated for their operands.
The new version takes no extra arguments to say that this subclass needs 'hung off uses'.
The HungOffUses bool is now set in this version of User::new and we can assert in
allocHungOffUses that we are allowed to have hung off uses.
This ensures we call the correct version of User::new for subclasses which need hung off uses.
A future commit will then allocate space for a single Use* which will be used
in place of User::OperandList once that field has been removed.
Pete Cooper [Fri, 12 Jun 2015 17:48:10 +0000 (17:48 +0000)]
Rename NumOperands to make it clear its managed by the User. NFC.
This is to try make it very clear that subclasses shouldn't be changing
the value directly. Now that OperandList for normal instructions is computed
using the NumOperands, its critical that the NumOperands is accurate or we
could compute the wrong offset to the first operand.
I looked over all places which update NumOperands and they are all safe.
Hung off use User's don't use NumOperands to compute the OperandList so they
are safe to continue to manipulate it. The only other User which changed it
was GlobalVariable which has an optional init list but always allocated space
for a single Use. It was correctly setting NumOperands to 1 before setting an
initializer, and setting it to 0 after clearing the init list, so the order was safe.
Added some comments to that code to make sure that this isn't changed in future
without being aware of this constraint.
Pete Cooper [Fri, 12 Jun 2015 17:48:05 +0000 (17:48 +0000)]
Replace all accesses to User::OperandList with getter and setter methods. NFC.
We don't want anyone to access OperandList directly as its going to be removed
and computed instead. This uses getter's and setter's instead in which we
can later change the underlying implementation of OperandList.
Pete Cooper [Fri, 12 Jun 2015 16:13:54 +0000 (16:13 +0000)]
Don't create instructions from ConstantExpr's in CFLAliasAnalysis.
The CFLAA code currently calls ConstantExpr::getAsInstruction which creates an instruction from a constant expr.
We then pass that instruction to the InstVisitor to analyze it.
Its not necessary to create these instructions as we can just cast from Constant to Operator in the visitor. This is how other InstVisitor’s such as SelectionDAGBuilder handle ConstantExpr.
Greg Bedwell [Fri, 12 Jun 2015 15:58:29 +0000 (15:58 +0000)]
In MSVC builds embed a VERSIONINFO resource in our exe and DLL files.
This reinstates my commits r238740/r238741 which I reverted due to a failure
in the clang-cl selfhost tests on Windows. I've now fixed the issue in
clang-cl that caused the failure so hopefully all should be well now.
John Brawn [Fri, 12 Jun 2015 09:38:51 +0000 (09:38 +0000)]
[ARM] Disabling vfp4 should disable fp16
ARMTargetParser::getFPUFeatures should disable fp16 whenever it
disables vfp4, as otherwise something like -mcpu=cortex-a7 -mfpu=none
leaves us with fp16 enabled (though the only effect that will have is
a wrong build attribute).
LowerBitSets: Give names to aliases of unnamed bitset element objects.
It is valid for globals to be unnamed, but aliases must have a name. To avoid
creating invalid IR, we need to assign names to any aliases we create that
point to unnamed objects that have been moved into combined globals.