Zachary Turner [Sat, 3 Jun 2017 00:33:35 +0000 (00:33 +0000)]
[PDB] Fix use after free.
Previously MappedBlockStream owned its own BumpPtrAllocator that
it would allocate from when a read crossed a block boundary. This
way it could still return the user a contiguous buffer of the
requested size. However, It's not uncommon to open a stream, read
some stuff, close it, and then save the information for later.
After all, since the entire file is mapped into memory, the data
should always be available as long as the file is open.
Of course, the exception to this is when the data isn't *in* the
file, but rather in some buffer that we temporarily allocated to
present this contiguous view. And this buffer would get destroyed
as soon as the strema was closed.
The fix here is to force the user to specify the allocator, this
way it can provide an allocator that has whatever lifetime it
chooses.
Matthias Braun [Sat, 3 Jun 2017 00:26:35 +0000 (00:26 +0000)]
LiveRegUnits: Port recent LivePhysRegs bugfixes
Adjust code to look more like the code in LivePhysRegs and port over the
fix for LivePhysRegs from r304001 and adapt to the new CSR management in
MachineRegisterInfo.
Sanjay Patel [Fri, 2 Jun 2017 23:40:46 +0000 (23:40 +0000)]
[x86] fix over-specific triple; NFC
There's nothing darwin-specific in these tests, and using
that setting causes extra phantom diffs when the auto-generated
check lines are regenerated today.
Philip Reames [Fri, 2 Jun 2017 23:03:26 +0000 (23:03 +0000)]
[Statepoint] Be consistent about using deopt naming [NFCI]
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed.
[RABasic] Properly update the LiveRegMatrix when LR splitting occur
Prior to this patch we used to not touch the LiveRegMatrix while doing
live-range splitting. In other words, when live-range splitting was
occurring, the LiveRegMatrix was not reflecting the changes.
This is generally fine because it means the query to the LiveRegMatrix
will be conservately correct. However, when decisions are taken based on
what is going to happen on the interferences (e.g., when we spill a
register and know that it is going to be available for another one), we
might hit an assertion that the color used for the assignment is still
in use.
This patch makes sure the changes on the live-ranges are properly
reflected in the LiveRegMatrix, so the assertions don't break.
An alternative could have been to remove the assertion, but it would
make the invariants of the code and the general reasoning more
complicated in my opnion.
Jun Bum Lim [Fri, 2 Jun 2017 20:42:54 +0000 (20:42 +0000)]
[InlineCost] Enable the new switch cost heuristic
Summary:
This is to enable the new switch inline cost heuristic (r301649) by removing the
old heuristic as well as the flag itself.
In my experiment for LLVM test suite and spec2000/2006, +17.82% performance and
8% code size reduce was observed in spec2000/vertex with O3 LTO in AArch64.
No significant code size / performance regression was found in O3/O2/Os. No
significant complain was reported from the llvm-dev thread.
Reviewers: hans, chandlerc, eraman, haicheng, mcrosier, bmakam, eastig, ddibyend, echristo
Ahmed Bougacha [Fri, 2 Jun 2017 20:02:59 +0000 (20:02 +0000)]
[X86] Correctly broadcast NaN-like integers as float on AVX.
Since r288804, we try to lower build_vectors on AVX using broadcasts of
float/double. However, when we broadcast integer values that happen to
have a NaN float bitpattern, we lose the NaN payload, thereby changing
the integer value being broadcast.
This is caused by ConstantFP::get, to which we pass the splat i32 as
a float (by bitcasting it using bitsToFloat). ConstantFP::get takes
a double parameter, so we end up lossily converting a single-precision
NaN to double-precision.
Instead, avoid any kinds of conversions by directly building an APFloat
from the splatted APInt.
Note that this also fixes another piece of code (broadcast of
subvectors), that currently isn't susceptible to the same problem.
Also note that we could really just use APInt and ConstantInt
throughout: the constant pool type doesn't matter much. Still, for
consistency, use the appropriate type.
Zachary Turner [Fri, 2 Jun 2017 19:49:14 +0000 (19:49 +0000)]
[CodeView] Support CodeView subsections in any order.
Previously we would expect certain subsections to appear
in a certain order because some subsections would reference
other subsections, but in practice we need to support
arbitrary orderings since some object file and PDB file
producers generate them this way. This also paves the
way for supporting Yaml <-> Object File conversion of
CodeView, since Object Files typically have quite a
large number of subsections in their debug info.
Keno Fischer [Fri, 2 Jun 2017 19:04:17 +0000 (19:04 +0000)]
[SROA] Fix crash due to bad bitcast
Summary:
As shown in the test case, SROA was crashing when trying to split
stores (to the alloca) of loads (from anywhere), because it assumed
the pointer operand to the loads and stores had to have the same
address space. This isn't the case. Make sure to use the correct
pointer type for both the load and the store.
Since D17854 LinkerSubsectionsViaSymbols is unnecessary.
It is interfering with ThinLTO implementation of CFI-ICall, where
the aliases used on the !LinkerSubsectionsViaSymbols branch are
needed to export jump tables to ThinLTO backends.
Reid Kleckner [Fri, 2 Jun 2017 17:53:06 +0000 (17:53 +0000)]
Re-land "COFF: migrate def parser from LLD to LLVM"
This reverts commit r304561 and re-lands r303490 & co.
The fix was to use "SymbolName" when translating LLD's internal export
list to lib/Object's short export struct. The SymbolName reflects the
actual symbol name, which may include fastcall and stdcall mangling bits
not included in the /EXPORT or .def file EXPORTS name:
@@ -434,8 +434,7 @@ std::vector<COFFShortExport> createCOFFShortExportFromConfig() {
std::vector<COFFShortExport> Exports;
for (Export &E1 : Config->Exports) {
COFFShortExport E2;
- E2.Name = E1.Name;
+ // Use SymbolName, which will have any stdcall or fastcall qualifiers.
+ E2.Name = E1.SymbolName;
E2.ExtName = E1.ExtName;
E2.Ordinal = E1.Ordinal;
E2.Noname = E1.Noname;
David Blaikie [Fri, 2 Jun 2017 17:24:26 +0000 (17:24 +0000)]
Tidy up a bit of r304516, use SmallVector::assign rather than for loop
This might give a few better opportunities to optimize these to memcpy
rather than loops - also a few minor cleanups (StringRef-izing,
templating (to avoid std::function indirection), etc).
The SmallVector::assign(iter, iter) could be improved with the use of
SFINAE, but the (iter, iter) ctor and append(iter, iter) need it to and
don't have it - so, workaround it for now rather than bothering with the
added complexity.
(also, as noted in the added FIXME, these assign ops could potentially
be optimized better at least for non-trivially-copyable types)
Philip Reames [Fri, 2 Jun 2017 17:02:33 +0000 (17:02 +0000)]
Verify a couple more fields in STATEPOINT instructions
While doing so, clarify the comments and update them to reflect current reality.
Note: I'm going to let this sit for a week or so before adding further verification. I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
Philip Reames [Fri, 2 Jun 2017 16:36:37 +0000 (16:36 +0000)]
Add placeholder for more extensive verification of psuedo ops
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.
For those curious, the more complicated version of this patch already found one very suspicious thing.
Craig Topper [Fri, 2 Jun 2017 16:17:32 +0000 (16:17 +0000)]
[InstSimplify][ConstantFolding] Teach constant folding how to handle icmp null, (inttoptr x) as well as it handles icmp (inttoptr x), null
Summary:
The constant folding code currently assumes that the constant expression will always be on the left and the simple null will be on the right. But that's not true at least on the path from InstSimplify.
This patch adds support to ConstantFolding to detect the reversed case.
Zoran Jovanovic [Fri, 2 Jun 2017 14:14:21 +0000 (14:14 +0000)]
[mips][microMIPS] Extending size reduction pass with LBU16, LHU16, SB16 and SH16
Author: milena.vujosevic.janicic
Reviewers: sdardis
The patch extends size reduction pass for MicroMIPS.
The following instructions are examined and transformed, if possible:
LBU instruction is transformed into 16-bit instruction LBU16
LHU instruction is transformed into 16-bit instruction LHU16
SB instruction is transformed into 16-bit instruction SB16
SH instruction is transformed into 16-bit instruction SH16
Differential Revision: https://reviews.llvm.org/D33091
Benjamin Kramer [Fri, 2 Jun 2017 10:50:22 +0000 (10:50 +0000)]
[X86] Don't fold into memory operands into insertps in the generated folding tables.
insertps behaves differently, the register form selects from an input
register based on the immediate operand while the memory form just loads
the given address. We have custom code to change the immediate in cases
where that's legal, so completely remove insertps from the generated
tables.
Diana Picus [Fri, 2 Jun 2017 10:16:48 +0000 (10:16 +0000)]
[ARM] GlobalISel: Support struct params/returns
Very very similar to the support for arrays. As with arrays, we don't
support returning large structs that wouldn't fit in R0-R3. Most
front-ends would likely use sret arguments for that anyway.
The only significant difference is that when splitting a struct, we need
to make sure we set the correct original alignment on each member,
otherwise it may get split incorrectly between stack and registers.
Javed Absar [Fri, 2 Jun 2017 08:53:19 +0000 (08:53 +0000)]
[ARM] Cortex-A57 scheduling model for ARM backend (AArch32)
This patch implements the Cortex-A57 scheduling model.
The main code is in ARMScheduleA57.td, ARMScheduleA57WriteRes.td.
Small changes in cpp,.h files to support required scheduling predicates.
Scheduling model implemented according to:
http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf.
Patch by : Andrew Zhogin (submitted on his behalf, as requested).
Rewiewed by: Renato Golin, Diana Picus, Javed Absar, Kristof Beyls.
Differential Revision: https://reviews.llvm.org/D28152
Max Kazantsev [Fri, 2 Jun 2017 07:11:00 +0000 (07:11 +0000)]
[SelectionDAG] Get rid of recursion in findNonImmUse
The recursive implementation of findNonImmUse may overflow stack
on extremely long use chains. This patch replaces it with an equivalent
iterative implementation.
Gor Nishanov [Fri, 2 Jun 2017 02:18:36 +0000 (02:18 +0000)]
[coroutines] PR33271: Remove stray coro.save intrinsics during CoroSplit
Summary:
Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned.
Make sure we clean up orphaned coro.saves. The bug manifested with a crash similar to this:
Teresa Johnson [Fri, 2 Jun 2017 01:56:02 +0000 (01:56 +0000)]
[ThinLTO] Efficiency improvement when writing module path string table
Summary:
When writing the combined index, we are walking the entire module
path StringMap in the full index, and checking whether each one should be
included in the index being written. For distributed backends, where we
write an individual combined index for each file, each with only a few
module paths, this is incredibly inefficient. Add a method that takes
a callback and hides the details of whether we are writing the full
combined index, or just a slice, and in the latter case it walks the set
of modules to include instead of the entire index.
For a huge application with around 23K files (i.e. where we were iterating
through the 23K-entry modulePath StringMap 23K times), this change improved
the thin link time by a whopping 48%.
Jacob Gravelle [Fri, 2 Jun 2017 01:26:17 +0000 (01:26 +0000)]
Revert r304117 - WebAssembly object format isn't ready to be the default
Summary: Wasm object format has some functionality regressions from the ELF format, and doesn't play nicely with the rest of the toolchain. It should eventually be the default, but not yet.
Tim Shen [Thu, 1 Jun 2017 23:13:44 +0000 (23:13 +0000)]
[ThinLTO] Move -lto-use-new-pm to llvm-lto2, and change it to -use-new-pm.
Summary:
As we teach Clang to use ThinkLTO + new PM, it's good for the users to
inject through Config, instead of setting a flag in the LTOBackend
library. Move the flag to llvm-lto2.
As it moves to llvm-lto2, a new name -use-new-pm seems simpler and as
clear.
Keno Fischer [Thu, 1 Jun 2017 23:02:12 +0000 (23:02 +0000)]
Reapply "[Cloning] Take another pass at properly cloning debug info"
This was rL304226, reverted in 304228 due to a clang assertion failure
on the build bots. That problem should have been addressed by clang
commit rL304470.
Zachary Turner [Thu, 1 Jun 2017 21:52:41 +0000 (21:52 +0000)]
[CodeView] Properly align symbol records on read/write.
Object files have symbol records not aligned to any particular
boundary (e.g. 1-byte aligned), while PDB files have symbol
records padded to 4-byte aligned boundaries. Since they share
the same reading / writing code, we have to provide an option to
specify the alignment and propagate it up to the producer or
consumer who knows what the alignment is supposed to be for the
given container type.
Added a test for this by modifying the existing PDB -> YAML -> PDB
round-tripping code to round trip symbol records as well as types.