CMake: Stop using LLVM's custom parse_arguments. NFC
Summary:
Use CMake's cmake_parse_arguments() instead.
It's called in a slightly different way, but supports all our use cases.
It's in CMake 2.8.8, which is our minimum supported version.
CMake 3.0 doc (roughly the same. No direct link to 2.8.8 doc):
http://www.cmake.org/cmake/help/v3.0/module/CMakeParseArguments.html?highlight=cmake_parse_arguments
Ahmed Bougacha [Fri, 19 Jun 2015 02:32:35 +0000 (02:32 +0000)]
[ARM] Look through concat when lowering in-place shuffles (VZIP, ..)
Currently, we canonicalize shuffles that produce a result larger than
their operands with:
shuffle(concat(v1, undef), concat(v2, undef))
->
shuffle(concat(v1, v2), undef)
because we can access quad vectors (see PerformVECTOR_SHUFFLECombine).
This is useful in the general case, but there are special cases where
native shuffles produce larger results: the two-result ops.
We can look through the concat when lowering them:
shuffle(concat(v1, v2), undef)
->
concat(VZIP(v1, v2):0, :1)
This lets us generate the native shuffles instead of scalarizing to
dozens of VMOVs.
Alex Lorenz [Thu, 18 Jun 2015 22:46:27 +0000 (22:46 +0000)]
MIR Serialization: Reenable one of the MIRParser tests by reverting r239805.
The test 'llvm/test/CodeGen/MIR/machine-function.mir' was disabled on
x86 msc18 in r239805 as it failed. My commit r240054 have fixed the
problem, so this commit reverts the commit that disabled the test as
it should pass now.
Hans Wennborg [Thu, 18 Jun 2015 22:22:30 +0000 (22:22 +0000)]
Switch lowering: enable whole-switch jump tables at -O0.
To same compile time, the analysis to find dense case-clusters in switches is
not done at -O0. However, when the whole switch is dense enough, it is easy to
turn it into a jump table, resulting in much faster code with no extra effort.
Sanjay Patel [Thu, 18 Jun 2015 21:12:24 +0000 (21:12 +0000)]
fixed to test attributes and use better checks
1. Used update_llc_test_checks.py to tighten checks
2. Fixed triple (nothing Darwin-specific here)
3. Replaced CPU specifiers with attributes
4. Fixed comments
5. Removed IvyBridge run because it did not add any coverage
Alex Lorenz [Thu, 18 Jun 2015 20:57:41 +0000 (20:57 +0000)]
MIR Serialization: initialize the fields without the default initializers in yaml::MachineFunction
My commit r239790 which introduced serialization for simple machine function attributes didn't
initialize them when parsing because I have misread the documentation for YAML IO's mapOptional
method. The mapOptional method doesn't actually set the values to the values returned by the
default constructor for that type when the key value pair is missing, it just doesn't modify
those values, so they still contain the value that was set during initialization by the default
constructor. But the fields in yaml::MachineFunction with types like unsigned and bool are not
initialized by default, and thus they can still be uninitialized after mapOptional during parsing.
This commit adds default initialization for those fields to prevent this.
Reid Kleckner [Thu, 18 Jun 2015 20:22:12 +0000 (20:22 +0000)]
[X86] Refactor stack adjustments into X86FrameLowering::BuildStackAdjustment
Deduplicates some code and lets us use LEA on atom when adjusting the
stack around callee-cleanup calls. This is the only intended
functionality change.
Sanjoy Das [Thu, 18 Jun 2015 19:28:26 +0000 (19:28 +0000)]
[CallGraph] Teach the CallGraph about non-leaf intrinsics.
Summary:
Currently intrinsics don't affect the creation of the call graph.
This is not accurate with respect to statepoint and patchpoint
intrinsics -- these do call (or invoke) LLVM level functions.
This change fixes this inconsistency by adding a call to the external
node for call sites that call these non-leaf intrinsics. This coupled
with the fact that these intrinsics also escape the function pointer
they call gives us a conservatively correct call graph.
James Y Knight [Thu, 18 Jun 2015 15:05:15 +0000 (15:05 +0000)]
[SPARC] Repair GOT references to internal symbols.
They had been getting emitted as a section + offset reference, which
is bogus since the value needs to be the offset within the GOT, not
the actual address of the symbol's object.
Daniel Jasper [Thu, 18 Jun 2015 11:51:16 +0000 (11:51 +0000)]
Update LLVM bindings after r239940. Apparently these aren't included in
any tests and I even don't know how to run the tests. This seems like a
minimal change to make them work again, although I can't really verify
at this point. Additionally, it probably makes sense to propagate the
personality parameter removal further.
AVX-512: (fixed) Added encoding of all forms of VPERMT2W/D/Q/PS/PD and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.
Benjamin Kramer [Wed, 17 Jun 2015 23:55:17 +0000 (23:55 +0000)]
[AsmPrinter] Make isRepeatedByteSequence smarter about odd integer types
- zext the value to alloc size first, then check if the value repeats
with zero padding included. If so we can still emit a .space
- Do the checking with APInt.isSplat(8), which handles non-pow2 types
- Also handle large constants (bit width > 64)
- In a ConstantArray all elements have the same type, so it's sufficient
to check the first constant recursively and then just compare if all
following constants are the same by pointer compare
Alex Lorenz [Wed, 17 Jun 2015 23:26:01 +0000 (23:26 +0000)]
YAML: Assign a value returned by the default constructor to the value in an optional mapping.
This commit ensures that a value that's passed into YAML's IO mapOptional method
is going to be assigned a value returned by the default constructor for that
value's type when the appropriate key is not present in the YAML mapping.
Jingyue Wu [Wed, 17 Jun 2015 22:31:02 +0000 (22:31 +0000)]
Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address
Summary:
This is done by first adding two additional instructions to convert the
alloca returned address to local and convert it back to generic. Then
replace all uses of alloca instruction with the converted generic
address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine
the generic addresscast and the corresponding Load, Store, Bitcast, GEP
Instruction together.
Reid Kleckner [Wed, 17 Jun 2015 21:31:17 +0000 (21:31 +0000)]
[X86] Cache variables that only depend on the subtarget
There is a one-to-one relationship between X86Subtarget and
X86FrameLowering, but every frame lowering method would previously pull
the subtarget off the MachineFunction and query some subtarget
properties.
Over time, these locals began to grow in complexity and it became
important to keep their names and meaning in sync across all of the
frame lowering methods, leading to duplication. We can eliminate that
duplication by computing them once in the constructor.
David Majnemer [Wed, 17 Jun 2015 20:52:32 +0000 (20:52 +0000)]
Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.
This isn't desirable because:
- All LandingPadInsts in the same function must have the same
personality routine. This means that each LandingPadInst beyond the
first has an operand which produces no additional information.
- There is ongoing work to introduce EH IR constructs other than
LandingPadInst. Moving the personality routine off of any one
particular Instruction and onto the parent function seems a lot better
than have N different places a personality function can sneak onto an
exceptional function.
Ahmed Bougacha [Wed, 17 Jun 2015 20:44:32 +0000 (20:44 +0000)]
[CodeGenPrepare] Generalize inserted set from truncs to any inst.
It's been used before to avoid infinite loops caused by separate CGP
optimizations undoing one another. We found one more such issue
caused by r238054. To avoid it, generalize the "InsertedTruncs"
set to any inst, and use it to avoid touching those again.
Rafael Espindola [Wed, 17 Jun 2015 17:53:31 +0000 (17:53 +0000)]
Allow aliases to be unnamed.
If globals can be unnamed, there is no reason for aliases to be different.
The restriction was there since the original implementation in r36435. I
can only guess it was there because of the old bison parser for the old
alias syntax.
Toma Tabacu [Wed, 17 Jun 2015 12:30:37 +0000 (12:30 +0000)]
[mips] [IAS] Fix LA with relative label operands.
Summary:
Call MCSymbolRefExpr::create() with a MCSymbol* argument, not with a StringRef
of the Symbol's name, in order to avoid creating invalid temporary symbols for
relative labels (e.g. {$,.L}tmp00, {$,.L}tmp10 etc.).
Toma Tabacu [Wed, 17 Jun 2015 10:43:45 +0000 (10:43 +0000)]
[mips] [IAS] Fix LW with relative label operands.
Summary:
Previously, MCSymbolRefExpr::create() was called with a StringRef of the symbol
name, which it would then search for in the Symbols StringMap (from MCContext).
However, relative labels (which are temporary symbols) are apparently not stored
in the Symbols StringMap, so we end up creating a new {$,.L}tmp symbol
({$,.L}tmp00, {$,.L}tmp10 etc.) each time we create an MCSymbolRefExpr by
passing in the symbol name as a StringRef.
Fortunately, there is a version of MCSymbolRefExpr::create() which takes an
MCSymbol* and we already have an MCSymbol* at that point, so we can just pass
that in instead of the StringRef.
I also removed the local StringRef calls to MCSymbolRefExpr::create() from
expandMemInst(), as those cases can be handled by evaluateRelocExpr() anyway.
Chandler Carruth [Wed, 17 Jun 2015 07:21:38 +0000 (07:21 +0000)]
[PM/AA] Remove the UnknownSize static member from AliasAnalysis.
This is now living in MemoryLocation, which is what it pertains to. It
is also an enum there rather than a static data member which is left
never defined.
Chandler Carruth [Wed, 17 Jun 2015 07:18:54 +0000 (07:18 +0000)]
[PM/AA] Remove the Location typedef from the AliasAnalysis class now
that it is its own entity in the form of MemoryLocation, and update all
the callers.
This is an entirely mechanical change. References to "Location" within
AA subclases become "MemoryLocation", and elsewhere
"AliasAnalysis::Location" becomes "MemoryLocation". Hope that helps
out-of-tree folks update.
Chandler Carruth [Wed, 17 Jun 2015 07:12:40 +0000 (07:12 +0000)]
[PM/AA] Split the location computation out of getArgLocation so the
virtual interface on AliasAnalysis only deals with ModRef information.
This interface was both computing memory locations by using TLI and
other tricks to estimate the size of memory referenced by an operand,
and computing ModRef information through similar investigations. This
change narrows the scope of the virtual interface on AliasAnalysis
slightly.
Note that all of this code could live in BasicAA, and be done with
a single investigation of the argument, if it weren't for the fact that
the generic code in AliasAnalysis::getModRefBehavior for a callsite
calls into the virtual aspect of (now) getArgModRefInfo. But this
patch's arrangement seems a not terrible way to go for now.
The other interesting wrinkle is how we could reasonably extend LLVM
with support for custom memory location sizes and mod/ref behavior for
library routines. After discussions with Hal on the review, the
conclusion is that this would be best done by fleshing out the much
desired support for extensions to TLI, and support these types of
queries in that interface where we would likely be doing other library
API recognition and analysis.
Rafael Espindola [Tue, 16 Jun 2015 23:22:02 +0000 (23:22 +0000)]
Rename and improve emitSectionOffset.
Different object formats represent references from dwarf in different ways.
ELF uses a relocation to the referenced point (except for .dwo) and
COFF/MachO use the offset of the referenced point inside its section.
This patch renames emitSectionOffset because
* It doesn't produce an offset on ELF.
* It changes behavior depending on how DWARF is represented, so adding
dwarf to its name is probably a good thing.
The patch also adds an option to force the use of offsets.That avoids
funny looking code like
if (!UseOffsets)
Asm->emitSectionOffset....
It was correct, but read as if the ! was inverted.
Simon Pilgrim [Tue, 16 Jun 2015 21:40:28 +0000 (21:40 +0000)]
[X86][SSE] Vectorize v2i32 to v2f64 conversions
This patch enables support for the conversion of v2i32 to v2f64 to use the CVTDQ2PD xmm instruction and stay on the SSE unit instead of scalarizing, sign extending to i64 and using CVTSI2SDQ scalar conversions.