David Blaikie [Sat, 14 Mar 2015 19:24:04 +0000 (19:24 +0000)]
[opaque pointer type] more gep API migrations
Adding nullptr to all the IRBuilder stuff because it's the first thing
that fails to build when testing without the back-compat functions, so
I'll keep having to re-add these locally for each chunk of migration I
do. Might as well check them in to save me the churn. Eventually I'll
have to migrate these too, but I'm going breadth-first.
Recover the ability to 'b CheckFailed' after r231577
Given that the stated purpose of `CheckFailed()` is to provide a nice
spot for a breakpoint, it'd be nice not to have to use a regex to break
on it. Recover the ability to simply use `b CheckFailed` by
specializing the message-only version, and by changing the variadic
version to call into the message-only version.
Frederic Riss [Sat, 14 Mar 2015 15:49:07 +0000 (15:49 +0000)]
[dsymutil] Add support for debug_loc section.
There is no need to look into the location expressions to transfer them,
the only modification to apply is to patch their base address to reflect
the linked function address.
Daniel Jasper [Sat, 14 Mar 2015 10:58:38 +0000 (10:58 +0000)]
[MachineLICM] First steps of sinking GEPs near calls.
Specifically, if there are copy-like instructions in the loop header
they are moved into the loop close to their uses. This reduces the live
intervals of the values and can avoid register spills.
This is working towards a fix for http://llvm.org/PR22230.
Review: http://reviews.llvm.org/D7259
Next steps:
- Find a better cost model (which non-copy instructions should be sunk?)
- Make this dependent on register pressure
Frederic Riss [Sat, 14 Mar 2015 03:46:51 +0000 (03:46 +0000)]
[dsymutil] Generate debug_aranges section.
This actually shares most of its implementation with the generation
of the debug_ranges (the absence of 'a' is not a typo) contribution
for the unit's DW_AT_ranges attribute.
Frederic Riss [Sat, 14 Mar 2015 03:46:40 +0000 (03:46 +0000)]
[dsymutil] Identify each CompileUnit with a unique ID.
The ID can eg. de used in MCSymbol names to differentiate the ones
that need to be created for every unit.
The ID is a constructor parameter and not a static class member so
there is no issue with counter updates if we decide to thread that
code.
David Blaikie [Sat, 14 Mar 2015 01:53:18 +0000 (01:53 +0000)]
[opaque pointer type] Start migrating GEP creation to explicitly specify the pointee type
I'm just going to migrate these in a pretty ad-hoc & incremental way -
providing the backwards compatible API for now, then locally removing
it, fixing a few callers, adding it back in and commiting those callers.
Rinse, repeat.
The assertions should ensure that if I get this wrong we'll find out
about it and not just have one giant patch to revert, recommit, revert,
recommit, etc.
LowerBitSets: Do not export symbols for bit set referenced globals on Darwin.
The linker on that platform may re-order symbols or strip dead symbols, which
will break bit set checks. Avoid this by hiding the symbols from the linker.
Frederic Riss [Fri, 13 Mar 2015 23:55:29 +0000 (23:55 +0000)]
[dsymutil] Fix typo in comment.
Next time, when I fix a typo, I'll take the time to reread the whole
comment instead of waiting for the commit email to realize that there
is another one two words later...
Rafael Espindola [Fri, 13 Mar 2015 22:18:18 +0000 (22:18 +0000)]
Use add32ri8 and friends on fast isel.
This fixes pr22854.
The core issue on the bug is that there are multiple instructions that
print the same in assembly. In fact, there doesn't seem to be any
syntax for specifying that a constant that fits in 8 bits should use a 32 bit
immediate.
The attached patch changes fast isel to consider i16immSExt8,
i32immSExt8, and i64immSExt8. They were disabled because fastisel didn’t know
to call the predicate back in the day.
David Blaikie [Fri, 13 Mar 2015 21:03:36 +0000 (21:03 +0000)]
[opaque pointer type] Bitcode support for explicit type parameter on the gep operator
This happened to be fairly easy to support backwards compatibility based
on the number of operands (old format had an even number, new format has
one more operand so an odd number).
test/Bitcode/old-aliases.ll already appears to test old gep operators
(if I remove the backwards compatibility in the BitcodeReader, this and
another test fail) so I'm not adding extra test coverage here.
David Blaikie [Fri, 13 Mar 2015 21:03:34 +0000 (21:03 +0000)]
Turn assertion into bitcode reading error
I don't think we test invalid bitcode records in any detail, so no test
here - just a change for consistency with existing error checks in
surrounding code.
Robert Lougher [Fri, 13 Mar 2015 20:53:01 +0000 (20:53 +0000)]
Reapply "[Reassociate] Add initial support for vector instructions."
This reapplies the patch previously committed at revision 232190. This was
reverted at revision 232196 as it caused test failures in tests that did not
expect operands to be commuted. I have made the tests more resilient to
reassociation in revision 232206.
As a follow-up to r232200, add an `-instcombine` to canonicalize scalar
allocations to `i32 1`. Since r232200, `iX 1` (for X != 32) are only
created by RAUWs, so this shouldn't fire too often. Nevertheless, it's
a cheap check and a nice cleanup.
instcombine: alloca: Limit array size type promotion
Move type promotion of the size of the array allocation to the end of
`simplifyAllocaArraySize()`. This avoids promoting the type of the
array size if it's a `ConstantInt`, since the next -instcombine
iteration will drop it to a scalar allocation anyway. Similarly, this
avoids promoting the type if it's an `UndefValue`, in which case the
alloca gets RAUW'ed.
This is NFC when considered over the lifetime of -instcombine, since
it's just reducing the number of iterations needed to reach fixed point.
Write the `alloca` array size explicitly when it's non-canonical.
Previously, if the array size was `iX 1` (where X is not 32), the type
would mutate to `i32` when round-tripping through assembly.
The testcase I added fails in `verify-uselistorder` (as well as
`FileCheck`), since the use-lists for `i32 1` and `i64 1` change.
(Manman Ren came across this when running `verify-uselistorder` on some
non-trivial, optimized code as part of PR5680.)
The type mutation started with r104911, which allowed array sizes to be
something other than an `i32`. Starting with r204945, we
"canonicalized" to `i64` on 64-bit platforms -- and then on every
round-trip through assembly, mutated back to `i32`.
I bundled a fixup for `-instcombine` to avoid r204945 on scalar
allocations. (There wasn't a clean way to sequence this into two
commits, since the assembly change on its own caused testcase churn, and
the `-instcombine` change can't be tested without the assembly changes.)
An obvious alternative fix -- change `AllocaInst::AllocaInst()`,
`AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for
scalar allocations -- was rejected out of hand, since this required
teaching them each about the data layout.
A follow-up commit will add an `-instcombine` to canonicalize the scalar
allocation array size to `i32 1` rather than leaving `iX 1` alone.
Manman Ren [Fri, 13 Mar 2015 19:24:30 +0000 (19:24 +0000)]
Add a parameter for getLazyBitcodeModule to lazily load Metadata.
We only defer loading metadata inside ParseModule when ShouldLazyLoadMetadata
is true and we have not loaded any Metadata block yet.
This commit implements all-or-nothing loading of Metadata. If there is a
request to load any metadata block, we will load all deferred metadata blocks.
We make sure the deferred metadata blocks are loaded before we materialize any
function or a module.
The default value of the added parameter ShouldLazyLoadMetadata for
getLazyBitcodeModule is false, so the default behavior stays the same.
We only set the parameter to true when creating LTOModule in local contexts.
These can only really be used for parsing symbols, so it's unnecessary to ever
load the metadata blocks.
If we are going to enable lazy-loading of Metadata for other usages of
getLazyBitcodeModule, where deferred metadata blocks need to be loaded, we can
expose BitcodeReader::materializeMetadata to Module, similar to
Module::materialize.
instcombine: alloca: Split out simplifyAllocaArraySize(), NFC
Follow-up commits will change some of the logic here. Splitting into a
separate function simplifies the logic by allowing early returns instead
of deeper nesting.
Frederic Riss [Fri, 13 Mar 2015 18:35:57 +0000 (18:35 +0000)]
[dsymutil] Fix handling of cross-cu forward references.
We recorded the forward references in the CU that holds the referenced
DIE, but this is wrong as those will get resoled *after* the CU that
holds the reference. Record the references in their originating CU along
with a pointer to the remote CU to be able to compute the fixed up
offset at the right time.
Frederic Riss [Fri, 13 Mar 2015 18:35:39 +0000 (18:35 +0000)]
[dsymutil] Fix location cloning for newer dwarf versions.
The typo got unnoticed because we were testing only on Dwarf 2. Add a
Dwarf4 test that exercises the code path, and also tests some newer
FORMs that the other test doesn't cover.
Robert Lougher [Fri, 13 Mar 2015 18:33:27 +0000 (18:33 +0000)]
[Reassociate] Add initial support for vector instructions.
This patch adds initial support for vector instructions to the reassociation
pass. It enables most parts of the pass to work with vectors but to keep the
size of the patch small, optimization of Xor trees, canonicalization of
negative constants and converting shifts to muls, etc., have been left out.
This will be handled in later patches.
The patch is based on an initial patch by Chad Rosier.
Sanjoy Das [Fri, 13 Mar 2015 18:31:19 +0000 (18:31 +0000)]
[SCEV] Fix PR22856.
Summary:
ScalarEvolutionExpander assumes that the header block of a loop is a
legal place to have a use for a phi node. This is true only for phis
that are either in the header or dominate the header block, but it is
not true for phi nodes that are strictly internal to the loop body.
This change teaches ScalarEvolutionExpander to place uses of PHI nodes
in the basic block the PHI nodes belong to. This is always legal, and
`hoistIVInc` ensures that the said position dominates `IsomorphicInc`.
David Blaikie [Fri, 13 Mar 2015 18:20:45 +0000 (18:20 +0000)]
[opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.
Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.
(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)
def conv(match):
line = match.group(1)
line += match.group(4)
line += ", "
line += match.group(2)
return line
line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
sys.stdout.write(line[off:match.start()])
sys.stdout.write(conv(match))
off = match.end()
sys.stdout.write(line[off:])
Kevin Enderby [Fri, 13 Mar 2015 17:56:32 +0000 (17:56 +0000)]
Add the option, -non-verbose to llvm-objdump used with -macho to print things
using numeric values and not their symbolic constant names.
The routines that print Mach-O stuff already had a verbose parameter and this
change is just changing the passing true to passing !NonVerbose. With just a
couple of fixes and a bunch of test case updates.
Andrea Di Biagio [Fri, 13 Mar 2015 17:29:49 +0000 (17:29 +0000)]
[X86][AVX] Fix wrong lowering of v4x64 shuffles into concat_vector plus extract_subvector nodes.
This patch fixes a bug in the shuffle lowering logic implemented by function
'lowerV2X128VectorShuffle'.
The are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a
shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR
nodes. The problematic expansion only occurs when the shuffle mask M has an
'undef' element at position 2, and M is equivalent to mask <0,1,4,5>.
In that case, the algorithm propagates the wrong vector to one of the two
new EXTRACT_SUBVECTOR nodes.
Example:
;;
define <4 x double> @test(<4 x double> %A, <4 x double> %B) {
entry:
%0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32><i32 undef, i32 1, i32 undef, i32 5>
ret <4 x double> %0
}
;;
Before this patch, llc (-mattr=+avx) generated:
vinsertf128 $1, %xmm0, %ymm0, %ymm0
With this patch, llc correctly generates:
vinsertf128 $1, %xmm1, %ymm0, %ymm0
David Majnemer [Fri, 13 Mar 2015 16:39:46 +0000 (16:39 +0000)]
ConstantFold: Fix big shift constant folding
Constant folding for shift IR instructions ignores all bits above 32 of
second argument (shift amount).
Because of that, some undef results are not recognized and APInt can
raise an assert failure if second argument has more than 64 bits.
Daniel Sanders [Fri, 13 Mar 2015 12:45:09 +0000 (12:45 +0000)]
Recommit r232027 with PR22883 fixed: Add infrastructure for support of multiple memory constraints.
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break
anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate
Constraint_* values.
PR22883 was caused the matching operands copying the whole of the operand flags
for the matched operand. This included the constraint id which needed to be
replaced with the operand number. This has been fixed with a conversion
function. Following on from this, matching operands also used the operand
number as the constraint id. This has been fixed by looking up the matched
operand and taking it from there.
Summary: Make emitMipsAbiFlags a direct member of MipsTargetELFStreamer, as that's the only place where it's used, and remove the empty implementations from MipsTargetStreamer and MipsTargetAsmStreamer.
Hao Liu [Fri, 13 Mar 2015 05:15:23 +0000 (05:15 +0000)]
[MachineCopyPropagation] Fix a bug causing incorrect removal for the instruction sequences as follows
%Q5_Q6<def> = COPY %Q2_Q3
%D5<def> =
%D3<def> =
%D3<def> = COPY %D6 // Incorrectly removed in MachineCopyPropagation
Using of %D3 results in incorrect result ...
Nick Lewycky [Fri, 13 Mar 2015 01:37:52 +0000 (01:37 +0000)]
When forming an addrec out of a phi don't just look at the last computation and steal its flags for our own, there may be other computations in the middle. Check whether the LHS of the computation is the phi itself and then we know it's safe to steal the flags. Fixes PR22795.
There's a missed optimization opportunity where we could look at the full chain of computation and take the intersection of the flags instead of only looking one instruction deep.
Eric Christopher [Fri, 13 Mar 2015 01:26:39 +0000 (01:26 +0000)]
Use the variable names from the TargetInstrInfo source when we
reference them in the generated files. A few characters aren't huge
here and CFSetupOpcode is much more readable than S0.
Sanjay Patel [Thu, 12 Mar 2015 23:16:18 +0000 (23:16 +0000)]
[X86, AVX2] Replace inserti128 and extracti128 intrinsics with generic shuffles
This should complete the job started in r231794 and continued in r232045:
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
AVX2 introduced proper integer variants of the hacked integer insert/extract
C intrinsics that were created for this same functionality with AVX1.
This should complete the removal of insert/extract128 intrinsics.
The Clang precursor patch for this change was checked in at r232109.
Eric Christopher [Thu, 12 Mar 2015 22:48:50 +0000 (22:48 +0000)]
In preparation for moving ARM's TargetRegisterInfo to the TargetMachine
merge Thumb1RegisterInfo and Thumb2RegisterInfo. This will enable
us to match the TargetMachine for our TargetRegisterInfo classes.
Tom Stellard [Thu, 12 Mar 2015 21:34:22 +0000 (21:34 +0000)]
R600/SI: Remove _e32 and _e64 suffixes from mnemonics
Instead print them as part of the $dst operand. The AsmMatcher
requires the 32-bit and 64-bit encodings have the same mnemonic in
order to parse them correctly.
Eric Christopher [Thu, 12 Mar 2015 21:04:46 +0000 (21:04 +0000)]
Migrate the AArch64 TargetRegisterInfo to its TargetMachine
implementation. This requires a bit of scaffolding and a few fixups
that'll go away once all of the ports have been migrated.
Hal Finkel [Thu, 12 Mar 2015 20:09:39 +0000 (20:09 +0000)]
Revert "r232027 - Add infrastructure for support of multiple memory constraints"
This (r232027) has caused PR22883; so it seems those bits might be used by
something else after all. Reverting until we can figure out what else to do.
Original commit message:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.
Quentin Colombet [Thu, 12 Mar 2015 19:34:12 +0000 (19:34 +0000)]
[X86] Fix a regression introduced by r223641.
The permps and permd instructions have their operands swapped compared to the
intrinsic definition. Therefore, they do not fall into the INTR_TYPE_2OP
category.
I did not create a new category for those two, as they are the only one AFAICT
in that case.
Frederic Riss [Thu, 12 Mar 2015 18:45:07 +0000 (18:45 +0000)]
[ADT] IntervalMap: use AlignedCharArrayUnion.
Currently IntervalMap would assert when used with keys bigger than host
pointers. This patch uses the AlignedCharArrayUnion functionality to
overcome that limitation.
Eric Christopher [Thu, 12 Mar 2015 17:54:19 +0000 (17:54 +0000)]
Remove the need to cache the subtarget in the X86 TargetRegisterInfo
classes. Use a Triple instead and simplify a lot of the querying
logic to use lookups on the Triple.
Chris Bieneman [Thu, 12 Mar 2015 17:33:34 +0000 (17:33 +0000)]
Refactoring CMake CrossCompile module.
* put most of the cross-compiling support into a function llvm_create_cross_target_internal.
* when CrossCompile is included it still generates a NATIVE target.
* llvm_create_cross_target function takes a target_name which should match a toolchain.
* llvm_create_cross_target can now be used to target more than one cross-compilation target.
Logan Chien [Thu, 12 Mar 2015 17:26:27 +0000 (17:26 +0000)]
[docs] Update the doxygen configuration file.
Update the doxygen configuration file and Makefile build rules
to provide better output (simply use the default stylesheet and template
from the Doxygen distribution.)
This CL has upgrade doxygen.cfg.in to Doxygen 1.8.6.
Chris Bieneman [Thu, 12 Mar 2015 16:19:16 +0000 (16:19 +0000)]
Doing some cleanup to the iOS toolchain.
* There is no reason to require SDKROOT as an environment variable because we can derive it from xcrun
* Setting CMAKE_RANLIB makes our static archives usable
Andrea Di Biagio [Thu, 12 Mar 2015 15:16:58 +0000 (15:16 +0000)]
[X86] Fix wrong target specific combine on SETCC nodes.
Part of the folding logic implemented by function 'PerformISDSETCCCombine'
only worked under the assumption that the condition code in input could have
been either SETNE or SETEQ.
Unfortunately that assumption was incorrect, and in some cases the algorithm
ended up incorrectly folding SETCC nodes.
The incorrect folding only affected SETCC dag nodes where:
- one of the operands was a build_vector of all zeroes;
- the other operand was a SIGN_EXTEND from a vector of MVT:i1 elements;
- the condition code was neither SETNE nor SETEQ.
Sanjay Patel [Thu, 12 Mar 2015 15:15:19 +0000 (15:15 +0000)]
[X86, AVX] replace vextractf128 intrinsics with generic shuffles
Now that we've replaced the vinsertf128 intrinsics,
do the same for their extract twins.
This is very much like D8086 (checked in at r231794):
We want to replace as much custom x86 shuffling via intrinsics
as possible because pushing the code down the generic shuffle
optimization path allows for better codegen and less complexity
in LLVM.
This is also the LLVM sibling to the cfe D8275 patch.
Daniel Sanders [Thu, 12 Mar 2015 11:00:48 +0000 (11:00 +0000)]
Add infrastructure for support of multiple memory constraints.
Summary:
The operand flag word for ISD::INLINEASM nodes now contains a 15-bit
memory constraint ID when the operand kind is Kind_Mem. This constraint
ID is a numeric equivalent to the constraint code string and is converted
with a target specific hook in TargetLowering.
This patch maps all memory constraints to InlineAsm::Constraint_m so there
is no functional change at this point. It just proves that using these
previously unused bits in the encoding of the flag word doesn't break anything.
The next patch will make each target preserve the current mapping of
everything to Constraint_m for itself while changing the target independent
implementation of the hook to return Constraint_Unknown appropriately. Each
target will then be adapted in separate patches to use appropriate Constraint_*
values.