AVX-512: Optimized INSERT_SUBVECTOR for i1 vector types
ISERT_SUBVECTOR for i1 vectors may be done with shifts, when we insert into the lower part, or into the upper part, on into all-zero vector.
CONCAT_VECTORS uses ISERT_SUBVECTOR.
[PGO] move names of runtime sections definitions to InstrProfData.inc
In profile runtime implementation for Darwin, Linux and FreeBSD, the
names of sections holding profile control/counter/naming data need
to be known by the runtime in order to locate the start/end of the
data. Moving the name definitions to the common file to specify the
connection.
Add more complete description of the content and structure
of the template file. Made the comment in C style to be
shared by C runtime. Also enhance the file structure so
that it can included as standalone header for common
definitions.
Rafael Espindola [Sun, 22 Nov 2015 00:16:24 +0000 (00:16 +0000)]
Have a single way for creating unique value names.
We had two code paths. One would create names like "foo.1" and the other
names like "foo1".
For globals it is important to use "foo.1" to help C++ name demangling.
For locals there is no strong reason to go one way or the other so I
kept the most common mangling (foo1).
Teresa Johnson [Sat, 21 Nov 2015 21:55:48 +0000 (21:55 +0000)]
[ThinLTO] Handle bitcode without function summary sections gracefully
Summary:
Several fixes to the handling of bitcode files without function summary
sections so that they are skipped during ThinLTO processing in llvm-lto
and the gold plugin when appropriate instead of aborting.
1 Don't assert when trying to add a FunctionInfo that doesn't have
a summary attached.
2 Skip FunctionInfo structures that don't have attached function summary
sections when trying to create the combined function summary.
3 In both llvm-lto and gold-plugin, check whether a bitcode file has
a function summary section before trying to parse the index, and skip
the bitcode file if it does not.
4 Fix hasFunctionSummaryInMemBuffer in BitcodeReader, which had a bug
where we returned to early while looking for the summary section.
Also added llvm-lto and gold-plugin based tests for cases where we
don't have function summaries in the bitcode file. I verified that
either the first couple fixes described above are enough to avoid the
crashes, or fixes 1,3,4. But have combined them all here for added
robustness.
Simon Pilgrim [Sat, 21 Nov 2015 21:42:26 +0000 (21:42 +0000)]
[MachineInstrBuilder] Support for adding a ConstantPoolIndex MO with an additional offset.
MachineInstrBuilder::addDisp can already add an immediate or global address MO with an adjusted offset, this patch adds support for constant pool indices as well.
All remaining MO types still assert - there are a number of other types that could support adjusted offsets but I have no test cases at this time.
Extended DFA tablegen to:
- added "-debug-only dfa-emitter" support to llvm-tblgen
- defined CVI_PIPE* resources for the V60 vector coprocessor
- allow specification of multiple required resources
- supports ANDs of ORs
- e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
(SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)
- added support for combo resources
- allows specifying ORs of ANDs
- e.g. [CVI_XLSHF, CVI_MPY01] means:
(CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)
- increased DFA input size from 32-bit to 64-bit
- allows for a maximum of 4 AND'ed terms of 16 resources
Extended DFA tablegen to:
- added "-debug-only dfa-emitter" support to llvm-tblgen
- defined CVI_PIPE* resources for the V60 vector coprocessor
- allow specification of multiple required resources
- supports ANDs of ORs
- e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
(SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)
- added support for combo resources
- allows specifying ORs of ANDs
- e.g. [CVI_XLSHF, CVI_MPY01] means:
(CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)
- increased DFA input size from 32-bit to 64-bit
- allows for a maximum of 4 AND'ed terms of 16 resources
Jonas Paulsson [Sat, 21 Nov 2015 13:25:07 +0000 (13:25 +0000)]
[DAGCombiner] Bugfix for lost chain depenedency.
When MergeConsecutiveStores() combines two loads and two stores into
wider loads and stores, the chain users of both of the original loads
must be transfered to the new load, because it may be that a chain
user only depends on one of the loads.
New test case: test/CodeGen/SystemZ/dag-combine-01.ll
Simon Pilgrim [Sat, 21 Nov 2015 12:38:34 +0000 (12:38 +0000)]
[X86][SSE] Legal XMM Register Class ordering for SSE1
It turns out we have a number of places that just grab the first type attached to a register class for various reasons. This is fine unless for some reason that type isn't legal on the current target, such as for SSE1 which doesn't support v16i8/v8i16/v4i32/v2i64 - all of which were included before 4f32 in the class.
Given that this is such a rare situation I've just re-ordered the types and placed the float types first.
Yaron Keren [Sat, 21 Nov 2015 06:33:54 +0000 (06:33 +0000)]
Unbreak build on OpenBSD by not adding -Wl,-z,defs to linker flags.
This is similar to the fix for FreeBSD in r226862. Without this patch,
the build aborts when linkling libLTO.so, complaining about undefined
references to assert2, cxa_atexit, etc.
Teresa Johnson [Sat, 21 Nov 2015 03:51:23 +0000 (03:51 +0000)]
Move new assert to correct location
This assert was meant to execute at the end of parseMetadata, but
we return early and never reach the end of the function. Caught
by a compile-time warning since the function doesn't return a value
from that location.
Teresa Johnson [Sat, 21 Nov 2015 00:35:38 +0000 (00:35 +0000)]
llvm-link option and test for recent metadata mapping bug
Summary:
Add a -preserve-modules option to llvm-link that simulates LTO
clients that don't destroy modules as they are linked. This enables
reproduction of a recent bug introduced by a metadata linking change
that was only caught when the modules weren't destroyed before
writing bitcode (LTO on Windows).
See http://llvm.org/viewvc/llvm-project?view=revision&revision=253170
for more details on the original bug and the fix.
Confirmed the new test added here reproduces the failure using the new
option when I suppress the fix.
Eric Christopher [Fri, 20 Nov 2015 22:38:20 +0000 (22:38 +0000)]
Power8 and later support fusing addis/addi and addis/ld instruction
pairs that use the same register to execute as a single instruction.
No Functional Change
Geoff Berry [Fri, 20 Nov 2015 22:34:39 +0000 (22:34 +0000)]
[CodeGenPrepare] Create more extloads and fewer ands
Summary:
Add and instructions immediately after loads that only have their low
bits used, assuming that the (and (load x) c) will be matched as a
extload and the ands/truncs fed by the extload will be removed by isel.
Chris Bieneman [Fri, 20 Nov 2015 22:08:49 +0000 (22:08 +0000)]
[CMake] Fix handling of passing through semi-colon separated lists.
When passing around CMake arguments as lists of arguments any arguments containing lists need to have their semi-colons escaped otherwise CMake will split the arguments in the middle.
[ShrinkWrap] Teach ShrinkWrap to handle targets requiring a register scavenger.
The included test only checks for a compiler crash for now. Several people are
facing this issue, so we first resolve the crash, and will increase shrinkwrap's
coverage later in a follow-up patch.
Diego Novillo [Fri, 20 Nov 2015 21:46:38 +0000 (21:46 +0000)]
SamplePGO - Do not count never-executed inlined functions when computing coverage.
If a function was originally inlined but not actually hot at runtime,
its samples will not be counted inside the parent function. This throws
off the coverage calculation because it expects to find more used
records than it should.
Fixed by ignoring functions that will not be inlined into the parent.
Currently, this is inlined functions with 0 samples. In subsequent
patches, I'll change this to mean "cold" functions.
Eric Christopher [Fri, 20 Nov 2015 20:51:31 +0000 (20:51 +0000)]
Weak non-function symbols were being accessed directly, which is
incorrect, as the chosen representative of the weak symbol may not live
with the code in question. Always indirect the access through the TOC
instead.
Bill Seurer [Fri, 20 Nov 2015 20:24:49 +0000 (20:24 +0000)]
Fix test case label check
Several (but not all) of the labels that are checked for in this test case
are checked as strings instead of labels. This can cause an apparent test
case failure if they are tested in an appropriately named directory.
Summary:
This change refactors two aspects of InstrProfRecord:
1) Add a merge() method to InstrProfRecord (previously InstrProfWriter combineInstrProfRecords()) in order to better encapsulate this functionality and to make the InstrProfRecord and SampleRecord APIs more consistent.
2) Make InstrProfRecord mergeValueProfData() a private method since it is only ever called internally by merge().
Artyom Skrobov [Fri, 20 Nov 2015 16:46:09 +0000 (16:46 +0000)]
Handle ARMv6-J as an alias, instead of fake architecture
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.
The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.
The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.
Diego Novillo [Fri, 20 Nov 2015 15:39:42 +0000 (15:39 +0000)]
SamplePGO - Add line offset and discriminator information to sample reports.
While debugging some sampling coverage problems, I found this useful:
When applying samples from a profile, it helps to also know what line
offset and discriminator the sample belongs to. This makes it easy to
correlate against the input profile.
Teresa Johnson [Fri, 20 Nov 2015 14:51:27 +0000 (14:51 +0000)]
[ThinLTO] Add MODULE_CODE_METADATA_VALUES record
Summary:
This is split out from the ThinLTO metadata mapping patch
http://reviews.llvm.org/D14752.
To avoid needing to parse the module level metadata during function
importing, a new module-level record is added which holds the
number of module-level metadata values. This is required because
metadata value ids are assigned implicitly during parsing, and the
function-level metadata ids start after the module-level metadata ids.
I made a change to this version of the code compared to D14752
in order to add more consistent and thorough assertion checking of the
new record value. We now unconditionally use the record value to
initialize the MDValueList size, and handle it the same in parseMetadata
for all module level metadata cases (lazy loading or not).
Daniel Sanders [Fri, 20 Nov 2015 10:07:11 +0000 (10:07 +0000)]
Revert 253497 and 253539 to try to fix clang-cmake-mips buildbot.
It caused link errors of the form:
InstrProfiling.c:(.text.__llvm_profile_instrument_target+0x1c0): undefined reference to `__sync_fetch_and_add_8'
We had a network outage at the time of the commit so the first build to show a
problem is http://lab.llvm.org:8011/builders/clang-cmake-mips/builds/10827
Dan Gohman [Fri, 20 Nov 2015 03:13:31 +0000 (03:13 +0000)]
[WebAssembly] Remove the AsmPrinter code for printing physical registers.
WebAssembly does not have physical registers, so even if LLVM uses physical
registers like SP, they'll need to be lowered to virtual registers before
AsmPrinter time.
ScalarEvolution: do not set nuw when creating exprs of form <expr> + <all-ones>.
The nuw constraint will not be satisfied unless <expr> == 0.
This bug has been around since r102234 (in 2010!), but was uncovered by
r251052, which introduced more aggressive optimization of nuw scev expressions.
Eric Christopher [Fri, 20 Nov 2015 00:34:54 +0000 (00:34 +0000)]
Split the argument unscheduling loop in the WebAssembly register
coloring pass. Turn the logic into "look for an insert point and
then move things past the insert point".
[LTO] Add options to llvm-lto to select output format and dump merged module
This introduces two new options:
- "llvm-lto -save-merged-module -o outfile" dumps the LTO Module to
outfile.merged.bc prior to CodeGen and after LTO optimizations have been run.
- "llvm-lto -filetype=asm -o outfile" makes llvm-lto emit assembly instead of
object code in outfile.
[LTO] Add option to emit assembly from LTOCodeGenerator
This adds a new API, LTOCodeGenerator::setFileType, to choose the output file
format for LTO CodeGen. A corresponding change to use this new API from
llvm-lto and a test case is coming in a separate commit.