Tim Northover [Mon, 5 Dec 2016 21:47:07 +0000 (21:47 +0000)]
GlobalISel: make G_CONSTANT take a ConstantInt rather than int64_t.
This makes it more similar to the floating-point constant, and also allows for
larger constants to be translated later. There's no real functional change in
this patch though, just syntax updates.
Tim Northover [Mon, 5 Dec 2016 21:40:33 +0000 (21:40 +0000)]
GlobalISel: improve translation fallback for constants.
Returning 0 (NoReg) from getOrCreateVReg leads to unexpected situations later
in the translation. It's better to return a valid (if undefined) register and
let the rest of the instruction carry on as planned.
Keno Fischer [Mon, 5 Dec 2016 21:25:03 +0000 (21:25 +0000)]
[LAA] Prevent invalid IR for loop-invariant bound in loop body
Summary:
If LAA expands a bound that is loop invariant, but not hoisted out
of the loop body, it used to use that value anyway, causing a
non-domination error, because the memcheck block is of course not
dominated by the scalar loop body. Detect this situation and expand
the SCEV expression instead.
[X86] Fix non-intrinsic roundss/roundsd to not read the destination register
This changes the scalar non-intrinsic non-avx roundss/sd instruction
definitions not to read their destination register - allowing partial dependency
breaking.
Matt Arsenault [Mon, 5 Dec 2016 20:23:10 +0000 (20:23 +0000)]
AMDGPU: Refactor exp instructions
Structure the definitions a bit more like the other classes.
The main change here is to split EXP with the done bit set
to a separate opcode, so we can set mayLoad = 1 so that it won't
be reordered before the other exp stores, since this has the special
constraint that if the done bit is set then this should be the last
exp in she shader.
Previously all exp instructions were inferred to have unmodeled
side effects.
Eric Fiselier [Mon, 5 Dec 2016 20:21:21 +0000 (20:21 +0000)]
[lit] Support custom parsers in parseIntegratedTestScript
Summary:
Libc++ frequently has the need to parse more than just the builtin *test keywords* (`RUN`, `REQUIRES`, `XFAIL`, ect). For example libc++ currently needs a new keyword `MODULES-DEFINES: macro list...`. Instead of re-implementing the script parsing in libc++ this patch allows `parseIntegratedTestScript` to take custom parsers.
This patch introduces a new class `IntegratedTestKeywordParser` which implements the logic to parse/process a test keyword. Parsing of various keyword "kinds" are supported out of the box, including 'TAG', 'COMMAND', and 'LIST', which parse keywords such as `END.`, `RUN:` and `XFAIL:` respectively.
As an example after this change libc++ can implement the `MODULES-DEFINES` simply using:
```
mparser = IntegratedTestKeywordParser('MODULES-DEFINES:', ParserKind.LIST)
parseIntegratedTestScript(test, additional_parsers=[mparser])
macro_list = mparser.getValue()
```
Matthias Braun [Mon, 5 Dec 2016 19:44:31 +0000 (19:44 +0000)]
TableGen/AsmMatcherEmitter: Bring sorting check back under EXPENSIVE_CHECKS
Bring the sorting check back that I removed in r288655 but put it under
EXPENSIVE_CHECKS this time. Also document that this the check isn't
purely about having a sorted list but also about operator < having the
correct transitive behavior.
Adrian Prantl [Mon, 5 Dec 2016 18:04:47 +0000 (18:04 +0000)]
[DIExpression] Introduce a dedicated DW_OP_LLVM_fragment operation
so we can stop using DW_OP_bit_piece with the wrong semantics.
The entire back story can be found here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20161114/405934.html
The gist is that in LLVM we've been misinterpreting DW_OP_bit_piece's
offset field to mean the offset into the source variable rather than
the offset into the location at the top the DWARF expression stack. In
order to be able to fix this in a subsequent patch, this patch
introduces a dedicated DW_OP_LLVM_fragment operation with the
semantics that we used to apply to DW_OP_bit_piece, which is what we
actually need while inside of LLVM. This patch is complete with a
bitcode upgrade for expressions using the old format. It does not yet
fix the DWARF backend to use DW_OP_bit_piece correctly.
Implementation note: We discussed several options for implementing
this, including reserving a dedicated field in DIExpression for the
fragment size and offset, but using an custom operator at the end of
the expression works just fine and is more efficient because we then
only pay for it when we need it.
Sanjay Patel [Mon, 5 Dec 2016 15:58:21 +0000 (15:58 +0000)]
[TargetLowering] add special-case for demanded bits analysis of 'not'
We treat bitwise 'not' as a special operation and try not to reduce its all-ones mask.
Presumably, this is because a 'not' may be cheaper than a generic 'xor' or it may get
folded into another logic op if the target has those. However, if we can remove a logic
instruction by changing the xor's constant mask value, that should always be a win.
Note that the IR version of SimplifyDemandedBits() does not treat 'not' as a special-case
currently (although that's marked with a FIXME). So if you run this IR through -instcombine,
you should get the same end result. I'm hoping to add a different backend transform that
will expose this problem though, so I need to solve this first.
Diana Picus [Mon, 5 Dec 2016 10:40:33 +0000 (10:40 +0000)]
[GlobalISel] Extract handleAssignments out of AArch64CallLowering
This function seems target-independent so far: all the target-specific behaviour
is isolated in the CCAssignFn and the ValueHandler (which we're also extracting
into the generic CallLowering).
Matthias Braun [Mon, 5 Dec 2016 08:15:57 +0000 (08:15 +0000)]
TableGen/AsmMatcherEmitter: Trust that stable_sort works
A debug build of AsmMatcherEmitter would use a quadratic algorithm to
check whether std::stable_sort() actually sorted. Let's hope the authors
of our C++ standard library did that testing for us. Removing the check
gives a 3x speedup in the X86 case.
Matthias Braun [Mon, 5 Dec 2016 07:35:09 +0000 (07:35 +0000)]
TableGen/Record: Shortcut member access in hottest function
This may seem unusual, but makes most debug tblgen builds ~10% faster.
Usually we wouldn't care about speed that much in debug builds, but for
tblgen that also translates into build time.
Matthias Braun [Mon, 5 Dec 2016 06:00:36 +0000 (06:00 +0000)]
TableGen: Use more StringInit instead of StringRef
This forces the code to call StringInit::get on the string early and
avoids storing duplicates in std::string and sometimes allows pointer
comparisons instead of string comparisons.
Kuba Mracek [Mon, 5 Dec 2016 05:21:44 +0000 (05:21 +0000)]
Use Darwin libtool's -no_warning_for_no_symbols if available to silence the "has no symbols" link warning
Building compiler-rt on Darwin produces dozens of meaningless warnings about object files having no symbols during static archive creation. This is very intentional as compiler-rt uses #ifdefs to conditionally compile platform-specific code, and we even have a .cpp source file that only contains static asserts to make sure the environment is configured right. On Linux, this situation is fine and no warning is produced. This patch adds a libtool version detection and if it's new enough, we'll use the -no_warning_for_no_symbols flag that suppresses this warning. Build logs should be much cleaner now!
Matthias Braun [Mon, 5 Dec 2016 05:21:18 +0000 (05:21 +0000)]
TableGen: Factor out STRCONCAT constructor, add shortcut.
Introduce new constructor for STRCONCAT binop with a shortcut that
immediately concatenates if the two arguments are StringInits.
Makes the QualifyName code more readable and tablegen 2-3% faster.
Craig Topper [Mon, 5 Dec 2016 04:51:31 +0000 (04:51 +0000)]
[AVX-512] Teach fast isel to use masked compare and movss for handling scalar cmp and select sequence when AVX-512 is enabled. This matches the behavior of normal isel.
Craig Topper [Mon, 5 Dec 2016 04:51:28 +0000 (04:51 +0000)]
[AVX-512] Add avx512f command lines to fast isel SSE select test.
Currently the fast isel code emits an avx1 instruction sequence even with avx512. This is different than normal isel. A follow up commit will fix this.
Chris Bieneman [Mon, 5 Dec 2016 03:28:03 +0000 (03:28 +0000)]
[CMake] Refactor add_llvm_tool_symlink for reuse
The old implementation of add_llvm_tool_symlink could fail in odd ways when building out of tree. This version solves that problem by not using the LLVM_* variables, and instead reaeding the target's properties.
lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::spillSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger*) const':
lib/Target/AMDGPU/SIRegisterInfo.cpp:572:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable]
const TargetRegisterClass *SubRC = nullptr;
^
lib/Target/AMDGPU/SIRegisterInfo.cpp: In member function 'void llvm::SIRegisterInfo::restoreSGPR(llvm::MachineBasicBlock::iterator, int, llvm::RegScavenger*) const':
lib/Target/AMDGPU/SIRegisterInfo.cpp:723:30: warning: variable 'SubRC' set but not used [-Wunused-but-set-variable]
const TargetRegisterClass *SubRC = nullptr;
^
The variable was assigned to, but never used. The functions called did not
mutate state. Simplify the logic and remove the variable. Identified by gcc
5.4.0.
build: allow specifying the component to `llvm_install_symlink`
Add an optional parameter to `llvm_install_symlink` which allows the symlink
installation to be placed into a specific component rather than the default
value.
Matthias Braun [Sat, 3 Dec 2016 00:52:56 +0000 (00:52 +0000)]
AArch64CollectLOH: Rewrite as block-local analysis.
Previously this pass was using up to 5% compile time in some cases which
is a bit much for what it is doing. The pass featured a full blown
data-flow analysis which in the default configuration was restricted to a
single block.
This rewrites the pass under the assumption that we only ever work on a
single block. This is done in a single pass maintaining a state machine
per general purpose register to catch LOH patterns.
Guozhi Wei [Sat, 3 Dec 2016 00:41:43 +0000 (00:41 +0000)]
[ppc] Correctly compute the cost of loading 32/64 bit memory into VSR
VSX has instructions lxsiwax/lxsdx that can load 32/64 bit value into VSX register cheaply. That patch makes it known to memory cost model, so the vectorization of the test case in pr30990 is beneficial.
Ivan Krasin [Fri, 2 Dec 2016 23:30:16 +0000 (23:30 +0000)]
Support escaping in TrigramIndex.
Summary:
This is a follow up to r288303, where I have introduced TrigramIndex
to speed up SpecialCaseList for the cases when all rules are
simple wildcards, like *hello*wor.d*.
Here, I add support for escaping, so that it's possible to
specify rules like *c\+\+abi*.
Zachary Turner [Fri, 2 Dec 2016 23:02:01 +0000 (23:02 +0000)]
Resubmit "[LibFuzzer] Split FuzzerUtil for Posix and Windows."
This resubmits r288529, which was resubmitted because it broke a
fuzzer bot. According to kcc@ the test that broke was flakey
and it is unlikely to be a result of this patch.
Zachary Turner [Fri, 2 Dec 2016 19:41:17 +0000 (19:41 +0000)]
[LibFuzzer] Introduce a portable WeakAlias implementation.
Windows doesn't really support weak aliases, but with some
linker magic we can get something that's pretty close on
Windows. This introduces an interface to accessing weakly
aliased symbols that will work on any platform. Linker
magic changes to come in a separate patch.
Patch by Marcos Pividori
Differential Revision: https://reviews.llvm.org/D27235
Rong Xu [Fri, 2 Dec 2016 19:10:29 +0000 (19:10 +0000)]
[PGO] Fix PGO use ICE when there are unreachable BBs
For -O0 there might be unreachable BBs, which breaks the assumption that all the
BBs have an auxiliary data structure. In this patch, we add another interface
called findBBInfo() so that a nullptr can be returned for the unreachable BBs
(and the callers can ignore those BBs).
This fixes the bug reported
https://llvm.org/bugs/show_bug.cgi?id=31209
Ulrich Weigand [Fri, 2 Dec 2016 18:24:16 +0000 (18:24 +0000)]
[SystemZ] Support remaining atomic instructions
Add assembler support for all atomic instructions that weren't already
supported. Some of those could be used to implement codegen for 128-bit
atomic operations, but this isn't done here yet.
Ulrich Weigand [Fri, 2 Dec 2016 18:19:22 +0000 (18:19 +0000)]
[SystemZ] Refactor hasSideEffects setting
Move setting of hasSideEffects out of SystemZInstrFormats.td,
to allow use of the format classes for instructions where this
flag shouldn't be set. NFC.