Fix CodeGen/AMDGPU/fcanonicalize-elimination.ll on FreeBSD 11.0
Summary:
On FreeBSD11.0 the FileCheck NOT string "1.0" will be matched by
`.amd_amdgpu_isa "amdgcn-unknown-freebsd11.0--gfx802"` at the end of the
file. Add a CHECK for that directive to avoid failing the test.
Sanjoy Das [Wed, 25 Oct 2017 21:41:00 +0000 (21:41 +0000)]
[SCEV] Fix an assertion failure in the max backedge taken count
Max backedge taken count is always expected to be a constant; and this is
usually true by construction -- it is a SCEV expression with constant inputs.
However, if the max backedge expression ends up being computed to be a udiv with
a constant zero denominator[0], SCEV does not fold the result to a constant
since there is no constant it can fold it to (SCEV has no representation for
"infinity" or "undef").
However, in computeMaxBECountForLT we already know the denominator is positive,
and thus at least 1; and we can use this fact to avoid dividing by zero.
[0]: We can end up with a constant zero denominator if the signed range of the
stride is more precise than the unsigned range.
Balaram Makam [Wed, 25 Oct 2017 21:32:54 +0000 (21:32 +0000)]
Revert r316582 [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Summary: This reverts commit r316582. It looks like this commit broke tests on one buildbot:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/5719
Mitch Phillips [Wed, 25 Oct 2017 21:21:16 +0000 (21:21 +0000)]
Add FileVerifier::isCFIProtected().
Add a CFI protection check that is implemented by building a graph and inspecting the output to deduce if the indirect CF instruction is CFI protected. Also added the output of this instruction to printIndirectInstructions().
[Hexagon] Account for negative offset when limiting max deviation
In getOffsetRange, Max can be set to 0 to force the extender replacement
to be at or below the original value. This would cause the new offset to
be non-negative, which is preferred for memory instructions (to reduce
the likelihood of it getting constant-extended due to predication). The
problem happens when the range is shifted by an offset (present in the
instruction being examined) and the offset is negative. The entire range
for the allowable deviation will then be strictly negative. This creates
a problem, since 0 is assumed to be a valid deviation.
Shoaib Meenai [Wed, 25 Oct 2017 17:11:28 +0000 (17:11 +0000)]
[cmake] Restrict resource file usage to Windows build hosts
Resource file compilation requires a working resource compiler.
Unfortunately, llvm-rc isn't quite there yet [1], and cmake's rc
invocation only works on Windows [2]. Until both those issues are
addressed, disable resource file usage on non-Windows build hosts, to
unblock Windows cross-compilation. This is also consistent with the
existing comment, which says "If *on Windows* and building with MSVC".
Matthew Simpson [Wed, 25 Oct 2017 13:40:08 +0000 (13:40 +0000)]
Add CalledValuePropagation pass
This patch adds a new pass for attaching !callees metadata to indirect call
sites. The pass propagates values to call sites by performing an IPSCCP-like
analysis using the generic sparse propagation solver. For indirect call sites
having a small set of possible callees, the attached metadata indicates what
those callees are. The metadata can be used to facilitate optimizations like
intersecting the function attributes of the possible callees, refining the call
graph, performing indirect call promotion, etc.
Daniil Fukalov [Wed, 25 Oct 2017 12:51:32 +0000 (12:51 +0000)]
[inlineasm] Fix crash when number of matched input constraint operands overflows signed char
In a case when number of output constraint operands that has matched input operands
doesn't fit to signed char, TargetLowering::ParseConstraints() can try to access
ConstraintOperands (that is std::vector) with negative index.
Diana Picus [Wed, 25 Oct 2017 12:22:21 +0000 (12:22 +0000)]
[ARM GlobalISel] Remove redundant testcases. NFC
Remove the G_FADD testcases from arm-legalizer.mir, they are covered by
arm-legalizer-fp.mir (I probably forgot to delete them when I created
that test).
Diana Picus [Wed, 25 Oct 2017 11:42:40 +0000 (11:42 +0000)]
[ARM GlobalISel] Fix call opcodes
We were generating BLX for all the calls, which was incorrect in most
cases. Update ARMCallLowering to generate BL for direct calls, and BLX,
BX_CALL or BMOVPCRX_CALL for indirect calls.
Diana Picus [Wed, 25 Oct 2017 11:21:15 +0000 (11:21 +0000)]
[ARM GlobalISel] Split test into 3. NFC
Separate the test cases that deal with calls from the rest of the IR
Translator tests.
We split into 2 different files, one for testing parameter and result
lowering, and one for testing the various different kinds of calls that
can occur (BL, BLX, BX_CALL etc).
Max Kazantsev [Wed, 25 Oct 2017 11:07:43 +0000 (11:07 +0000)]
[SCEV] Enhance SCEVFindUnsafe for division
This patch allows SCEVFindUnsafe algorithm to tread division by any non-positive
value as safe. Previously, it could only recognize non-zero constants.
Clement Courbet [Wed, 25 Oct 2017 11:02:09 +0000 (11:02 +0000)]
Re-land "[CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads (1)"
Compute the actual decomposition only after deciding whether to expand
of not. Else, it's easy to make the compiler OOM with:
`memcpy(dst, src, 0xffffffffffffffff);`, which typically happens if
someone mistakenly passes a negative value. Add a test.
Martin Storsjo [Wed, 25 Oct 2017 07:25:18 +0000 (07:25 +0000)]
[AArch64] Add support for dllimport of values and functions
Previously, the dllimport attribute did the right thing in terms
of treating it as a pointer to a value, but this makes sure the
names get mangled properly, and calls to such functions load the
function from the __imp_ pointer.
This is based on SVN r212431 and r212430 where the same was
implemented for ARM.
Matt Arsenault [Wed, 25 Oct 2017 07:14:07 +0000 (07:14 +0000)]
DAG: Fix creating select with wrong condition type
This code added in r297930 assumed that it could create
a select with a condition type that is just an integer
bitcast of the selected type. For AMDGPU any vselect is
going to be scalarized (although the vector types are legal),
and all select conditions must be i1 (the same as getSetCCResultType).
This logic doesn't really make sense to me, but there's
never really been a consistent policy in what the select
condition mask type is supposed to be. Try to extend
the logic for skipping the transform for condition types
that aren't setccs. It doesn't seem quite right to me though,
but checking conditions that seem more sensible (like whether the
vselect is going to be expanded) doesn't work since this
seems to depend on that also.
Max Kazantsev [Wed, 25 Oct 2017 06:47:39 +0000 (06:47 +0000)]
[IRCE] Fix intersection between signed and unsigned ranges
IRCE for unsigned latch conditions was temporarily disabled by rL314881. The motivating
example contained an unsigned latch condition and a signed range check. One of the safe
iteration ranges was `[1, SINT_MAX + 1]`. Its right border was incorrectly interpreted as a negative
value in `IntersectRange` function, this lead to a miscompile under which we deleted a range check
without inserting a postloop where it was needed.
This patch brings back IRCE for unsigned latch conditions. Now we treat range intersection more
carefully. If the latch condition was unsigned, we only try to consider a range check for deletion if:
1. The range check is also unsigned, or
2. Safe iteration range of the range check lies within `[0, SINT_MAX]`.
The same is done for signed latch.
Values from `[0, SINT_MAX]` are unambiguous, these values are non-negative under any interpretation,
and all values of a range intersected with such range are also non-negative.
We also use signed/unsigned min/max functions for range intersection depending on type of the
latch condition.
Mikael Holmen [Wed, 25 Oct 2017 06:15:32 +0000 (06:15 +0000)]
[MemDep] DBG intrinsics don't impact abort limit for call site dependence analysis
Summary:
Memory dependence analysis no longer counts DbgInfoIntrinsics towards the
limit where to abort the analysis. Before, a bunch of calls to dbg.value
could affect the generated code, meaning that with -g we could generate
different code than without.
Max Kazantsev [Wed, 25 Oct 2017 06:10:02 +0000 (06:10 +0000)]
[IRCE] Smarter detection of empty ranges using SCEV
For a SCEV range, this patch replaces the naive emptiness check for SCEV ranges
which looks like `Begin == End` with a SCEV check. The range is guaranteed to be
empty of `Begin >= End`. We should filter such ranges out and do not try to perform
IRCE for them.
For example, we can get such range when intersecting range `[A, B)` and `[C, D)`
where `A < B < C < D`. The resulting range is `[max(A, C), min(B, D)) = [C, B)`.
This range is empty, but its `Begin` does not match with `End`.
Making IRCE for an empty range is basically safe but unprofitable because we
never actually get into the main loop where the range checks are supposed to
be eliminated. This patch uses SCEV mechanisms to treat loops with proved
`Begin >= End` as empty.
Teresa Johnson [Wed, 25 Oct 2017 03:41:31 +0000 (03:41 +0000)]
[ThinLTO] Make test for promoted names more specific
With r314527, promoted values get a suffix that is a decimal value of
the module hash instead of hex. Change the regex to match only decimal
suffix values.
llvm-readobj: Add support for reading relocations in the Android packed format.
This is in preparation for testing lld's upcoming relocation packing
feature (D39152). I have verified that this implementation correctly
unpacks the relocations from a Chromium DSO built with gold and the
Android relocation packer for ARM32 and ARM64.
Mitch Phillips [Tue, 24 Oct 2017 23:56:12 +0000 (23:56 +0000)]
Check special-case-list regex before insertion.
Summary:
Checks that the supplied regex to SpecialCaseList::Matcher::insert(..) is non-empty.
Reported by OSS-fuzz: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3688
Verified that this fixes the provided assertion failure (built with {asan, fuzzer}):
```
mitchp@mitchp2:~/llvm-build/git-fuzz$ ninja llvm-special-case-list-fuzzer[12/12] Linking CXX executable bin/llvm-special-case-list-fuzzer
mitchp@mitchp2:~/llvm-build/git-fuzz$ bin/llvm-special-case-list-fuzzer ~/Downloads/clusterfuzz-testcase-6748633157337088
INFO: Seed: 1697404507
INFO: Loaded 1 modules (18581 inline 8-bit counters): 18581 [0x9e9f60, 0x9ee7f5),
INFO: Loaded 1 PC tables (18581 PCs): 18581 [0x9ee7f8,0xa37148),
bin/llvm-special-case-list-fuzzer: Running 1 inputs 1 time(s) each.
Running: /usr/local/google/home/mitchp/Downloads/clusterfuzz-testcase-6748633157337088
Executed /usr/local/google/home/mitchp/Downloads/clusterfuzz-testcase-6748633157337088 in 0 ms
***
*** NOTE: fuzzing was not performed, you have only
*** executed the target code on a fixed set of inputs.
***
mitchp@mitchp2:~/llvm-build/git-fuzz$
Adrian Prantl [Tue, 24 Oct 2017 22:55:12 +0000 (22:55 +0000)]
Implement salavageDebugInfo functionality for SelectionDAG.
Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds
a hook that can be invoked before an SDNode that is associated with an
SDDbgValue is erased to capture the effect of the deleted node in a
DIExpression.
The motivating example is an SDDebugValue attached to an ADD operation
that gets folded into a LOAD+OFFSET operation.
Martin Bohme [Tue, 24 Oct 2017 20:40:02 +0000 (20:40 +0000)]
Revert "[CodeGen][ExpandMemcmp][NFC] Allow memcmp to expand to vector loads (1)"
This reverts commit r316417, which causes internal compiles to OOM.
I don't unfortunately have a self-contained test case but will follow up
with courbet.
Artem Belevich [Tue, 24 Oct 2017 20:31:44 +0000 (20:31 +0000)]
[NVPTX] allow address space inference for volatile loads/stores.
If particular target supports volatile memory access operations, we can
avoid AS casting to generic AS. Currently it's only enabled in NVPTX for
loads and stores that access global & shared AS.
Gadi Haber [Tue, 24 Oct 2017 20:19:47 +0000 (20:19 +0000)]
[X86][Broadwell] Added the instruction scheduling information for the Broadwell CPU.
Adding the scheduling information for the Browadwell (BDW) CPU target.
This patch adds the instruction scheduling information for the Broadwell (BDW) architecture target by adding the file X86SchedBroadwell.td located under the X86 Target.
We used the scheduling information retrieved from the Broadwell architects in order to create the file.
The scheduling information includes latency, number of micro-Ops and used ports by each BDW instruction.
The patch continues the scheduling replacement and insertion effort started with the SandyBridge (SNB) target in r310792, the Haswell (HSW) target in r311879, the SkylakeClient (SKL) target in rL313613 + rL315978 and the SkylakeServer (SKX) in rL315175.
Performance fluctuations may be expected due to code alignment effects.
Vedant Kumar [Tue, 24 Oct 2017 20:03:37 +0000 (20:03 +0000)]
[llvm-cov] Use a stable sort on sub-views
We need to use a stable sort on instantiation and expansion sub-views to
produce consistent output. Fortunately, we've gotten lucky and the tests
have checks for the stable order.
This is needed to unblock D39245. Once that lands, we'll have better
test coverage for sort non-determinism.
Justin Bogner [Tue, 24 Oct 2017 18:04:54 +0000 (18:04 +0000)]
MIR: Print the register class or bank in vreg defs
This updates the MIRPrinter to include the regclass when printing
virtual register defs, which is already valid syntax for the
parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,
%1(s64) = COPY %0(s64)
would now be written as
%1:gpr(s64) = COPY %0(s64)
While this change alone introduces a bit of redundancy with the
registers block, it allows us to update the tests to be more concise
and understandable and brings us closer to being able to remove the
registers block completely.
Note: We generally only print the class in defs, but there is one
exception. If there are uses without any defs whatsoever, we'll print
the class on all uses. I'm not completely convinced this comes up in
meaningful machine IR, but for now the MIRParser and MachineVerifier
both accept that kind of stuff, so we don't want to have a situation
where we can print something we can't parse.
David Blaikie [Tue, 24 Oct 2017 17:29:14 +0000 (17:29 +0000)]
BinaryFormat/MachO.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:14 +0000 (17:29 +0000)]
ValueTracking.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:13 +0000 (17:29 +0000)]
MemoryBuiltins.h: Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:12 +0000 (17:29 +0000)]
IndirectCallSiteVisitor.h:findIndirectCallSites Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:12 +0000 (17:29 +0000)]
StringExtras.h Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:11 +0000 (17:29 +0000)]
SmallVector.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline
function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:11 +0000 (17:29 +0000)]
DenseMap.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another inline
function in a header and also creates binary bloat from duplicate definitions.
David Blaikie [Tue, 24 Oct 2017 17:29:08 +0000 (17:29 +0000)]
BitVector.h:capacity_in_bytes Don't mark header functions as file-scope static
This creates ODR violations if the function is called from another
inline function in a header and also creates binary bloat from duplicate
definitions.
Yonghong Song [Tue, 24 Oct 2017 17:29:03 +0000 (17:29 +0000)]
bpf: fix a bug in bpf-isel trunc-op optimization
In BPF backend, we try to optimize away redundant
trunc operations so that kernel verifier rewrite
remains valid. Previous implementation only works
for a single function.
This patch fixed the issue for multiple functions.
It clears internal map data structure before
performing optimization for each function.
Michael Kruse [Tue, 24 Oct 2017 17:17:27 +0000 (17:17 +0000)]
[opt] Initialize WriteBitcode pass.
Probably due to a change of how some pass initializes its dependencies,
the -write-bitcode pass (Bitcode/Writer/BitcodeWriterPass.cpp) is not
initialized in opt anymore and therefore not usable with
opt -write-bitcode
Explicitly call initializeWriteBitcodePassPass() to make it available
in opt again.
Daniel Sanders [Tue, 24 Oct 2017 17:08:43 +0000 (17:08 +0000)]
[globalisel][tablegen] Multi-insn emission requires that BuildMIAction support not being linked to an InstructionMatcher. NFC
When multi-instruction emission is supported, it will no longer be guaranteed
that every BuildMIAction has a corresponding matched instruction. BuildMIAction
should support not having one to cover the case where a rule produces more
instructions than it matched.
Simon Pilgrim [Tue, 24 Oct 2017 15:38:16 +0000 (15:38 +0000)]
[X86] truncateVectorCompareWithPACKSS - use PACKSSDW/PACKSSWB instead of just PACKSSWB.
By using the widest type possible for PACKSS truncation we have a better chance of being able to peek through bitcasts and improves other combines driven by ComputeNumSignBits.
Sanjay Patel [Tue, 24 Oct 2017 14:32:52 +0000 (14:32 +0000)]
[utils] make retq/retl regex an option that is off by default
Ideally, we should compare 32- and 64-bit versions to see if the
ret line is the only difference and then insert the regex only
in that case. But this is a quick hack to avoid a bunch of noise
as existing tests are updated.
Oliver Stannard [Tue, 24 Oct 2017 14:20:13 +0000 (14:20 +0000)]
[ARM] Tighten up CHECK lines in a test
These tests checked for the line number without a leading ":", so for example,
a missed diagnostic on line 123 could match one on line 1123, 2123, etc,
desynchronising the test for hundreds of lines.
This couldn't cause it to incorrectly pass or fail, but made it hard to track
down test failures.
Oliver Stannard [Tue, 24 Oct 2017 14:19:08 +0000 (14:19 +0000)]
[ARM] Error for invalid shift in memory operand
Report a diagnostic when we fail to parse a shift in a memory operand because
the shift type is not an identifier. Without this, we were silently ignoring
the whole instruction.