Simon Pilgrim [Wed, 31 Jul 2019 12:55:39 +0000 (12:55 +0000)]
[X86][AVX] Ensure chained subvector insertions are the same size (PR42833)
Before combining insert_subvector(insert_subvector(vec, sub0, c0), sub1, c1) patterns, ensure that the subvectors are all the same type. On AVX512 targets especially we might have a mixture of 128/256 subvector insertions.
Simon Pilgrim [Wed, 31 Jul 2019 12:27:47 +0000 (12:27 +0000)]
[X86] Regenerate alias-static-alloca test checks to make D65354 diff easier
I've manually added the stack offsets back as these are worth keeping - we really need a way for update_llc_test_checks.py not to mask out useful address math
Roman Lebedev [Wed, 31 Jul 2019 12:06:51 +0000 (12:06 +0000)]
[DivRemPairs] Recommit: Handling for expanded-form rem - recomposition (PR42673)
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.
If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.
The original patch was committed in rL367288 but was reverted in rL367289
because it exposed pre-existing RAUW issues in internal data structures
of the pass; those now have been addressed in a previous patch.
Roman Lebedev [Wed, 31 Jul 2019 12:06:38 +0000 (12:06 +0000)]
[DivRemPairs] Avoid RAUW pitfalls (PR42823)
Summary:
`DivRemPairs` internally creates two maps:
* {sign, divident, divisor} -> div instruction
* {sign, divident, divisor} -> rem instruction
Then it iterates over rem map, and looks if there is an entry
in div map with the same key. Then depending on some internal logic
it may RAUW rem instruction with something else.
But if that rem instruction is an input to other div/rem,
then it was used as a key in these maps, so the old value (used in key)
is now dandling, because RAUW didn't update those maps.
And we can't even RAUW map keys in general, there's `ValueMap`,
but we don't have a single `Value` as key...
The bug was discovered via D65298, and the test there exists.
Now, i'm not sure how to expose this issue in trunk.
The bug is clearly there if i change the map keys to be `AssertingVH`/`PoisoningVH`,
but i guess this didn't miscompiled anything thus far?
I really don't think this is benin without that patch.
The fix is actually rather straight-forward - instead of trying to somehow
shoe-horn `ValueMap` here (doesn't fit, key isn't just `Value`), or writing a new
`ValueMap` with key being a struct of `Value`s, we can just have an intermediate
data structure - a vector, each entry containing matching `Div, Rem` pair,
and pre-filling it before doing any modifications.
This way we won't need to query map after doing RAUW, so no bug is possible.
Sam Elliott [Wed, 31 Jul 2019 09:45:55 +0000 (09:45 +0000)]
[RISCV] Support 'f' Inline Assembly Constraint
Summary:
This adds the 'f' inline assembly constraint, as supported by GCC. An
'f'-constrained operand is passed in a floating point register. Exactly
which kind of floating-point register (32-bit or 64-bit) is decided
based on the operand type and the available standard extensions (-f and
-d, respectively).
This patch adds support in both the clang frontend, and LLVM itself.
Sam Parker [Wed, 31 Jul 2019 09:34:11 +0000 (09:34 +0000)]
[NFC][ARMCGP] Use switch in isSupportedValue
Use a switch instead of many isa<> while checking for supported
values. Also be explicit about which cast instructions are supported;
This allows the removal of SIToFP from GenerateSignBits.
Summary:
* Loads and stores in SVE2 are gather/scatter not contiguous, fixed by
renaming multiclasses to reflect this and also updated comments.
* Remove aliases from load/store multiclasses that reflect the behaviour
of the original form.
* Fix bug in scatter store implementation, vector list should be used as
input, not output.
Simon Cook [Wed, 31 Jul 2019 09:07:21 +0000 (09:07 +0000)]
[RISCV] Add support for lowering floating point inlineasm clobbers
This adds the required extension to RISC-V's getRegForInlineAsmConstraint
in order to be able to correctly distringuish between the 32 and 64-bit
floating point registers when the generic fX name appears in inlineasm
clobber contraints. It also adds a check to validate that callee saved
floating point registers are only saved in this case when a hard-float
ABI is selected.
Summary:
* Clarify comment with SVE2 for predicated shifts and move next to other
shift instructions.
* Clarify comments for various instructions.
* Move FCVTX instruction next to other fp conversions.
* Move FLOGB to next to other fp instructions and fix description.
* Remove "cons" from non-constructive multiclass for bitwise shift-right
and accumulate instructions.
[AArch64][SVE2] Use destination register as source register
Summary:
This patch fixes a bug in the following instructions that should have been
implemented as destructive. A destructive instruction is an instruction where
one of the source registers also acts as the destination register. Therefore,
the contents of the source register, when the instruction begins execution, are
replaced by the result of the instruction when the instruction completes
execution [1]:
Summary:
This patch introduces a type to straighten LLVM's alignment management.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
George Rimar [Wed, 31 Jul 2019 08:12:01 +0000 (08:12 +0000)]
[llvm/Object] - Add a test for "empty string table" error.
This error was never tested. In this patch I improved
the error message, added the test case and also simplified
the code that processes a similar error right below.
Summary: The minimum compilers support all have alignas, and we don't use LLVM_ALIGNAS anywhere anymore. This also removes an MSVC diagnostic which, according to the comment above, isn't relevant anymore.
Zi Xuan Wu [Wed, 31 Jul 2019 02:56:00 +0000 (02:56 +0000)]
[PowerPC] Eliminate loads/swap feeding swap/store for vector type by using big-endian load/store
In PowerPC, there is instruction to load vector in big endian element order when it's in little endian target.
So we can combine vector load + reverse into big endian load to eliminate the swap instruction.
Also combine vector reverse + store into big endian store.
Matt Arsenault [Wed, 31 Jul 2019 00:14:43 +0000 (00:14 +0000)]
TableGen: Add MinAlignment predicate
AMDGPU uses some custom code predicates for testing alignments.
I'm still having trouble comprehending the behavior of predicate bits
in the PatFrag hierarchy. Any attempt to abstract these properties
unexpectdly fails to apply them.
[MemorySSA] Extend allowed behavior for simplified instructions.
Summary:
LoopRotate may simplify instructions, leading to the new instructions not having memory accesses created for them.
Allow this behavior, by allowing the new access to be null when the template is null, and looking upwards for the proper defined access when dealing with simplified instructions.
Michael Liao [Tue, 30 Jul 2019 19:52:01 +0000 (19:52 +0000)]
[NVPTX] Fix PR41651
Summary:
- Use the passed `DL` directly as retrieving data layout from CS by
checking the called function is not reliable. Under indirect function
call, there is no called function.
[dsymutil] Pass LinkOptions by value instead of const ref.
When looping over the difference architectures in a fat binary, we
modify the link options before dispatching the link step to a different
thread. Passing the options by cont reference is not thread safe, as we
might modify its fields before the whole sturct is copied over.
Given that the link options are already stored in the DwarfLinker, we
can easily fix this by passing a copy of the link options instead of a
reference, which would just get copied later on.
[FunctionAttrs] Annotate "willreturn" for AssumeLikeInst
Summary:
In D37215, AssumeLikeInstruction are regarded as `willreturn`. In this patch, annotation is added to those which don't have `willreturn` now(`sideeffect, object_size, experimental_widenable_condition`).
Thomas Lively [Tue, 30 Jul 2019 18:08:39 +0000 (18:08 +0000)]
[WebAssembly] Do not emit tail calls with return type mismatch
Summary:
return_call and return_call_indirect are only valid if the return
types of the callee and caller match. We were previously not enforcing
that, which was producing invalid modules.
LLVM_ALIGNAS was removed from this class in http://llvm.org/r338099 but the comment was left there. The class is still sommewhat relevant despite the comment, let's keep it there with its one use.
[GVN] Preserve loop related analysis/canonical forms.
LoopInfo can be easily preserved by passing it to the functions that
modify the CFG (SplitCriticalEdge and MergeBlockIntoPredecessor.
SplitCriticalEdge also preserves LoopSimplify and LCSSA form when when passing in
LoopInfo. The test case shows that we preserve LoopSimplify and
LoopInfo. Adding addPreservedID(LCSSAID) did not preserve LCSSA for some
reason.
Also I am not sure if it is possible to preserve those in the new pass
manager, as they aren't analysis passes.
[LoopFusion] Extend use of OptimizationRemarkEmitter
Summary:
This patch extends the use of the OptimizationRemarkEmitter to provide
information about loops that are not fused, and loops that are not eligible for
fusion. In particular, it uses the OptimizationRemarkAnalysis to identify loops
that are not eligible for fusion and the OptimizationRemarkMissed to identify
loops that cannot be fused.
It also reuses the statistics to provide the messages used in the
OptimizationRemarks. This provides common message strings between the
optimization remarks and the statistics.
I would like feedback on this approach, in general. If people are OK with this,
I will flesh out additional remarks in subsequent commits.
The @srem_of_srem_expanded case exposed a RAUW pitfall in D65298.
Right now these don't appear to fail verification,
so it should be safe to precommit them.
Sean Fertile [Tue, 30 Jul 2019 15:37:01 +0000 (15:37 +0000)]
Address post commit review comments on revision 366727.
Addresses number of comment made on D64652 after commiting:
- Reorders function decls in the TargetLoweringObjectFileXCOFF class.
- Fix comment in MCSectionXCOFF to include description of external reference
csects.
- Convert several llvm_unreachables to report_fatal_error
- Convert several dyn_casts to casts as they are expected not to fail.
- Avoid copying DataLayout object.
Roman Lebedev [Tue, 30 Jul 2019 15:28:22 +0000 (15:28 +0000)]
[InstCombine] Fold "x ?% y ==/!= 0" to "x & (y-1) ==/!= 0" iff y is power-of-two
Summary:
I have stumbled into this by accident while preparing to extend backend `x s% C ==/!= 0` handling.
While we did happen to handle this fold in most of the cases,
the folding is indirect - we fold `x u% y` to `x & (y-1)` (iff `y` is power-of-two),
or first turn `x s% -y` to `x u% y`; that does handle most of the cases.
But we can't turn `x s% INT_MIN` to `x u% -INT_MIN`,
and thus we end up being stuck with `(x s% INT_MIN) == 0`.
There is no such restriction for the more general fold:
https://rise4fun.com/Alive/IIeS
To be noted, the fold does not enforce that `y` is a constant,
so it may indeed increase instruction count.
This is consistent with what `x u% y`->`x & (y-1)` already does.
I think it makes sense, it's at most one (simple) extra instruction,
while `rem`ainder is really much more un-simple (and likely **very** costly).
Sam Elliott [Tue, 30 Jul 2019 13:40:51 +0000 (13:40 +0000)]
[RISCV] Attempt to make rv{32,64}i-aliases-invalid.s less flaky
These tests have been disabled on Linux and Windows due to failing
there. I think that could be down to a race condition between stdout
and stderr, so I have disabled output to stdout.
For the moment, only re-enable on linux, because I don't have a windows
machine to test on.
George Rimar [Tue, 30 Jul 2019 13:37:02 +0000 (13:37 +0000)]
[llvm-objcopy] - Stop using Inputs/alloc-symtab.o
Initially Inputs/alloc-symtab.o was added in D42222.
It contains an allocatable .symtab section. Today
we are able to create such sections using yaml2obj.
Later people started using this input for no solid reason in their tests.
Now multiple of tests are using it.
(And those tests do not need such a specific case actually).
In this patch I removed this binary and rewrote the few tests.
This is the compantion patch to https://reviews.llvm.org/D64482, needed to ensure
that builds with host compilers that don't yet predefine _FILE_OFFSET_BITS=64 on
Solaris succeed by always making the host and freshly built clang consistent.
Roman Lebedev [Tue, 30 Jul 2019 07:10:00 +0000 (07:10 +0000)]
[DivRemPairs] Handling for expanded-form rem - recomposition (PR42673)
Summary:
While `-div-rem-pairs` pass can decompose rem in div+rem pair when div-rem pair
is unsupported by target, nothing performs the opposite fold.
We can't do that in InstCombine or DAGCombine since neither of those has access to TTI.
So it makes most sense to teach `-div-rem-pairs` about it.
If we matched rem in expanded form, we know we will be able to place div-rem pair
next to each other so we won't regress the situation.
Also, we shouldn't decompose rem if we matched already-decomposed form.
This is surprisingly straight-forward otherwise.
Michael Pozulp [Tue, 30 Jul 2019 07:05:27 +0000 (07:05 +0000)]
Revert "[llvm-objdump] Add warning messages if disassembly + source for problematic inputs"
This reverts r367284 (git commit b1cbe51bdf44098c74f5c74b7bcd8c041a7c6772).
My changes to LLVMSymbolizer caused a test to fail:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/29488
[NFC] use C++11 in AlignOf.h, remove AlignedCharArray
I removed all uses of AlignedCharArray since the minimum MSVC version can handle
alignas on char arrays correctly. We can therefore remove AlignedCharArray.
This patch also updates AlignedCharArrayUnion to use C++11.
Alex Lorenz [Mon, 29 Jul 2019 23:38:30 +0000 (23:38 +0000)]
[FileCollector] Add a VFS that records FS accesses using the FileCollector
This patch adds a VFS that can be overlaid on top of another VFS
to record file system accesses using the FileCollector.
This can help to gather files that are needed for reproducers.
As discussed in D65249, don't use AlignedCharArray or std::aligned_storage. Just use alignas(X) char Buf[Size];. This will allow me to remove AlignedCharArray entirely, and works on the current minimum version of Visual Studio.
David Bolvansky [Mon, 29 Jul 2019 17:41:00 +0000 (17:41 +0000)]
[UpdateTestChecks] Emit warning when invalid value for -check-prefix(es) option
Summary:
The script is silent for the following issue:
FileCheck %s -check-prefix=CHECK,POPCOUNT
FileCheck will catch it later, but I think we can warn here too.
Now it warns:
./update_llc_test_checks.py file.ll
WARNING: Supplied prefix 'CHECK,POPCOUNT' is invalid. Prefix must contain only alphanumeric characters, hyphens and underscores. Did you mean --check-prefixes=CHECK,POPCOUNT?